TAC SCM Project Report
Anat Ganor Moshe Sapir
Supervisor: Ishai Menashe
Lab Director: Dr. Ilana David
Computer Science & Electronic Engineering departments
Technion – Israel Institute of Technology
1. Abstract
This report overviews our agent techtac that participated in the 2004 Trading Agent Competition on Supply Chain Management. The document summers the game principals, challenges posed by it and our general approach to the solution. It also presents our work process and efforts during the past year. We also report more detailed description of our design, algorithms we used and finally, performance analysis and conclusions.
2. Game overview
TAC SCM (Trading Agent Competition – Supply Chain Management) simulates PC assembly line controlled by agents, which compete for customer orders and for procuring the components needed to assemble the PCs. In each game instance, the agent competes against 5 other agents, where the final goal is to maximize the revenue.
The three basic entities in the game are the Agents, Customers and Suppliers. Each entity represents a set of interests in the supply chain scenario. The customers and suppliers are simulated by the game server. The different agents, which are implemented by the game competitors, interact with those entities to promote their interests in the game.
The following diagram illustrates the different entities’ role in the game:
In TAC SCM an agents task is to manufacture PC’s, win customer orders, and procure components
Supplier Model
The suppliers produce PC components in a build-to-order strategy. When receiving requests for components from the agents, they need to make two basic decisions: whether to accept the request, and which price tag to put on it. The first decision is made according a mechanism they employee which evaluates the agents by their willingness to actually buy the components they requested in the past. The second decision is made by evaluating the future production line’s capacity and utilization.
Customer Model
Customers request PCs from the agents. A request contains quantity, due date and maximal price constraints. The different agents need to bid on those requests. The customers will accept an agent bid that satisfies the request constraints and has the lowest price. The level of customer demand is modeled by knows stochastic functions.
A Day in Agents Life
As mentioned the agents need to interact with both the customers and suppliers. In addition, they need to maintain their PCs assembly line and employee a delivery policy that decides which PCs will be delivered to which costumers. The following list summarizes the agent’s daily activities:
1. Which customer RFQs to bid on.
2. Which components to procure (and issue RFQs to the suppliers).
3. Which supplier offers to accept.
4. Which PCs to assemble.
5. Which assembled PCs to ship to the customers.
The following diagram illustrates the different activities of the agent each day:
3. Motivation
In today’s global economy, with worldwide transactions in trillions of dollars, effective supply chain management is vital to the competitiveness of manufacturing enterprises. It impacts their ability to meet changing market in a timely and cost effective manner. The potential improvement in performance is tremendous, as today supply chains are essentially static. More flexible and dynamic practice will better match between suppliers and customers while the market conditions change.
There are advantages to testing strategies for that practice in a competitive setting where other agents and the scenario itself designed by others. It allows comparison of different approaches in the same domain.
4. The Challenge
The TAC SCM game scenario poses great challenges, as the agent has to act in a very high-dimensional, non-deterministic environment. The supply chain is also a challenging environment for planning and scheduling, for the following reasons. First, the agent faces substantial uncertainty and incomplete information. Though we may possess information about our own local state, we have at best general knowledge and access to only noisy data of other operations on the supply chain. Nevertheless, we cannot avoid making commitments before all relevant uncertainty is resolved. Second, the supply chain is highly dynamic. Decisions made at one time propagate and affect conditions elsewhere on the chain at disparate times. Also external conditions may cause sudden changes to our view of the game’s state. Third, the other agents are self-interested whose behavior is strategic. The game is far too complex to solve analytically or characterize optimal behavior, due to those issues of uncertainty, dynamism, and strategy.
5. Related Research Fields
The TAC SCM presents a large spectrum of problems residing in different scientific disciplines such as Computer Science, Electric Engineering and Industrial Engineering. Although the relevance of the following list to the problem is obvious, none of them presents comprehensive solution for planning and scheduling supply chains. The following list presents some relevant research fields.
1. Control Systems – From the game’s properties and the scenarios it presents, rises the need to evaluate the state of this complex system in a continuous or discrete manner, and to process that data in a controlled form under limited boundaries into a decision function. This kind of continuous decision-making also implies the use of feedback mechanisms.
2. Optimization – Some of the decisions the agent need to take and also the problem as a whole can be viewed as an optimization problem under well-defined set of constraints.
3. Dynamic programming – There is a strong connection to dynamic programming methods since most of the information we have changes continuously, the behavior of the customers and suppliers is dynamic. Our decisions that were taken at one point will be applied at different time and might have an affect at another time, as well as the decisions other agents take.
4. Distributed and Parallel Programming – The need to make real time complex decisions raises the possibility of using parallel and distributed computation. The field provides various methods to communicate and synchronize multiple computing modules that use and update shared data.
5. AI – The agent need to make various decisions during the game, many of them involve search in large spaces. AI provides various approaches and algorithms for searching large spaces and decision-making.
6. Computational Learning theory – The agent acts in a partial-knowledge environment while pieces of information arrive continuously giving us a feedback It implies that learning algorithms can be used to complete the missing data by identifying the parameters’ values used in the generally known models of the customers and suppliers, as well as the strategic behavior of the other agents.
7. Supply Chain Models – There is a lot of research on the supply chain management and relevant models to describe it. Also the behaviors of the game’s entities that compose the supply chain are practices of different known models (build-to-order, buy-to-build etc).
8. Non-Cooperative Game Theory – Since the TAC is a non-cooperative game there is plenty of relevant research in the game theory field concerning tactics, future profit expectation, equilibrium etc.
6. The Work Process
A lot of work involved in the development of the basic outline of our solution, the forming of the full design, the implementation stage and the participation in the competition. We try to summarize the stages we went through during the last two semesters. This may give a wider perspective on our work process and the final result that was achieved:
1. The preparation stage –
1.1. After choosing the SCM competition, we have studied the game’s rules and the behavior of the customers and suppliers, to acquire deep understanding of the problem we face, along with some intuition. Special attention was given to interesting stochastic calculations involved in the game’s scenario.
1.2. We analyzed the code provided to us and ran an example agent to ensure we know how to communicate with the server of the game. A thought was invested in order to build our solution in a way that will enable us maximize the use of the functionality given to us.
1.3. We surveyed published papers and books relating to supply chain management, in order to get acquainted with related research fields. During the passing year tens of papers were published, including different approaches to the game and detailed designs and algorithms used in last year competition. We relied on the experience of Ishai and other competitors. The background, reading material and ideas collected during the beginning of this project helped us form our solution to this complex problem.
2. Forming our solution’s design -
2.1. Looking on the game from several different aspects, we wrote initial drafts of guiding principles that will form our solution. These included possible strategies, interesting situations we might encounter during the game, data we can get, information we can evaluate using learning and search techniques, data reliability and the information that we would like to know for making the necessary decisions.
2.2. Based on the above analysis we defined a set of state parameters, which provide a dynamic observation of the game, and enable to analyze its status. We wrote guiding behavior rules that might be applied in sub domains of the problem. This led to the ideas of distribution architecture, dynamic shared information, changing strategic and feedback control that will be explained later.
3. The implementation stage –
3.1. First we implemented the basic structure and communication. After achieving working communication with the server of the game, our goal was to write all basic data structures within the agent, trying to save processed data in an efficient way without losing any information. This data will be used by all modules. We also paid special attention to the separation of independent parts in our design in order to enable maximum use of the hardware and time limitations.
3.2. When starting the implementation of each module we raised ideas from different connected research fields that might help us solve these sub problems ahead. The evaluation of the main ideas was based on the difficulties of the game, including uncertainty, dynamism and strategy. We used basic algorithms to process all relevant data efficiently into the necessary decisions. We tried to maximize the use of the data calculated and the limited time given each day.
3.3. The implementation raised issues of time efficiency, logic separation of the control and synchronization. We developed a synchronizing mechanism to maintain all modules synchronized with the server of the game using a timer controlled by the agent and a set of messages, some optional and some that occur every day during a specific event. The flow of messages was also designed to define the flow of control and distribution of responsibility concerning the transfer of meaningful decisions to the server of the game, on time.
3.4. Next we added advanced methods and more sophisticated ideas that require additional developing time to improve and stable the behavior of our agent. We tried to develop the algorithms used in each module and extended the data structures within each module that serve the entities in that module only.
4. Experiments and participating in the competition -
4.1. We ran our agent in simulation games in order to measure the performance of each module and the correct interaction between the modules. We tried to use different initial values to get variety of strategic behaviors. We tuned the agent with the initial parameters that produce the best performance according the measurements
4.2. Deploy of the agent in the competition environment to ensure best performance during the competition. Participating in the competition included working at strange hours depending on the server’s time in order to fix problems and bugs and to refine our agent in the new environment.
5. Analysis and conclusion –
During our work we collect ideas for future enhancements such as to refine the feedback process by making computations over a duplication of the State, throughout each day, exploiting the modules’ idle time. Such computations might lead to better decision-making according more insightful projection of the future state. We also discovered issues in our design that need to be improved in order to achieve better results. Looking at the games during the competition, exposing our agent to so many different strong agents helped us to better understand the game. We present our conclusions in that document.
7. Our Solution
8.1 Layout
Since the problem in hand is complex and previous research does not provide a comprehensive solution, the basic underlying idea of our approach is to perform a decomposition of the control architecture into three separated entities. The procurement module interacts with the suppliers and makes the decisions related to the biding for PCs components, the sales module interacts with the customers and makes RFQs for selling PCs and the factory module controls the manufacturing and shipping schedules. Apart from these three functional modules there’s the agent module that also holds the desired information on the game’s state. The collection of decisions the agent has to make each day consists of all the decisions made by the modules.
The decomposing of the complex problem has several advantages. It will allow us to concentrate at different aspects of the problem separately, applying existing approaches and tools designed for local operations within a supply chain context. It also might enable us the use of parallel programming to accelerate the agent reactions during the game. However, this approach makes it more difficult to achieve fast and uncompromising reactions during the game.
In order to coordinate between all modules and achieve a progress towards their shared purpose of maximizing the agent’s profit, we use a feedback control mechanism. The information gathered during the game reflects the decisions that were made so far by all modules as well as the effect others had on the game’s state. That way the modules receive a feedback on their performance and would be affected by other modules when evaluating future decisions according to the shared information.
The following is a schematic formula illustrating the feedback idea.