Planning in Autonomous Agents

Planning in Autonomous Agents

A Three-Layer Hybrid Architecture for Planning in Autonomous Agents

M. Alwan and N. E. Cheikh Obeid

Automation Department

Higher Institute for Applied Sciences and

Technology

P.O.Box 31983

DAMASCUS, SYRIA.

Fax: +963-11-223 7710

N. N. Kharma and P. Y. K. Cheung

Information Engineering Section

Department of Electrical & Electronic Engineering

Imperial College of Science, Technology and Medicine

University of London, Exhibition Road

LONDON SW7 2BT, UK.

Fax: +44-71-581 4419

Abstract

This paper describes an architecture for Autonomous Agents that combines the advantages of both the classical Hierarchical and the Layered Architectures. The aim is to build more ‘Intelligent’ Autonomous Agents that are capable of planning for high level goals as well as having the ability to react and respond to arising situations in a human-like manner.

1 Introduction

Planning is the process of reasoning, utilising every relevant piece of information available about the task and the environment, for the purpose of finding a sequence of steps leading to a goal. A planner was the heart of the first generation of robots. Earlier attempts to design Autonomous Mobile Robots by the Artificial Intelligence community relied on a crisp symbolic representation of a simplified static model of the environment. The planner operated on the internal model to produce plans that would achieve the goals set to the robot, through reasoning. Later the robot carried out the plan blindly [1]. The approach has given unsatisfactory results for Autonomous Agents. In fact, it showed brittleness, inflexibility, lack of speed and other negative effects when situations to which the plan is unable to react arise [2]. In such situations the plan failed and a model update and re-planning, which are known to be slow processes [1], were needed.

This failure led researchers to doubt the need for planning, or even a world model, and resulted in a radically different view in which the emphasis is shifted from planning towards the tight linking between sensing and action in a distributed network of hardwired behaviours [3]. In such a system, interaction between behaviours, where higher level behaviours could suppress the output of lower level ones, has produced emergent behaviours that enabled the agent to operate autonomously and achieve goals. However, the emergent behaviour approach is inflexible in terms of having different and changing goals; for each goal there has to be a separate hardwired behaviour.

Another "middle of the road" approach to attacking the planning issue was Payton's work [4], in which he adopted the 'Plans as Communication' view, which was discussed by Agre and Chapman. Agre and Chapman state that a plan should be a source of action rather than a program of actions [5]. Payton armed his plan with the ability to take opportunistic reactions to unexpected changes in real-time by calculating a gradient field description of a plan to achieve a certain goal or a set of goals. The approach works fine and generates plans that are much less likely to fail than classical plans, but when a situation that necessitates re-planning arises the plan regeneration turns out to be very slow since it will re-calculate the gradient field, which decides the appropriate action for every possible state in the state space.

In this paper we will argue in favour of planning in Autonomous Agents, show the requirements necessary for a plan to succeed in the real world, and propose an architecture for Autonomous Agents, explained in the context of an Autonomous Guided Vehicle. Further, we will outline a path planning method that fits into the overall architecture, and show its merits.

2 Why is Planning Important?

Planning is a characteristic of the behaviour of human beings, who are acknowledged to be very successful Autonomous Agents. Although planning is time consuming (in that it delays actions until their results are anticipated, evaluated and found to be serving the goal), people do take some time to think and plan before taking an action. When humans plan they try to utilise all the available information to anticipate possible results. They also often plan under uncertain circumstances.

An incentive behind planning is always an expected saving in execution time, energy or even money. Hence, incorporating planning in Autonomous Agents would surely enhance their performance, and would result in designing purposeful goal-directed agents.

3 Requirements for Successful Planning

In the following we will present the requirements necessary for achieving high level goal directed actions without compromising flexibility, that is the ability to react to situations unaccounted for by the planner, and having the appropriate speed of response when such a situation arises to the agent in not only a sudden but also a threatening manner.

Successful planning requires, firstly, an internal representation of the environment, in which uncertainty and incompleteness are explicitly represented, and upon which the planner operates. This model should be continuously updated to help provide the agent with the ability to adapt to a dynamic or changing environment. Uncertainties are inherent to physical systems due to measurement errors and noise in the sensory system. Incompleteness at early stages of operation, that is some missing features that are necessary for successful planning and operation, is a direct result of the limitations in the sensory systems. If such uncertainties were not included explicitly in the world model, plans based on the world model are likely to fail because of the discrepancies that are bound to arise between the environment itself and the crisp world model representing it, which has been used in generating the plan. Additionally, if there are other environmental constraints that are present, but cannot be perceived by the sensory system, such as prohibited areas, there must be a provision to force such constraints into the model.

Secondly, a planner that is capable of taking into consideration the uncertainties embodied in the world model and of coping with ambiguity, which is a distinct character of human commands, is needed. Handling ambiguity means that the planner should cater for planning multiple possible goals, leaving the conflict resolution to a later stage. Such a planner produces realistic plans that are less likely to fail when executed in the environment, unless an unexpected situation arises and prevents the agent from executing a certain step of the plan. To solve this problem of over-commitment to the plan, reduce the invocations of the world model update and re-planning, and to allow for more reactivity and flexibility in the agent's performance, the plan can be abstracted by analysing it and identifying the critical points, bottle-necks, turning points, or compulsory passageways. The identified points, or subgoals, form the skeleton of a universal plan that the agent has to try to fulfil, leaving the executive system, which carries out the plan, to perform local planning, improvise and fill-in the plan details. Further, flexibility can be even better increased by declaring the subgoals themselves as fuzzy subgoals, that is: if the plan execution gets the agent to the vicinity of the current subgoal in the state space such that the proceeding subgoal can be pursued, execution considers the subgoal satisfied, abandons it and starts pursuing the next one. This can be efficient if circumstances change and the complete fulfilment of one of the subgoals at hand becomes impossible, while the partial fulfilment of this subgoal can let us continue carrying out the current plan without the need for re-planning.

Thirdly, a flexible executive having the ability to react to unexpected situations is necessary. The reactive executive uses on-line sensory information, together with the plan guidance, to generate the appropriate actions in real-time. The powers of fuzzy logic and qualitative reasoning can be utilised to fill-in plan details by providing a conscious reaction, as opposed to the traditional blind execution. This kind of response is a reaction in terms of human behaviour.

Finally, a module linking sensory information directly to instantaneous actions is important for preserving the agent and protecting it from the effects of sudden threats. Such a response is a reflex in terms of human behaviour.

The above remarks are general and provide guidelines for a general architecture suitable for Autonomous Agents that has the ability to plan purposely, react to plan contingencies, and have protective reflexes in a human-like manner.

In the next section we will briefly describe the architecture dictated by the above guidelines, in the context of an Autonomous Guided Vehicle.

4 The Hybrid Architecture

The architecture comprises three different layers: a reflexive layer, a reactive layer and a goal directed actions layer, divided according to their complexity, speed and range of effect as Figure 1 shows.

The goal directed actions layer is hierarchical, and has a long term and a long range of effects on the overall behaviour; it also requires most of the computations in the system for map building and planning. The reactive layer is faster and has a closer range of effects; it is still hierarchical in structure and requires some time for building a local view of the vehicle's surroundings and reasoning about it. The reflexive layer has the closest range, the fastest effect and, most importantly, the highest priority. The name Hybrid is derived from this composite structure, in addition to the capabilities inherited from both the Hierarchical and the Layered Architectures.

The components of the three above mentioned layers are described below.

4.1 The Perception System

The perception system uses an array of sonar range detectors to sense objects in the environment, builds a local view of obstacles surrounding the vehicle, and then uses the local view to update a global map of the environment. The sonar itself has a measurement error that introduces uncertainty in addition to measurement noise.

Fig. 1 A Block Diagram of the proposed Hybrid Architecture for an Autonomous

Guided Vehicle with its three main Layers.

To model this uncertainty we use a degree of confidence measure to express to what degree the range reading is accurate, i.e. a fuzzy membership function of an obstacle area; a free area will have a membership function that is the fuzzy complement [6] of the obstacle area:

(1)

Where is the degree of confidence in free areas, is the degree of confidence in occupied areas.

Fig. 2 shows the simplified membership functions graphically.

In the figure R is the sonar range reading, e is the absolute maximum error; the actual shape of the occupancy membership function depends on noise distribution and can be constructed from probabilistic data. A probabilistic distribution study of sonar beams can be found in [7] and [8].

The resulting view and map will contain obstacle areas, free areas, and possibly occupied areas to a certain degree of possibility. Unknown areas can exist during the exploration stages, and they can be treated as possibly occupied areas to a degree of 0.5. Hence, the map built will also include incompleteness; however, the global map is built gradually and eventually it will be complete.

Fig. 2 Shape of membership functions of free and occupied areas for a sonar range reading.

Moreover, the robots kinematics can be included in the map if the map was transformed into a configuration space map [9] by growing obstacles [10].

In [11] we have shown a way to build a Fuzzy Configuration Space Map directly from on-line sonar readings.

4.2 The Human Interface System

The human interface system receives the human commands, analyses them and extracts a set of possible goals, each goal will have a degree of confidence describing its possibility of being what is really meant or a measure of preference with respect to other goals. This task does not remove ambiguity, it instead unveils it. This can be accomplished by analysing previous commands utilising a rule-base describing the behavioural pattern of the operator's commands, or alternatively by linking a command to a possible set of goals.

4.3 The Path Planner

The path planner reads the global map, the possible goals and the current position and orientation and performs manipulations of representation. Then the planner plans a low cost path that gets the vehicle from its current location to a desired goal. The cost of the path takes into account distance to a goal, degree of confidence in a goal, uncertainties, as well as path width and the vehicle's kinematics. The planner then abstracts the plan and gives a list of subgoals to the navigation system. The particular path planning method will be discussed in section five.

4.4 The Navigation System

The navigation system transforms a sequence of subgoals, which can sometimes be only fuzzily or approximately reached, into a smooth executable trajectory. To achieve this task the navigation system has two blocks: the first analyses the planned path curvatures by comparing pairs of path segments to generate the appropriate navigational commands, that is desired direction and speed along the path segments. The analysis can be performed by a set of rules that take into account the constraints of the mechanical design of the vehicle, the turns that has to be performed between path segments, safety and, sometimes, comfort of the passenger; such rules are naturally fuzzy. The second block takes an input representing free direction from the reactions sub-system and performs a fuzzy aggregation [6] to find the modified direction of travel. It then decides the speed modification from the required change in travel direction, i.e. the difference between the direction of travel desired by the plan and the modified one.

4.5 Reactions

The reactions are rules that qualitatively reason about the local view information, to allow the vehicle to gradually avoid unexpected static or moving obstacles as long as they do not appear close to the vehicle suddenly. They generate a new travel direction, selected on the fact that it is free while the original desired direction is not.

The rules take the following form: If there is an obstacle at a medium distance in the desired travel direction and the desired speed is high then if the space to the right of the desired direction is free then the new direction is to the right. In this example the obstacle is unexpected, because if it were known before, the path planner would have directed a turn away from it earlier, and hence the desired speed would not have been high.

In this sub-system, the rules are also tuned to produce smooth transitions in both direction and speed, fulfilling the constraints similar to those of the path analysis rules.

4.6 Reflexes

The reflexes are meant to prevent the vehicle from colliding with static or moving obstacles suddenly appearing to the vehicle's sensors at a close distance. Again they are rules, but this time linking sensors directly to actuators. They are encoded and hardwired or prioritised in order to achieve their tasks as fast as they should.

An Autonomous Guided Vehicle has limited reflexes, namely the emergency brake due to the limited movement allowed by the mechanical design, and limitations in the sensor types fitted.

The rule for generating an emergency brake could be: If there is a sensor reading indicating a close obstacle in the direction of travel and the speed is high or medium then stop. This obstacle must have appeared suddenly, otherwise the reactive behaviour would have already reduced the speed while approaching it. In this sub-system the emphasis is on protecting the vehicle and its passenger from hitting anything or anyone, if possible. Thus this behaviour, although low level, has the top priority; it over-rides all other behaviours, as opposed to the layered architecture where higher level behaviours subsume lower level ones [1].

This approach reduces some of the redundancy present in the layered architecture.

4.7 Conclusion

The proposed architecture benefits from both existing architectures of Autonomous Agents, the Layered and the Hierarchical; it exhibits reflexes and emergent behaviour of the first, and the purposeful high level planning of the latter. Finally, the proposed Hybrid Architecture seems closer to human beings than the other two architectures.

5 The Path Planning Method

The global Fuzzy Configuration Space Map, built by the perception system, is mapped to a resistive network where obstacle nodes have an infinite resistance, i.e. disconnected from the network, free nodes have a relatively low resistance value, while possibly occupied nodes have a resistance value ranging between infinity and the free node resistance, according to the degree of confidence in the node being occupied. This maps uncertainty as well as incompleteness to resistance model. More than one voltage source can be attached between the start point and the possible goal nodes. The start node is tied to the ground potential so that only one reference point exists in the network. The voltage sources can have different values according to the degree of confidence in the location represented by that node as being the desired target, or to the order of preference of the multiple targets. The path simply follows the maximum branch current at every node until a goal is reached.