CONVERSATIONS AND VIRTUAL REALITY

by

Swapna Reddy Gouravaram

A report submitted in partial fulfillment

of the requirements for the degree

of

MASTER OF SCIENCE

in

Computer Science

Approved:

Dr. Vicki Allan Dr. Steve Allan

Major Professor Committee Member

Dr. Xiaojun Qi

Committee Member

UTAH STATE UNIVERSITY

Logan, Utah

2004

ACKNOWLEDGMENTS

I would first like to thank my advisor, Professor Vicki Allan, for all her wisdom and guidance throughout my years as a master’s student. I am extremely thankful to her for having helped me in all aspects throughout my education in the Computer Science Department. She has been a source of knowledge and inspiration. Professor Vicki has been very patient in helping me understand the new trends and concepts in computer science. I am greatly indebted to her for all the help she has rendered.

I am also grateful for the members of my committee, Dr. Steve Allan and Dr. Xiaojun, for taking time in providing advice and direction on my report. Finally, I would like to thank all my friends for their support.

Swapna Reddy Gouravaram

TABLE OF CONTENTS

Page

ACKNOWLEDGMENTS………………………………………………………………..ii

LIST OF FIGURES………………………………………………………………………iv

LIST OF TABLES………………………………………………………………………..v

ABSTRACT……………………………………………………………………………...vi

CHAPTER

1 INTRODUCTION...... 1

1.1 Agent Conversations...... 2

1.2 Virtual Reality...... 8

2 MODELING CONVERSATIONS USING STOCHASTIC CONTEXT-FREE GRAMMARS...... 10

2.1 Introduction...... 11

2.2 Related work...... 12

2.3 Stochastic context-free grammars...... 14

2.4 MATES (Martial Agent Trait-Based Emotion System)...... 15

2.5 Conversations...... 16

2.5.1 IPIP-NEO Personality Survey...... 17

2.5.2 Rejection language mapping...... 20

2.5.3 Computing probabilities...... 24

2.5.3.1 Determining the Probability of Rejection Based on the “Anger” personality attribute...... 25

2.5.3.2 Determining the Probability of Counter-proposal Based on the “Sympathy” personality attribute...... 28

2.6 An Example: Movie Plan...... 31

2.7 Conclusions and Future work………………………………………..34

3 EVALUATION OF EXISTING VIRTUAL REALITY TOOLS……………….35

3.1 Introduction…………………………………………………………..35

3.2 Requirements………………………………………………………...36

3.3 Comparison of virtual reality systems……………………………….41

3.3.1 NetICE……………………………………………………..41

3.3.2 Dive………………………………………………………...43

3.3.3 Alpha wolves……………………………………………....45

3.3.4 Jack………………………………………………………...47

3.3.5 Active Worlds……………………………………………...48

3.4 Discussion……………………………………………………………52

3.5 Conclusions and Future work………………………………………..53

4 RECOMMENDATIONS AND CONCLUSIONS………………………………54

REFERENCES…………………………………………………………………………..58

LIST OF FIGURES

Figure Page

1.1  Conversations using Finite State machines...... 3

1.2  Conversations using Dooley graphs...... 4

1.3  Conversations using Petri nets before a token is fired ...... 5

1.4  Conversations using Petri nets after a token is fired...... 6

2.1  Mapping between input (personality and situation data) and output...... 17

2.2  Stochastic finite state machine for rejection language...... 24

2.3  Determining rejection probability based on anger...... 26

2.4  Determining counter-proposal probability based on sympathy...... 28

3.1  NetICE virtual environment...... 43

3.2  Dive virtual environment...... 45

3.3  Wolves forming social relationships...... 47

3.4  Jack changing tools of a machine...... 48

3.5  Active Worlds virtual environment...... 50

3.6  Avatar in NetICE showing an anger face...... 56

3.7  Avatars in NetICE raising their hands...... 56

LIST OF TABLES

Table Page

2.1  IPIP-NEO Personality Survey report...... 19

2.2  Personality vector for Alice...... 31

2.3  Personality vector for Alice...... 32

3.1  Evaluation of existing virtual reality tools...... 51

ABSTRACT

Conversations and Virtual Reality

by

Swapna Reddy Gouravaram, Master of Science

Utah State University, 2004

Major Professor: Dr. Vicki Allan

Department: Computer Science

Many emotional and social agents exist that model various aspects of human behavior and personality. This report designed a prototype to generate responses for a part of the conversation. The emphasis has been on developing a framework for the rejection language mapping in the domain of MATES (Martial Agent Trait–based Emotion System) using stochastic context-free grammars. The MATES agent is an intelligent agent with personality, emotions, goals, and plans. Personality of the agent is used to determine the responses during a conversation.

The report also analyses the current literature in the field of virtual reality. Based on opinions of key researchers and our own evaluations, we identify the key issues that must be addressed for evaluating various virtual reality tools.

Virtual reality tools can be coupled with response generation to enhance believability of agents. (64 pages)

43

CHAPTER 1

INTRODUCTION

Agents and virtual environments are two powerful tools in the world of computing. Each has aspects that could potentially complement the other’s strength, particularly when used in animation. This report explores that possibility. To begin, we will briefly define agents and virtual reality.

An agent is an autonomous entity that acts on the behalf of the user [40] by performing actions to meet its design objectives. Weiss [39] defines an intelligent agent as “one that is capable of performing flexible actions” where flexibility means: - reactivity, pro-activeness, and social ability.

A virtual environment is a computer-generated environment that the user may manipulate or move through in real time. An intelligent virtual environment consists of intelligent agents that adapt to the user’s requirements [24]. Thus, agents in intelligent virtual environments guide users, perform actions, and present information to the users.

Virtual reality establishes multi-modal interactions among agents and humans [30, 31]. In order to achieve these interactions, agent design has to include the ability to communicate at a satisfactory level by including personality, emotions, and language [24].

Agents have been used in various applications such as e-commerce applications [23], and large-scale distributed applications like Dive [10]. Intelligent agents are used extensively in applications that involve human interactions [29]. This report deals with agent conversations and virtual reality.

1.1 Agent Conversations

Current research in believable agents deals with developing agents with personality and emotion [26]. Software agents are expected to be believable. When interacting with an agent, the user should feel that he/she is interacting with a life-like character rather than a life-less character. These interactions should model believable, effective human communication [22]. The requirements for believability are personality and emotion. Personality distinguishes one character from another. It includes everything unique and specific about the character. Emotion is the mental state of the agent and depends on the situation [26]. Characters with different personalities show different emotions under similar situations. Research in believable agents is important, as designers need techniques for representing emotions and personality to depict user behavior.

Extensive research has been done in modeling agent conversations. In particular, finite state machines, Dooley graphs, and Petri nets have been used for developing conversations.

A finite state machine consists of states and transitions. States represent the possible states of the agents in a given conversation. Transitions represent a change in the conversation state based on communication to and/or from the agent [13].

Figure 1.1 shows a conversation using finite state machine between Agent A and Agent B. Agent A proposes an activity to Agent B. Agent B can either reject the proposal or accept the proposal. Finite state machines can handle sequential conversations or conversations that occur in parallel. Finite state machines represent each agent’s communication parameters in the form of transition rules.

Figure 1.1: Conversations using Finite State machines.

A Dooley graph is represented by a 4-tuple <E, P, M, A>. E is a set of counting numbers indexing the chronologically ordered utterances in the conversation. P represents the set of participants in the conversation. A is the set of ordered triples of the form {p1, p2, k} defined over two ordered pairs S and R: the Sender set S= {<p1, k>, participant p1 sends utterance k} and the Addressee set R = {<p2, k>, participant p2 receives utterance k}. Each triple in A becomes an arc in the graph. M is a relation between S and R indicating the sender and receiver of an utterance [27]. Figure 1.2 represents the conversation of Figure 1.1 in terms of a Dooley graph.

Figure 1.2: Conversations using Dooley graphs

In the above figure, utterances connect the participants. The decision made by B spawns a new component {B1, B2} in order to represent the complete conversation. A finite state machine model clarifies various states through which a conversation may move, but it obscures the identity of the participant. Dooley graphs appear similar to finite state machines, but instead of representing possibilities, the Dooley graph represents a complete conversation. Each node represents a participant and state information. Thus, we can have more than one state for a participant. Each state represents a part of the conversation. Thus, the collection of states associated with a participant represent the role of the participant. Utterances link senders and receivers. Dooley graphs are said to be useful in capturing complex agent interactions and expressing them in an understandable manner, but they are not used in generating a conversation or choosing a possible conversation path.

A Petri net is represented by a 5-tuple <P, T, I, O, MO), where P is the set of places. T is the set of transitions. I and O are the input and output functions which map places to transitions and transitions to places. MO is the marking vector that characterizes the initial state of the system by indicating the number of tokens at each place. Every transition has a predecessor place and a successor place. When there is at least one token in all predecessor places connected to a transition, we say that the transition is enabled. An enabled transition fires by removing one token from each predecessor place, and depositing one token at each successor place (all the preconditions must be fulfilled). Petri nets have been used for modeling conversations in complex, distributed and concurrent systems [6]. The main reason for using Petri nets on a large scale is their graphical representation and well-defined semantics.

Figures1.3 and 1.4 represent the conversation in Figure 1.1 in terms of a Petri net. Further, Figures 1.3 and 1.4 show transitions among places by removing and adding tokens.

Figure 1.3: Conversations using Petri nets before a token is fired

Figure 1.4: Conversations using Petri nets after a token is fired

The number of tokens removed or added depends on the cardinality of the arc. In the example shown in Figure 1.3 and 1.4, the cardinality is one. Petri nets provide support for complex, concurrent, distributed, and/or stochastic conversations. The tokens in the Petri nets represent the dynamic components of the system. The widespread use of Petri nets is due to their useful graphical representation (used to model flow charts, block diagrams, and networks) and mathematical formalism (needed for state equations and algebraic equations) to represent the behavior of the system. With Petri nets, it is easier to see the flow of a conversation than it is with Dooley graphs (based on the triggering of the events).

There are several ways to incorporate the timing concept into a Petri net model. One way is to associate a firing delay with each transition. This delay specifies the relative time that the transition has to be enabled, before it can actually fire. If the delay is a random distribution function, then it is a stochastic Petri net. While we can use Petri nets to effect the stochastic generation of conversation, it seems more complicated than stochastic grammars.

Stochastic context-free grammars are based on context-free grammars. Stochastic context-free grammars extend context-free grammars by adding probabilities to productions. Stochastic grammars can be used for generation and/or recognition of sentences. They are easier to implement than Petri nets. We are using the grammars to generate a sentence of the language. Others are using stochastic grammars to recognize sentences in the language, by assigning a parse tree to each sentence.

Linking personality with emotions can affect decision-making. Gratch [12] gives an example of a virtual environment to illustrate how emotions can be used to affect behavior. The virtual environment consists of a platoon leader with several autonomous agents, such as other platoon leaders, civilians, and platoon members. The emotional traits of the agents are set based on situation data, and responses are generated based on the agent’s beliefs, plans and emotional values.

Kshirsagar [19] combines personality and emotions to generate responses to synthesize virtual humans. Kshirsagar defines interactions by means of finite state machines. The PEN model determines the personality traits. PEN model is comprised of three personality dimensions namely extraversion, neuroticism and psychoticism.

Our MATES system (Martial Agent Trait-based Emotion system) consists of two agents, Bob and Alice, representing a couple considering marriage. These agents converse with each other based on current plans, goals, history, personality, and emotions. A conversational path is a collection of specifications that guide particular choices during a conversation. Specifications are a collection of internal information structures, such as with whom to communicate, when to communicate, and what to communicate. A variety of conversational paths are possible based on personality, emotions, plans, goals and history. For example, angry people tend to give more rejections. The selection process is determined by using stochastic context-free grammars.

We have attempted to generate multiple responses between two agents through the use of stochastic context-free grammars. A MATES agent has a personality that is established based on user response to a series of questionnaires. Personality values are used to determine the responses generated by the agent. We have chosen the five-factor model personality model for our system, as it is compatible with existing psychological theories.

1.2 Virtual Reality Systems

Virtual Reality (VR) is a computer-generated virtual environment through which the user may move and manipulate the environment in real time. Virtual reality is accomplished via a combination of computer techniques and interface devices that presents the user with the illusion of being in a three-dimensional world [37]. The main goal of virtual reality implementations is to provide the user with an interface that is easy to use, interesting, and intuitive.