U.S. patent application number 13/079766 was filed with the patent office on 2011-07-21 for characterizing and predicting agents via multi-agent evolution.
Invention is credited to Robert J. Bisson, Steven M. Brophy, Sven Brueckner, Robert S. Matthews, H. Van Dyke Parunak, John A. Sauter.
Application Number | 20110178978 13/079766 |
Document ID | / |
Family ID | 38233879 |
Filed Date | 2011-07-21 |
United States Patent
Application |
20110178978 |
Kind Code |
A1 |
Parunak; H. Van Dyke ; et
al. |
July 21, 2011 |
CHARACTERIZING AND PREDICTING AGENTS VIA MULTI-AGENT EVOLUTION
Abstract
A method of predicting the behavior of software agents in a
simulated environment involving modeling a plurality of software
agents representing entities to be analyzed, which may be human
beings. Using a set of parameters that governs the behavior of the
agents, the internal state of at least one of the agents is
estimated by its behavior in the simulation, including its movement
within the environment. This facilitates a prediction of the likely
future behavior of the agent based solely upon its internal state;
that is, without recourse to any intentional agent communications.
In one embodiment, the simulated environment is based upon a
digital pheromone infrastructure. The simulation integrates
knowledge of threat regions, a cognitive analysis of the agent's
beliefs, desires, and intentions, a model of the agent's emotional
disposition and state, and the dynamics of interactions with the
environment.
Inventors: |
Parunak; H. Van Dyke; (Ann
Arbor, MI) ; Brueckner; Sven; (Dexter, MI) ;
Matthews; Robert S.; (Saline, MI) ; Sauter; John
A.; (Ann Arbor, MI) ; Brophy; Steven M.;
(Saline, MI) ; Bisson; Robert J.; (Dexter,
MI) |
Family ID: |
38233879 |
Appl. No.: |
13/079766 |
Filed: |
April 4, 2011 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
11548909 |
Oct 12, 2006 |
7921066 |
|
|
13079766 |
|
|
|
|
60725854 |
Oct 12, 2005 |
|
|
|
Current U.S.
Class: |
706/52 |
Current CPC
Class: |
G06N 3/006 20130101 |
Class at
Publication: |
706/52 |
International
Class: |
G06N 5/04 20060101
G06N005/04 |
Goverment Interests
[0002] This application is based in part upon work supported by the
Defense Advanced Research Projects Agency (DARPA) under Contract
No. NBCHC040153. Any opinions, findings and conclusions or
recommendations expressed in this material are those of the
inventors and do not necessarily reflect the views of the DARPA or
the Department of Interior-National Business Center DOI-NBC).
Distribution Statement "A" (Approved for Public Release,
Distribution Unlimited).
Claims
1. A method of predicting the behavior of an agent in an
environment, comprising the steps of: executing a computer
simulation of an environment including a plurality of software
agents; estimating the internal state of at least one of the agents
based upon its behavior in the simulation, including its movement
within the environment; and predicting the likely future behavior
of the agent based upon the estimate of its internal state.
2. The method of claim 1, wherein the agent's internal state is
estimated by examining changes in the agent's observed
behavior.
3. The method of claim 1, wherein the agent's internal state is
estimated in conjunction with a model of the environment.
4. The method of claim 1, wherein the prediction of the agent's
future behavior is based in part on the agent's interaction with
the environment.
5. The method of claim 1, wherein the agents represent human
beings.
6. The method of claim 1, wherein the simulated environment
comprises digital pheromones.
7. The method of claim 6, wherein the digital pheromones are scalar
variables that agents can sense and which they deposit at their
current location in the environment.
8. The method of claim 7, wherein the agents respond to the local
concentrations of the digital pheromones tropistically through
climbing or descending local gradients.
9. The method of claim 6, wherein the pheromones run on the nodes
of a graph-structured environment.
10. The method of claim 6, wherein the graph-structured environment
is a rectangular lattice.
11. The method of claim 6, wherein each agent is capable of
aggregating pheromone deposits from individual agents, thereby
fusing information across multiple agents over time.
12. The method of claim 6, wherein each agent is capable of
evaporating pheromones over time to remove inconsistencies that
result from changes in the simulation.
13. The method of claim 6, wherein each agent is capable of
diffusing pheromones to nearby places, thereby disseminating
information for access by nearby agents.
14. The method of claim 6, wherein the movements of the agents
change their deposit patterns.
15. The method of claim 6, wherein the simulation integrates
knowledge of threat regions, a cognitive analysis of the agent's
beliefs, desires, and intentions, a model of the agent's emotional
disposition and state, and the dynamics of interactions with the
environment.
16. The method of claim 1, wherein the simulation involves urban
warfare.
17. The method of claim 1, wherein the simulation involves a
computer game.
18. The method of claim 1, wherein the simulation involves a
business strategy.
19. The method of claim 1, wherein the simulation involves a sensor
fusion.
20. A system for predicting the behavior of an agent in an
environment, comprising: a processor or microprocessor coupled to a
memory, wherein the processor or microprocessor is programmed to
evaluate search results by: executing a computer simulation of an
environment including a plurality of software agents; estimating
the internal state of at least one of the agents based upon its
behavior in the simulation, including its movement within the
environment; and predicting the likely future behavior of the agent
based upon the estimate of its internal state.
Description
[0001] This application is a continuation of and claim priority to
U.S. application Ser. No. 11/548,909, which claims benefit of and
priority to U.S. Provisional Application No. 60/725,854, filed Oct.
12, 2005, and is entitled to that filing date for priority. The
specification, figures and complete disclosures of U.S. Provisional
Application No. 60/725,854 and application Ser. No. 11/548,909 are
incorporated herein by specific reference for all purposes.
FIELD OF INVENTION
[0003] This invention relates generally to agent behavior and, in
particular, to a system and method that characterizes an agent's
internal state by evolution against observed behavior, and predicts
future behavior, taking into account the dynamics of agent
interaction with their environment.
BACKGROUND OF THE INVENTION
[0004] Reasoning about agents that we observe in the world must
integrate two disparate levels. Our observations are often limited
to the agent's external behavior, which can frequently be
summarized: numerically as a trajectory in space-time (perhaps
punctuated by actions from a fairly limited vocabulary). However,
this behavior is driven by the agent's internal state, which (in
the case of a human) may involve high-level psychological and
cognitive concepts such as intentions and emotions. A central
challenge in many application domains is reasoning from external
observations of agent behavior to an estimate of their internal
state. Such reasoning is motivated by a desire to predict the
agent's behavior. Work to date focuses almost entirely on
recognizing the rational state (as opposed to the emotional state)
of a single agent (as opposed to an interacting community), and
frequently takes advantage of explicit communications between
agents (as in managing conversational protocols).
[0005] It is increasingly common in agent theory to describe the
cognitive state of an agent in terms of its beliefs, desires, and
intentions (the so-called "BDI" model [4, 15]). An agent's beliefs
are propositions about the state of the world that it considers
true, based on its perceptions. Its desires are propositions about
the world that it would like to be true. Desires are not
necessarily consistent with one another: an agent might desire both
to be rich and not to work at the same time. An agent's intentions,
or goals, are a subset of its desires that it has selected, based
on its beliefs, to guide its future actions. Unlike desires, goals
must be consistent with one another (or at least believed to be
consistent by the agent).
[0006] An agent's goals guide its actions. Thus one ought to be
able to learn something about an agent's goals by observing its
past actions, and knowledge of the agent's goals in turn enables
conclusions about what the agent may do in the future.
[0007] There is a considerable body of work in the AI and
multi-agent community on reasoning from an agent's actions to the
goals that motivate them. This process is known as "plan
recognition" or "plan inference." A recent survey is available at
[2]. This body of work is rich and varied. It covers both
single-agent and multi-agent (e.g., robot soccer team) plans,
intentional vs. non-intentional actions, speech vs. non-speech
behavior, adversarial vs. cooperative intent, complete vs.
incomplete world knowledge, and correct vs. faulty plans, among
other dimensions.
[0008] Plan recognition is seldom pursued for its own sake. It
usually supports a higher-level function. For example, in
human-computer interfaces, recognizing a user's plan can enable the
system to provide more appropriate information and options for user
action. In a tutoring system, inferring the student's plan is a
first step to identifying buggy plans and providing appropriate
remediation. In many cases, the higher-level function is predicting
likely future actions by the entity whose plan is being
inferred.
[0009] Many realistic problems deviate from these conditions:
[0010] Increasing the number of agents leads to a combinatorial
explosion of possibilities that can swamp conventional analysis.
[0011] The dynamics of the environment can frustrate the intentions
of an agent. [0012] The agents often are trying to hide their
intentions (and even their presence), rather than intentionally
sharing information. [0013] An agent's emotional state may be at
least as important as its rational state in determining its
behavior.
[0014] Domains that exhibit these constraints can often be
characterized as adversarial, and include military combat,
competitive business tactics, and multi-player computer games.
SUMMARY OF INVENTION
[0015] In various embodiments, the present invention comprises a
method of predicting the behavior of software agents in a simulated
environment. The method involves modeling a plurality of software
agents representing entities to be analyzed, which may be human
beings. Using a set of parameters that governs the behavior of the
agents, the internal state of at least one of the agents is
estimated by its behavior in the simulation, including its movement
within the environment. This facilitates a prediction of the likely
future behavior of the agent based solely upon its internal state;
that is, without recourse to any intentional agent
communications.
[0016] In one embodiment, the simulated environment is based upon a
digital pheromone infrastructure. The digital pheromones are scalar
variables that agents can sense and which they deposit at their
current location in the environment. The agents respond to the
local concentrations of the digital pheromones tropistically
through climbing or descending local gradients. The pheromone
infrastructure runs on the nodes of a graph-structured environment,
preferably a rectangular lattice. Each agent is capable of
aggregating pheromone deposits from individual agents, thereby
fusing information across multiple agents over time. Each agent is
further capable of evaporating pheromones over time to remove
inconsistencies that result from changes in the simulation, and
diffusing pheromones to nearby places, thereby disseminating
information for access by nearby agents.
[0017] By reasoning from an entity's observed behavior, this
invention is capable of providing an estimate of the entity's
internal state, and extrapolating that estimate into a prediction
of the entity's likely future behavior. The system and method,
called BEE (Behavioral Evolution and Extrapolation), performs these
and other tasks using a faster-than-real-time simulation of
lightweight swarming agents, coordinated through digital
pheromones. This simulation integrates knowledge of threat regions,
a cognitive analysis of the agent's beliefs, desires, and
intentions, a model of the agent's emotional disposition and state,
and the dynamics of interactions with the environment. By evolving
agents in this rich environment, their internal state can be fitted
to their observed behavior. In realistic wargame scenarios, the
system successfully detects deliberately played emotions and makes
reasonable predictions about the entities' future behavior.
DESCRIPTION OF THE DRAWINGS
[0018] FIG. 1 is a graphic model of a tracking nonlinear dynamical
system wherein a=system state space; b=system trajectory over time;
c=recent measurements of system state; and d=short-range
prediction.
[0019] FIG. 2 is a diagram of a Behavioral Emulation and
Extrapolation (BEE) Integrated Rational and Emotive Personality
Model.
[0020] FIG. 3 is graphical representation of an exemplary
embodiment of the BEE model, wherein each avatar generates a stream
of ghosts that sample the personality space of the entity it
represents. They evolve against the observed behavior of the entity
in the recent past, and the fittest ghosts then run into the future
to generate predictions.
[0021] FIG. 4 is a Delta Disposition chart for a "Chicken's Ghosts"
embodiment.
[0022] FIG. 5 is a Delta Disposition chart for a "Rambo"
embodiment.
[0023] FIG. 6 shows a table for evaluating predictions, where each
row corresponds to a successive prediction for a given unit, and
each column to a time in the real world that is covered by some set
of these predictions. The shaded cells show which predictions cover
which time periods. Each cell (a) contains the location error, that
is, how far the unit is at the time indicated by the column from
where the prediction indicated by the row said it would be. One can
average these errors across a single prediction (b) to estimate the
prospective accuracy of a single prediction, across a single time
(c) to estimate the retrospective accuracy of all previous
predictions referring to a given time, or across a given offset
from the start of the prediction (d) to estimate the horizon error,
i.e, how prediction accuracy varies with look-ahead depth.
[0024] FIG. 7 shows a graphic representation of path
characteristics: angle .theta., straight-line radius .rho., and
actual length .lamda..
[0025] FIG. 8 shows graphs for exemplary stepwise metrics,
including, from left to right, average prospective, retrospective,
and horizon error. The thin line is the average of metrics from 100
random walks. The vertical line indicates when the unit dies. Since
these are error curves, lower is better.
[0026] FIG. 9 shows graphs for exemplary component metrics. The
thin line is the random baseline. Since these metrics indicate
degree of agreement between prediction and baseline, higher is
better.
DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS
[0027] In one exemplary embodiment, the present system provides a
Behavioral Evolution and Extrapolation (BEE) method and approach to
addressing the recognition of the rational and emotional state of
multiple interacting agents based solely on their behavior, without
recourse to intentional communications from them. It is inspired by
techniques used to predict the behavior of nonlinear dynamical
systems, in which a representation of the system is continually fit
to its recent past behavior. In such analysis of nonlinear
dynamical systems, the representation takes the form of a closed
form mathematical equation. In BEE, it takes the form of a set of
parameters governing the behavior of software agents representing
the individuals being analyzed.
[0028] In contrast to previous research in AI (plan recognition)
and nonlinear dynamics systems (trajectory prediction), embodiments
of the present invention focus on plan recognition in support of
prediction. An agent's plan is a necessary input to a prediction of
its future behavior, but hardly a sufficient one. At least two
other influences, one internal and one external, need to be taken
into account.
[0029] The external influence is the dynamics of the environment,
which may include other agents. The dynamics of the real world
impose significant constraints. The environment is autonomous (it
may do things on its own that interfere with the desires of the
agent) [3, 8]. Most interactions among agents, and between agents
and the world, are nonlinear. When iterated, these can generate
rapid divergence of trajectories ("chaos," sensitivity to initial
conditions).
[0030] A rational analysis of an agent's goals may enable one to
predict what it will attempt, but any nontrivial plan with several
steps will depend sensitively at each step to the reaction of the
environment, and predictions must take this into account as well.
Actual simulation of futures is one way to deal with these.
[0031] In the case of human agents, an internal influence also
comes into play. The agent's emotional state can modulate its
decision process and its focus of attention (and thus its
perception of the environment). In extreme cases, emotion can lead
an agent to choose actions that from the standpoint of a logical
analysis may appear irrational.
[0032] Current work on plan recognition for prediction focuses on
the rational plan, and does not take into account either external
environmental influences or internal emotional biases. BEE
integrates all three elements into its predictions.
Real-Time Fitting in Nonlinear Systems Analysis
[0033] Many systems of interest can be described in terms of a
vector of real numbers that changes as a function of time. The
dimensions of the vector define the system's state space.
Notionally, one typically analyzes such systems as vector
differential equations, e.g., dx/dt=f(x).
[0034] When f is nonlinear, the system can be formally chaotic, and
starting points arbitrarily close to one another can lead to
trajectories that diverge exponentially rapidly, becoming
uncorrelated. Long-range prediction of the behavior of such a
system is impossible in principle. However, it is often useful to
anticipate the system's behavior a short distance into the future.
To do so, a common technique is to fit a convenient functional form
for f to the system's trajectory in the recent past, and then
extrapolate this fit into the future, as seen in FIG. 1. [6] This
process is repeated constantly, in real time, providing the user
with a limited look-ahead into the system's future.
[0035] While this approach is robust and widely applied, it
requires systems that can efficiently be described in terms of
mathematical equations that can be fit using optimization methods
such as least squares. BEE applies this approach to agent
behaviors, which it fits to observed behavior using a genetic
algorithm.
Architecture
[0036] BEE predicts the future by observing the emergent behavior
of agents representing the entities of interest in a fine-grained
agent simulation. Key elements of the BEE architecture include the
model of an individual agent, the pheromone infrastructure through
which agents interact, the information sources that guide them, and
the overall evolutionary cycle that they execute.
Agent Model
[0037] The agents in BEE are inspired by two bodies of work. The
first is our own previous work on fine-grained agents that
coordinate their actions stigmergically, through digital pheromones
in a shared environment [1, 11, 13, 14, 16]. The second inspiration
is the success of previous agent-based combat modeling in EINSTein
and MAUI.
[0038] Digital pheromones are scalar variables that agents deposit
at their current location in the environment, and that they can
sense. Agents respond to the local concentrations of these
variables tropistically, typically climbing or descending local
gradients. Their movements in turn change the deposit patterns.
This feedback loop, together with processes of evaporation and
propagation in the environment, can support complex patterns of
interaction and coordination among the agents [12]. Table 1 shows
the pheromone flavors currently used in the BEE. In addition,
ghosts take into account their distance from distinguished static
locations, a mechanism that we call "virtual pheromones," since it
has the same effect as propagating a pheromone field from such a
location, but with lower computational costs.
TABLE-US-00001 TABLE 1 PHEROMONE FLAVORS IN RAID RedAlive Emitted
by a living or dead entity of RedCasualty the appropriate group
Blue Alive (Red = enemy, Blue = friendly, Green = neutral)
BlueCasualty GreenAlive GreenCasualty Weapons Fire Emitted by a
firing weapon KeySite Emitted by a site of particular importance to
Red Cover Emitted by locations that afford cover from fire Mobility
Emitted by roads and other structures that enhance agent mobility
RedThreat Determined by external process Blue Threat
[0039] The use of agents to model combat is inspired by EINSTein
and MAUI. EINSTein [5] represents an agent as a set of six weights,
each in [-1, 1], describing the agent's response to six kinds of
information. Four of these describe the number of alive friendly,
alive enemy, injured friendly, and injured enemy troops within the
agent's sensor range. The other two weights relate to the model's
use of a childhood game, "capture the flag," as a prototype of
combat. Each team has a flag, and seeks to protect it from the
other team while capturing the other team's flag. The fifth and
sixth weights describe how far the agent is from its own and its
adversary's flag. A positive weight indicates that the agent is
attracted to the entity described by the weight, while a negative
weight indicates that it is repelled.
[0040] MANA [7] extends the concepts in EINSTein. Friendly and
enemy flags are replaced by the waypoints being pursued by each
side. MANA includes four additional components: low, medium, and
high threat enemies. In addition, it defines a set of triggers
(e.g., reaching a waypoint, being shot at, making contact with the
enemy, being injured) that shift the agent from one personality
vector to another. A default state defines the personality vector
when no trigger state is active.
[0041] The personality vectors in MANA and EINSTein reflect both
rational and emotive aspects of decision-making. The notion of
being attracted or repelled by friendly or adversarial forces in
various states of health is an important component of what we
informally think of as emotion (e.g., fear, compassion,
aggression), and the use of the term "personality" in both EINSTein
and MANA suggests that the system designers are thinking
anthropomorphically, though they do not use "emotion" to describe
the effect they are trying to achieve. The notion of waypoints to
which an agent is attracted reflects goal-oriented rationality.
[0042] BEE embodies an integrated rational-emotive personality
model. In one embodiment, a BEE agent's rationality is modeled as a
vector of seven desires, which are values in [-1, +1]: ProtectRed
(the adversary), ProtectBlue (friendly forces), ProtectGreen
(civilians), ProtectKeySites, AvoidCombat, AvoidDetection, and
Survive. Negative values reverse the sense suggested by the label.
For example, a negative value of ProtectRed indicates a desire to
harm Red.
[0043] Table 2 shows which pheromones A(ttract) or R(epel) an agent
with a given desire, and how that tendency translates into
action.
[0044] The emotive component of a BEE's personality is based on the
Ortony-Clore-Collins (OCC) framework [9], and described in detail
elsewhere [10]. OCC define emotions as "valanced reactions to
agents, states, or events in the environment." This notion of
reaction is captured in MANA's trigger states. An important advance
in BEE's emotional model with respect to MANA and EINSTein is the
recognition that agents may differ in how sensitive they are to
triggers. For example, threatening situations tend to stimulate the
emotion of fear, but a given level of threat will produce more fear
in a new recruit than in a seasoned combat veteran. Thus, the
present model includes not only Emotions, but Dispositions. Each
Emotion has a corresponding Disposition. Dispositions are
relatively stable, and considered constant over the time horizon of
a run of the BEE, while Emotions vary based on the agent's
disposition and the stimuli to which it is exposed.
[0045] Based on interviews with military domain experts we
identified the two most crucial emotions for combat behavior as
Anger (with the corresponding disposition Irritability) and Fear
(whose disposition is Cowardice). Table 3 shows which pheromones
trigger which emotions. Emotions are modeled as agent hormones
(internal pheromones) that are augmented in the presence of the
triggering environmental condition and evaporate over time.
TABLE-US-00002 TABLE 3 INTERACTIONS OF PHEROMONES AND
DISPOSITIONS/EMOTIONS Red Perspective Blue Perspective Green
Perspective Irritability/ Cowardice/ Irritability/ Cowardice/
Irritability/ Cowardice/ Anger Fear Anger Fear Anger Fear Pheromone
RedAlive X X RedCasualty X X BlueAlive X X X X BlueCasualty X X
GreenCasualty X X X X WeaponsFire X X X X X X KeySites X X
[0046] The effect of a non-zero emotion is to modify actions. An
elevated level of Anger will increase movement likelihood, weapon
firing likelihood, and tendency toward an exposed posture. An
increasing level of Fear will decrease these likelihoods.
[0047] FIG. 2 summarizes one embodiment of the BEE's personality
model. The left two columns are a straightforward BDI model (where
we prefer the term "goal" to "intention"). The right-hand column is
the emotive component, where an appraisal of the agent's beliefs,
moderated by the disposition, leads to an emotion that in turn
influences the BDI analysis.
The BEE Cycle
[0048] A major innovation in BEE is an extension of the nonlinear
systems technique described herein to characterize agents based on
their past behavior and extrapolate their future behavior based on
this characterization. This section describes this process at a
high level, then discusses in more detail the multi-page pheromone
infrastructure that implements it.
Overview
[0049] FIG. 3 is an overview of one embodiment of the BEE process.
Each active entity in the battlespace has an avatar that
continuously generates a stream of ghost agents representing
itself. Ghosts live on a timeline indexed by .tau. that begins in
the past at the insertion horizon and runs into the future to the
prediction horizon. .tau. is offset with respect to the current
time t in the domain being modeled. The timeline is divided into
discrete "pages," each representing a successive value of .tau..
The avatar inserts the ghosts at the insertion horizon. In our
current system, the insertion horizon is at .tau.-t=-30, meaning
that ghosts are inserted into a page representing the state of the
world 30 minutes ago. At the insertion horizon, each ghost's
behavioral parameters (desires and dispositions) are sampled from
distributions to explore alternative personalities of the entity it
represents.
[0050] Each page between the insertion horizon and .tau.=t ("now,"
the pace corresponding to the state of the world at the current
domain time) records the historical state of the world at the point
in the past to which it corresponds. As ghosts move from page to
page, they interact with this past state, based on their behavioral
parameters. These interactions mean that their fitness depends not
just on their own actions, but also on the behaviors of the rest of
the population, which is also evolving. Because .tau. advances
faster than real time, eventually .tau.=t (actual time). At this
point, each ghost is evaluated based on its location compared with
the actual location of its corresponding real-world entity.
[0051] The fittest ghosts have three functions:
[0052] 1. The personality of the fittest ghost for each entity is
reported to the rest of the system as the likely personality of the
corresponding entity. This information enables us to characterize
individual warriors as unusually cowardly or brave.
[0053] 2. The fittest ghosts are bred genetically and their
offspring are reintroduced at the insertion horizon to continue the
fitting process.
[0054] 3. The fittest ghosts for each entity form the basis for a
population of ghosts that are allowed to run past the avatar's
present into the future. Each ghost that is allowed to run into the
future explores a different possible future of the battle,
analogous to how some people plan ahead by mentally simulating
different ways that a situation might unfold. Analysis of the
behaviors of these different possible futures yields
predictions.
[0055] A review of this process shows that BEE has three distinct
notions of time, all of which may be distinct from real-world
time.
[0056] 1. Domain time t is the current time in the domain being
modeled. This time may be the same as real-world time, if BEE is
being applied to a real-world situation. In our current
experiments, we apply BEE to a battle taking place in a simulator,
the OneSAF Test Bed (OTB), and domain time is the time stamp
published by OTB. During actual runs, OTB is often paused, so
domain time runs slower than real time. When we replay logs from
simulation runs, we can speed them up so that domain time runs
faster than real time.
[0057] 2. BEE time .tau. for a specific page records the domain
time corresponding to the state of the world represented on that
page, and is offset from the current domain time.
[0058] 3. Shift time is incremented every time the ghosts move from
one page to the next. The relation between shift time and real time
depends on the processing resources available.
Pheromone Infrastructure
[0059] BEE must operate very rapidly in order to keep pace with an
ongoing evolution of a battle or other complex situation. Thus we
use simple agents coordinated using pheromone mechanisms. We have
described the basic dynamics of our pheromone infrastructure
elsewhere [1]. This infrastructure runs on the nodes of a
graph-structured environment (in the case of BEE, a rectangular
lattice). Each node maintains a scalar value for each flavor of
pheromone, and provides three functions:
[0060] 1. It aggregates deposits from individual agents, fusing
information across multiple agents and through time.
[0061] 2. It evaporates pheromones over time. This dynamic is an
innovative alternative to traditional truth maintenance in
artificial intelligence. Traditionally, knowledge bases remember
everything they are told unless they have a reason to forget
something, and expend large amounts of computation in the
NP-complete problem of reviewing their holdings to detect
inconsistencies that result from changes in the domain being
modeled. Ants immediately begin to forget everything they learn,
unless it is continually reinforced. Thus inconsistencies
automatically remove themselves within a known period.
[0062] 3. It diffuses pheromones to nearby places, disseminating
information for access by nearby agents.
[0063] The distribution of each pheromone flavor over the
environment forms a scalar field that represents some aspect of the
state of the world at an instant in time. Each page of the timeline
discussed in the previous section is a complete pheromone field for
the world at the BEE time .tau. represented by that page. The
behavior of the pheromones on each page depends on whether the page
represents the past or the future.
[0064] In pages representing the future (.tau.>t), the usual
pheromone mechanisms apply. Ghosts deposit pheromone each time they
move to a new page, and pheromones evaporate and propagate from one
page to the next.
[0065] In pages representing the domain past (.tau..ltoreq. t), one
has an observed state of the real world. This has two consequences
for pheromone management. First, we can generate the pheromone
fields directly from the observed locations of individual entities,
so there is no need for the ghosts to make deposits. Second, we can
adjust the pheromone intensities based on the changed locations of
entities from page to page, so we do not need to evaporate or
propagate the pheromones. Both of these simplifications reflect the
fact that in our current system, we have complete knowledge of the
past. When we introduce noise and uncertainty, we will probably
need to introduce dynamic pheromones in the past as well as the
future.
[0066] Execution of the pheromone infrastructure proceeds on two
time scales, running in separate threads.
[0067] The first thread updates the book of pages each time the
domain time advances past the next page boundary. At each step:
[0068] 1. The former "now+1" page is replaced with a new current
page, whose pheromones correspond to the locations and strengths of
observed units;
[0069] 2. An empty page is added at the prediction horizon; and
[0070] 3. The oldest page is discarded, since it has passed the
insertion horizon.
[0071] The second thread moves the ghosts from one page to the
next, as fast as the processor allows. At each step:
[0072] 1. Ghosts reaching the .tau.=t page are evaluated for
fitness and removed or evolved;
[0073] 2. New ghosts from the avatars and from the evolutionary
process are inserted at the insertion horizon;
[0074] 3. A population of ghosts based on the fittest ghosts are
inserted at .tau.=t to run into the future;
[0075] 4. Ghosts that have moved beyond the prediction horizon are
removed;
[0076] 5. All ghosts plan their next actions based on the pheromone
field in the pages they currently occupy;
[0077] 6. The system computes the next state of each page,
including executing the actions elected by the ghosts, and (in
future pages) evaporating pheromones and recording new deposits
from the recently arrived ghosts.
[0078] Ghost movement based on pheromone gradients is a very simple
process, so this system can support realistic agent populations
without excessive computer load. In our current system, each avatar
generates eight ghosts per shift. Since there are about 50 entities
in the battlespace (about 20 units each of Red and Blue and about 5
of Green), we must support about 400 ghosts per page, or about
24000 over the entire book.
[0079] How fast a processor do we need? Let p be the real-time
duration of a page in seconds. If each page represents 60 seconds
of domain time, and we are replaying a simulation at 2.times.
domain time, p=30. Let n be the number of pages between the
insertion horizon and .tau.=t. In our current system, n=30. Then a
shift rate of n/p shifts per second will permit ghosts to run from
the insertion horizon to the current time at least once before a
new page is generated. Empirically, we have found this level a
reasonable lower bound for reasonable performance, and easily
achievable on stock WinTel platforms.
Information Sources
[0080] The flexibility of the BEE's pheromone infrastructure
permits the integration of numerous information sources as input to
our characterizations of entity personalities and predictions of
their future behavior. Our current system draws on three sources of
information, but others can readily be added.
[0081] Real-world observations.--Observations from the real world
are encoded into the pheromone field each increment of BEE time, as
a new "current page" is generated. Table 1 identifies the entities
that generate each flavor of pheromone.
[0082] Statistical estimates of threat regions.--An independent
process (known as SAD (Statistical Anomaly Detection) developed by
Rafael Alonso, Hua Li, and John Asmuth at Sarnoff Corporation) uses
statistical techniques to estimate the level of threat to each
force (Red or Blue), based on the topology of the battlefield and
the known disposition of forces. For example, a broad open area
with no cover is particularly threatening, especially if the
opposite force occupies its margins. The results of this process
are posted to the pheromone pages as "RedThreat" pheromone
(representing a threat to red) and "BlueThreat" pheromone
(representing a threat to Blue).
[0083] AI-based plan recognition.--BEE is motivated by the
recognition that prediction requires not only analysis of an
entity's intentions, but also its internal emotional state and the
dynamics it experiences externally in interacting with the
environment. While plan recognition is not sufficient for effective
prediction, it is a valuable input. In the current system, a Bayes
net is dynamically configured based on heuristics to identify the
likely goals that each entity may hold. This process is known as
KIP (Knowledge-based Intention Projection). The destinations of
these goals function as "virtual pheromones." As described below,
ghosts include their distance to such points in their action
decisions, achieving the result of gradient following without the
computational expense of maintaining a pheromone field.
Experimental Results
[0084] BEE has been tested in a series of experiments in which
human wargamers make decisions that are played out in a real-time
battlefield simulator. The commander for each side (Red and Blue)
has at his disposal a team of pucksters, human operators who set
waypoints for individual units in the simulator. Each puckster is
responsible for four to six units. The simulator moves the units,
determines firing actions, and resolves the outcome of
conflicts.
Fitting Dispositions
[0085] To test the system's ability to fit personalities based on
behavior, one Red puckster responsible for four units was
designated the "emotional" puckster. His instructions were to
select two of his units to be cowardly ("chickens") and two to be
irritable ("Rambos"). He did not disclose this assignment during
the run. His instructions were to move each unit according to the
commander's orders until the unit encountered circumstances that
would trigger the emotion associated with the unit's disposition.
Then he would manipulate chickens as though they were fearful
(typically avoiding combat and moving away from Blue), and would
move Rambos into combat as quickly as possible.
[0086] It has been found that the difference between the two
disposition values (Cowardice-Irritability) of the fittest ghosts
is a better indicator of the emotional state of the corresponding
entity than either value by itself.
[0087] FIG. 4 shows the delta disposition for each of the eight
fittest ghosts at each time step, plotted against the time step in
seconds, for a unit played as a Chicken in an actual run. The
values clearly trend negative.
[0088] FIG. 5 is a shows a similar plot for a Rambo. Units played
with an aggressive personality tend to die very soon, and often do
not give their ghosts enough time to evolve a clear picture of
their personality, but in this case the positive Delta Disposition
is clearly evident before the unit's demise.
[0089] To distill such a series of points into a characterization
of a unit's personality, we maintain a 800-second exponentially
weighted moving average of the Delta Disposition, and declare the
unit to be a Chicken or Rambo if this value passes a negative or
positive threshold, respectively. Currently, this threshold is set
at 0.25. Other filters may be used. For example, a rapid rate of
increase enhances the likelihood of calling a Rambo; units that
seek to avoid detection and avoid combat are more readily called
Chicken.
[0090] Table 4 shows the percentages of emotional units detected in
a recent series of experiments. A Rambo is never called a Chicken,
and examination of the logs for the one case where a Chicken is
called a Rambo shows that in fact the unit was being played
aggressively, rushing toward oncoming Blue forces. Because the
brave die young, we almost never detect units played intentionally
as Rambos.
TABLE-US-00003 TABLE 4 EXPERIMENTAL RESULTS ON FITTING DISPOSITIONS
(16 runs) Called Correctly Called Incorrectly Note Called Chickens
68% 5% 27% Rambos 5% 0% 95%
[0091] In addition to these results on units intentionally played
as emotional, there are a number of cases where other units were
detected as cowardly or brave. Analysis of the behavior of these
units shows that these characterizations were appropriate: units
that flee in the face of enemy forces or weapons fire are detected
as Chickens, while those that stand their ground or rush the
adversary are denominated as Rambos.
Integrated Predictions
[0092] Each ghost that runs into the future generates a possible
future path that its unit might follow. The set of such paths for
all ghosts embodies a number of distinct predictions, including the
most or least likely future, the future that poses the greatest or
least risk to the opposite side, the future that poses the greatest
or least risk to one's own side, and so forth. In the experiments
reported here, the future whose ghost receives the most guidance
from pheromones in the environment was selected at each step along
the way. In this sense, it is the most likely future.
[0093] Assessing the accuracy of these predictions requires a set
of metrics, and a baseline against which they can be compared.
Metrics for Predictions
[0094] In one embodiment, two sets of metrics may be used. One set
evaluates predictions in terms of their individual steps. The other
examines several characteristics of an entire prediction.
[0095] The step-wise evaluations are based on the structure
summarized schematically in FIG. 6. Each row in the matrix is a
successive prediction. Each column describes a real-world time
step. A given cell records the distance between where the row's
prediction indicated the unit would be at the column's time, and
where it actually was.
[0096] The figure shows how these cells can be averaged
meaningfully to yield three different measures: the prospective
accuracy of a single prediction issued at a point in time, the
retrospective accuracy of all predictions concerning a given point
in time, or the offset accuracy showing how predictions vary as a
function of look-ahead depth.
[0097] The second set of metrics is based on characteristics of an
entire prediction. FIG. 7 summarizes three such characteristics of
a path (whether real or predicted): the overall angle .theta. it
subtends, the straight-line radius .tau. from start to end, and the
actual length .lamda. integrated along the path. A fourth
characteristic of interest is the number of time intervals .tau.
during which the unit was moving. Each of these four values
provides a basis of comparison between a prediction and a unit's
actual movement (or between any two paths).
[0098] AScore (Angle Score).--Let .theta..sub.p be the angle
associated with the prediction, and .theta..sub.a the angle
associated with the unit's actual path over the period covered by
the prediction. Let .DELTA..theta.=|.theta..sub.p-.theta..sub.a|.
The angle score is (with angles expressed in degrees)
AScore=1-Min(.DELTA..theta., 360-.DELTA..theta.)/180.
[0099] If .DELTA..theta.=0, AScore=1. If .DELTA..theta.=180,
AScore=0. The average of a set of random predictions will produce a
score approaching 0.5.
[0100] RScore (Range Score).--Let .rho..sub.p be the straight-line
distance from the current position to the end of the prediction,
and .rho..sub.a the straight-line distance for the actual path. The
range score is:
RScore=1.0-|.rho..sub.p-.rho..sub.a|/Max(.rho..sub.p,
.rho..sub.a).
[0101] If the prediction is perfect, .rho..sub.p=.rho..sub.a, and
RScore=1. If the ranges are different, RScore gives the percentage
that the shorter range is of the longer one. Special logic returns
an RScore of 0 if just one of the ranges is 0, and 1 if both are
0.
[0102] LScore (Length Score).--Let .lamda..sub.p be the sum of path
segment distances for the prediction, and .lamda..sub.a the sum of
path segment distances for the actual path. The length score is:
LScore=1.0-|.lamda..sub.p-.lamda..sub.a|/Max(.lamda..sub.p,
.lamda..sub.a).
[0103] If the prediction is perfect, .lamda..sub.p=.lamda..sub.a,
and LScore=1. If both lengths are non-zero, LScore indicates what
percentage the shorter path length is of the longer path length.
Special logic returns an LScore of 0 if just one of the lengths is
0, and 1 if both are 0.
[0104] TScore (Time Score).--Let .tau..sub.p be the number of
minutes that the unit is predicted to move, and .tau..sub.a the
number of minutes that it actually moves. The time score is:
TScore=1.0-|.tau..sub.p-.tau..sub.a|/Max(.tau..sub.p,
.tau..sub.a).
[0105] If the prediction is perfect, .tau..sub.p=.tau..sub.a, and
LScore=1. If both times are non-zero, TScore indicates what
percentage the shorter path length is of the longer path length.
Special logic returns a TScore of 0 if just one of the times is 0,
and 1 if both are 0.
Baseline
[0106] As a baseline for comparison, a random-walk predictor can be
implemented. This process starts at a unit's current location, then
takes 30 random steps. A random step consists of picking a random
number uniformly distributed between 0 and 120 indicating the next
cell to move to in an 11-by-11 grid with the current position at
the center. (The grid was size 11 because the BEE movement model
allows the ghosts to move from 0 to 5 cells in the x and y
directions at each step.) The random number r is translated into x
and y steps, .DELTA.x, .DELTA.y, using the equations
.DELTA.x=r/11-5, .DELTA.y=(r mod 11)-5.
[0107] To compile a baseline, the random prediction is generated
100 times, and each of these runs is used to generate one of the
metrics discussed above. The baseline reported is the average of
these 100-instances.
EXAMPLES
[0108] FIG. 8 illustrates the three stepwise metrics for a single
unit in a single run. In the case of this unit, BEE was able to
formulate good predictions, which are superior to the baseline in
all three metrics. It is particularly encouraging that the horizon
error increases so gradually. In a complex nonlinear system,
trajectories may diverge at some point, making prediction
physically impossible. One would expect to see a discontinuity in
the horizon error if the system were reaching this limit. The
gentle increase of the horizon error suggests that we are not near
this position.
[0109] FIG. 9 illustrates the four component metrics for the same
unit and the same run. In general, these metrics support the
conclusion that these predictions are superior to the baseline, and
make clear which characteristics of the prediction are most
reliable.
[0110] The BEE architecture lends itself to extension in several
areas. The various inputs being integrated by the BEE are only an
example of the kinds of information that can be handled. The basic
principle of using a dynamical simulation to integrate a wide range
of influences can be extended to other inputs as well, requiring
much less additional engineering than other more traditional ways
of reasoning about how different knowledge sources come together in
impacting an agent's behavior.
[0111] The initial limited repertoire of emotions is a small subset
of those that have been distinguished by psychologists, and that
might be useful for understanding and projecting behavior. The set
of emotions and supporting dispositions that BEE can detect can be
extended.
[0112] The mapping between an agent's psychological (cognitive and
emotional) state and its outward behavior is not one-to-one.
Several different internal states might be consistent with a given
observed behavior under one set of environmental conditions, but
might yield distinct behaviors under other conditions. If the
environment in the recent past is one that confounds such distinct
internal states, one will be unable to distinguish them, and if the
environment shifts to a condition in which they yield different
behaviors, any predictions will suffer. One can probe the real
world, perturbing it in ways that would stimulate distinct
behaviors from entities whose psychological state is otherwise
indistinguishable. BEE's faster-than-real-time simulation can allow
the user to identify appropriate probing actions, greatly
increasing the effectiveness of intelligence efforts.
[0113] While BEE has been developed in the context of adversarial
reasoning in urban warfare, it is applicable in a much wider range
of applications, including computer games, business strategy, and
sensor fusion.
[0114] Thus, it should be understood that the embodiments and
examples described herein have been chosen and described in order
to best illustrate the principles of the invention and its
practical applications to thereby enable one of ordinary skill in
the art to best utilize the invention in various embodiments and
with various modifications as are suited for particular uses
contemplated. Even though specific embodiments of this invention
have been described, they are not to be taken as exhaustive. There
are several variations that will be apparent to those skilled in
the art.
REFERENCES
[0115] [1] S. Brueckner. Return from the Ant: Synthetic Ecosystem
for Manufacturing Control. Dr.rer.nat. Thesis at Humboldt
University Berlin, Department of Computer Science, 2000. Available
at
http://dochostrz.hu-berlin.de/dissertationen/brueckner-sven-2000-06-21/P--
DF/Brueckner.pdf. [0116] [2] S. Carberry. Techniques for Plan
Recognition. User Modeling and User-Adapted Interaction,
11(1-2):31-48, 2001. Available at
http://www.cis.udel.edu/.about.carberry/Papers/UMUAI-PlanRec.ps.
[0117] [3] J. Ferber and J.-P. Muller. Influences and Reactions: a
Model of Situated Multiagent Systems. In Proceedings of Second
International Conference on Multi-Agent Systems (ICMAS-96), pages
72-79, 1996. [0118] [4] A. Haddadi and K. Sundermeyer.
Belief-Desire-Intention Agent Architectures, In G. M. P. O'Hare and
N. R. Jennings, Editors, Foundation of Distributed Artificial
Intelligence, pages 169-185. John Wiley, New York, N.Y., 1996.
[0119] [5] A. Ilachinski. Artificial War: Multiagent-based
Simulation of Combat. Singapore, World Scientific, 2004. [0120] [6]
H. Kantz and T. Schreiber. Nonlinear Time Series Analysis.
Cambridge, UK, Cambridge University Press, 1997. [0121] [7] M. K.
Lauren and R. T. Stephen. Map-Aware Non-uniform Automata (MANA)--A
New Zealand Approach to Scenario Modelling. Journal of Battlefield
Technology, 5(1 (March)):27ff, 2002. Available at
http://www.argospress.com/jbt/Volume5/5-1-4.htm. [0122] [8] F.
Michel. Formalisme, methodologie et outils pour la modelisation et
la simulation de systemes multi-agents. Doctorat Thesis at
Universite des Sciences et Techniques du Languedoc, Department of
Informatique, 2004. Available at
http://www.lirmm.fr/.about.fmichel/these/index.html. [0123] [9] A.
Ortony, G. L. Clore, and A. Collins. The cognitive structure of
emotions. Cambridge, UK, Cambridge University Press, 1988. [0124]
[10] H. V. D. Parunak, R. Bisson, S. Brueckner, R. Matthews, and J.
Sauter. Representing Dispositions and Emotions in Simulated Combat.
In Proceedings of Workshop on Defense Applications of Multi-Agent
Systems (DAMAS05, at AAMAS05), pages (forthcoming), 2005. Available
at http://www.altarum.net/.about.vparunak/DAMAS05DETT.pdf. [0125]
[11] H. V. D. Parunak and S. Brueckner. Ant-Like Missionaries and
Cannibals: Synthetic Pheromones for Distributed Motion Control. In
Proceedings of Fourth International Conference on Autonomous Agents
(Agents 2000), pages 467-474, 2000. Available at
http://www.altarum.net/.about.vparunak/MissCann.pdf. [0126] [12] H.
V. D. Parunak, S. Brueckner, M. Fleischer, and J. Odell. A Design
Taxonomy of Multi-Agent Interactions. In Proceedings of
Agent-Oriented Software Engineering IV, pages 123-137, Springer,
2003. Available at www.altarum.net/.about.vparunak/cox.pdf. [0127]
[13] H. V. D. Parunak, S. Brueckner, and J. Sauter. Digital
Pheromones for Coordination of Unmanned Vehicles. In Proceedings of
Workshop on Environments for Multi-Agent Systems (E4MAS 2004),
pages 246-263, Springer, 2004. Available at
http://www.altarum.net/.about.vparunak/AAMAS04_UAVCoordination.pdf.
[0128] [14] H. V. D. Parunak, S. A. Brueckner, and J. Sauter.
Digital Pheromone Mechanisms for Coordination of Unmanned Vehicles.
In Proceedings of First International Conference on Autonomous
Agents and Multi-Agent Systems (AAMAS 2002), pages 449-450, 2002.
Available at www.altarum.net/.about.vparunak/AAMAS02ADAPTIV.pdf.
[0129] [15] A. S. Rao and M. P. Georgeff. Modeling Rational Agents
within a BDI Architecture. In Proceedings of International
Conference on Principles of Knowledge Representation and Reasoning
(KR-91), pages 473-484, Morgan Kaufman, 1991. [0130] [16] J. A.
Sauter, R. Matthews, H. V. D. Parunak, and S. Brueckner. Evolving
Adaptive Pheromone Path Planning Mechanisms. In Proceedings of
Autonomous Agents and Multi-Agent Systems (AAMAS02), pages 434-440,
2002. Available at
www.altarum.net/.about.vparunak/AAMAS02Evolution.pdf.
* * * * *
References