Intention recognition is significant in many applications. In this paper, we focus on team intention recognition, which identifies the intention of each team member and the team working mode. To model the team intention as well as the world state and observation, we propose a Logical Hierarchical Hidden Semi-Markov Model (LHHSMM), which has advantages of conducting statistical relational learning and can present a complex mission hierarchically. Additionally, the LHHSMM explicitly models the duration of team working mode, the intention termination, and relations between the world state and observation. A Logical Particle Filter (LPF) algorithm is also designed to infer team intentions modeled by the LHHSMM. In experiments, we simulate agents’ movements in a combat field and employ agents’ traces to evaluate performances of the LHHSMM and LPF. The results indicate that the team working mode and the target of each agent can be effectively recognized by our methods. When intentions are interrupted within a high probability, the LHHSMM outperforms a modified logical hierarchical hidden Markov model in terms of precision, recall, and
Intention Recognition (IR) is to identify the specific goals that an agent/agents is/are attempting to achieve [
The meaning of recognizing intention is significant in both real and virtual worlds. For example, in real-time strategy games, AI players can choose more efficient policies if their enemies’ actions are known [
As a famous branch of probabilistic graphical models (PGMs), the hidden Markov model (HMM) is very popular for analyzing sequential data in many applications [
Even though the HMM looks suitable to model intentioned behaviors and observations, there are still some problems when the HMM is applied in the IR domain: The HMM has a strong Markov assumption. In the HMM, the duration of a state is implicitly a geometric distribution, whose parameter is the state self-transition probability. However, in many applications, the duration of hidden states does not follow the geometric distribution. The HMM does not have a hierarchical structure. However, when a mission is complex, we always need to decompose it into subtasks repeatedly until the mission only consists of primitive actions, and the hierarchical structure is necessary to present the task decomposition and allocation. The HMM is actually propositional. There is no concept of the class or object in the HMM, which makes the HMM a poor representative. Additionally, the propositional model is not suitable to tackle relations, which are important in the IR. The most likelihood inference assumes that the intention is static. However, the team intention may be interrupted and changed because of new situations or other reasons.
To solve these problems, researchers modified the HMM in different ways. For example, the Hidden Semi-Markov Model (HSMM) uses an arbitrary distribution such as Poisson and Gamma to model the state duration explicitly and sets the self-transition probability zero [
Another type of extensions is related to relation data. These models fall into the theory of the Statistical Relational Learning (SRL), which integrates the relational or logical representations and probabilistic reasoning mechanisms with machine learning [
Extensions of the HMM above focus on recognizing intentions of one agent. However, teamwork is quite common in many scenarios. In this paper, we focus on the problem of team intention recognition. Obviously, recognizing the intention of a team is more complex than that of a single agent. Because (a) we need to recognize the goal of each agent as well as the team working mode, (b) task decomposition and allocation, which depend on the team working mode, need to be presented hierarchically, (c) the team intention may be interrupted with some unknown reasons, and (d) observations are noisy and partially missing.
The research on the LHSMM has shown that (a) the logic predicates and instantiation process can represent the working mode and the composition of the team well and (b) modeling duration of abstract state can get a higher precision and smoother recognition curve [
To solve problems of applying existing models in team intention recognition, we propose a framework named Logical Hierarchical Hidden Semi-Markov Model (LHHSMM). The LHHSMM borrows the ideas of the LHHMM and combines it with the LHSMM as well as the MDP. Our LHHSMM has advantages as below: Comparing to PGMs such as the HMM and its extensions, the LHHSMM has the advantages of SRL methods. By introducing the first order logic, LHHSMM can infer complex relations and use logical inference to replace some probabilities computing. Additionally, the predicates and instantiation process are very suitable to represent team working mode and changes of team member. A novel structure named logical hierarchical Markov chain (LHMC) is proposed to present logical transitions and decomposition of the intention and its subtasks (we call the intention and its subtasks policies). With this structure and transition conditions, our model inherits the compactness of presenting policies hierarchically in the LHHMM, plus a mechanism which makes the executing subtasks terminated forcedly in case that intentions are changed. Considering that the team intention may be interrupted by some unknown reasons, we use a lognormal distribution to model the duration that the team working mode is not interrupted, or the time remaining before an interruption event happens. With this explicit duration modeling, the current team intention depends on not only the team intention and world state at previous step, which is another difference between the LHHMM and the LHHSMM and the reason that our model is called semi-Markov. Primitive actions are selected based on the current policies and previous world states. This MDP-like process makes the agents choose any primitive actions without limits of the executed action in previous time. Observation functions are used to present the probabilistic relations between the world state and observation. It makes our model able to tackle noisy and partially missing observations.
To infer the team intentions modeled by the LHHSMM approximately, we provide a Logical Particle Filtering (LPF) based on logical definitions and dependency of the LHHSMM. In the LPF, we use the simplest importance distribution and a forward sampling method to sample the particles, and logical transitions and instantiation functions in the LHHSMM are introduced in this process.
We design a combat scenario to validate the LHHSMM and LPF: two agents move around and attack targets on a grid map. Our methods are used to infer the team working mode and targets of agents online according to the observed agents’ traces. Based on this scenario, we design a decision model for the agents and generate a dataset consisting of 100 traces. We use three traces (one changes intentions, others do not) in the dataset to evaluate the LHHSMM and the LPF. Then, three metrics including precision, recall, and
The rest of the paper is organized as follows: Section
By making an intersection of psychology and artificial intelligence [
The logical reasoning based on event hierarchy [
In this section, we first review some research about applying PGMs in team intention recognition. Then, some extensions of the HMM related with our model will be analyzed.
A probabilistic graphical model is used to encode a complex distribution over a high-dimensional space, by using a graph-based representation. The graph usually consists of the nodes and edges, which correspond to the variables in the domain and direct probabilistic interactions between variables, respectively [
Masato et al. [
Some SRL methods such as the MLN and LHMM can also be regarded as special cases of PGMs. Sadilek and Kautz [
Even though the HMM has some advantages in behavior modeling, the strong Markov assumption limits its application in many areas. Thus, some research has been done to extend HMM in the IR domain.
A Hidden semi-Markov model (HSMM) is the same as the HMM, except for modeling the duration of hidden states explicitly. Because of the advantage of modeling state duration precisely, people use HSMMs to solve IR problems in the digital games. For example, Hladky and Bulitko applied the HSMM to predict the position of a player in the first person shooting game [
van Kasteren et al. compared the performances of activity recognition using the HMM, HSMM, CRF, and Semi-Markov CRF, respectively; their activity data was recorded by real sensors in smart home [
AHMM is a stochastic model for representing the execution of a hierarchy of contingent plans [
Another famous extension of the HMM is the Coxian hidden semi-Markov model (CxHSMM) [
The LHHSMM is a fusion of logical hierarchical hidden Markov model, logical hidden semi-Markov model, and Markov decision process. It is used to model the team to be recognized as well as the world state and observation. In this section, we will give a formal definition of the LHHSMM and describe the dependency by a DBN representation. Then, we will explain how to use a logical hierarchical Markov chain to present the logical transition and decomposition of policies.
LHHSMM in one time slice is a tuple
A level-
The world state
An instantiation function
For the
Policy termination variable
Intention duration variable
A conditional probabilistic logical transition has a form
A specific logical transition has a form
A unified logical transition has a form
These three kinds of logical transitions can be represented by solid edge, dashed edge, and dotted edge, respectively, in a FSM, as in the LHMM [
Three examples of logical transitions.
Solid edges
Dashed edges
Dotted edges
In this section, we will use a DBN presentation to describe the dependency among variables in the LHHSMM. However, the standard DBN is not available to present the logical transitions and instantiation process in our model, since it is actually propositional. Thus, we can only show the full DBN after substituting all variables. To explain logical dependency under standard probabilistic transitions, we will analyze the factors which each policy, primitive action, termination, duration, and state depend on and discuss the details about the logical transitions and instantiation process after that. Figure
Subnetwork for a policy.
Full dependency
Dependency when
Dependency when
When
Subnetwork for duration.
Full dependency
Dependency when
Dependency when
Duration
Subnetwork for
Full dependency
Dependency when
Dependency when
Figure
Subnetwork for intention termination.
Full dependency
Dependency when
Dependency when
Figure
Relationships between world state, observation, and primitive action are similar as Markov decision process. First, given the level-1 policy
The full DBN representation in two time slices.
The DBN representation depicts the dependency in
Unlike the DBN structure, a logical hierarchical Markov chain (LHMC) ignores the variables of duration, conditions of logical transitions, primitive actions, world states, and observations but only depicts the decomposition of policies and logical transitions among them. A LHMC is a tree structure whose each node is a logical Markov chain (LMC); the LMC can be regarded as a LHMM without any observation, just like the state transition process in a standard HMM is a Markov chain. In this paper, an abstract state in the LMC represents an abstract policy and its child is a LMC in its lower lever except for the policies in leaf nodes. Figure
An example of team policies represented by a two-layer LHMC.
The team policies in Figure
In this section, we will discuss how to infer team intentions presented by the LHHSMM. Online intention recognition is essentially treated as a filtering problem. The policies, primitive actions, duration, and termination can be regarded as states of a dynamic system. They cannot be observed directly but can produce some observations sequentially. Our goal is to infer the real state at each time according to these observation series. Since there are noise and data missing when we observe the state, we will use a particle filter (PF) to solve the approximate inference problem. However, the standard PF does not define any logical transition and instantiation process. Thus, we will propose a new logical particle filter by introducing logical definitions and dependency in LHHSMM.
Suppose that we have a dynamic system whose real state at time
When we choose optimal importance distribution
The standard PF cannot be applied to recognize intentions in our LHHSMM, since it does not allow logical transitions and instantiation process. In this section, we presented a logical particle filtering based on logical definitions and dependency of our model. A particle
Set time
Set
If Else If Sample Else Sample End Instantiate Sample End
From If else If Sample Else Sample End Instantiate End
Sample Instantiate
Sampling
If Else Sample End From If Else Sample End
There are two points to be noted: one is that when an abstract policy is selected and executed for the first time, it should update the alphabets, instantiation functions, and logical transitions in its corresponding lower level LMC. However, when a policy has not been terminated, elements in LMC do not need to be changed, and those in the top level LMC will keep; the other point is that when sampling an abstract policy from logical transitions, we must follow the specific logical transition and unified logical transition if conditions are satisfied and then get the new abstract policy by conditional probabilistic logical transition immediately.
Update the weight of each particle by
In this paper, we only care about the abstract intention and the constants in the instantiated intention. Since they are discrete, we can compute them by
With the four steps above, we can compute the probabilities of the team working mode and goals of agents at each time.
To evaluate the performance of applying the LHHSMM and LPF in team intention recognition, we design a battle scenario. In this scenario, two agents execute an attacking mission individually or cooperatively in a known environment. Their team intention consists of two parts: the team working mode and the specific target of each agent. Our recognition is to compute the probabilities of the team intention sequentially according to the continuous observation of agent traces.
There are some characters of this scenario: first, the agents can act individually or constitute a team; second, the attacking mission can be decomposed into subtasks and primitive actions; third, the team intentions can be interrupted because of new orders or other unknown reasons; last, the observed traces have noise and we may not have any position record at all at some time; Furthermore, the observed data is got sequentially, and we need to update our recognition results when new evidence arrives. The initial situation of the battlefield is shown in Figure
The initial situation of the battlefield.
The battlefield map consists of
A decomposed LHMC representation of the team policies.
The LMC in level-
The LMC in level-1 under
The LMC in level-1 under
Figure
The team has only one abstract primitive action
In this scenario, the selection of policies and actions is only related to the positions of agents. Thus, the world state is defined as the set of
There is only one possible logical transition in level-1 and the transition probability is always 1, and the conditional transition probabilities of level-
B, C, D, E, and F represent policies shown in Figure
In our LHHSMM, termination of policy
When
In this scenario, selection of primitive actions can be neglected since there is only one abstract primitive action. Instantiation function When When When
After the abstract intention
Observation at one time is a set
To simplify the instantiation functions of intentions, the specific targets are selected according to their tactical values. We set the normalized values of
The instantiation probabilities of As given value of
As |
| ||
---|---|---|---|
|
|
| |
as1 | 0.4 | 0.6 | 0.7 |
as2 | 0.6 | 0.4 | 0.3 |
In the approximate inference, we set the number of particles
We run the scenario repeatedly and produce a test dataset consisting of 100 traces. With this data set, we compute the recognition results of specific traces to validate the LHHSMM and the LPF and compare the performances when intention duration distributions are different.
Three traces in test data set are selected to compute the probabilities of team intentions by LHHSMM. The details of these three traces are shown in Table
The details of three traces.
Trace number | Durations | Working modes | Targets | Interrupted |
---|---|---|---|---|
5 |
|
Attack_I ( |
agent A: |
No |
|
||||
17 |
|
Attack_C ( |
agent A: |
No |
|
||||
57 |
|
Attack_I ( |
agent A: |
Yes |
|
||||
57 |
|
Attack_C ( |
agent A: |
Yes |
|
||||
57 |
|
Attack_I ( |
agent A: |
No |
As shown in Table
The recognition results of trace number 5 computed by the LHHSMM.
The probabilities of working modes in trace number 5
The probabilities of targets of agent A in trace number 5
The probabilities of targets of agent B in trace number 5
The recognition results of trace number 17 computed by the LHHSMM.
The probabilities of working modes in trace number 17
The probabilities of targets of agent A in trace number 17
The probabilities of targets of agent B in trace number 17
The recognition results of trace number 57 computed by the LHHSMM.
The probabilities of working modes in trace number 57
The probabilities of targets of agent A in trace number 57
The probabilities of targets of agent B in trace number 57
Figure
Figure
Metrics of recognizing team working modes and targets of agent A computed by the LHHSMM and the LHHMM.
Precision of recognizing team working modes
Recall of recognizing team working modes
Precision of recognizing targets of agent A
Recall of recognizing targets of agent A
Figure
To show the advantages of LHHSMM compared with the LHHMM, we add our observation model to the standard LHHMM (we still call it the LHHMM later) and also use a particle filtering to infer intentions. However, since the LHHMM does not model intention interruption, the weights of all particles may be 0 after an interruption happens. In this situation, we reset the particle weights to
The performances of our model and the LHHMM are compared statistically: their performances are evaluated by three metrics: precision, recall, and
Since the traces in the dataset have different time lengths, we define a variable
The red solid curves and the blue dash dotted curves are computed by the LHHSMM and the LHHMM, respectively. We can find that the LHHSMM outperforms the LHHMM in all the three metrics, especially in the second half of the simulation. In the starting phase, the intention just keeps for a short time and the probability of changing intentions is not large. Thus, the LHHSMM and the LHHMM have similar performances. However, with more and more interruptions of intention occurring, the LHHSMM shows its advantage: it will generally improve the recognition performance when there are more observations.
Our LHHSMM is semi-Markov because it uses a specific distribution to model the duration that the team intention is not interrupted explicitly. To evaluate the effects of duration modeling, we compare recognition metrics computed by LHHSMMs with three different duration distributions:
Recognition metrics with different duration distributions.
The blue, green, and red bars indicate the performances of our LHHSMMs with
In this paper, we proposed a LHHSMM to recognize team intentions. As a fusion of the LHHMM, LHSMM, and MDP, the LHHSMM possesses advantages to solve the team intention recognition problems in a complex environment: first, it uses a logical predicate to represent the team working mode, which can be recognized together with each goal of agents; second, it has a hierarchical structure and can make use of domain knowledge to present complex tasks; third, due to the modeling of intention duration, LHHSMM can update the probabilities of results correctly even if the intention is interrupted and changed; last, LHHSMM can deal with noisy and partially missing observations.
To solve inference problems of the LHHSMM, a LPF is proposed based on logical definitions and the dependency of the LHHSMM. We also design a combat scenario to evaluate the LHHSMM and LPF; the results show the following: first, no matter intentions are changed or not, our methods can effectively recognize the team working mode and the targets of each agent; second, the LHHSMM outperforms the LHHMM in precision, recall, and
In the future, we would like to continue our research on two aspects: (a) to learn parameters in the LHHSMM and (b) to discuss how to get the optimal importance distribution in the LPF. Moreover, applying our model to recognize intentions in a real scenario is also absorbing.
The authors declare that there is no conflict of interests regarding the publication of this paper.
The work described in this paper is sponsored by the National Natural Science Foundation of China under Grants no. 61473300 and no. 61573369.