Random Finite Set Based Data Assimilation for Dynamic Data Driven Simulation of Maritime Pirate Activity

Maritime piracy is posing a genuine threat to maritime transport. The main purpose of simulation is to predict the behaviors of many actual systems, and it has been successfully applied inmany fields. But the application of simulation in themaritime domain is still scarce. The rapid development of network and measurement technologies brings about higher accuracy and better availability of online measurements. This makes the simulation paradigm named as dynamic data driven simulation increasingly popular. It can assimilate the online measurements into the running simulation models and ensure much more accurate prediction of the complex systems under study. In this paper, we study how to utilize the online measurements in the agent based simulation of the maritime pirate activity. A new random finite set based data assimilation algorithm is proposed to overcome the limitations of the conventional vectors based data assimilation algorithms. The random finite set based general data model, measurement model, and simulation model are introduced to support the proposed algorithm. The details of the proposed algorithm are presented in the context of agent based simulation of maritime pirate activity. Two groups of experiments are used to practically prove the effectiveness and superiority of the proposed algorithm.


Introduction
In modern society, global maritime transport system is a critical part of the world's transportation system.According to information, nearly ninety percent of internationally traded goods are transported by sea [1].Since 2010, insurance rate has increased more than 10 times for ships transiting the areas where the maritime piracy is rampant.In 2016, the insured losses caused by maritime piracy worldwide amounted to 2.46 billion US dollars and are rising continually.The number of pirate attacks against ships worldwide in 2016 was 191.The increasingly rampant maritime pirate activities have become a great threat to the international maritime transport system throughout the world.
Since entering the new century, many researchers have studied how to fight against the dramatically increased maritime piracy [2].Many of them offer insight into how the problem may be addressed through global political, security studies, international relations, or some other approaches.These approaches proposed by them target the restoration of order and law in weak states, but they offer only the most general intelligence while state fragility still remains [3][4][5].For example, many approaches only aim to tell where attacks are most likely to happen without giving advice on how to prevent or stop them.
The main purpose of simulation is to predict the behaviors of many actual systems, such as natural disasters, wildfires, and terrorist attacks.The simulation systems have been used to support policy design and operational management in the ground transportation [6] and air transportation field [7] for a very long time.But the applications of simulation systems in the maritime transportation field are scarce.
The agent based dynamic data driven simulation (DDDS) of maritime pirate activity focuses on the accurate prediction of marine pirates in the real world and takes full advantage of the real-time sensors.The primary purpose of the simulation system is to accurately assess the potential outcomes of actions taken by decision-makers (including ship commanders and operators) who interact with the system.It also can be used to find the real-time optimal penetration routes through designated piracy zones by assimilating the real-time measurements from all kinds of sensors.Data assimilation techniques, which are the foundation of DDDS, can fuse realtime measurements with simulated states to best determine the states of the studied system.Since actions are chosen according to some policy based on the current simulation states of the system, simulation is deservedly used to determine the utility of these actions.The facts that make the data assimilation technology necessary for DDDS of maritime pirate activity are as follows.
Highly Dynamic and Changing Real Operational Context.Future maritime mission planning environments are expected to be complex and distributed due to rapid advancement in cyber-physical systems and the ubiquitous use of intelligent heterogeneous assets.The maritime environmental context is always changing dynamically.The simulation results based on current data may no longer be accurate as additional data is received.Unforeseen pirates can appear in the designated piracy zone and critical rescue equipment can be damaged, thereby making the current simulation results useless.In order to adapt to the dynamic and changing real operational context, we need to use the data assimilation technology to assimilate the changing environmental information into the simulation system.
Imprecise Simulation Data and Models.The accuracy of simulation is determined by many factors, such as data used for initialization and models for describing the real system.Taking the simulation of maritime pirate activity as example, the accuracy of simulation depends on the weather data, merchant vessel data, and piracy vessel data in designated piracy zones.Due to current technical constraints, it is impossible to obtain all the required data without errors.Besides data errors, the simulation model of pirates also brings errors.Being subject to these constraints, the simulation results of the simulation models will certainly be inaccurate.In order to overcome the adverse impacts of the imprecise input data and models, we need the data assimilation technology.
The differences between the simulation system and actual system will continue to expand, if we do not assimilate measurements from the actual system and dynamically adjust the simulation results.We propose a random finite set (RFS) based recursive algorithm for online assimilation of measurements in the presence of noises, data association uncertainty, detection uncertainty, and clutters.We model the simulation states and measurements as RFS and use the RFS based Bayesian inference recursion to assimilate the real-time measurements in time.
The structure of the paper is organized as follows.Section 2 introduces two related works and highlights the gaps of the literature covered by the current algorithms.Section 3 indicates the main RFS based models (including general data model, measurement model, and simulation model) which are the foundation of the proposed data assimilation algorithm.Section 4 introduces the agent based models and the combination of data assimilation and agent.Section 5 presents the detailed design and implementation of the RFS based data assimilation algorithm.Section 6 mentions the experiment design and the results obtained from the experiments.Section 7 presents the conclusions.

Related Works
2.1.Maritime Piracy Simulation.The applications of simulation systems in the maritime transportation domain are especially scarce.The majority of existing works only focus on traffic in national, coastal waters and ports.They tend to use high level equation based models for modeling maritime piracy [8].A little work has been done to optimize ship routing in the maritime domain, but it did not take the security aspect into consideration [9].Some researchers attempt to apply mathematical modeling and computer simulation to study the maritime piracy in recent years.In [10], the authors use the discrete event simulator PANOPEA to model the maritime pirates around the Gulf of Aden.They have evaluated the effectiveness of some command and control models.The limitation is that they only consider the application for a specified scenario, and the method and models are not applicable in other scenarios.
In [11], the author employs intelligence reports, meteorological forecasts, and historical pirate incidents to describe Next-generation Piracy Performance Surface (PPSN) model and mainly focuses on predicting the areas around the Horn of Africa.In [12], the authors introduce the Pirate Attack Risk Surface (PARS) model by using the probabilistic models to further improve the PPSN model.Both these two models are numerical models.Since they do not directly model behaviors or interactions of individual ships, their ability for what-if type of simulation and prediction is limited.
In [13], the authors use the agent based modeling framework named as MANA (Map-Aware, Non-Uniform Automata) to find the factors that are important for the escort through the Gulf of Aden.MANA is used for defense related modeling and simulation by specifying the properties of agents.They model the escorting scenario on a tactical level by using MANA.In [14], the authors use the MANA framework to defend vessels against pirate activities by employing nonlegal deterrents.They model the single encounter behaviors in detail, and have not taken the whole system into consideration.The limitation is that it has not taken the measurements of real situation into consideration, which makes it unpractical in applications.
Jakob et al. are the first to use agent based simulation to study the international maritime transport security [15].In [16], they describe and model the transportation system from a macro perspective.They describe the pirates and piracy countermeasures by using the agent based models.And they drive the agent model by inputting the offline data stored in the database.They conduct the simulation with thousands of agents, and each agent is related to a vessel.They can capture the dynamic of the real transportation system and evaluate the effectiveness of piracy countermeasures.But they only focus on the whole maritime transportation system from a macro perspective but fail to support decision for the ship operator in a tactical level.The data used to drive the simulation are offline data that are stored in the database.They use the stored data to drive the simulation instead of real-time measurements, so the simulation cannot be used to support real-time decision-making.

Real dynamic mission environment
The agent based DDDS of maritime pirate activity studies the problem of the merchant vessel agent trying to cross or penetrate a piracy-infested area patrolled by one or more pirate vessel agents by assimilating the online measurements into the simulation models.The merchant vessel agent in the simulation can choose its self-relief measures.The simulation system can be used to help minimize the chance of hostile encounter or being hijacked and to avoid detection and interception.So the simulation system can be used by the merchant vessel operators or commanders to penetrate a piracy-infested area.This simulation also can be applied in reality, such as logistics in illegal border crossing, insecure regions, and/or smuggling interdiction.The schematic diagram of the simulation system based decision support system for maritime counter-piracy is as shown in Figure 1.
Many real-world problems would benefit from the ability to assimilate measurements and predict the state of complex systems in real time.Since many complex systems, such as modeling counter-piracy operations, can take on very different states at critical decision points in the simulation, it is necessary for these simulation systems to support branching through either the generation of separate Monte Carlo replications or the use of multiple hypotheses branching techniques that support multiple hypotheses within a single execution.Multiple hypotheses branching can also be used to explore what-if possibilities or alternative plans.The traditional estimation and prediction control theory is combined with optimistic simulation techniques to support the data assimilation of complex systems.Real-time simulation based decision support system for maritime counter-piracy projects the future through simulation.It uses real-time measurements to calibrate the current state and requires multiple hypothesis what-if branching capabilities to simultaneously explore the effectiveness of various decision options.
Figure 2 is the schematic diagram of data assimilation for agent based DDDS of maritime pirate activity.In the simulation based decision support system, the newly assimilated simulation states of the real system at time   are obtained by optimally combining noisy and cluttered measurements with the previously predicted state.Then, the new state estimated is extrapolated to time  +1 to form the next prediction of the system.The prediction/update cycle repeats as time evolves and new measurements become available.This iterative process is shown in Figure 2, where X represents the abstract state of the system.Accurate assimilation must balance the uncertainty of available measurements with unknown or unpredictable behaviors associated with the model.

DDDS and Data Assimilation.
Until now, the simulation models have been using the static information as initial settings to simulate the real systems in the conventional Figure 2: The relationship between data assimilation and simulation.
simulation systems.In the dynamic maritime pirate activity modeling, since the static initial parameters fail to capture the real-time environmental changes timely, the simulation states are always quite different from the actual ones of the real system.DDDS has a golden opportunity for development, for the online measurements can reflect the latest updated states of the real system.It means dynamically assimilating online measurements when executing simulations.This idea has been applied in some research areas, such as traffic management, disaster forecasting, environmental science, and wildfire [17,18].The simulation system in DDDS paradigm is continually adjusted by assimilating real-time measurements for much more accurate prediction of the studied systems [19].The rapid development of network and measurement technologies contributes to the higher accuracy and better availability of online measurements, which has facilitated the development of DDDS.DDDS highly depends on data assimilation technology which is in charge of assimilating online measurements into a running simulation system.Data assimilation has the potential to greatly improve the accuracy of simulation systems.As the simulation models become increasingly complicated, if we still make little usage of online measurements from the studied system, traditional simulations will be largely decoupled from real systems.In data assimilation, the measurements are assimilated into the running simulation models to generate more accurate simulation results [20].The data assimilation algorithms aim to minimize the differences between the simulated states and the real world.There are two main types of data assimilation algorithms: one is the constant statistical algorithms which include 3DVAR [21] and 4DVAR [22] and the other one is the adaptive statistical algorithms, such as Kalman filter [23], ensemble Kalman filter (EnKF) [24], particle filter (PF) [25], and hierarchical Bayesian method [26].The data assimilation algorithms that have been successfully used in simulation applications include 3DVAR, 4DVAR, Kalman filter, and PF based algorithms.
The basic principle of 3DVAR and 4DVAR is to avoid the explicit calculation of the gain matrix and make its inversion by using a minimization procedure of the cost function.They are cost function based and use a suited descent algorithm to iteratively evaluate the cost function.The limitation of variational methods is that they use the minimization of a cost function and they do not provide any estimate of predictive uncertainty.Kalman filter and its variance [27,28] are important and conventional Bayes estimation techniques used in data assimilation.Kalman filter based data assimilation algorithms are optimal for the assimilation of linear Gaussian systems.The requirements of using Kalman filter based data assimilation algorithms are that the studied systems should be modeled in linear equations and the distributions of processing noises and observation noises are Gaussian.The constraints greatly restrict the applicability of Kalman filter based algorithms in data assimilation for systems that are nonlinear and non-Gaussian.In this paper, we extend the Kalman filter by using RFS based models and unscented transformation.
These conventional data assimilation algorithms fail to support data assimilation in the simulation field because of the nonlinear, non-Gaussian properties of a lot of simulation models.The sequential Monte Carlo method is also named as particle filter (PF).PF based data assimilation has been used for some simulation applications including traffic and wildfire.Particle filter is the exact method for the nonlinear and non-Gaussian applications.It has been the most widely used data assimilation algorithm in simulation applications in the recent years [19].PF based data assimilation in traffic simulation is presented by Xie et al. in [29] and Wu et al. in [30].PF based data assimilation in wildfire simulation is presented by Xue et al. in [31,32].They all use the same nonparametric statistic inference method based on particle filter, because the particle filter based data assimilation algorithms need no assumptions about the distribution or linearity of the studied simulation systems.
PF based data assimilation algorithms are widely used, because they are the exact methods for the nonlinear and non-Gaussian systems.But they have the following essential disadvantages: (i) They are based on the assumption that the studied systems are single dynamic systems that are permanently active.What is more, they use the presumed physical models of individual agents, and the transition of agents from one model to another is less common.
(ii) They are based on the assumption that there is exactly one measurement at each time and the detection is perfect with no false detections or misdetections.In other words, they assume that the sensors never fail to make a measurement of the state and never pick up false measurements.
(iii) They are based on the assumption that the measurements are precise and the measurement models are always precise as well.They always model the measurements as points in the measurement space, and the number and ordering of measurements are previously designated.
The standard PF based data assimilation algorithms cannot be used for data assimilation of the dynamic systems that are switching on and off randomly.Switching is quite common in maritime pirate activity, as it did in the case where moving pirate vessels appear and vanish in the designated areas.The restriction in which the studied system must be a single dynamic system limits the applicability of the PF based algorithms.In the maritime pirate activity, multiple dynamic pirate vessels can appear and vanish during the measurement intervals.The measurement models used in PF based data assimilation algorithm are quite unreasonable, since there are always false detections and misdetections in the real world, and the measurement functions are impossible to be precisely known.The PF based algorithm cannot jointly estimate the number of pirates and the states of every pirate either.
For the agent based simulation of the maritime pirate activity, we employ RFS to assimilate online measurements into the simulation model.A major task for the data assimilation is to dynamically update the simulation states of the real pirate vessels and then provide the updated states to the simulation model for follow-on simulations.The proposed RFS based data assimilation algorithm can overcome the limitations of the standard PF based data assimilation algorithms effectively.
The RFS theory proposed by Goodman et al. as finite set statistics (FISST) is a recently developed theory that unifies much of information fusion under a single Bayesian paradigm [33,34].FISST is the first systematic treatment of multisensor multitarget filtering as part of a unified framework for data fusion using random set theory [35,36].FISST is a means of directly extending the familiar methodology of single-sensor, single-target Bayes statistics to multisensor multitarget problems.FISST gives some applicable mathematical tools that enable direct application of Bayesian inference.FISST provides the concept of integration/density for set-valued variable.More details can be found in [37].

RFS Based Models for Data Assimilation
The data assimilation algorithm based on the RFS theory has markedly widened the application fields of data assimilation.It supports both permanently active systems (i.e., present pirates) and on/off systems (i.e., appearing/vanishing pirates).It supports both the single dynamic system and the multiple dynamic systems.It also supports both the perfect detection and the imperfect detection.It can be applied to both precise measurements and the imprecise measurements.It can jointly estimate the number of agents and the state of each of the agents.We will tell how the proposed method overcomes the three limitations of the traditional PF based algorithms under the background of the DDDS of maritime pirate activity.
The proposed data assimilation algorithm uses two kinds of system models to describe the real system: one is Markov transition density based simulation model and the other is the measurement likelihood based measurement model.Simulation model is used to describe the state transition of the studied system, and the measurement is used to model the sensors.Both of these models are constructed based on the general data model.The RFS based representation of data (including measurements and states) is the foundation of RFS based data assimilation algorithm and the essential difference between the proposed algorithm and the traditional PF based algorithms.

RFS Based General Data Model.
The standard PF based data assimilation assumes precise measurements and measurement models.But the reality is that there is uncertainty in both measurements and measurement models.The uncertainty in the measurements can come from many aspects, including uncertainty due to multiple agents, uncertainty due to unreliability, uncertainty due to false measurements, uncertainty due to deception, uncertainty due to evasive action, and uncertainty due to unknown correlations.The details can be found in [33,37].The challenges caused by the uncertainties in the data model have inhibited the maturation of DDDS as a unified, systematic, scientifically founded simulation paradigm.For example, the synthetic aperture radar (SAR) images used in the maritime simulation are statistically uncharacterized and difficult to model.The natural-language statements or rules drawn from knowledge bases are widely used in the maritime pirate activity, and they are unclear and cannot be statistically processed.
The limitations of the standard PF based data assimilation algorithms are caused by the fact that they employ vectors to represent the states and measurements.Clearly, the vector based representation cannot represent all occurrences of multiple agent states and, more importantly, does not admit a meaningful and mathematically consistent notion of estimation error.The commonly used vector based assimilation algorithm is summarized in Figure 3.We can find that it should assume that the dimension and elements order in each vector are equal.It also needs necessary operations outside of the Bayesian recursion to ensure the consistency of the vectors used for assimilation.The measurement processing step is necessary for vector based algorithms.The determination of newly observed measurement and missed measurement is through vector augmentation or truncation which are very computationally intensive and irreversible.It is also a disadvantage of these algorithms, because it makes the computation complex and unstable.Because of the fundamental inconsistency of the vector based representation, stacking individual measurements into a large vector is not a satisfactory representation.
Here we use RFS to represent the multiple agent states and measurements, because the RFS can represent all possible occurrences of multiple agent states, and distance between sets is a well understood concept.The RFS based data assimilation framework is summarized in Figure 4; it enables the joint estimation of errors in cardinality and assimilated states, because it jointly propagates both the estimate of the cardinality of agents and their states.The data association process is automatically dealt with within the FISST framework, and consequently there is no need for data association and measurement management.
In the real operational context of maritime, the number of measurements is a random variable, and there is no information on which is the measurement of the state.Actually, the number of measurements is not fixed and the ordering of measurements is not relevant; the measurement is better to be modeled as a RFS.RFS based data model generalizes the state space model for simulation; it takes into account a more realistic situation where the random number of agents  and measurements, detection uncertainty, false alarms, and association uncertainty are all taken into consideration.
A RFS X = {x 1 , x 2 , . . ., x  } is a finite set-valued random variable that can be illustrated by a discrete probability distribution () = Pr{|X| = } and a family of symmetric joint probability densities   (x 1 , . . ., x  ) [33,38].The joint probability densities conditioned on cardinality  describe the probability distribution of its elements over the state space.The discrete probability distribution describes the cardinality of RFS X and is defined as follows: The relationship between the distribution of RFS X and its cardinality is We assimilate measurements with a RFS variable formal measurement model that describes sensors' behaviors and a RFS variable formal simulation model that models the pirates' behaviors.This means that we use the RFS formulation to treat the collection of individual pirate states as a random finite set-valued state and the collection of individual measurements as a random finite set-valued measurement.

RFS Based Measurement Model.
We propose a general detection type measurement model based on the following assumptions as shown in Figure 5: ) observes a scene involving an unknown number of pirate agents; here ℎ +1 (x) is the deterministic generation of measurement from state x.
(ii) A single pirate agent with state x generates either a single measurement which means an agent detection, occurring with probability   (x), or a missed detection which means no measurement at all, occurring with probability 1 −   (x).
(iii) The false alarm process C is Poisson-distributed in time with expected value  and distributed in space according to an arbitrary density (z), and  | (Z  ) =  −   |  | (z 1 ) ⋅ ⋅ ⋅  | (z  ) is the probability that a set Z  = {z 1 , . . ., z  } of clutter observations will be generated.
For the given predicted multiple agent states X  = {x 1 , . . ., x  }, the random measurement set collected by the sensor will have the form Z  = Γ  (X  ) ∪ C  , where Γ  (X  ) is the agent detection set and it has the form Γ  (X  ) = Γ  (x 1 ) ∪ ⋅ ⋅ ⋅ ∪ Γ  (x  ); Γ  (x  ) is the detection set for state x  ; and C  is Poisson with mean value  and spatial distribution (z).We assume that Γ  (x 1 ), . . ., Γ  (x  ); C  is statistically independent.
Any given agent can generate either a single measurement or no measurement at all.Consequently, Γ  (x  ) must have the form Γ  (x  ) = ⌀   (x  ) ∩ {Z  }.Here, Z  = ℎ(x  , W  ) is the sensor-noise model associated with th state x  ; and ⌀  is the discrete random subset of the baseline measurement space  0 .Detection uncertainty is modeled as Bernoulli RFS as follows: The The likelihood   (Z  | X  ) of multiple agents describes the sensor measurement for multiple agents and characterizes the detections, clutters, and agent generated observations.For the state set X  , the probability density distribution of receiving the measurement set Z  is described by the true likelihood function.The true likelihood function for arbitrary number of agents with missed detections and clutters is here,   (Z  ) =  − ∏ z∈ (z) and   (⌀ | X  ) =  − ∏ x∈ (1 −   (x)); the summation is taken over all association hypotheses  : {1, . . ., } → {0, 1, . . ., }.

RFS Based Simulation Model.
In the agent based simulation of maritime pirate activity, the agents number can be constantly varying because agents can appear and disappear.For example, completely new agents can enter a scene spontaneously, as when a pirate vessel or military vessel suddenly appears in the piracy zones.Existing agents can give rise to new agents through spawning, as when a pirate mothership acts as a floating platform from which smaller pirate vessels are launched and when an apparently single agent at a distance resolves into multiple, closely spaced agents and so on.Agents can likewise leave a scene, as when disappearing from the piracy zones or behind some other occlusion, or they can be damaged or destroyed.All of these instances of multiple agent dynamics should be explicitly modeled by using RFS.If multiple agents are related to birth, death, spawning, merging, and so on, the standard PF based assimilation algorithms are inapplicable, because they fail to accurately estimate the number of agents.In DDDS of maritime Figure 6: A summary of RFS based simulation model of pirate agents.
pirate activity, we use the unified representation of pirate's states to get a complete description of the simulation model.We model the time evolution of the multiple agent states (e.g., the states of pirate vessels) by employing RFS.We take the agent birth, death, spawn, and motion into consideration.The RFS based simulation model is based on the following assumptions as shown in Figure 6: (i) The likelihood that a single agent will have state vector x  at time step  if it had state vector x −1 at time step  − 1 is described by a Markov transition density  |−1 (x  | x −1 ) corresponding to a single agent simulation model Hereafter this is assumed for notational simplicity to be additive: (ii) A single agent with state x −1 at time step  − 1 has a probability   (x  ) abbr.abbr.

= 𝑝
of new agents at time step , as when new pirate vessels launch from a mother ship.
(iv) The probability density that new agents with state set B  will spontaneously appear at time  is (B  ) abbr.
=  |−1 (B  ), as when some pirate vessels ambush from concealments or enter the zone for the first time.
The multiple agent Markov density  |−1 (X  | X −1 ) characterizes the time evolution of the multiple agent states and takes the agent births, deaths, and motions into consideration.Here we will give the true multiple agent Markov density function  |−1 (X  | X −1 ) for the proposed simulation model, where X  = {x 1 , . . ., x  } and The proposed representation of agents' states has the mathematical form where X  is the predicted state set, S |−1 (X −1 ) are persisting agents that are continuing to chase the merchant, B |−1 (X −1 ) are spawned agents that are launched from the mother ship, B  are spontaneous agents that are ambushing from concealments or enter the designated zone for the first time, S |−1 (X −1 ) have the form S |−1 (X −1 ) = S |−1 (x  1 ) ∪ ⋅ ⋅ ⋅ ∪ S |−1 (x   −1 ), and ) is spawned by x   which means that it is launched from the mother ship which has the state of x   .We also assume that S |−1 (x  1 ), . . ., S |−1 (x   −1 ), B |−1 (x  1 ), . . ., B |−1 (x   −1 ), and B  are statistically independent.
The RFS based simulation model is identical in mathematical form to the RFS based measurement model.The corresponding true multiple agent Markov density is The summation is taken over all association hypotheses  : {1, . . .,  −1 } → {0, 1, . . .,   }, where Here,  0 is the expected number of spontaneously generated new agents and  0 (x) is their physical distribution.Also, (x   ) is the expected number of new agents spawned by an agent with previous state x   , and (x | x   ) is their physical distribution.
The calculation of the true Markov density is impractical.Here, we use another new variable to deal with the complex computation.The difficulty of RFS based Bayesian inference for multiple agents is its computational complexity.To solve this problem, Mahler proposed the PHD (probability hypothesis density) filter [34].The PHD  | (x | Z  ) of the posterior  probability density  | (X  | Z  ) is a probability density function defined on the single agent state x ∈ X 0 as follows: In point theory,  | (x) is defined as the intensity density [39].It is not a probability density. | (x) represents the density of expected number of points at x.Given any region S of single pirate agent state space X 0 , the integral ∫ S  | (x)x = V | is the expected number of pirate agents in S, where V | ∈ R is the expected number of objects in the state space S. In particular, if S = X 0 is the entire state space, then  | ≜ ∫  | (x)x is the total expected number of pirate agents in the scene.
Here, we assume that then PHD of the RFS B  for the new born pirate agents is a Gaussian mixture defined as follows: ,  (x; m () , , P () , ) , where  () , , m () , , and P () , are the weights, means, and covariance matrices of the mixture birth PHD.

Agent Model of Vessels
Multiple agent based simulation models are more suitable for characterizing the complex systems.They can capture the properties of reasoning processes.The agent based modeling methods are perfect for the simulation of maritime pirate activity.Because there are a lot of priorities, objectives, constraints, and complex scheduling dependencies in the actual maritime pirate activities, the agent based simulation has been applied to solve these problems successfully.
Here, the agent based simulation models are used to represent the behaviors of the real pirate vessels.The pirate vessels are characterized by autonomous agents.These agent models describe the movement and interactions of different kinds of pirate vessels.If there are no interactions between the vessel agents, each vessel agent pursues its individual goals.But just like the real pirate vessels, the pirate vessel agents can also interact with other kinds of vessels.And these interactions are difficult to model by using the mathematical models or DEVS.
The data assimilation algorithm is implemented in the merchant vessel agent model.By assimilating the measurements from the real system, the merchant vessel agent can interact with the real system more conveniently and effectively.The interactions between the merchant agent and the real system are shown in Figure 7.

RFS Based Data Assimilation Algorithm
We propose using a new RFS based data assimilation algorithm by combining the basic ideas of Gaussian mixture approximation, Cardinalized Probability Hypothesis Density (CPHD) filter, and the unscented Kalman filter [40] together.Unlike the standard unscented Kalman filter, the proposed algorithm represents the states by RFS rather than by vector.
Let   (X | Z 1: ) denote the multiple agent posterior probability density.Since we have got the RFS based measurement model and simulation model, we can use these two models to get the optimal Bayes filter.The optimal Bayesian inference is captured by the recursion as follows:

Prediction for birthed, spawned, and existing agents
Step k − 1

Measurements from
Unscented transformation for each Gaussian component

Sigma points and weights for all Gaussian components
Agent simulator engine

Update Gaussian component weight, mean, and covariance matrix
Update Gaussian component Pruning and merging State extraction Step k Step k + 1

Execute simulation
Roll back simulation k|k−1 where   is the reference measure on all finite subsets of the measurement space [41].
The CPHD filter is introduced by Mahler in [42], and it propagates both the PHD and the discrete distribution of the pirates' number.The discrete distribution of pirate numbers can be arbitrary.=  Γ () denote the distribution of new agents' number at time .The proposed data assimilation algorithm propagates both the cardinality distribution  | () and the PHD  | (x).
The main idea of the proposed data assimilation algorithm is shown in Figure 8.It is a rollback based algorithm.In this algorithm, the Gaussian components are recursively generated by the rollback based sampling step.The rollback operation means reversing or undoing something.The rollback in this paper has its natural meaning which means restoring the simulation to a previous state.The agent based simulation system is used in the prediction step of the data assimilation algorithm.The rollback strategy is used in the simulation running during the prediction step.In the prediction step, the agent based simulation system is the parallel system of the real system.The update step is the central step of the data assimilation algorithm.The update step uses the predicted measurement statistics, real-time measurements, and predicted cardinality as inputs.

Predict with RFS Based Simulation Model.
If the number of pirate agents that continue to attack the merchant agent is , the predicted cardinality of these survived agents will be Then the predicted distribution of pirates' number can be characterized by the following convolution: Mathematical Problems in Engineering 11 Based on the RFS based simulation model of pirate agent movement, the predicted PHD is given as follows: where  |−1 (x) denotes the PHD of the RFS of new agents appearing between time  − 1 and time k.
Since the Markov transition density of the single vessel agent is nonlinear, here we use the unscented transformation to approximate the moments of  |−1 (x | x  ) since we know it can be transformed into a linear-Gaussian model by using the unscented transformation.
Under the assumption that the posterior PHD at time  − 1 is the Gaussian mixture denoted by −1 (x; m () −1 , P () −1 ), the predicted PHD  |−1 (x) is also a Gaussian mixture under the assumption  |−1 (x) =  ,|−1 (x) +   (x), where   (x) is given as in ( 9) and the predicted PHD of survival agents is For each Gaussian component, we should compute the predicted mean value m Step 1. Choose the set of sigma points and their corresponding weights denoted by {x () −1 ,  () }  =0 to approximate mean  ()   and covariance C ()  as introduced in [43].
Step 6. Calculate the predicted measurements and their corresponding characteristic: Step 7. Compute the Kalman gain and posterior covariance matrix: 5.2.Update with Online Measurements.Without measurements, the covariance matrix grows exponentially over time.
Assimilating the real-time measurements can be regarded as calibrating or estimating the real-time state of the simulation.The update step of the proposed data assimilation algorithm is depending on the Bayesian inference.The updated cardinality is as follows: Under the assumption that the predicted PHD  |−1 (x) at time  is the Gaussian mixture denoted by |−1 (x; m () |−1 , P () |−1 ), the updated PHD  | (x) can also be represented by the Gaussian mixture as follows: where ⟨ 1 ,  2 ⟩ denotes the inner product defined by ⟨ 1 ,  2 ⟩ = ∑ ∞ =0  1 () 2 ();   (⋅ | x) is the single pirate agent measurement likelihood density at time  based on state x;   (⋅) is the PHD of clutter measurements at time .The sequence Υ  ,,Z () () is defined for  ∈ {0, 1} as follows: The term  , () in ( 21) is the distribution of the numbers of clutters at time .The other terms are given as follows: where   (Z) is the elementary symmetric function of order  for the random finite set Z; the details can be found in [37].

Experiments
6.1.Experimental Design.We take advantage of the identicaltwin experiment introduced in [31,32] to verify the proposed data assimilation algorithm in ideal situations.In the identical-twin experiment, the simulation whose results are considered as real states is first executed with corresponding measurements which are collected during the process.And the measurements here are treated as the real measurements, since they are collected from the real states.Accordingly, we assimilate the measurements into DDDS by using the proposed data assimilation algorithm and then check whether the assimilated results are close to the real system states or not.
We utilize three terms in the experiments, real pirate movement, assimilated pirate movement, and simulated pirate movement, to represent the results of the experiments.A real pirate movement is the simulated pirate movement from which the online measurements are obtained.A simulated pirate movement is the simulation states based on some erroneous parameters, for example, imprecise turn rate, which aims to comply with the fact that pirate movement simulation usually relies on imprecise parameters compared to real pirate movement.Finally, an assimilated pirate movement is the data assimilation enhanced simulation states based on the same erroneous parameters as in the simulated pirate movement.The goal of these experiments is to verify that an assimilated pirate movement creates more accurate simulation states by assimilating online measurements from the real pirate movement, even if it uses the erroneous parameters in the simulated pirate movement.

Experiments on Single Pirate Vessel in Clutter.
The state space for the pirate vessel is four-dimensional with two position dimensions in the normal (x, y) Cartesian coordinate system and two velocity coordinates, ( ẋ , ẏ ).We establish the 2D horizontal turn motion model relying on ship kinematics.We use a widely used ship motion model [44].The dynamics equation is as follows: here,  = (−/2 + √ 2 /4 − )/, Ω 0 = Ω/V 0 , and (x, y), Ω, and  are ship position, velocity vector turn rate, and heading, respectively. is the sampling interval.
The simulation model x  =   (x −1 , V −1 ) of the vessel motion is nonlinear, where   is the nonlinear state transition functions and V −1 is an independent zero-mean Gaussian process noise; and  −1 is the covariance matrix.The vessel simulation model can be reformulated as where  = [  2 /2  0 0 0 0  The sensor measurement model in the experiments is as follows: where measurement noise   ∼ (⋅; 0, R  ) with R  = diag([ 2  ,  2  ]  ); in the simulation experiments, we assume that   = 2(/180) rad, and   = 10 m.
In our experiments, the differences between a real pirate movement and a simulated one are caused by the imprecise turn rate φ .We use the imprecise turn rate as the erroneous parameters.The real turn rate is /60 ≈ 0.0524 rad/s with random variances added every second.Our first experiment case uses the real turn rate to run the agent based simulation system and collects the measurements at every time step.Our second experiment case introduces errors to the turn rate and makes the other configurations exactly the same as the first experiment.The erroneous turn rate is 0.0524 + 0.01/3 ≈ 0.05573 rad/s with the same random variances in the first experiment.Our third experiment case uses the erroneous turn rate as the second experiment and the other configurations remain the same as the other two experiments.In this experiment case, the measurements collected in the first experiment are used as well as the proposed data assimilation algorithm.
We assume that the merchant vessel is stationary at a particular location, which is the center in the Cartesian coordinates.In this way, we can get the pirate's states.The states of the pirate vessel generated by the three experiments are shown in Figure 9.This set of experiments verifies the effectiveness of data assimilation algorithm when the turn rate has errors.From Figure 9, we know that the real pirate vessel movement and the simulated one have large deviations due to the turn rate errors.By assimilating the measurements, the assimilated pirate vessel movement is obtained.From this figure, it is observed that the assimilated pirate vessel movement is much closer to the real one than the simulated one.Specifically, due to the erroneous turn rate, the simulated pirate states gradually deviate from the real pirate states.Using the proposed data assimilation algorithm, the assimilated pirate states overcome this problem and match the real pirate states with minor differences.
Figure 10(a) shows the PHD of positions and Figure 10(b) shows the contours of the marginal PHD in all time steps.We can learn that the PHD is gradually improved as the simulation time evolves.We can find that, in the initial phase, the uncertainty of the PHD is large due to the limited and imprecise prior information, and as the simulation time evolves, the uncertainty gradually decreases.This phenomenon is caused by the fact that the knowledge about the PHD (related to the posterior probability density) is gradually accumulated over time, and the uncertainty about the mean of mixture Gaussian distribution of PHD gradually decreases at the same time.This result proves that the proposed data assimilation algorithm can make the uncertainty of the estimated results be controlled in a limited range and it can overcome the uncertainty of the measurements including clutters, false alarms, missed detections, and noises.
To capture the average performance, we run 1000 Monte Carlo trials for each experiment with the same setup.Figure 11 shows the Monte Carlo average of the mean squared error at every step for the assimilated pirate states and the simulated ones.We can find that the mean squared errors between the real pirate states and the assimilated ones are smaller than those of the simulated ones.As time evolves, the mean squared errors between the real pirate states and the simulated ones gradually increase, while the mean squared errors between the real ones and the assimilated ones are stable and limited.This figure reflects the differences between using data assimilation and not using it.
At each time step, the data assimilation algorithm assimilated the measurements by using the RFS based Gaussian mixture.To see the data assimilation process more clearly, we give the assimilation result at a single time step.Here we analyze the result at time step 57.To see the distribution of the PHD more intuitively, we give the surface plot of the distribution of PHD.From Figure 12, we can know that the assimilated states of the pirate vessel are the states having the largest PHD.Since there is only one pirate vessel in the experiment, there is one two-dimensional Gaussian distribution in the  plane.
Figure 13 shows the real pirate vessel state with cluttered measurements and the simulated one and the assimilated one at time step 57.From this figure, we can find that the assimilated state is much closer to the real one than the simulated one even though the measurements are heavily influenced by the clutters and noises.We also plot the mean value of the measurements.We can find that neither the mean value nor the measurements are closer to the real state than the assimilated one.This proves that the direct using of measurements without data assimilation is not applicable.

Experiments on Multiple Pirate
Vessels.Now consider a nonlinear bearing and range scenario with a total of 5 pirate vessels and a mothership.The birth process follows a Poisson RFS with PHD:  Like in Section 6.2, we carry out three experiments with the same measurement model and simulation model.The configurations of the three experiments are identical except the turn rate.The real turn rates and erroneous ones of all 5 pirate agents and a mother ship are shown in Table 1.To show the ability of assimilating varying agents' number of the proposed data assimilation algorithm, we make the pirate vessel agents' number vary by setting different start and stop time.
We assume that the merchant vessel is stationary at a particular location as a center.Thus, we can get the states in the Cartesian coordinates with the location of merchant vessels.Pirate vessels' states are shown in Figure 14.
From Figure 14, it is not difficult to find that, by assimilating the measurements, the assimilated pirate vessel states are closer to the real ones than the simulated ones. in the initial phase, the uncertainty of the PHD is large due to the limited and imprecise prior information, and as the simulation time evolves, the uncertainty of PHD gradually decreases.This phenomenon is consistent with the experiment for the single pirate agent.From this result, we can know that the varying agents' number will not affect the ability of the proposed algorithm to control the uncertainty of estimated results in a limited range and overcome the uncertainty of measurements.
Unlike the analysis of single pirate vessel, the analysis of the multiple pirate vessels cannot use the distance error directly.Here we use the new metric which is proposed by Schuhmacher et al. for sets, named as the optimal subpattern assignment (OSPA) metric [45].The OSPA metric is employed extensively as a performance evaluation criterion here.
To capture the average performance, we also run 1000 Monte Carlo trials for each experiment with the same setup.Figure 16 shows the Monte Carlo average of the OSPA distance, and we can find that the Monte Carlo average of the OSPA distance of simulated states gradually increases, while the OSPA distance of assimilated states is limited to a smaller range.This result reflects the fact that the multiple agent simulation using data assimilation is far more accurate than those not using it.Figure 17 shows the agents' number in the real experiment and the assimilated agents' number in the assimilated experiment.As the assimilated agents' number is estimated by rounding off the mean of updated agents' cardinality, a few estimation errors are caused.But mostly the assimilated numbers are consistent with the real agents' numbers.This figure proves that the proposed data assimilation algorithm not only can assimilate the simulation states of the agents but also can estimate the varying agents' number in the simulation.It is difficult for the standard PF based data assimilation algorithms to estimate the agents' number at every time step.
To describe the working mechanism of the proposed data assimilation algorithm more clearly, we give the assimilated results at a single time step.Here, we analyze time step 48 at which all the six pirate vessels are concurrent.To see the distribution of the PHD more intuitively, we give the surface plot of the distribution of PHD.From Figure 18, we can know that the assimilated states of the pirate vessels are those with the largest PHD, and the number of the two-dimensional Gaussian distributions equals the number of pirate vessels.Figure 19 shows the real pirate vessels' states, the simulated ones, the assimilated ones, the clutters, and noises.We can see that the assimilated pirate vessels' states are closer to the real ones than the simulated ones.It can be seen that the assimilated states are not affected by the disorganized distribution of measurements.

Figure 1 :
Figure 1: Agent and simulation based decision support system for maritime counter-piracy.

Figure 4 :
Figure 4: RFS based data assimilation for a single prediction/update cycle.

Figure 5 :
Figure 5: The major assumptions of RFS based measurement model.

Figure 7 :
Figure 7: Interactions between merchant vessel agent and real system.

Figure 8 :
Figure 8: Gaussian mixture based implementation of RFS based data assimilation algorithm.
−1 at time  by using the unscented transformation and the simulation model.For the th Gaussian component, its predicted mean value m () ,|−1 and predicted covariance P () ,|−1 are computed by the following steps.Here, we assume that the mean of the th Gaussian component is  () −1 and its covariance is C () −1 .

Figure 9 :
Figure 9: States of identical-twin experiments on single pirate vessel in clutter.

yFigure 10 :Figure 11 :
Figure 10: PHD of single pirate agent versus time: (a) surface plot of the distribution of PHD; (b) contours of the marginal PHD.

Figure 12 :
Figure 12: Surface plot of the distribution of assimilated PHD at time step 57.

Figure 13 :
Figure 13: The real state, assimilated state, simulated state, and corresponding measurements at time step 57.

Figure 15 (Figure 14 :
Figure 14: States of identical-twin experiments on multiple pirates.

yFigure 15 :Figure 16 :
Figure 15: PHD of multiple pirate agents versus time: (a) surface plot of the distribution of PHD; (b) contours of the marginal PHD.
number St. dev. of updated number

Table 1 :
Configurations of different pirate vessels.