Partition Learning for Multiagent Planning

,


Introduction
The advancement of computing technology has enabled the practical development of intelligent autonomous systems.Intelligent autonomous systems can be used to perform difficult sensing tasks.One such sensing task is to search for and track targets over large geographic areas.Much research has gone into this task resulting in a standard approach.This standard approach decomposes the problem into two steps.
(2) Agent path optimization based on target track estimation.
Significant research has been accomplished for each of these steps.Target track estimation has largely been solved [1][2][3][4][5] and this paper proposes no new methods for target track estimation.Agent path optimization based on target track estimation has been solved for many scenarios.However, the general scenario of when the number of targets is unknown still requires more development.
The standard approach in general works particularly well when it can be assumed that there is a single target [6][7][8][9][10][11][12][13][14].And in many scenarios the standard approach works well even when there are multiple targets [15,16].However, when there are multiple targets, methods following the standard approach start requiring limiting assumptions on the problem.For example, many methods require that the geographic area be easily scanned so that there are frequent target detections throughout the geographic area.When this can be assumed it allows for simpler estimation, such as Gaussian distributions, and consequently agent paths can be more readily optimized.However, these methods do not extend well when the geographic area is too large to scan quickly.When the geographic area is large, agents frequently do not detect targets.These no-detection events must be utilized to estimate the target tracks [6,10,12].General recursive Bayesian estimation methods are required to accomplish estimation for this case.This results in a general, nonparametric estimate of the target tracks.Yet, because of this generality, as the number of targets increases it becomes very difficult to optimize sensor paths over the target track estimates.
The focus of this paper is then on the problem of search and tracking an unknown number of possibly multiple targets in a large geographic area utilizing a team of autonomous, sensing agents.As such, very few assumptions are placed on the problem.All that is assumed is that timechanging target track estimates are provided and shared by the team of autonomous agents [10,11,13,15,16].The target track estimates are completely general and nonparametric and the geographic area is too large to scan quickly.
A solution to this general problem is provided by proposing a new approach to perform autonomous search and tracking that inherently handles the case of an unknown number of targets in a general, nonparametric estimation setting.The performance of this new approach is then compared with the standard approach.The specific method that will be used for comparison is direct optimization over target estimation distribution [6,10,14].

Problem Structure: New Approach
In this paper a new approach is presented to perform autonomous search and tracking over large geographic areas.The union of such large geographic areas will be referred to as the surveillance area S in this paper.This new approach is designed to inherently handle the case of the number of targets being unknown and target track estimation being general and non-parametric.Figure 1 provides an overview of this approach.Notice, instead of the standard two-step approach there are three steps.These three steps are (1) target density estimation [17], (2) partition learning based on target density estimation, (3) agent path planning based on partitions.
With this decomposition of the problem, separate subproblems are defined for each step.Each step will be described in the following sections.Two of these steps leverage existing work by extending existing methods to fit the structure of the problem presented in this paper.These steps are target density estimation and agent path planning based on partitions.The required extensions will be presented in this paper.Partition learning requires further development.Consequently most discussion provided in this paper will focus on the partition learning step.Also note, these steps are repeated at each time instance.As the target estimation changes with time, so do the partitions and the agent paths.

Step 1: Target Density Estimation
The first step in the problem decomposition is target density estimation [18].Recall that the first step of the standard approach is target track estimation.Target track estimation is  not identical to target density estimation.In order to understand the difference, consider a search and tracking application that estimates the position of N targets utilizing the standard approach.The first step of the standard approach is target track estimation.Assume that target positions are within some region S of the plane (R 2 ).Call this region the surveillance area.The estimation space is then N × S. The dimension of this space is potentially very high.In order to estimate the position of the targets a probability distribution is determined which is defined over this high-dimensional space.The second step of the standard approach is then agent path planning by optimizating over this high-dimensional space.However, from the perspective of agent path planning it would be beneficial to optimize paths over the significantly lower dimensional space S, instead of N × S.This is the purpose of the target density distribution.
The target density distribution captures the complexity of the high-dimensional target track estimation space and maps it to the single planar space of the surveillance area.This process is depicted in Figure 2.This figure shows multiple single-target distributions combining to form a single target density distribution.A sample target density distribution, defined over a planar surveillance area, is provided in Figure 3.In this figure the target density distribution is represented by contour lines.Red lines have high density and blue lines have low density.The details of this distribution's shape are not important.What is important is to note that a target density distribution captures the complexity of combining the high-dimensional target track estimation.
The space of the target density distribution is the surveillance area S. Now consider some subspace A ⊂ S of the surveillance area.The target density can be defined as a distribution f (x) that is defined over the surveillance area as where N A is the number of targets in the subspace A and μ(x) denotes that measure on which the integral is performed.For example, if the space is discretized then the integral represents a summation.Notice that the target density distribution provides the estimated number of targets within regions of the surveillance area.The expected total number of targets in the surveillance area is then The target density distribution can be computed from target track estimation.For example, consider the case when the positions of N targets are estimated independently.Target track estimation then provides a set of target distributions {P(X 1 ), . . ., P(X N )}.The target density distribution can be computed as f (x) = i P(x i ).Consequently there is no need to develop new methods for estimating the target density distribution.Instead, the vast body of target track estimation work can be leveraged.
However, it is not necessary to obtain the target density distribution from target track estimation.Instead, it can be estimated directly from sensor observations.One approach [5] that accomplishes this utilizes random set theory [19] to obtain an approximate target density distribution.

Step 2: Partition Learning
Recall that the information contained in a target density distribution can be very complex.This is because it combines the information of all target track estimates and maps them onto the surveillance area.To aid in distributing agents across the surveillance area, the complexity of the target density distribution can be used to partition the surveillance area into disjoint regions.For example, recall the sample target density distribution depicted in Figure 3.One possible set of partitioned regions is depicted in Figure 4.Note that in this figure the target density distribution is represented by contour lines and the boundaries of the partitions are represented by thick, straight lines.The particular choice of partitions displayed is of little significance.What is significant is to note that the partitions are determined based on the information content provided by the target density distribution.Instead of partitioning the surveillance area into arbitrary regions the approach taken in this paper is to partition the surveillance area into regions that correspond to  By partitioning the surveillance area into these types of regions it gives agent path planning algorithms the flexibility to switch between modes of surveillance exploration, target search, and target tracking.These modes are depicted in Figure 5.In order to compute each of these partitions a different classifier is designed.These partition classifiers along with corresponding modes will be described further below.
The combination of these partition classifiers constructs the overall partition learning classifier.The structure of this classifier is presented in Figure 6.The general steps are to classify (1) the null target partition, (2) the exploration partition, and (3) a set of search and tracking partitions.
Before describing the details of these steps, the computational flow of the overall classifier can be understood by considering a set Γ and how it changes as it moves through the classifier.Let Γ be the set of all points in the surveillance area.As such, the target density distribution is defined over Γ.The flow of the classifier can then be understood as (1) the null target partition S null is classified and removed from Γ (Γ ← Γ \ S null ) in block (1) of Figure 6, (2) Γ is now a subset of the surveillance area.Within Γ, the exploration partition S explore is classified and then removed from Γ (Γ ← Γ \ S explore ) in block (2) of Figure 6, (3) Γ now consists of only the subset of the surveillance area that will be partitioned into search and tracking partitions.In block (3) of Figure 6 an ordered set of search and tracking partitions are classified within Γ.
Each of these partition learning classifier steps will now be discussed in more detail.

Partitioning
Step 1: Null Target Partition.Over time some regions will be repeatedly observed.Much of the observed regions will never have targets detected.Dependent on anticipated possible target mobility, it may be concluded that no targets exist in these areas.It is necessary to maintain a partition that classifies regions in which no targets exist.These regions form the null target partition.The first step of partition learning is to classify this null target partition.The importance of this partition is seen by considering two scenarios.The first is when there is no target in the surveillance area.At some point in time the conclusion should be reached that there is no target.The second scenario is when there is a vast exploration partition.As regions become fully observed, but no targets have been detected, these observed regions should cease to be explored.The method of classification for this step of partition learning will now be described.To do this the features used for classification will be described first.Then the classifier will be described.

Features. Low values for target density are what define
the null target partition.The only feature required to classify this partition is then simply values of the target density distribution.The target density distribution was defined previously in (1).

Classifier.
Recall that Γ represents the set over which the classifier operates.As such, Γ is initially the entire surveillance area (Γ ← S).The first step (block (1) of Figure 6) of partition learning is to determine the null target partition S null and remove it from Γ (Γ ← Γ \ S null ).This step of the classifier is visualized in Figure 7 for a simple onedimensional target density distribution.
To determine which points in Γ belong to the null target partition, a target nullity threshold null is required.This threshold specifies the value of target density below which it is assumed no targets exist.With this target nullity threshold given, all points in the surveillance area that correspond to regions of essentially no targets can then be defined by The set of points in S null then form the null target partition.This set of points is removed from Γ and the classifier continues by classifying the exploration partition.

Partitioning
Step 2: Exploration Partition.The second step of partition learning is to classify the exploration partition.The exploration partition consists of areas within the surveillance area that have low-to-no information bias.For example, a region over which there is defined a uniform probability distribution would be included in the exploration partition.Because there is no bias in information, an exploration-oriented mode of path planning may be preferred for these regions [20,21].To allow this exploration-oriented mode of path planning, these types of regions are classified into a separate partition.The method of classification for this step of partition learning will now be described.To do this the features used for classification will be described first.Then the classifier will be described.

Features.
In order to classify points in Γ into the exploration partition, two features are required.These features are (1) local uncertainty, (2) target density.
Local uncertainty is used to determine regions of locally uniform value.Initially it may appear that only local uncertainty is required to define the exploration partition completely.However, there is a subtle aspect that requires the addition of target density in order to completely capture the entire exploration partition.
This subtle aspect can be understood by considering the case when the entire surveillance area is initially uniformly distributed.An agent makes imperfect no-detection observations.As such, some regions will have low-target density (regions that have been observed well), completely unobserved regions will have an unchanged uniform value, and others will have value somewhere in between the low value and the unchanged uniform value (due to poor observation in these areas).These in-between-valued areas will not have locally uniform value, yet will still belong to the exploration partition.This case suggests that target density, in addition to local uncertainty, is required to catch the complete exploration partition.
Before defining local uncertainty a definition for local area is required.The local area S r (x 0 ) ⊆ S of some point x 0 ∈ S in the surveillance area (where S is the space of the surveillance area) is defined as where d(x, x 0 ) is some measure of distance.
Local uncertainty is defined by first selecting a measure for uncertainty.In this paper local uncertainty is based on entropy.As such, local uncertainty is computed by evaluating the local entropy defined as where f r (x, x 0 ) is the locally normalized target density function defined by Note, if local uncertainty were not defined on a locally normalized target density distribution there would not be a well-established maximum value for local uncertainty.
A locally normalized density is then required so that the maximum value for local uncertainty can be referenced during classification.This maximum value allows the classifier to determine if a particular region contains significantly biased information of target density.To aid in understanding local uncertainty Figure 8 presents the computation of local entropy over a surveillance area when the underlying target density distribution is a simple Gaussian probability distribution centered in the middle of the surveillance area.In this figure, local uncertainty is represented by shade value where white is high value and black is low value.

Classifier.
At this point of the classifier Γ consists of a subset of the surveillance area defined by Γ := S \ S null .In this step (block (2) of Figure 6) of partition learning the exploration partition S explore is classified and removed from Γ (Γ ← Γ \ S explore ).Define this resulting state of Γ to be the search partition (or search set) S search .The process of this stage of the classifier is visualized in Figure 9 for a simple onedimensional target density distribution.
To determine which regions in Γ are approximately locally uniform, a new set S HE , called the high-entropy set, is computed.The high-entropy set is defined as where H r (x) is the local entropy feature as defined in ( 5) and H max is the maximum local entropy possible defined as where |S r | := x∈Sr dμ(x).Take, for example, the case when the target density distribution is discretized on a fixed grid defined over the surveillance area.Recall the definition of S r in (4).In (4), note that in order to define S r it is required to define a measure of distance d(x, x 0 ) between two points x and x 0 in the surveillance area.For a fixed grid, this distance is defined as where the numbers 1 and 2 specify the indices of the points.
Then, according to the definition of S r , the maximum local entropy is H max = log N 2 , where N is the number of rows/ columns in the square local area S r .
To catch regions for which the subtlety mentioned above applies another set is computed.This set is called the lowdensity set S NI .The definition of this set is simple, however, it requires explanation.At the beginning of a surveillance task an initial target density (or prior distribution) is constructed that expresses prior belief in possible target locations.At a minimum this prior consists of two pieces.These pieces are (1) a prior distribution f prior (x) of previously known target positions, (2) an estimated number of additional targets EN additional that may exist in the surveillance area.
The prior target density distribution f prior (x) provides a target density bias based on where targets have most recently been observed and where they might be now.For example, f prior may consist of a summation of Gaussian distributions with each Gaussian representing the possible location of a particular target whose position was once known or whose position is simply guessed.The estimated number of additional targets EN additional affects the initial target density distribution by defining a uniform target density distribution The resultant prior target density distribution is then Note that some regions of the prior target density distribution will have a characteristic uniform value U.This characteristic low-information value can then be used to define the low-density set.The regions that must be captured in the low-density set are those regions that have target density in between the locally uniform density and the null target threshold.Yet, because Γ = S \ S null at this point of the classifier, the low-density set can be defined simply as Then, combining the low-density set with the high-entropy set, the exploration partition is defined as S explore can then be removed from Γ as Γ ← Γ \ S explore .Let this state of Γ be called the search partition (or search set) S search .S search is then defined as After removing the exploration partition from Γ the classifier then continues on to classifying the search and tracking partitions.In terms of information content, the opposite type of region to the exploration partition is a tracking partition.Tracking partitions are small spatially and are partitions in which there is strong bias of target density and high certainty.
These are partitions in which targets have been detected consistently.Consequently, tracking partitions define locations of known targets.These partitions must continue to be tracked according to the mobility of the tracked targets.The search strategy then becomes that of keeping observance of the known position of the targets.For example, the points of maximum density within tracking partitions are kept in observance.The search strategy for tracking partitions is then the most constraining on agent motion.For example, if an agent is a fixed-wing aircraft it will have to fly orbit-like paths encircling the known position of the target [14,[22][23][24][25].
Similar to tracking partitions are search partitions.The similarity that the search partitions have with the tracking partitions is that both consist of an information set that provides a bias to aid in optimizing search plans.However, search partitions are different from tracking partitions in that the information content is not very certain.Consequently, not much can be said about exactly where a target may be located.However, there is bias over which parts of the regions have high possibility of target existence.It then becomes the duty of a search plan to optimize paths based on the information content in order to yield a new distribution with higher certainty.The search strategy then becomes to maximize some type of information gain, and a searcher's paths are guided to improve the information content in order to ultimately observe a target [6-8, 12, 26, 27].
The method of classification for this step of partition learning will now be described.To do this the features used for classification will be described first.Then the classifier will be described.

Features.
In order to classify points in Γ into search and tracking partitions, two features are required.These features are (1) normalized position within the surveillance area, (2) local expected number of targets.
Normalized position is computed for some point x ∈ S by dividing by the size of the surveillance area S. Local expected number of targets comes readily from the target density distribution.To understand this, recall the definition of the target density distribution.From this definition it is apparent that the expected number of targets within some region A is just the integration of the target density distribution over that region.Recalling the definition of local area, local expected number of targets is then computed by integrating the target density distribution over the local area as   be observed that local expected number of targets acts as a smoothing filter over the target density distribution.

Classifier.
At this point in partition learning (block 3 in Figure 6) the set on which classification operates is Γ = S search = S \ {S null ∪ S explore }.Γ then consists of regions in the surveillance area with some level of biased information of possible target locations.In this step of partition learning Γ is partitioned into a set of search and tracking partitions.This step cannot be performed by a simple set computation as was done for the previous two steps of partition learning.Instead, points in Γ are clustered according to the target density feature space.In order to help visualize the action that occurs at this stage of the classifier, observe Figure 11.
This figure shows how a set of points in the target density distribution feature space are convexly partitioned.There are many convex clustering methods [28] that would work for this level of the classifier.The approach taken in this paper is to perform classification by utilizing both K-means and Gaussian Mixture Model EM [28].Kmeans is used to initialize search and tracking partitioning at the beginning of the surveillance task.Gaussian Mixture Model EM is then used at subsequent steps in time, where the Gaussian Mixture Model EM is seeded with previous partition means [17].Both K-means and Gaussian Mixture Model EM operate in the target density distribution feature space consisting of normalized position within the surveillance area and local expected number of targets.
In order to perform this classification, the number of partitions to classify must be initialized.The initial number of partitions is determined from the total expected number of targets in the surveillance area After performing K-means or Gaussian Mixture Model EM to convexly partition Γ = S search into a set of search and tracking partitions, these partitions are then ordered.The partitions are ordered so that tracking partitions appear first and uncertain searching partitions appear last.This enables path planning algorithms to prioritize the various search and tracking partitions.To accomplish this ordering the partition density ρ Pi of each partition P i is computed.Partition density is defined as At this point in this step of partition learning an ordered set of search and tracking partitions have been classified.Some of these partitions may correspond to regions with many densely located targets.It may be beneficial to allocate more searching resource to these types of partitions.Accordingly, these partitions are further subpartitioned.
In order to determine if some partition P i should be subpartitioned its expected number of targets EN Pi is computed.EN Pi is easily computed from the target density distribution f (x) as If EN Pi > 1 then it is expected that there is more than one target within P i .In order to track all of these possible targets multiple agents may be required.To account for the possible need of multiple agents, any P i with EN Pi > 1 is subpartitioned into ceil(EN Pi ) new partitions.And this is where the classifier ends.The end result is one null target partition, one exploration partition, and a set of ordered search and tracking partitions.

Step 3: Path Planning over Partitions
The final step of the approach presented in this paper for autonomous search and tracking is path planning.This path planning is performed over the set of partitions.In order to plan paths over partitions, path planning is decomposed into two steps as depicted in Figure 12.These steps are (1) partition task allocation, (2) target density distribution based path optimization.
In the first step partitions are allocated to the team of agents [29,30].In the second step agent paths are determined within allocated partitions by optimizing directly over partition level target density distributions [6-8, 10, 14].
By decomposing path planning in this manner the vast amount of work that has been developed for vehicle routing (for partition task allocation) and receding horizon path optimization (for target density-distribution-based path optimization) can be leveraged.Now, all that is required is extensions of existing methods where necessary.As such, this section refers the reader to the body of work that is leveraged and then presents any required extensions.

Path Planning
Step 1: Partition Task Allocation.The partitions generated by the classifier define areas over which subsets of the target density distribution can be extracted.This suggests the application of some kind of task allocation algorithm that takes each of the partitioned search areas as tasks with varying level of certainty or priority.The exact method of task allocation is beyond the scope of this paper since it has been well developed by researchers already.Refer specifically to [29,30] for methods that directly apply.For task allocation algorithms that have been developed for projects at the Center for Collaborative Control of Unmanned Vehicles refer to [29].

Path Planning Step 2: Distribution-Based Optimization.
Optimizing a path over a target density distribution is almost identical to optimizing a path over a probability distribution.Fortunately, much work has been done to develop path optimization over probability distributions [6-8, 10, 14].These existing methods are leveraged in this paper.The general approach of these methods is to define a function for measuring the utility of a path based on an underlying probability distribution.Then, this utility function is used to optimize paths through the surveillance area.These methods are extended by defining a utility function based on target density distributions.Defining this utility function is the focus of this section.
In order to define a path's utility the utility of a point in the surveillance area must be defined.However, before defining the utility of a point, an agent's sensor observation coverage f C (x, x 0 ), about a point x 0 in the surveillance area, must be determined.f C (x, x 0 ) essentially specifies how applicable some point x in the surveillance area is to a particular agent when the agent is located at x 0 .For example, consider the case of a fixed sensor.This sensor is free to rotate in order to observe its surroundings.However, it cannot see beyond r meters.Consequently, any point farther away than r meters is of little significance to this sensor.An agent's observation coverage is determined by the properties of the agent's sensor.For example, if an agent can make observations perfectly within a radius r, then the observation coverage is an indicator function defined by However, in general, the observation coverage is determined by the sensor's capable field of view as well as the resolution of observable points within the field of view and the probability of missed detection [31].In order to further visualize possible sensor observation coverages, consider two cases.
(1) An agent can view it's surroundings perfectly within 25 meters.Beyond that, the agent's view linearly degrades until it cannot make any observation at 70 meters.
(2) An agent cannot view anything near it until a distance of 45 meters away.After that, observations quickly become perfect but then start to fade around 65 meters.By 90 meters, observations are no longer possible.
The first of these cases is similar to what is true for many sensing agents [32].They are designed such that their observations improve with proximity.Figure 13 depicts this sensor coverage.In this figure the quality of a point's coverage by the agent's sensor is represented by a shaded value where white is high utility and black is low utility.
The second of these cases may seem odd, but is actually similar to what was used in an experiment performed with autonomous aircraft equipped with visual spectrum cameras [33].In this experiment, a camera on-board an aircraft was zoomed in to detect features of a pedestrian.The zoom was designed so that good resolution would be provided when the aircraft orbited the pedestrian.Consequently, it was   designed to make good observations at an orbit's radius away from the aircraft.Figure 14 depicts this sensor coverage.In this figure the quality of a point's coverage by the agent's sensor is represented by a shaded value where white is high utility and black is low utility.
The utility of an agent's point x 0 in the surveillance area can then be defined utilizing sensor coverage.First, consider a zero horizon path.The utility of a point x 0 is where f (x) is the target density distribution.Extending this to finite horizon planning, define the H-step horizon observation coverage over the path x 0:H = (x 0 , . . ., x H ) as The utility of a point x 0 ∈ S, and consequently the path x 0:H , is then defined as where R(x) is the set of all points within the reach set of x [14].Intuitively, (22) represents the expected number of targets within the sensor coverage over a H-step path originating from the point x 0 .Maximizing (22) then corresponds to choosing the point x 0 that yields the maximum expected number of targets within the observation coverage of a path originating from x 0 .This definition of path utility was based on the entire target density distribution.In order to optimize paths through specific partitions the utility must be defined for partition level target density distributions.Fortunately, this extension is easily accomplished.First, let the set of partitions defined over the surveillance area be P = {P 1 , . . ., P n }.The partition level target density distribution f Pi (x) for partition P i ∈ P is then defined as And then replacing f (x) with f Pi (x), the partition level utility of a path starting at x 0 is then defined as This equation fully specifies partition level target densitydistribution-based path optimization.Further development and application-specific details can be found in [14,17].

Results
In this paper a new approach for autonomous search and tracking was presented.This new approach was designed for the case when the surveillance area is large, the number of targets is unknown, and target estimation is general and nonparametric.In this section, results of the performance of this approach are presented.
The performance of partition learning to aid in agent path planning was tested by constructing a simulation environment.In this environment the team of agents consisted of autonomous aircraft equipped with visual spectrum    gimballed camera sensors.The capabilities of these agents were designed to closely represent behaviors observed in flight experiments [32,33].The camera characteristics were designed to represent a field of view resulting from a 0.9273 rad view angle.Effects of resolution were included by limiting the distance of observations to 250 m.The agents were designed to fly at 25 m/s and 100 m altitude with a maximum turn rate of 0.2 rad/s.The targets were allowed to move according to a transition model defined by where r and θ were distributed as with μ = 2 m in one second and σ 2 = 10 m 2 .The time interval of each simulation iteration was 4 seconds.Several  It is additionally necessary to provide a comparison in order to see how well the presented methods perform.Recall the standard approach for autonomous search and tracking.The standard approach optimizes agent paths directly over the distribution.The comparison provided in this section is then between using partition learning to aid path planning versus optimizing paths directly over the target distribution.This standard approach will be referred to as the state-of-theart.Note, however, in order to compute some of the metrics above, it is necessary to run the partition classification algorithm for both cases.The partition learning classifier was run for both path planning approaches (that presented in this paper and the state-of-the-art).Yet, only the path planning approach presented in this paper utilized the partitions for path planning purposes.
In this paper, results for the scenario in which there are six agents and three targets is provided.Additionally, a sample sequence of partition learning is provided as a visual aid to understand how these partitions may look for the case when there are six agents and six targets.This sample sequence is found in Figure 21.This figure spans an entire page so it is provided after all other figures.Additional scenarios are provided in [17].
The results provided in this section demonstrate the scenario when there are sufficient resources to perform surveillance.From these results it is concluded that the the approach presented in this paper performed well.This conclusion is determined by observing Figures 15 and 16. Figure 15 presents the number of targets in search or tracking partitions over time.Figure 16 presents the number of search or tracking partitions over time.
From Figures 15 and 16 it is apparent that all targets are quickly captured within search or tracking partitions and the number of partitions is bounded.However, the performance of the two path planning approaches is very different.From Figure 17, it can be seen that the average partition size does not decrease for state-of-the-art path planning.However, the average partition size decreases substantially for partition learning classification path planning.
A similar result is also true for the average size of partitions containing targets, plotted in Figure 18.Additionally, according to Figure 19, the exploration size continually decreases for partition learning classification, but tends to level off for state-of-the-art path planning.Furthermore, the state-of-the-art path planning did not perform well to localize targets in this scenario.In contrast to this, partition learning classification path planning performed well to eventually localize all targets.This can be seen in Figure 20.From the results presented here, it is then apparent that partition learning classification path planning performs well to find and localize targets for this scenario, as compared to state-of-the-art path planning.

Conclusions
In this paper a new approach for autonomous search and tracking was presented.This new approach was designed for the case when the geographic area is large, the number of targets is unknown, and target track estimation is general and nonparametric.This is a challenging problem because very little is assumed.All that was assumed is that some form of target track estimation is available and shared among the team of autonomous agents performing the search.This new approach decomposes the search and tracking problem into three steps.The first step is target density distribution estimation.The second step is partition learning classification based on the target density distribution.The third step is path planning based on the partitions.
The vast body of work available for target track estimation and path planning over probability distributions was leveraged to provide solutions for the first and third steps.As such, the main focus of this paper was on partition learning.In order to determine the performance of this new approach, it was compared with the standard approach of directly optimizing paths over target estimation distributions.From this comparison, it is concluded that the approach presented in this paper performs well and provides an improved solution for this very general form of the autonomous search and tracking problem.

Figure 1 :
Figure1: Problem structure of new approach for autonomous search and tracking.The problem is decomposed into three steps.Each of these steps is performed at each instance in time.As the target estimation changes with time, so do the partitions and the agent paths.

Figure 2 :
Figure 2: Depiction of capturing the complexity of the highdimensional target track estimation space and mapping it to the surveillance area.This forms the target density distribution.

Figure 3 :Figure 4 :
Figure3: Sample target density distribution, represented by contour lines, defined over a surveillance area.The details of this distribution's shape are not important.What is important is to note that a target density distribution captures the complexity of the high-dimensional target track estimation space.

( 1 )
a null target partition, (2) an exploration partition, (3) a set of search and tracking partitions.

Figure 5 :
Figure 5: The path planning modes that may exist when surveillance is performed by region-based planning.

Figure 6 :
Figure 6: Cascade of classifiers that partition the surveillance area into (1) a null target partition, (2) an exploration partition, and (3) a set of search and tracking partitions.Note that at each instance of time that the target density distribution changes, this classifier is processed on the new target density distribution.In this sense partitions change with time.

Figure 7 :
Figure 7: Visualization of a simple target density distribution with corresponding null target partition noted.

Figure 8 :Figure 9 :
Figure 8: A sample computation of local uncertainty over a surveillance area when the underlying target density distribution is a simple Gaussian probability distribution.Local uncertainty is represented by shade value where white is high-local uncertainty and black is low-local uncertainty.In this sample, notice that the peak and the tails of the Gaussian have high-local uncertainty whereas the regions in between the peak and tails have low uncertainty.

4. 3 .
Partitioning Step 3: Search and Tracking Partitions.The third step of partition learning is to classify a set of search and tracking partitions.Both search and tracking partitions are classified by the same classifier.

( 15 )Figure 10 :
Figure 10: A sample computation of local expected number of targets I r over a surveillance area when the underlying target density distribution is a simple Gaussian probability distribution.Notice that local expected number of targets acts as a smoothing filter over the target density distribution.

Figure 11 :
Figure 11: Visualization of how points in the target density distribution feature space are convexly partitioned.In this example three partitions are classified.The resulting three partitions are represented by circles, stars, and rectangles.

Figure 12 :
Figure 12: Structure of path planning decomposed into task allocation over partitions and path planning by optimizing directly over the target density distribution.

Figure 13 :
Figure 13: A sample sensor observation coverage where the quality of coverage is represented by shaded value, white being high quality and black being low quality.This type of coverage is applicable when a sensor's observations are improved with close proximity.

Figure 14 :
Figure14: A sample sensor observation coverage where the quality of coverage is represented by shaded value, white being high quality and black being low quality.This type of coverage is applicable when a sensor makes good observations at some specified distance away.

Figure 15 :
Figure 15: Comparison of sample mean number of targets within search or tracking partitions over simulation time between stateof-the-art direct distribution optimization and partition learning classification path planning approaches.

Figure 16 :
Figure 16: Comparison of sample mean number of search and tracking partitions over simulation time between state-of-the-art direct distribution optimization and partition learning classification path planning approaches.

Figure 17 :Figure 18 :
Figure 17: Comparison of sample mean average search and tracking partition size over simulation time between state-of-theart direct distribution optimization and partition learning classification path planning approaches.

Figure 19 :
Figure 19: Comparison of sample mean exploration partition size over simulation time between state-of-the-art direct distribution optimization and partition learning classification path planning approaches.

Figure 20 :
Figure 20: Comparison of sample mean number of targets localized over simulation time between state-of-the-art direct distribution optimization and partition learning classification path planning approaches.