The underlying goal of a competing agent in a discrete real-time strategy (RTS) game is to defeat an adversary. Strategic agents or participants must define an a priori plan to maneuver their resources in order to destroy the adversary and the adversary's resources as well as secure physical regions of the environment. This a priori plan can be generated by leveraging collected historical knowledge about the environment. This knowledge is then employed in the generation of a classification model for real-time decision-making in the RTS domain. The best way to generate a classification model for a complex problem domain depends on the characteristics of the solution space. An experimental method to determine solution space (search landscape) characteristics is through analysis of historical algorithm performance for solving the specific problem. We select a deterministic search technique and a stochastic search method for a priori classification model generation. These approaches are designed, implemented, and tested for a specific complex RTS game, Bos Wars. Their performance allows us to draw various conclusions about applying a competing agent in complex search landscapes associated with RTS games.

The real-time strategy (RTS) domain [

The objective of a competing agent in an RTS game is to defeat an adversary (or adversaries) by directly and indirectly moving and maneuvering resources in order to destroy the adversary’s resources, capture and destroy the adversary, and secure physical regions of the environment [

A

Nevertheless, an existing method for development of a

We present an AI strategy-based agent which collects information and learns about an opponent by examining its past performance. Past performance can be captured through a collection of

Before the subject of agent generation can be approached, a reliable method of generating a classification model needs to be created. The RTS domain is relatively new; while many different AI search approaches have been applied to agent generation, little research has been done into determining the underlying characteristics of the domain. Are there many different feature combinations which lead to victory? Are they in close proximity to each other, or are they spread out around the domain? Is the solution space (fitness landscape) jagged, where good feature combinations are in very close proximity to bad feature combinations, or are the transitions between the two more gradual? By answering these questions, we can determine an algorithm to use which can leverage the characteristics of the domain to find the better solutions in a reasonable amount of time.

In this paper, related RTS investigations including RTS games are summarized in Section

Related to the development of RTS games are appropriate contemporary RTS agent development methods, some current applications, and supporting generic feature selection and class identification methods.

Over the past three decades, there have been a variety of imperfect information (note that perfect information games include tic-tac-toe, checkers, chess, backgammon, and Go. RTS and RTT approaches have been applied to these games with some success depending upon depth of look-ahead search [

Distinct details of these games can usually be found by name via the internet. We address specific RTS game attributes that have a direct consideration in our “optimal” agent algorithmic approach: Case-Based Reasoning, Reinforcement Learning, Dynamic Scripting, and Monte-Carlo planning, along with available RTS software platforms.

A Case-Based Reasoning approach was used by Ontañón et al. [

A Hybrid Case-Based Reasoning/Reinforcement Learning approach was used by Sharma et al. [

Graepel et al. apply extended Q-learning reinforcement in order to find “good” Markov decision policies for a fighting agent game [

Continuous action model-learning in an RTS environment was addressed by Molineaux et al. [

Dynamic Scripting is a method developed by Spronck et al. [

A Monte-Carlo planning approach was used by Chung et al. [

In general, these approaches for solving RTS games do generate acceptable nonoptimal but not robust RTS solutions. This situation is generally due to the characteristics of the highly dimensional RTS search space being jagged and very rough. Moreover, we show this characteristic empirically via more appropriate stochastic search.

Note that contemporary AI techniques in RTS games continue to be in the development stage but with limited implementation. Observe that currently all such RTS games can be beaten by a knowledgeable human opponent, thus, making RTS games quite interesting and one would hope playable. Also, no single AI or human approach has been shown to be better or show more promise than others; therefore, there probably is no generic robust RTS game strategy-based agent that leads to victory in all cases! One can think of this situation as a reflection of the no-free-lunch theorem [

There are a number of RTS platforms on which to implement an RTS game along with collection of algorithmic game data. For example,

We choose to use the Bos Wars platform for determining general RTS search space characteristics. This choice provides an efficient and effective computational platform for gaining initial insight to the RTS search space. Knowing these characteristics, generic RTS platforms can be used later to explicitly search for RTS strategic solutions using appropriate stochastic AI algorithms.

The goal of generic feature selection is to find a

A general overview of feature selection and classification methods is given by Blum and Langley [

Blum and Langley [

The algorithm designed in this paper takes an embedded approach to a priori feature selection and classification. In each method, possible class separability and clustering functions are based upon a distance function. Such metrics include error probability, interclass distance, k-means clustering, entropy, consistency-based feature selection, and correlation-based feature selection.

A good overview of the feature selection problem domain is presented by Jain et al. [

Collections of RTS game traces can be used to construct a generalization of a particular game given many runs. By using machine learning techniques, specifically the generation of classification models for the game traces, the feature value combinations which tend to lead to victory and the feature value combinations which tend to lead to defeat can be determined. These good and bad feature values can then be given to an agent that would seek to avoid the bad feature combinations and approach the use of good combinations in the temporal decision process of the game.

There are numerous approaches to feature selection, using many different algorithms and heuristics. For example, search algorithms include deterministic depth-first search and breath-first search (best-first search), and stochastic simulated annealing and genetic algorithm techniques. The Feature Selection problem is known to be NP-Complete [

For example, to reduce the problem search space, Somol et al. [

As an example in the marketing domain, feature selection is used to determine customers who are likely to buy a product, based on the other products they have bought. Genetic algorithms were used by Jarmulak and Craw [

There are numerous examples of feature selection methods, in many different domains. However, feature selection is usually a domain specific problem; a feature selection algorithm which gives a good solution in one problem domain does not necessarily give the same quality of solution in a different domain. Our embedded algorithm uses a priori stochastic feature selection as motivated in the following sections.

A classifier is a system created from quantitative labeled data which can then be used to generalize qualitative data. In a more general sense, building a classifier is the process of learning a set of rules from instances. These rules can be used to assign new samples to classes. In an AI taxonomy, classification falls into the realm of supervised machine learning [

A classifier is often generated from an initial dataset, called the training set. This training set is a series of samples of feature values, where a feature is some measurable aspect of a specific problem domain. Each sample has values for all the features and is labeled as to what class in the problem domain it came from.

There are numerous methods of generating classifiers.

The family of

Two basic IBL exemplar models are

Another classification method based upon the K-NN approach is the K-winner machine (KWM) model [

One method of creating a best-example model from the training set is the

Our Real-Time Strategy Prediction Problem (RTSPP) is a classification problem which is formulated as a basic search problem. Any search problem definition including the RTSPP can be defined by its input, output, and fitness function.

The input to the RTSPP is a set of game traces from RTS games. Each game trace consists of “snapshots” taken at constant intervals or epochs. Each snapshot contains the value of all the possible features which an agent can observe. In the RTS domain, features could be the number and type of units, the amount of energy or fuel, or the rate at which energy and fuel are collected or used. Features could also be the rate of change of any of the static features across some time interval. Each snapshot is labeled as to whether it came from a game which was won or lost from player one’s perspective.

All features are defined as the difference between player one’s value and player two’s value. For example, if at some point in a game player one has two infantry units and player two has three, then the value of the infantry unit feature is negative one. Expressing features as a difference cuts the space required to store game traces in half.

The output (solution) of the RTSPP is a classifier: a subset of features, a set of winning

The classifier is then used to predict the outcome of a game based on only the current state. During a game, the values for the features in the solution are measured. Then, the distance to each center in the sets of centers is measured. The closest center is determined. If this center is a winning center, then the game state is predicted to result in a win. If it is a losing center, then the game state should result in a loss.

The quality of a solution to the RTSPP can be measured by testing its classification performance.

The RTSPP is formally defined to remove any ambiguity of understanding. There is a set

The output of the problem is a set of features

The fitness of a solution can be determined by using it to classify all the samples in

Next, a function which determines the accuracy of a prediction is needed. The function

Total fitness

The

The concluding step in the problem definition is an analysis of the number of possible RTSPP solutions. This information is important because it determines the difficulty of the search.

In the RTSPP, there are two components to a solution: the features in the set

Center solution space analysis is more complicated. If centers are restricted to being a sample

Combining the two solution spaces leads to a total solution space of

One of the easiest reductions to the problem domain is to reduce the number of features in

The two constraints significantly reduce the size of the solution space. The feature selection portion is now

With the reduction based on the constraints, the solution space is polynomial in the number of features and samples in the input data.

Any search problem can be solved using one of two general search types: deterministic and stochastic [

In a stochastic search, the algorithm is a probabilistic search over the solution space. The next state (solution) of a stochastic search algorithm is not always the same. Instead, the search is guided towards profitable areas using some heuristic. A stochastic algorithm does not search the entire solution space; instead, it seeks to exploit characteristics of the problem domain to find good solutions. Stochastic search algorithms require the assumption that the search is allowed to run forever to guarantee optimality. This is clearly unrealistic. However, the solution yielded by a stochastic algorithm is a solution in the original problem domain which may be near optimal or at least acceptable.

In some problem domains, a near optimal solution to the original problem is better than an optimal solution. In others, the converse is true. One way to determine this is to test both approaches on the problem domain. To do this, the problem domain must be explicitly defined. Next, a specific search algorithm can be developed and tailored to the problem. In this chapter, both deterministic and stochastic search algorithms are developed to solve the RTS classification problem. They are tested on a data set from an RTS application, and their performance is compared. Finally, a selection is made between the deterministic and stochastic families for further development. To appreciate the subtle aspects of these feature selection search techniques for RTS games, the following sections are provided.

In general, features work in combinations to determine the fitness of a given RTS state. To find a subset of features, deterministic search in the RTS domain faces an immediate problem because of the complexity and roughness of the solution space. There is no way to search the entire problem space in a reasonable amount of time, which would be required to guarantee an optimal classification solution. Moreover, classification, when conducted on a problem with dependent variables, does not lend itself to implicit searching. The RTSPP for example probably has dependent variables.

In problems with independent variables, a solution can be constructed by adding features to a solution one by one, adding the feature at each level which has the greatest positive effect on the classification accuracy of the model. Dependent variables provide no such guarantee; because they work in combinations, the addition or deletion of a feature from a solution can have a large and unpredictable effect on classification model accuracy.

Basically, this means there is no admissible heuristic [

When reducing the size of the solution space via classification, we need to find a heuristic which preserves the high fitness solutions of the entire space, while discarding the solutions with low fitness. If we start with the solution space in Figure

A hypothetical solution space.

A hypothetical solution space which has been pruned through the use of a heuristic.

Another hypothetical solution space which has been pruned through the use of a heuristic.

One of the easiest ways to reduce solution space size is to determine a way to pair features with centers. If at each step a triple could be selected which consisted of one feature, one winning center and one losing center, the number of combinations would be greatly reduced. This requires a means of determining good feature values when features are selected.

One way of determining good features involves the use of the

The BC is calculated by taking a histogram of all the data and determining the probability of a sample falling in a bin for both classes. The two probabilities for each bin are multiplied together and summed over the entire histogram. Formally, this is

A visualization of the Bhattacharyya coefficient (BC) on feature

The BC pairs each feature with two centers (one winning, one losing), so at each step of the

Of course, BC is not an admissible heuristic. The optimization function (percent classified correctly) is not directly related to the BC. However, if the triple with the lowest BC is chosen at each step, it should drive the greatest improvement in classification accuracy because the overlap between the winning/losing sets is as small as possible. If the feature with the lowest BC remaining is selected and it does not improve the value of the optimization function, the next one picked should not do any better; the solution samples are close together.

When choosing a search algorithm, we must keep in mind our goal: to determine the characteristics of the RTSPP solution space. We have a heuristic which we would like to test, the BC. A

Another way to test the effectiveness of the heuristic is to use a

The increased space searchable with greedy search portion.

The BC heuristic also prunes the search space. The BC pairs each feature with a center, as described. This significantly reduces the space, allowing us to completely search the space in a reasonable amount of time. However, we eliminate many possible combinations. To test the effectiveness of the heuristic from this perspective, some other method of search must be used which searches other possibilities missed by the deterministic search.

The best choice for a deterministic search algorithm is to begin with a greedy search which chooses some number of feature/center triples for a partial solution. Then, we begin a best first search which tries all the possible combinations of triples which can be used to form a solution, subject to the constraints on the number of features in a solution.

These algorithm choices lead to two different search parameters: the depth of the greedy search and the total number of features in a solution. By varying these parameters, we can gauge the effectiveness of the heuristic, as well as determine some characteristics of the solution space. But, because of the deterministic algorithm computational characteristics, a stochastic local search algorithm is selected.

It is assumed because of the combinatorics that the solution landscape of the RTSPP has many local maximum and minimum points. Most of these would exist in close proximity to each other; some features should be more closely related to the eventual outcome of a game. For instance, the total number of units for one player compared to the units for another player is one feature which would probably give good prediction accuracies, while the total amount of money or fuel which could possibly be stored is probably not in a solution. Local maxima should be near the global maximum, while local minima should be near the global minimum. As a result of these search landscape characteristics, a stochastic algorithm that is initially biased towards exploration, but then tends to exploitation is suggested.

This tentative analysis of the solution space shows the RTSPP may be responsive to a relatively simple stochastic algorithm like

In simulated annealing, the same approach is taken, but worse solutions can be accepted with some probability. Hill-climbing is subject to getting caught in a local maximum since it has no way of escaping. The probabilistic acceptance provided by simulated annealing allows the algorithm to possibly escape from a local maximum. The probability of selecting a worse solution is based on the current temperature, which changes based on a cooling parameter. At the beginning of the algorithm, the temperature is high so almost all solutions are accepted. As the search continues, the temperature falls such that lower quality solutions are accepted less frequently. By the end of the algorithm, SA becomes hill climbing.

Simulated annealing is easy to implement and runs quickly. It is a good choice to test the performance of a stochastic algorithm on the RTSPP.

In order to appreciate the important design evolution of our SA method, the SA algorithm refinement is presented. Initially, we need to consider a complete formal SA specification, which requires a solution form, fitness function, neighborhood function, and cooling function for the problem domain.

A solution to the RTSPP is a set of features along with a set of centers. There are

The fitness function is determined by the chromosome string representing the current solution

To generate the next solution, the current solution may be

The cooling function is a geometric decreasing function defined by a parameter

The combination of the algorithm constructs and specification generates the program specification in Algorithm

_{o}

_{0}

The algorithm complexity depends on the time it takes to compute the fitness function

The problem with the program as currently designed is in the neighborhood function. Allowing flipped bit s can potentially change the number of features/centers in a solution. Since the two constraints are limits on the number of features and centers, this means the algorithm may generate infeasible solutions. To deal with this problem, a repair function could be introduced to “fix” infeasible solutions, or the neighborhood function could be changed. Since one of the main concerns with the search is complexity, and introducing a repair function increases complexity, changing the neighborhood function is the best course.

Instead of allowing “flipped” bits, only swaps are allowed, and bits must be swapped in the same portion of the binary solution so a bit in the feature portion of the solution is not swapped with a bit in the center portion. Three swaps are made based upon problem insight: one in the feature portion and two in the center portion of the solution. For ease of notation, this function is called

Proximity in solution space.

As already stated, the solution

Additionally, the data array is used to compute the entire fitness function. Like in the deterministic solution, the data is stored in an array for fast access, the array

The best solution is

Instead of having the user specify the initial solution, it is generated randomly by picking

The data structures lead to the final program refinement in Algorithm

RTS problem domain data is used to test the two designed classification search algorithms, the parameters used in each algorithm, and the performance metrics used to gauge their performance.

The algorithms are tested on data from the RTS platform

There are three scripted AI search techniques packaged with the development version of the game:

Additionally, there are three different difficulty levels for the game:

To collect data, the Bos Wars source code is modified to take a snapshot of the game state at intervals of five seconds and output the feature values to a text file.

Altogether,

Win/loss prediction is easier: the closer one gets to the end of the game, and almost impossible at the beginning. The goal of the RTSPP is to capture the important part of a game, where one player obtains an advantage over the other. To facilitate this, only game states in the third quarter of a game, the ones starting after 50% of the game had elapsed and before 75% of the game had elapsed, are used as input. The shortest game was about ten minutes long, while the longest was more than forty minutes. Predictions ranged from samples 2.5 minutes from the end of the game to 20 minutes from the end of the game. Table

Records for each agent combination on listed map (1st agent wins—2nd agent wins).

Map/Agent combination | Battlefield | Island Warfare | Wetlands |
---|---|---|---|

Rush versus Blitz | 6–3 | 9–0 | 9–0 |

Tank Rush versus Blitz | 9–0 | 9–0 | 9–0 |

Rush versus Tank Rush | 0–9 | 2–7 | 9–0 |

Average standard deviation in game length (seconds) for agent combinations on specific maps.

Map/Agent combination | Battlefield | Island Warfare | Wetlands |
---|---|---|---|

Rush versus Blitz | 0.00 | 31.30 | 38.10 |

Tank Rush versus Blitz | 0.00 | 0.00 | 34.78 |

Rush versus Tank Rush | 0.77 | 30.41 | 0.26 |

Extracting all the third quarter samples from the game leads to a sample size of about 4500. This data is split into two portions: the first, of around 3000 samples, is used by both algorithms to develop classifiers. This data is referred to as the Bos Wars Training Set. The remaining 1500 samples are held out and used to compare the best classifiers found by the two algorithms. This data is referred to as the Bos Wars Testing Set. Holding out a portion of the data so neither algorithm is allowed to train on it leads to a fair comparison. The percentage of winning samples in each data set is presented in Table

Number of winning samples for each fold of the Bos Wars Training Set and the number of winning samples in the Bos Wars Test Set.

Data set | Winning samples | Samples | Percentage |
---|---|---|---|

Fold One | 311 | 998 | |

Fold Two | 311 | 998 | |

Fold Three | 310 | 997 | |

Bos Wars Test Set | 452 | 1463 |

When generating classifiers, both algorithms use 3-fold cross validation to develop their classifiers. In 3-fold cross validation, the data is split into three sections. The algorithm takes two of these sections to train a classifier and then uses the final third to test the performance of the classifier.

The Bos Wars Training Set is used to determine the best search parameters for each algorithm. Solutions obtained using the best search parameters on the Training Set are then tested on the Bos Wars Testing Set.

The developed deterministic algorithm is a

The depth of the DFS portion of the search is limited to values less than or equal to four because of computational complexity, or the constraint

The chosen stochastic search algorithm is simulated annealing. The SA algorithm has three search parameters: the initial temperature

Table

Parameter combinations for testing of the stochastic search algorithm.

Parameter | Range | Step | Unique values |
---|---|---|---|

50–200 | 25 | 7 | |

0.2–0.8 | 0.1 | 7 | |

2–8 | 1 | 7 |

To assess the performance of each classification algorithm, two metrics are used: the fitness of the generated classifiers and the time to complete a search. The fitness of a classifier is its classification accuracy on the test set.

For the deterministic solution, every time the algorithm is run with the same parameter settings on the same data set, it finishes with the same solution. Repeated iterations are not required. For each parameter setting, the algorithm is run on each of the three folds in the data set. The best classifier found is tested on the appropriate fold, and the fitness across all three folds is averaged, giving an average classification accuracy for the parameter setting. The time to complete each search is expressed in seconds required for the search; this is also averaged across all three folds for the specific parameter setting.

In the stochastic search, subsequent runs of the algorithm do not necessarily result in the same answer, so one hundred iterations are run for each parameter combination on each fold. The average time required to complete one iteration is computed for each fold.

Finally, to compare the two algorithms, the classifiers for the top five parameter settings are tested on the Bos Wars Test Set. The average fitness for each parameter setting is computed and can be used for comparison of the performance of the two algorithms, along with the average time to complete a search.

This section displays the results of the deterministic and stochastic search algorithms and compares their performance. First, the best performing deterministic search parameters are determined by examining algorithm performance on the Bos Wars Training Set. The process is repeated for the stochastic search algorithm. Next, the classifiers generated using the best performing parameters are compared on the Bos Wars Training Set.

Deterministic search algorithm performance is measured in terms of time to search and classification performance. The chosen deterministic search algorithm was a Depth First Search with Backtracking (DFS-BT).

Results for the DFS-BT across all folds on the Bos Wars Training Set.

Fitness | St Dev | Time (s) | St Dev | ||
---|---|---|---|---|---|

0 | 1 | 73.6% | 0.039 | <1 | 0.00 |

0 | 2 | 80.5% | 0.050 | 1.33 | 0.58 |

0 | 3 | 86.7% | 0.027 | 46.67 | 0.58 |

0 | 4 | 90.0% | 0.004 | 1013.00 | 5.29 |

1 | 2 | 70.7% | 0.060 | <1 | 0.00 |

1 | 3 | 83.1% | 0.046 | 3.00 | 0.00 |

1 | 4 | 85.6% | 0.039 | 69.33 | 1.16 |

1 | 5 | 89.0% | 0.019 | 1345.33 | 17.79 |

2 | 3 | 67.8% | 0.060 | <1 | 0.58 |

2 | 4 | 80.5% | 0.040 | 3.33 | 0.58 |

2 | 5 | 83.6% | 0.036 | 90.67 | 0.58 |

2 | 6 | 84.7% | 0.008 | 1673.00 | 51.18 |

3 | 4 | 65.2% | 0.050 | <1 | 0.58 |

3 | 5 | 77.5% | 0.020 | 4.67 | 0.58 |

3 | 6 | 80.9% | 0.023 | 113.00 | 1.00 |

3 | 7 | 85.5% | 0.022 | 1962.33 | 15.54 |

4 | 5 | 67.5% | 0.010 | <1 | 0.58 |

4 | 6 | 75.9% | 0.031 | 6.00 | 0.00 |

4 | 7 | 83.4% | 0.019 | 141.67 | 0.58 |

4 | 8 | 85.5% | 0.009 | 2279.00 | 6.56 |

5 | 6 | 73.0% | 0.017 | 1.00 | 0.00 |

5 | 7 | 83.4% | 0.019 | 7.00 | 0.00 |

5 | 8 | 85.5% | 0.009 | 164.00 | 0.00 |

6 | 7 | 72.9% | 0.016 | <1 | 0.58 |

6 | 8 | 83.1% | 0.009 | 9.00 | 0.00 |

7 | 8 | 75.7% | 0.082 | <1 | 0.57 |

In the deterministic search, there are two parameters: the greedy search depth and the total search depth,

Effect of search depth on classification accuracy, (a) and (b).

Effect of DFS depth on classification accuracy

Effect of greedy search depth on classification accuracy

In the first, the direct relationship between classification accuracy and DFS depth

However, the greedy search portion, which is reflected in the second graph, is not as effective. Although not as definitive, the trend in the classification accuracy as

The solutions with the best fitness are generated for the parameter values

The Bhattacharyya Metric (BC) is computed for each training set in the Bos Wars Data before beginning the deterministic search. In Figure

The BC for the features in the Bos Wars training sets.

As a heuristic for the greedy search portion of the deterministic algorithm, the BC is ineffective. In almost all cases, adding more levels to the greedy search decreased performance. However, using the BC to pair features with centers is effective: using these triples, the deterministic search is able to attain accuracies over

To fine-tune the simulated annealing stochastic algorithm, the effects of various parameters on solution fitness are explored. Figure

Effect of different parameter settings on overall classification accuracy and search time for Simulated Annealing on the Bos Wars data set.

Both the number of features in a solution and the cooling parameter have a direct relationship with both classification accuracy and search time. For alpha values, the relationship appears to be linear. An increase of 0.1 in

The number of features in a solution has a large impact on fitness at the low ends, but less at the high ends. Again, two-sample

The

The detailed analysis of the effect of the parameter values leads to a selection of the best values for the Bos Wars data set. In this case, those values are

To choose whether to develop a deterministic or stochastic algorithm, we must compare the solutions found by each. For each algorithm, the best performing search parameters are determined. In the deterministic algorithm, these parameters are

Instead of comparing the results of the algorithms on the data sets already observed, they are tested on a different Bos Wars data set on which neither was allowed to train. The deterministic algorithm uses the three different solutions developed for the parameter settings. Each solution is the result of a DFS on a different fold of the Bos Wars training set. The stochastic algorithm is run fifty times on each fold, so there are 150 different solutions for the best parameter set. All these solutions are tested on the novel data set.

The classification accuracy, along with the time which is required to generate each solution from the training data, is presented in Table

Results for the solutions found with the best parameters by the deterministic and stochastic algorithms.

Algorithm | Accuracy | Search time |
---|---|---|

Deterministic | 1013.0 | |

Stochastic | 75.0 |

The results are unequivocal: the stochastic algorithm outperforms the deterministic algorithm on both performance metrics. In the RTSPP domain, a near-optimal solution to the original problem is better than an optimal solution to the reduced-dimension problem.

The simulated annealing solution gives good performance on this data set. However, simulated annealing is a simple stochastic search algorithm which was chosen for the ease with which it could be implemented. It would be more complicated to refine or tune the algorithm for a specific RTSPP search landscape.

On the other hand, the SA solution exposes information about the problem domain. Figure

The frequency of each feature in the 150 SA solutions evaluated on the Bos Wars Test Set.

We conclude there is

The failure of the BC metric to generate good classification accuracies for the deterministic solution indicates that the features are dependent. Features work in combinations to determine the outcome of an RTS game.

This study was conducted to determine the characteristics of the RTSPP. While the stochastic search method was able to find good classification accuracies that was not our main objective, instead, we used the results to determine the characteristics of the space, which allows us to develop a search algorithm tailored to our specific RTSPP problem.

The deterministic search tries to find a heuristic. In many searches, a heuristic is used to guide the search in profitable directions. If admissible, it can also be used to implicitly search much of the domain, using a best-first search strategy like

In the RTSPP, we do not have that luxury. No admissible heuristic could be found. Instead, we used a heuristic to reduce the size of the solution space. Our hope was the heuristic would preserve the high fitness solutions in the space, while discarding the lower fitness solutions. For example, if the entire problem domain looked as in Figure

Our results show this does not work for the RTSPP. The stochastic algorithm is allowed to search the entire space. Even though it is only able to explore a small portion of solutions on each run, it finds solutions superior to those from the deterministic solution. Instead of the ideal reduced solution space, we have found a space looking more like Figure

The stochastic search results tell us the solution space is quite jagged and rough. However, it also tells us the fitness of the solution at the top of each

Our goal is to use the understanding of the solution space characteristics determined in this study and develop a more complicated RTSPP algorithm. This innovative generic RTSPP method would employ a hybrid genetic algorithm/evolutionary strategy [

Specific to the RTS game domain, Bakkes et al. [

This investigation is a research effort of the AFIT Center for Cyberspace Research (CCR), Director: Dr. Rick Raines.