Application of a Theorem in Stochastic Models of Elections

Previous empirical research has developed stochastic electoral models for Israel, Turkey, and other polities. The work suggests that convergence to an electoral center often predicted by electoral models is a nongeneric phenomenon. In an attempt to explain nonconvergence, a formal model based on intrinsic valence is presented. This theory showed that there are necessary and sufficient conditions for convergence. The necessary condition is that a convergence coefficient c is bounded above by the dimension w of the policy space, while a sufficient condition is that the coefficient is bounded above by 1. This coefficient is defined in terms of the difference in exogenous valences, the " spatial coefficient " , and the electoral variance. The theoretical model is then applied to empirical analyses of elections in the United States and Britain. These empirical models include sociode-mographic valence and electoral perceptions of character trait. It is shown that the model implies convergence to positions close to the electoral origin. To explain party divergence, the model is then extended to incorporate activist valences. This extension gives a first-order balance condition that allows the party to calculate the optimal marginal condition to maximize vote share. We argue that the equilibrium positions of presidential candidates in US elections and by party leaders in British elections are principally due to the influence of activists, rather than the centripetal effect of the electorate.


Introduction
Electoral models based on the work of Hotelling 1 and Downs 2 suggest that parties will converge to an electoral center at the electoral median) when the policy space has a single dimension.Although a pure strategy Nash equilibrium generically fails to exist in competition between two agents under majority rule in high enough dimension, there will exist mixed-strategy equilibria whose support is located near to the electoral center 3 .However, previous empirical research has developed stochastic electoral models for International Journal of Mathematics and Mathematical Sciences Argentina, Israel, Russia, Turkey, and other polities 4-10 , and has suggested that divergence from the electoral center is a generic property of electoral systems.
This paper presents the formal stochastic model based on electoral valence to explain nonconvergence of candidates in the 2008 elections in the United States, and in an earlier elction in Britain.The key idea is that the convergence result need not hold if there is an asymmetry in the electoral perception of the "quality" of party leaders 11, 12 .The average weight given to the perceived quality of the leader of the jth party is called the party's intrinsic or exogenous) valence.In empirical models, a party's valence is assumed to be independent of the party's position, and adds to the statistical significance of the model.It is obtained from the intercept of the empirical model, and reflects a common perception of the quality of the candidate or party leader.In general, intrinsic valence reflects the overall degree to which the party or candidate is generally perceived to be able to govern effectively 13, 14 .
We assume here that, in addition to intrinsic valence, there are three further kinds of valence.The first kind is a sociodemographic valence.Empirical models show that different subgroups in the electorate respond to leaders or candidates in different ways.These sociodemographic valences reflect the fact that particular party leaders have established specific political relationships with various political groups that are, at least in the short run, independent of the party's position.
The second type of valence is individual specific, and is defined by individual perception of the character traits of the candidates or party leaders.
The third kind of valence is called activist (or endogenous) valence.When party j adopts a policy position z j , in the policy space, X, then the activist valence of the party is denoted as μ j z j .Implicitly we adopt a model originally due to Aldrich 15 .In this model, activists provide crucial resources of time and money to their chosen party, and these resources are dependent on the party position.For convenience, it is assumed that μ j z j is only dependent on z j , and not on z k , k / j, but this is not a crucial assumption.The party then uses these resources to enhance its image before the electorate, thus affecting its overall valence.Although activist valence is affected by party position, it does not operate in the usual way by influencing voter choice through the distance between a voter's preferred policy position, say x i , and the party position.Rather, as party j's activist support, μ j z j , increases due to increased contributions to the party in contrast to the support μ k z k received by party k, then in the model all voters become more likely to support party j over party k.
However, activists are likely to be more extreme than the typical voter.By choosing a policy position to maximize activist support, the party will lose centrist voters.The party must therefore determine the "optimal marginal condition" to maximize vote share.The first result presented here gives this as a first-order balance condition.Moreover, because activist support is denominated in terms of time and money, it is reasonable to suppose that the activist function will exhibit decreasing returns.We point out that when these activist functions are sufficiently concave, then the model will exhibit a Nash equilibrium, where each party or political candidate adopts a position that maximizes its vote, in response to the positions adopted by the other agents.
This stochastic model is also applied to the case with intrinsic valence alone.For this model, it can be shown that the joint electoral origin satisfies the first-order condition for a Nash equilibrium.Because the vote share functions are differentiable, we make use of calculus techniques, and therefore use the notion of local Nash equilibrium LNE .To determine whether the origin is an LNE, it is necessary to examine the Hessian of the vote share function of the political agent with lowest intrinsic valence.We thus obtain the necessary and sufficient conditions for the validity of the mean voter theorem that all agents should converge to the electoral origin.The second result gives these conditions in terms of a "convergence coefficient" incorporating all the parameters of the intrinsic valence model.This coefficient, c, involves the differences in the intrinsic valences of the agents, and the "spatial coefficient" β.When the policy space, X, is assumed to be of dimension w, then the necessary condition for existence of a Nash equilibrium when all agents are located at the electoral origin is that the coefficient c is bounded above by w.When the necessary condition fails, then agents, in equilibrium, will adopt divergent positions.
In the next section we briefly sketch the nature of the local Nash equilibria in political games involving these different types of electoral valences.We focus on candidates in the 2008 US Presidential election, in order to illustrate our results.The formal models are presented in Section 3.There we formally introduce the notion of a local Nash equilibrium, and then show that the unique Nash equilibrium in the presidential campaign of 2008 should be where both candidates adopted positions very close to the electoral origin.Since the candidates did not adopt such convergent positions, we can estimate the effect of activists in this election.We then follow up with a brief analysis of the 1979 general election in Britain, and show again that the empirical model indicated convergence.Again, nonconvergence of the parties allows us to estimate the effect of activists.In the conclusion, we offer some general remarks about Madison's argument about the "probability of a fit choice."

Activist Support for the Parties
The main result of this paper can be applied to analysis of the equilibrium candidate positions z * dem , z * rep in a two-candidate game of vote maximization in a US election.It is shown here that the the first-order condition is given by a balance equation.This means that, for each party j dem or rep, there is a weighted electoral mean for party j, given by the expression and which is determined by the set of voter preferred points {x i }.Notice that the coefficients { ij } for candidate j will depend on the position of the other candidate, k.Define the centripetal marginal electoral pull for candidate j, at z j , by The influence of activists on candidate j is given by the marginal activist pull for party j ⎡ ⎣ dμ j dz j z j ⎤ ⎦ .

2.3
The first-order balance equation for equilibrium is that the position z * j for each j must satisfy the gradient equation The locus of points satisfying this equation is called the balance locus for the party.To illustrate this model, consider Figure 1 which illustrates elections in the US.Our empirical analysis indicates that there are two dimensions, economic and social.Consider initial positions R and D, on either side of and approximately equidistant from the origin, as in the figure.Both social conservative activists, represented by C, and social liberal activists represented by S, would be indifferent between both parties.A Democratic candidate by moving to position D * will benefit from activist support of the social liberals, but will lose some support from the economic liberal activists at L. The "contract curve" between the two activist groups, centered at L and S, represents the set of conflicting interests or "bargains" that can be made between these two groups over the policy to be followed by the candidate.In the figure, the indifference curves of the activist groups are shown to be eccentric, with economic activists much less concerned about social policy, and social activists less concerned about economic policy.Under this assumption, it can be shown that this contract curve is a catenary whose curvature is determined by the "eccentricities" of the utility functions of the activist groups.We therefore call this contract curve the Democratic activist catenary.It is obtained by shifting the appropriate activist catenary towards the weighted electoral mean of the party.The marginal activist pull for party j at a position z j is a gradient vector, dμ j /dz j | z j , which represents the marginal effect of the activist groups on the party's valence.The gradient term dE * j /dz j z j is the marginal electoral pull of party j at z j , and this pull is zero at z * j z el j .Otherwise, it is a vector pointing towards z el j .To illustrate, the pair of positions D * , R * in Figure 1 are equilibrium candidate positions that maximize each candidate's vote share.The positioning of R * in the lower right electoral quadrant in Figure 1 and of D * in the upper left quadrant is meant to indicate the realignment that has occurred since the election victory of Kennedy over Nixon in 1960.By 1964 Lyndon Johnson had moved away from a typical New Deal Democratic position, L, to a position comparable to D * .The long-term effect of this transformation was that by 2000; most of the southern states had become dominated by the Republican party.Empirical analysis of this election suggests that the intrinsic valence of Johnson was greater than that of Goldwater.See 8,16 .According to the activist model, this implies that Goldwater's dependence on activist support was greater than Johnson's.This is reflected in Figure 1, where the balance locus for Goldwater is shown to be further from the electoral origin than the balance locus for Johnson.From this we can infer the influence of activists on the two-candidates, thus providing an explanation why socially conservative activists responded so vigorously to the new Republican position adopted by Goldwater, and came to dominate the Republican primaries in support of his proposed policies.These characteristics of the balance solution appear to provide an explanation for Johnson's electoral landslide in 1964.

International Journal of Mathematics and Mathematical Sciences
In this paper we shall apply the electoral model to account for the positions of Obama and McCain in the 2008 presidential election in the context of an electoral distribution, obtained from the American National Election Survey ANES .Figure 2 shows the estimated voter distribution together with these estimated candidate positions.
We first present the formal stochastic model and then give the empirical analysis of this election.

The Formal Stochastic Model
Details of the spatial stochastic electoral models are published in 17, 18 .This model is an extension of the standard multiparty stochastic model 19 modified by inducing asymmetries in terms of valence.
We define a stochastic electoral model, which utilizes sociodemographic variables and voter perceptions of character traits.For this model we assume that voter i utility is given by the expression

International Journal of Mathematics and Mathematical Sciences
Here u * ij x i , z j is the observable component of utility, while λ λ 1 , λ 2 , . . ., λ p is the intrinsic valence vector, which we assume satisfies the ranking condition λ The political agents who may be presidential candidates, in US elections, or party leaders, as in British elections are denoted as 1, . . ., p .The points {x i : i N} are the preferred policies, in a space X, of the voters and z {z j : j P } are the positions, in X, of the agents.The term is simply the Euclidean distance between x i and z j .The error vector ε ε 1. , . . ., ε j , . . ., ε p is distributed by the type I extreme value distribution, as assumed in empirical conditional logit estimation.In empirical models, the valence vector λ is given by the intercept term for each agent in the model.The symbol θ denotes a set of k-vectors {θ j : j P } representing the effect of the k different sociodemographic parameters class, domicile, education, income, religious orientation, etc. on voting for agent j while η i is a k-vector denoting the ith individual's relevant "sociodemographic" characteristics.The compositions { θ j • η i } are scalar products, called the sociodemographic valences for j.
The terms { α j • τ i } are scalars giving voter i s perceptions and beliefs.These can include perceptions of the character traits of agent j, or beliefs about the state of the economy, and so forth.We let α α p , . . ., α 1 .A trait score can be obtained by factor analysis from a set of survey questions asking respondents about the traits of the agent, including "moral", "caring", "knowledgable", "strong", "honest", "intelligent", and so forth.The perception of traits can be augmented with voter perception of the state of the economy, in order to examine how anticipated changes in the economy affect each agent's electoral support.
The terms {μ j : j P } are the activist valence functions .The full model including activists is denoted as M λ, μ, θ, α, β .
Partial models are: i pure sociodemographic, denoted as M λ, θ , with only intrinsic valence and sociodemographic variables, ii pure spatial, denoted as M λ, β , with only intrinsic valence and β, iii joint spatial, denoted as M λ, θ, β , with intrinsic valence, sociodemographic variables and β, iv joint spatial model with traits, denoted as M λ, θ, α, β , without the activist components.
In all models, the probability that voter i chooses agent j, when agent positions are given by z, is A strict local Nash equilibrium LNE for a model M is a vector, z, such that each agent, j, chooses z j to locally strictly maximize the expected vote share International Journal of Mathematics and Mathematical Sciences 7 In these models, political agents cannot know precisely how each voter will choose at the vector z.The stochastic component as described by the vector ε is one way of modeling the degree of risk or uncertainty in the agents' calculations.Implicitly we assume that they can use polling information and the like to obtain an approximation to this stochastic model in a neighborhood of the initial candidate locations.For this reason we focus on LNE.Halpern 20 gives some objections to the concept of Nash equilibrium, in terms of computability and the knowledge requirement of agents, and this provides some basis for our use of LNE.In the empirical work presented below, we find that LNE and PNE coincide.Note, however, that as agents adjust position in response to information in search of equilibrium then the empirical model may become increasingly inaccurate.
A strict Nash equilibrium PNE for a model M is a vector z which globally strictly maximizes V j z .Obviously if z is not an LNE then it cannot be a PNE.
It follows from 21 that, for the model M λ, μ, θ, α, β , the probability, ρ ij z , that voter i, with ideal point, x i , picks j at the vector, z, of agent positions is given by where

3.8
We use this gradient equation in the form of MATLAB algorithms, given in Appendices A and B to obtain the LNE.This equation shows that the first-order condition for z * to be an LNE is given by

3.9
Hence dμ j dz j z j .

International Journal of Mathematics and Mathematical Sciences
This can be written as dμ j dz j z j , 3.11 where

3.12
Here z el j is the weighted electoral mean of agent j.Because this model is linear, it is possible to modify these weights to take account of the differential importance of voters in different constituencies.For example, presidential candidates may attempt to maximize total electoral votes, so voters can be weighted by the relative electoral college seats of the state they reside in.We can therefore write the first-order balance condition at an equilibrium, z * z * 1 , . . ., z * j , . . ., z * p , as a set of gradient balance conditions dμ j dz j z * j 0.

3.13
The first term in this equation is the centripetal marginal electoral pull for agent j, defined at z j by dE * j dz j z j z el j − z j .

3.14
The second gradient term, dμ j /dz j | z j , is the centrifugal marginal activist pull for j, at z j .
To determine the LNE for the model M λ, μ, θ, α, β , it is of course necessary to consider the Hessians dV 2 j z /dz 2 j .These will involve the second-order terms d 2 μ j /dz 2 j .In the next section, we suggest that there will be natural conditions under which these will be negative definite.Indeed if the eigenvalues are negative and of sufficiently large modulus, then we may expect the existence of PNE.
For the pure spatial model, M λ, β , it is clear that when the agents adopt the same positions then ρ kj ρ j is independent of the voter suffix, k.Thus all ij 1/n gives the first-order condition for an LNE.By a change of coordinates, it follows that z 0 0, . . ., 0 is a candidate for an LNE.Note however that this argument does not follow for the model M λ, θ, α, β , and generically z el z el 1 , z el 2 , . . ., z el p / 0, . . ., 0 .Since the valence functions are constant in the model M λ, θ, α, β , the marginal effects, dμ j /dz j , will be zero.However, since the weights in the weighted electoral mean for each agent will vary from one individual to another, it is necessary to simulate the model to determine the LNE z el z el 1 , z el 2 , . . ., z el p .Notice also that the marginal vote effect, dρ ij /dz j , for a voter with ρ ij z 1 will be close to zero.Thus in searching for LNE, each agent will seek voters with ρ ij z < 1.
The necessary and sufficient second-order condition for LNE at z 0 in the pure spatial model, M λ, β , is determined as follows.When all agents are at the electoral origin, and agent 1 is, by definition, the lowest valence agent, then the probability that a generic voter picks agent 1 is given by: To compute the Hessian of agent 1, we proceed as follows:

3.16
Here I is the w by w identity matrix, and we use T to denote a column vector.When all agents are at the same position, then ρ i1 ρ 1 is independent of i.Moreover, is the w by w covariance matrix of the distribution of voter ideal points, taken about the electoral origin.Thus the Hessian of the vote share function of agent 1 at z 0 0, . . ., 0 is given by

3.18
Since ρ 1 − ρ 2 1 > 0, β > 0, this Hessian can be identified with the w by w characteristic matrix for agent 1, given by Then the necessary and sufficient second-order condition for LNE at z 0 is that C 1 has negative eigenvalues.For convenience we focus on a strict local equilibrium associated with negative eigenvalues of the Hessian.
It follows from this that a necessary condition for z 0 0, . . ., 0 to be an LNE is that the trace of the matrix C 1 is strictly negative.For a weak LNE we require the trace to be nonpositive.In turn this means that a convergence coefficient, c, defined by International Journal of Mathematics and Mathematical Sciences satsfies the critical convergence condition, c < w.Here σ 2 trace ∇ 0 is the sum of the variance terms on all axes.
A sufficient condition for convergence to z 0 in the two-dimensional case is that c < 1.
When the necessary condition fails, then the lowest valence agent has a best response that diverges from the origin.In this case there is no guarantee of existence of a PNE.
We can also consider a model M λ, μ, θ, α, β where we use different coefficients β β 1 , . . ., β w on the axes, so the spatial component has the form

3.21
Then the characteristic matrix can be taken to be where β is the diagonal matrix of the β coefficients, while β∇ 0 β is the covariance matrix where each axis is weighted by the β coefficients β 1 , β 2 , . . ., β w .The necessary condition is thus that trace C 1 < 0, or Because the model is linear, we can obtain a similar result where there a multiple electoral groups, each weighting the axes differently.
In the empirical analyses, we can used Newton's method with gradient information to compute best responses, in order to determine LNE in the various models.

Application to the Case with Multiple Activist Groups
We adapt the model presented by Schofield and Cataife in 4 , where there are multiple activist groups for each party.
i For each agent, j, let {A j } be a family of potential activists, where each k ∈ A j is endowed with a utility function, U k , which is a function of the position z j .The resources allocated to j by k are denoted as R jk U k z j .The total activist valence function for agent j is the linear combination where {μ jk } are functions of the contributions {R jk U k z j }, and each μ jk is a concave function of R jk .
ii Assume that the gradients of the valence functions for j are given by where the coefficients {a * k , a * * k } > 0 are differentiable functions of z j .iii Under these assumptions, the first-order equation dμ j /dz j 0 becomes

3.26
The Contract Curve generated by the family {A j } is the locus of points satisfying the gradient equation dU k dz j 0, where k∈A j a k 1 and all a k > 0.

3.27
The Balance Locus for agent j, defined by the family {A j }, is the solution to the firstorder gradient equation The simplest case, discussed in 4 , is in two dimensions, where each agent has two supporting activist groups.In this case, the contract curve for each agent's supporters will, generically, be a one-dimensional arc.Miller and Schofield 22 also supposed that the activist utility functions were ellipsoidal, mirroring differing saliences on the two axes.As discussed earlier, in this case the contract curves would be catenaries, and the balance locus would be a one-dimensional arc.The balance solution for each agent naturally depends on the position s of opposed agent s , and on the coefficients, as indicated above, of the various activists.The determination of the balance solution can be obtained by computing the vote share Hessian along the balance locus.Since the activist valence function for agent j depends on the resources contributed by the various activist groups to this agent, we may expect the marginal effect of these resources to exhibit diminishing returns.Thus the activist valence functions can be expected to be concave in the activist resources, so that the Hessian of the overall activist valence, μ j , can be expected to have negative eigenvalues.When the activist functions are sufficiently concave in the sense that the Hessians have negative eigenvalues of sufficiently large modulus , then we may infer not only that the LNE will exist, but that they will be PNE.
If we associate the utilities {U k } with leaders of the activist groups for the agents, then the combination k∈A j a k dU k dz j 3.29 may be interpreted as the marginal utility of the candidate of party j, induced by the activist support.

International Journal of Mathematics and Mathematical Sciences
To see this, suppose that each agent were to maximize the function where μ j is no longer an activist function, but a policy-determined component of the agent's utility function, while δ is the weight given to the policy preference.See 23 for such a model of policy-motivated agents.Then the first-order condition is almost precisely as we obtained above, namely dμ j dz j z * j 0.

3.31
Here dμ j /dz j z * j is a gradient pointing towards the policy preferred position of the agent.Thus we can make the identity δ dμ j dz j z j k∈A j a k dU k dz j 3.32 and infer that agent's marginal policy preference can be identified with a combination of the marginal preferences of the party activists.In principle such a model could be used to determine optimal resource-raising strategies in an environment as complex as a presidential election.

Methodology: A Spatial Model of the 2008 Election
The 2008 American National Election Study ANES introduced many new questions on political issues in addition to the existing set.Assignment of respondents into the "new" or "old" set was random, with 1,059 respondents assigned to the "new" condition and having completed the followup post-election interview.Due to both Hispanic and African-American voter oversampling and followup attrition, the postelection weights are used for all analyses.
As with all survey data, there was missing data for most of the survey items used in this study varying from 0 to 8.6% by item .We used multiple imputation to correct for missing data.
The post-election interviews asked repondents whom they voted for, if at all.Since we use a conditional logit model, which requires data for both respondents which we have and candidates which we only have for the major party candidates , we deleted 7 observations where respondents claimed to have voted for a presidential candidate other than McCain or Obama.The final sample size was thus 788 respondents.
To create the two-dimensional policy space, 29 survey items were selected to broadly represent the economic and social policy dimensions of American political ideology see Appendix C for question wording .Some issues were overrepresented amongst these item, with seven questions about abortion, four for gay rights and policies concerning aid for African-Americans, and two about immigration issues.To avoid the policy space measure becoming dominated by these issues, with abortion a particular concern, separate scales were estimated for each of these policy areas, either using confirmatory factor analysis or a simple average in the case of the two immigration items.see Tables 1, 2, and 3. Finally, a confirmatory factor analysis was run using these four scales in conjunction with the remaining 12 survey items.Only two factors achieved eigenvalues greater than one.Each factor corresponded closely to a priori conceptualizations of economic and social policy, with the possible exceptions of the equality and gun access items, which loaded more strongly on the economic rather than social dimensions.see Table 4 for factor loadings.These factor scores were used as measures of individual locations on the policy space.
The ANES also includes questions on seven qualities or traits associated with Obama and McCain.Confirmatory factor analysis run on the 14 items produced a two-factor solution which corresponded perfectly with the named candidate.The resulting factor scores were used as estimates of voter perceptions of the candidate's personal traits.see Table 5.
Respondents were coded as activists if they claimed to have donated money to a candidate or party and nonactivists if they donated money to no candidate.Table 6 gives the descriptive data for activists and nonactivists.
The survey also gave data on whether the respondent was African-American, female, working class, from the South, as well as the number of years of education and level of income.These data were used to construct the sociodemographic models of voting.
To calculate the presidential candidate positions, we took advantage of new survey questions which asked respondents to locate the positions of Obama and McCain on seven distinct issues.
These seven questions government spending, universal health care, citizenship for immigrants, abortion when nonfatal, abortion when gender incorrect, aid to blacks, and liberal-conservative were otherwise worded the same as the corresponding items from the policy issue questions.We ran two linear regression models on the voter economic policy and social policy factor scores using only the seven policy items corresponding in wording to the seven candidate location items as predictors.The estimated coefficients from these two linear models enabled us to construct equations to map the data from the candidate location questions onto the complete voter policy space.These equations were able to predict the scores of the voter policy space fairly accurately.The coefficient of determination R 2 for the economic and social policy equations were 0.63 and 0.75, respectively.To find McCain's ideal point, we simply took the average response for each of his seven candidate location questions, entered these into the economic and social policy prediction equations, and used the corresponding predicted values.We then repeated the process using Obama's candidate location questions.See Table 7 for the estimated positions of the two-candidates.
Figure 2 previously gave a plot of the voter distribution, while Figure 3 shows the perspective plot of the voter distribution.The plots of the activist positions are shown in Figure 4. Finally Figure 5 gives a smoothed contour plot of the probability density function of the voter distribution The outer contour line is at the 0.05 level, while Democrat activists are denoted in red, and Republicans in blue. .Figure 5 also shows the estimated threshold dividing likely Democrat candidate voters from Republican candidate voters.This partisan cleavage line was derived from a binomial logit model, designed to test the effects of each policy dimension on vote choice.We call this the pure positional binomial logit model.
According to the positional model, a voter i, with preferred position x i , y i is estimated to vote Republican with probability

Computation of Equilibria for the US 2008 Election
As above, we first assume that the utility of voter i for candidate j is given by the pure spatial model We assume that each candidate, j, chooses z j to locally strictly maximize the expected vote share subject to the position s of the other candidates.We essentially assume therefore that candidates cannot know precisely how voters choose but they can estimate the relationship between their own position, that of the competing candidate, and the aggregate vote total.As we shall see, the induced candidate preference correspondences are convex valued, indicating existence of Nash equilibria.The local pure strategy Nash equilibria LNE can be computed as follows.
The electoral covariance matrix for the sample is given by ∇ 0 0.80 −0.127 −0.127 0.83 .4.5 The principal component of the electoral distribution is given by the vector 1.0, −1.8 with variance 1.02, while the minor component is given by the orthogonal eigenvector 1.8, 1.0 with variance 0.61.All models in Table 8 are given with Obama as the base, so the results give the estimations of the probability of voting for McCain.The table also shows the loglikelihood, Akaike information criterion AIC , and Bayesian information criterion BIC for the various models.Model 1 in Table 8 shows the coefficients for the β-spatial conditional logit model in 2008 to be λ Obama , λ McCain , β 0, −0.84, 0.85 .

4.6
These parameters are estimated when the candidates are located at the estimated positions.We assume that the parameters of the model remain close to these values as we modify the candidates positions in order to determine the equilibria of the model.

4.8
The "convergence coefficient" is The sufficient condition for convergence to z 0 is that c < 1.Thus our estimate for c exceeds this critical value for convergence.However, the necessary condition is satisfied, and the determinant of C McCain is positive, while the trace is negative.Thus both of the eigenvalues of C McCain are negative, and the origin is a maximum of McCain's vote share function.The best response functions of the candidates are well behaved, so the LNE is a PNE.We also considered a spatial mode where the two axes had different coefficients, estimated to be β 1 0.8, β 2 0.92.Again, the determinant was found to be positive and trace negative, so the origin is also a maximum of McCains vote share function for this model.Simulation of these models confirmed that the joint origin was an LNE.
We now turn to the models with traits and sociodemographics.Table 8 also gives the various spatial models with these additional valences.
Comparison of the loglikelihoods for the pure spatial model and the model with traits shows that the perception of character traits is important for the statistical significance of the model.We use the Bayes' factors, or difference in loglikelihoods as a measure of statistical difference between two models 24 .For example, the spatial model with traits has a very large Bayes' factor of 114 over the pure traits model, while the spatial model with traits and sociodemographics has a Bayes' factor of 150 over the traits model.
Like the pure spatial model, the induced preference correspondences in the joint model with sociodemographic valences are all convex valued, indicating existence of a PNE.Simulation of the spatial model with sociodemographic valences showed that the PNE was one where both candidates adopt the origin.Although the sociodemographic valences add significance to the model, they do not affect the equilibrium positions.On the other hand, simulation of the full model with traits showed that the PNE was one where the candidates adopted the positions z Obama 0.10, −0.07 and z McCain 0.13, −0.12 .This equilibrium is only a slight perturbation from the joint origin.We can infer that though the traits add to the statistical significance of the stochastic model they do not significantly affect the equilibrium.Figures 6, 7, and 8 show the relationship of the perception of Obama and McCain traits.Figure 6 shows there is a slight negative correlation between these perceptions, while Figures 7 and 8 suggest that there are correlations between perceptions of candidate traits and vote choice.These weak correlations have only a slight effect on the strong convergence induced by the electoral pull.

International Journal of Mathematics and Mathematical Sciences
We can therefore write z el z el Obama , z el McCain 0.10, −0.07 , 0.13, −0.12 , since the joint model with traits has no activist valence terms.The argument of Section 3 implies that z el can be interpreted as the vector of "weighted electoral means" in a full model with activists.Assuming that the estimated candidate positions, z * , are in equilibrium with respect to the activist model, then by the balance condition, as given above, we obtain: is the pair of direction gradients, induced by activist preferences, acting on the twocandidates.The difference between z * and z el thus provides an estimate of the activist pull on the two-candidates.In this election, we estimate that activists pull the two-candidates into opposed quadrants of the policy space.The estimated distributions of activist positions for the two parties, in these two opposed quadrants as given in Figure 4 , are compatible with this inference.The means of these activist positions are: International Journal of Mathematics and Mathematical Sciences Miller and Schofield 16,22 propose a model where activists have eccentric utility functions.If we assume that the Democrat activists tend to be more concerned with social policy and Republican activists with economic policy, then we have an explanation for the candidate shifts from the estimated equilibrium.Note in particular that the distribution of activist positions for the two parties looks very different from the voter positions.The latter is much more heavily concentrated near the electoral origin, while the former tends to be dispersed.
When the candidates are at their estimated positions, the estimated vote shares, according to the traits model, are V Obama , V McCain 0.68, 0.32 .Since the actual vote shares are 0.52, 0.48 , it appears that the trait model may give a statistically plausible account for voter choice, but it does not provide, by itself, a good model of how candidates obtain votes.We suggest that the missing characteristic of this model of the election is due to the contributions of party activists.
Indeed, we suggest that the addition of activists to the model can account for the difference between convergent, equilibrium positions and divergent, estimated candidate positions, as obtained by Enelow and Hinich 25 and Poole and Rosenthal 26 , respectively, in their various analyses of US elections.
The section on the formal model presented an extension where there are many activists for each candidate.This model suggests that the activist pulls on the two candidates will be particularly influenced by those activists who have more extreme policy preferences.This inference is corroborated by the above analysis, since it appears that the Democratic activists are more concerned with social policy, while the Republican activists are more concerned with economic policy.
Since the above equation is obtained from a first-order gradient condition, then as shown in Section 3, we could also interpret dμ/dz z as the gradient obtained from a model where candidates have policy preferences derived from utility functions μ mc , μ ob .Duggan and Fey 23 have explored such a model for the case of a deterministic vote model, and obtained symmetry conditions for equilibrium.However, in such a model of policy seeking candidates, a candidate must be willing to adopt a losing position because of strong preferences for particular policies.In the activist model presented here, candidates act as though they have policy preferences, but these are induced from activist preferences, and are compatible with vote maximizing strategies by the candidates.Using the pure spatial model as presented in Table 10 for just three parties in Great Britain, the coefficients are λ LAB , λ LIB , λ CON , β 1979 −0.011, −1.574, 0.0, 0.272 , ρ LIB e 0 e 0 e 1.563 e 1.574  0.094.When all parties are located at the origin, the model suggests that the Liberals would gain just under 10% of the vote.In fact, in 1979 they gained 13.8%.The model suggests that the divergence of the two major parties from the origin allowed the Liberals to gain a further 4% of the vote.Since the electoral variance is 0.587 on the first economic axis and 0.444 on the second axis, with negligible covariance σ 1 .σwas statistically significant at the 1% level.The β-spatial coefficient was also significant at the 1% level.The loglikelihoods of the joint and pure spatial models were very similar, and the Bayes' factor of the joint models over the pure spatial model was 16.

5.1
According to the joint model, the weighted electoral mean for the Labor party should give greater weight to these voters who are manual laborers.Since these voters will tend to have preferred positions on the left of the economic dimension, we may infer that the Labor party activists will be positioned on the left of the economic dimension.However, the simulated LNE in the joint model was found to be the joint origin.
Thus the impacts of the sociodemographic valences on the simulated equilibrium are insignificant.Although these valences are useful in modeling the voting behavior of the electorate, they appear to have little significance on the policy positioning of the parties.If we assume that the party positions in Figure 9  As in the analysis of the United States, we find that the overall effect of the activist groups on the two major parties is to pull these parties apart.This leaves the Liberal Democrats in the center.With low valence, they only gain about 14% of the vote.

Concluding Remarks
Valence, whether intrinsic or based on electoral perceptions of character traits, is intended to model that component of voting which is determined by the judgments of the citizens.In this respect, the formal stochastic valence model provides a framework for interpreting Madison's argument in Federalist X over the nature of the choice of Chief Magistrate in the Republic.Schofield 30 has suggested that Madison's argument may well have been influenced by Condorcet's work on the so-called "Jury Theorem" 31 .However, Madison's conclusion about the "probability of a fit choice" depended on assumption that electoral judgment would determine the political choice.The analysis presented here does indeed suggest that voters' judgments, as well as their policy preferences, strongly influence their political choice.
Condorcet's work has recently received renewed attention McLennan 32 .This paper can be seen as a contribution to the development of a Madisonian conception of elections in representative democracies as methods of aggregation of both preferences and judgments.One inference from the work presented here does seem to belie Riker's arguments 33, 34 that there is no formal basis for populist democracy.Since voters' perceptions about candidate traits strongly influence their political decisions, the fundamental theoretical question is the manner by which these perceptions are formed.We argue that the low convergence coefficients in the majoritarian polities of the United States and Great Britain imply that the electorate is not polarized.Since candidates or party leaders do not adopt convergent positions, we can infer that democratic equilibria in these polities reflect the preferences of interest groups rather than the electorate at large.
On the other hand, empirical work on Israel 9 and Turkey 10 shows that the convergence coefficients in recent elections in these two polities are very large.The estimates are 3.98 for Israel in 1996 and 5.4 for Turkey in 2002.These estimates indicate that the polities in these countries are polarized.We can infer that parties in these polities diverge away from the electoral center, even in the absence of activism.

Figure 1 :
Figure 1: The balance loci in the United States.

Figure 2 :
Figure 2: Distribution of voter ideal points and candidate positions.

5 Figure 5 :
Figure 5: Smoothed voter distribution; candidate positions and activist positions in 2008.Democrat activists are red and Republican activists are blue, together with cleavage line.

Then β 1
According to the model M λ, β , the probability that a voter chooses McCain, when the McCain and Obama positions are at the electoral origin, − 2ρ McCain 0.85 × 0.4 0.34.The characteristic matrix essentially the Hessian of McCain's vote function at z 0 is C McCain 2β 1 − 2ρ McCain ∇ 0 − I 2 × 0.34 × ∇ 0 − I 0.68 ∇ 0 − I

Figure 8 :
Figure 8: Scatterplot of Obama and McCain traits for McCain voters.

Figure 9
Figure 9 shows the estimated positions of the three major parties in Britain in 1979, as obtained by Quinn et al. 27 , with the electoral distribution obtained from the survey data from Eurobarometer 28 and the party positions obtained from the middle level Elites Study 29 .Tables 9 a and 9 b give the election results for five parties in Great Britain and five in Northern Ireland.Using the pure spatial model as presented in Table10for just three parties in Great Britain, the coefficients are

Figure 9 :
Figure 9: Smoothed electoral distribution in Britain and party positions in 1979.
are the LNE in the full activist model the, we obtain

Table 1 :
Factor loadings for abortion.

Table 2 :
Factor loadings for gay issues.

Table 3 :
Factor loadings for Black issues.

Table 4 :
Factor loadings for economic and social policy.

Table 5 :
Factor loadings for candidate traits scores 2008

Table 7 :
Obama and McCain perceived positions.

Table 9 a
Election in Great Britain 1979 .

Table 10 :
British pure spatial model for 1979, with respect to the conservative Party.

Table 11 :
British Joint model in 1979 normalized with respect to the conservative party.
* prob < .0.05 27e pure spatial model of the 1979 election in Britain implies that the electoral joint origin is a vote share maximizing equilibrium.We next consider a joint multinomial conditional logit model, M λ, θ, β , with the sociodemographic variables used by Quinn et al.27.These variables are denoted income, religion relig ,manual labor manlab , size of town stown and education educ , respectively, in Table11.As Table11makes clear, only the group specific valence