Empirical analysis and agent-based modeling of Lithuanian parliamentary elections

In this contribution we analyze a parties' vote share distribution across the polling stations during the Lithuanian parliamentary elections of 1992, 2008 and 2012. We find that the distribution is rather well fitted by the Beta distribution. To reproduce this empirical observation we propose a simple multi-state agent-based model of the voting behavior. In the proposed model agents change the party they vote for either idiosyncratically or due to a linear recruitment mechanism. We use the model to reproduce the vote share distribution observed during the election of 1992. We discuss model extensions needed to reproduce the vote share distribution observed during the other elections.


Introduction
While any individual vote is equally important to determine the outcome of an election, the probability for a single vote to decide the outcome is extremely small. As utility of casting a vote in this context seems to be small and as there is at least a minor associated cost, it seems that a rational choice would be simply not to vote.
Yet this simplistic context may be further extended to provide a sound reasoning for why people vote. Some argue that people vote to show a support for the political system [1] or to avoid a risk of regret [2], there might also be a social cost for abstension [3]. Some of the aforementioned works as well as numerous other earlier game theoretic approaches, such as [4][5][6], had shown promise that game theoretic voting models would soon provide rich and sophisticated explanation for the voting behavior. Yet further research have shown that general game theoretic models of the voting behavior with pure Nash equilibrium, and even mixed Nash equilibrium, might be impossible unless under certain specific conditions [7][8][9]. But people are rarely well informed and ideally rational, as they are not homo economicus nor are they Laplace's demons, [10][11][12].
The above context provides a good reasoning to consider the modeling of the voting behavior from the perspective of psychology [13][14][15][16][17][18][19][20]. The main drawback of these psychologically motivated models is that they usually are rather complicated, at least when compared with game theoretic models, hard to implement and understand the obtained results. Also usually these models involve a large number of parameters, which may lead to overfitting the data or different parameter sets providing similar results. Notably recently a psychologically motivated model was successfully used to predict the Polish election of 2015 [17,19].
Another possible approach to the modeling of the voting behavior has its roots in statistical physics. This perspective could be neatly summarized by quoting the Boltzmann's molecular chaos hypothesis [21]: The molecules are like so many individuals, having the most various states of motion, and the properties of gases only remain unaltered because the number of these molecules which on the average have a given state of motion is constant.
During the last three decades physicists have approached social and economic systems from this perspective, looking for universal laws and important statistical patterns, while proposing simple theoretical models to explain the empirical observations. This effort by a numerous more or less prominent physicists became what is now known as sociophysics and econophysics [22][23][24][25][26][27]. The opinion dynamics, and the voting behavior as a proxy of opinion, is still one of the major topics in sociophysics [25][26][27][28][29][30].
This paper contributes to the understanding and describtion of the voting behavior from a couple of different point of views. First of all Lithuania is a young democratic nation and the analysis of the Lithuanian parliamentary elections' data sets seems interesting in the context of similar analyses carried out on the data sets gathered in the mature democratic nations, such as Brazil, England, Germany, France, Finland, Norway or Switzerland [31][32][33][34][35]. In the political science and sociological literature one would find numerous previous approaches to the Lithuanian parliamentary elections, e.g., [36][37][38][39][40]. Yet most of these approaches had a quite different perspective, most of these papers discuss general electoral trends in the context of social, demographic and economic changes. For this kind of discussion a highly aggregated (e.g., on a municipal district level) data sets prove to be sufficient, while in this paper we will consider the data on the smallest scale available (polling station level).
Another key contribution of this paper is a simple agent-based model, which is used to explain the statistical patterns uncovered during the empirical analysis. The proposed model is built upon a two-state herding model originally proposed by Kirman in [41]. In the recent years the two-state herding model was quite frequently and rather successfully applied to reproduce the statistical patterns observed in the empirical data of the financial markets [42][43][44][45][46][47][48]. In this paper we extend the two-state herding model to allow the agents to switch between more than two states. We discuss the similarity between the proposed model and the well known Voter model [49][50][51][52][53][54].
Our approach is unique in a sense that we consider reproducing the parties' vote share distribution observed in the Lithuanian parliamentary elections. In the previous literature there was only a single attempt to model, and predict, popular vote (aggregated vote share) in the Lithuanian parliamentary elections using regression model, see [55]. Numerous previous sociophysics papers have mainly ignored the vote share distribution, likely due to belief that the vote share distribution reflects electoral sensitivity to the policies promoted by the parties and less to the endogenous interactions between the voters (a similar argument is given in [31]). To some extent this belief is supported by game theoretic models, see [5,6]. Notably there were a couple of sociophysics papers considering two-state agent-based models of the voting behavior, e.g., voting for or against certain proposals in a referendum [29]. It is interesting to note that recently the binary models considered in [29] were used to construct a simple financial market model [56], while we start from the financial market model [47] and move towards the model of the voting behavior. While the vote share distribution was mainly ignored in the previous sociophysics papers, the other statistical patterns arising during the many different elections were considered for the empirical analysis and modeling: a branching process model was proposed to reproduce the individual politician, nominated via open party list, vote share distribution [31], a network model was used to explain how people decide whether to take part in the municipal elections [34], a diffusive model for the turn-out was proposed in [33,35]. One of a more similar approaches was taken by [57], in which a generative model was proposed to reproduce the rank-size distribution of parties' vote share. Another similar approach, taken by [54], considered the vote share distribution observed in the elections of House of Representatives in Japan. Latter approach [54] also used a mean-field Voter model to explain the empirical observations. This paper is organized as follows. In Section 2 we discuss the Lithuanian parliamentary election system as well as carry out the empirical analysis. Next, in Section 3, we briefly introduce the two-state herding model and extend it to account for the multiple states. Afterwards, in Section 4, we apply the extended model to reproduce the statistical patterns uncovered during the empirical analysis. Finally we end the paper with a discussion (see Section 5).
2 Empirical analysis of the data from the Lithuanian parliamentary elections Let us start by discussing the parliamentary voting system used in Lithuania. Lithuanian parliamentary elections are held every 4 years. During every election all of the 141 parliamentary seats are distributed using two-tier voting system. Namely, 71 seats in the parliament are taken by elected district representatives (there are 71 electoral districts in total), while the other 70 seats are distributed according to the popular vote among the parties that received more than 5% of the popular vote. In other words, each individual voter is able to vote for a single candidate to represent his electoral district (the two round voting system is used) and for an open party list (listing up to 5 individuals from that list). Every electoral district has multiple polling stations (their number varies over the years), which further subdivide the electoral districts. Every eligible voter is assigned to a single polling station based on location of their residence. Each of the polling stations may have widely different number of the assigned voters -some of the smallest polling stations have as few as 100 assigned voters, while the largest have up to 7000 assigned voters.
In this paper we consider only votes cast for the open party lists in each of the local polling stations. We do not analyze ranking of the individuals on the party lists (similar analysis was carried out in, e.g., [31]), voting for the representative of electoral district (similar data was previously considered in, e.g., [54,57]) nor turnout rates (modeling and analysis of which was previously considered in, e.g., [33,35]). We ignore votes cast in the polling stations abroad or votes cast by post. In the analysis that follows we consider only parties that were elected to the parliament (total vote share larger thant 5%), while all other less succesful parties were combined into a single party, which we have labeled as the "Other" party.
In this paper we consider the three data sets from the Lithuanian parliamentary elections of 1992, 2008 and 2012. All of the original data sets were made publicly available by the Central Electoral Commission of the Republic of Lithuania (at https://www.rinkejopuslapis.lt/ataskaitu-formavimas). We have downloaded the original data sets from the website on August 31, 2016. During the preliminary phase of the empirical analysis we have found some small inconsistencies within the original data. The original 1992 election data set had seven polling stations with incorrect total vote counts. We have identified three pairs of polling stations which were, most likely, swapped among themselves as the number of missing votes in the one polling station matched the number of surplus votes in the other. While we have dealt with the remaining polling station by simply adjusting the total vote count to match the sum of votes cast for each of the parties in that polling station. We have also found that data from 51 (out of 2034) polling stations is missing (the data was filled with zeros) from the original 2008 election data set. We have not identified any issues with the original 2012 election data set. These minor inconsistencies would not impact the overall result of any of the considered elections nor the results reported in this section. We have made the modified data sets available online at https://github.com/akononovicius/lithuanian-parliamentary-election-data.
In the analysis that follows we consider the parties' vote share distribution across each polling station. The vote share, v ij , is defined as total number votes cast for the party V ij divided by the total number of votes cast in that polling station: here index i varies over the parties (K is the total number of parties participating in the election) and index j varies over the polling stations. We consider the probability and rank-size distributions of v ij across all of the polling stations during the same parliamentary election. The probability distribution is estimated using the standard probability density functions (abbr. PDF). The rank-size distributions are often used if the data varies significantly in scale, e.g., word occurrence frequency [58], earthquake magnitudes [59], city sizes [60], cross country income distributions [61]. When using this technique the original empirical data is sorted in descending order. Afterwards the sorted data is plotted with the rank being the abscissa coordinate and the actual value being the ordinate coordinate. In our case we sort the parties' vote shares, v ij , for each party i  Table 1.
separately to produceṽ ik , for whichṽ is true. In the above M is a total number of polling stations, so that index k represents correspond the rank. Note that in this representation the same polling station may be ranked differently for diffferent parties (namely k might be different for the same polling station for different i). Evidently these two approaches, PDFs and rank-size distributions, are inter-related, but using both of them allows to uncover different statistical patterns.

The parliamentary election of 1992
The parliamentary election of 1992 was held in 2061 local polling stations. 17 parties competed in the parliamentary election, but only 4 of them were able to obtain more than 5% of popular vote. For the sake of simplicity we will use the following abbreviations for these parties: SK -"SąjūdÂžio koalicija", LSDP -"Lietuvos socialdemokratų partija", LKDP -"Lietuvos KrikÂščioniu˛demokratu˛partijos, Lietuvos politiniu˛kaliniu˛ir tremtiniu˛sąjungos ir Lietuvos demokratų partijos jungtinis sąraÂšas", LDDP -"Lietuvos demokratinė darbo partija". We have combined the other 13 parties to form the "Other" party (abbr. O) and considered the votes cast for the combined "Other" party alongside the votes cast for the 4 main parties.
As you can see from Figs. 1 and 2 as well as Table 1 all of the parties with a notable exception of the "Other" party are very well fitted by assuming that data is distributed according to the Beta distribution, PDF of which is given by The "Other" party stands out, because it includes "Lietuvos lenku˛sąjunga" party (abbr. LLS; en. Association of Poles in Lithuania). The LLS party had heavily relied on the support of the ethnic minorities, which were spatially segregated. Namely, the representatives of ethnic minorities mostly live in larger cities and Vilnius County. The observed spatial segregation could easily cause the segregation observed in the voting data.
In Fig. 3 we confirm this intuition by splitting the LLS party away from the "Other" party. After the split the rank-size distribution of the "Other" party is well approximated by the Beta distribution with parameters To provide a good fit for the LLS rank-size distribution we assume that  Table 1.  underlying data is distributed according to a mixture of the two Beta distributions: one (95% of points) with parameters α 1 = 0.08, β 1 = 10 and the other (5% of points) with parameters α 2 = 1.22, β 2 = 1.37.
The parties' vote share rank-size distributions were previously considered in [57]. Unlike in this paper, Fenner and others assumed that the parties' vote share is distributed according Weibull distribution, they have obtained rather good fits for the UK election data. Yet fits obtained here, assuming Beta distribution, are also rather good. We believe that Beta distribution is superior for this purpose from the theoretical point of view. Namely, Beta distribution has reasonable support, probabilities are defined for v ∈ [0; 1], while Weibull distribution needs to be arbitrary truncated, as probabilities are defined for v ∈ [0; +∞). Interestingly Fenner and others also use a mixture distribution (of two Weibull distributions) to fit the UK election data. Similar observations were also made when studying Brazilian presidential election data [62]. In [63] it was noted that multiple different distributions, Weibull, log-normal and normal, provide good fits for the distribution of religions' adherents. To discriminate between the possibilities a deeper theoretical insight is needed.

The parliamentary election of 2008
The parliamentary election of 2008 was held in 2034 polling stations, yet we have only 1983 points in the data set as the data from 51 polling stations is missing. In this election a slightly smaller number of parties had participated (16), but now 7 of them were able to obtain more than 5% of popular vote. For the sake of simplicity we will use the following abbreviations for them: LSDP -"Lietuvos socialdemokratų partija" (formed by LSDP and LDDP, which participated in the 1992 election), TS-LKD -"Tėvynės sąjunga -Lietuvos krikÂščionys demokratai" (could be considered to be a successor of the SK and LKDP, which participated in the 1992 election), TPP -"Tautos prisikėlimo partija", DP -"Koalicija Darbo partija + jaunimas", LRLS -"Lietuvos Respublikos liberalu˛sąjūdis", TT -"Partija Tvarka ir teisingumas", LiCS -"Liberalu˛ir centro sąjunga". As previously all other parties (9 of them) were combined to form the "Other" party (abbr. O). We have considered the votes cast for the combined "Other" party alongside the votes cast for the 7 main parties.
As is evident from Figs. 4 and 5 in the 2008 election the vote share distributions of the most of the parties are well fitted by a mixture of two Beta distributions. Although now there is no clear-cut explanation for this phenomenon, we would like to conjecture that this observation indicates that other spatial segregations (e.g., by income) of voters in Lithuania have started to play an important role. Namely, most of the parties could now be identified with specific socio-economic classes, e.g., some party starts favoring higher income voters (gaining the support in cities), consequently losing the support of poorer voters (losing the support in rural areas).

The parliamentary election of 2012
The parliamentary election of 2012 was held in 2017 polling stations (thus we have 2017 data points). 18 parties had participated in the election, while 7 of them were able to obtain more than 5% of popular vote. For the sake  Table 2).  The empirical values are shown as black circles, while theoretical fits using a mixture of Beta distributions are shown as solid curves (the values of the parameters are given in Table 2).  Table 3).
Once again, as well as in the 2008 parliamentary election data set, it is evident that the vote share distributions of the most of the parties in the 2012 parliamentary are also well described by a mixture of two Beta distributions (see Figs. 6 and 7).

A multi-state agent-based model of the voting behavior
In this section we propose a simple multi-state agent-based model, which is to describes the voting behavior within a small non-specific geographic region covered by a single polling station. Unlike in some previous approaches [13][14][15][16][17][18][19][20], our aim is not to incorporate complex ideas from psychology, but to reproduce the empirical parties' vote share distribution. It is known that an agent-based herding model proposed by Alan Kirman, in [41], reproduces Beta distribution. So let us start by introducing Kirman's herding model.
Originally in [41] Kirman noted that biologists and economists observe similar behavioral patterns. Apparently both ants and people show interest in things which are more popular among their peers regardless of their objective properties [64][65][66][67][68]. In [41] a simple two-state model was proposed to explain these observations. The empirical values are shown as black circles, while theoretical fits using a mixture of Beta distributions are shown as solid curves (the values of parameters are given in Table 3). In the contemporary interpretations of the Kirman's model the following mathematical form of the one step transition probabilities is used (see [42][43][44][45]): here N is a total number of agents acting in the system, X is a total number of agents occupying the first state (consequently there are N − X agents occupying the second state), σ i are the perceived attractiveness parameters (may differ for different states), h is a recruitment efficiency parameter and ∆t is a relatively short time step.
Let us now show that the two state model produces Beta distribution. This can be done by using birth-death process formalism, which is well described in [69]. The dynamics of x = X N (let N be large) can be alternatively described by the following master equation: here ω(x, t) is time-dependent distribution, π ± (x) are the transition rates per unit of time (defined as π ± (x) = p(X→X±1) N 2 h∆t ), E and E −1 are the one step increment and decrement operators. Let us expand these operators in Taylor series up to the second order term: here ∆x = 1/N . Putting these Taylor expansions back into the master equation as well as taking small time step limit, yields the following Fokker-Planck equation: where ε i = σ i /h. Steady-state distribution of this Fokker-Planck equation can be obtained by solving: In general case the solution of this ordinary differential equation is given by: where C 0 is normalization constant. In our specific case we obtain a PDF for the Beta distribution, Be (ε 1 , ε 2 ), As in the typical parliamentary election there are more than two competitors, we need to generalize the model to incorporate more than two states. From the conservation of the total number of agents N , we have: Assuming that the right hand side probabilities have the same form as Eqs. (4) and (5), we obtain: These one step transition probabilities for X i depend not only on X i (as in the two-state model), but also on all the other X j (here j = i). To circumvent this potentially cumbersome dependence one needs to assume that σ ij = σ j and h ij = h (where j = i). The first assumption, σ ij = σ j , means that the perceived attractiveness of any party does not depend on who is attracted to it. While the second assumption, h ij = h, means that the recruitment mechanism is uniform (symmetric and independent of interacting agents). Note that these assumptions contrast with the assumptions underlying the bounded confidence model [15,16]. Yet these assumptions are needed to ensure that x i = Xi N is distributed according to the Beta distribution. After making these assumptions we can further simplify the one step transition probabilities: here σ −i = j =i σ j is the total attractiveness of all of the competitors of the party i. By the analogy with the two state model it should be evident that Unlike the two-state model, it seems impossible to provide a useful general aggregated macroscopic description, using the Fokker-Planck equation or a set of stochastic differential equations, of the generalized M -state model. In [70] a three-state model was considered and given an aggregated macroscopic description, by a system of two stochastic differential equations, yet it was possible only under specific conditions.
The one step transition probabilities, while describe agent's behavior, are still aggregate description of individual agent level dynamics. So some discussion on what do the Eqs. (18) and (19) represent is relevant. Selecting one random agent, per time step, and setting his switching probability to ε −i ∆t s gives us idiosyncratic behavior term, X i ε −i ∆t s . While selecting another random agent and, if the both agents vote for the different parties, allowing the first agent to copy the second agent's voting preference gives us the recruitment term, X i (N − X i )∆t s . This description of agent level dynamics could be further generalized to allow the model to be run on the randomly generated networks [71,72]. This agent-based algorithm might be seen to be a special case of the well known Voter model [25, 49-51, 53, 54].

The modeling of the parliamentary election
Now let us apply the proposed model to reproduce the empirical vote share PDFs and rank-size distributions, which were observed during the 1992 parliamentary election. Here we consider only the simplest case by ignoring the "Other" party and in this way removing distortions caused by the ethnic segregation (see the discussion in Section 2.1). We do not consider the segregated data, the full data of the 1992 parliamentary election or the data of the 2008 and 2012 parliamentary elections, as to account for the vote segregation a more sophisticated approach is needed. Namely, in order to account for the full complexity of the empirical data additional information, such as the spatial polling data or the socio-demographic data, would be needed. Although, in general, one could try to infer the correct partition of the polling stations, where the vote share distribution of each partion would be modelled using the proposed model using the same parameter set.
In Figs. 8 and 9 we compare the vote share PDFs and the rank-size distributions numerically generated by the proposed model, Eqs. (18) and (19), and the respective empirical vote share distributions of the 1992  18) and (19) (the same model run as in Fig. 9). The parameters of the model are given in Table 4. parliamentary election. Yet we cannot use the previously empirically estimated Beta distribution parameters, by assuming α i = ε i and β i = ε −i , as model parameters, because an important model implication, l =i α l = β i , doesn't hold for the empirical data. Yet one may obtain the parameter values by fitting the empirical data with the model implication in mind.
As you can see from Figs. 8 and 9 as well as Table 4 the proposed model excellently fits three of the four parties. While for the LSDP the fit is not as good as one might expect. Note that model overestimates the success of LSDP (the solid curve is above black circles for small k in subfigure (d)) and underestimates the electoral support of LDDP (the solid curve is below black circles for small k in subfigure (h)). Thus it is likely that the LSDP had small perceived chance to win the 1992 election (their aggregated vote share was near 5%), thus voters who would actually consider voting for the LSDP cast their votes for the other left-wing party, which had better perceived chance at winning the 1992 election (aggregated vote share of LDDP was above 40%).
We can check this intuition by violating the assumption that the perceived attractivenes should not depend on the current state of the agent (agent's currently supported party), namely instead of ε i we now have ε ji . Previously this assumption was needed to ensure that vote share is distributed according to the Beta distribution.
Let us introduce a single exception, if j corresponds to LSDP and i corresponds to LDDP, then ε ji can be differ in value from ε i . This gives us the following matrix of ε ji values (the numeric indices are assigned according to Note that the diagonal elements of the matrix ε are set to zero, as it is not possible to switch to the state the The empirical values are shown as black circles, while solid curves represent numerical results obtained from the proposed model, driven by Eqs. (18) and (19) (the same model run as in Fig. 8). The parameters of the model are given in Table 4. agent is already in. As you can see in Figs. 10 and 11, the fit provided by the model has significantly improved for the LSDP by just by making this small change.

Conclusions
In this paper we have considered the parties' vote share PDFs and the rank-size distributions observed during the Lithuanian parliamentary elections. Namely, we have considered the 1992, 2008 and 2012 parliamentary elections' data sets. We have determined that the empirical vote share PDFs and the rank-size distributions are rather well fitted by assuming that the underlying distribution is the Beta distribution or a mixture of two Beta distributions. Reviewing literature we have found that [54,57,62,63] have reported somewhat similar results. In [57,62] it was reported that the empirical data is rather well fitted by a mixture of Weibull distributions.
In [63] it was noted that multiple different distributions, Weibull, log-normal and normal, provide good fits for the distribution of religions' adherents. We argue that the Beta distribution is more suitable as it has correct support (probabilities are defined for v ∈ [0; 1]; although the other distributions could be arbitrary truncated) and it arises from a simple easily tractable agent-based model. From our empirical analysis it follows that the mixture of Beta distributions is needed to fit the data if there is underlying spatial segregation of the electorate. In [54] it is also reported that the empirical data is rather well fitted by Beta distribution, which arises from a noisy Voter model. Yet [54] did not observe the vote share segregation pattern.
Having in mind the stark difference between the psychologically motivated models, such as bounded confidence model [15,16], we would like to point out that the observed statistical patterns as well as the applicability of the model could arise due to numerous unrelated reasons. One of the alternative possibilities would be the people mobility patterns. In the proposed model a single agent switching from supporting one party to supporting another party, could also represent one agent moving away from the modeled geographic location, due to social or economic reasons, and another agent, holding different political views, moving in. A similar idea was raised in [53].
In the nearest future we will consider spatial modeling of the Lithuanian parliamentary elections. Another possible approach, with forecasting possibility, could be considering a temporal regression model for the attractiveness parameters of the proposed model, ε i , as well as the estimation of the agent interaction rates, h.