An artificial neural network (ANN), adaptive neurofuzzy inference system (ANFIS) models, and fuzzy rulebased system (FRBS) models are developed to predict the attendance demand in European football games, in this paper. To determine the most successful method, each of the methods is analyzed under different situations. The Elman backpropagation, feedforward backpropagation, and cascadeforward backpropagation network types are developed to determine the outperforming ANN model. The backpropagation and hybrid optimization methods are used for training fuzzy inference system (FIS) to determine the outperforming ANFIS model. The fuzzy logic model is developed after experimenting different forms of membership functions. To this end, the data of 236 soccer games are used to train the ANN and ANFIS models, and 2017/2018 season’s data of these clubs are used to test all of the models. The results of all models are compared with each other and real past data. To assess the performance of each model, two error measures that are Mean Absolute Percent Error (MAPE) and Mean Absolute Deviation (MAD) are implemented. These measures reveal that the ANN model that has Elman network type outperforms the other models. Finally, the results emphasize that the proposed ANN model can be effectively used for prediction purposes.
In recent years, the economic impact of sports events has grown significantly [
The usual approach used in previous studies in the sports literature is based on predicting a linear demand equation. For a detailed literature review, readers can refer to Borland and MacDonald [
The central aim of this study is to evaluate the performance of three alternative forecasting techniques that are NN, ANFIS, and fuzzy logic and reveal the most accurate forecasting technique for predicting attendances of European football games. For this purpose, real data of soccer games are used. By using past real data, the attendances of soccer games are forecasted by each of the methods. The attendance rates of soccer games were predicted depending on five effective variables that are the day of the game, the distance in terms of miles between stadia of home and away clubs, uncertainty of outcome, and the home and away teams’ performances. These factors are determined after examining the literature in detail and interviewing with experts. The uncertainty of outcome as a determining factor is included in this study since it covers the effect of significant factors such as injured players, suspended players, and so on. Each of the models was tested comprehensively under different scenarios. The forecasting results of these three methods were compared to each other and past real data. The proposed models are not limited to forecasting demand of European football games as they can be utilized in a variety sports disciplines by making some alterations.
Fuzzy logic is a computation and reasoning system where the objects of computation and reasoning are classes with fuzzy boundaries. In fuzzy logic, everything is to be a matter of degree [
Neural networks simulate the functioning of the biologic neurons. NN has the ability to learn from experiences and information to enhance its performance [
ANFIS was first introduced by Jang [
This study extends the literature in the following ways. First, differently from the previous study [
Data are required to train and validate the ANN and ANFIS models and test all of the models including the fuzzy logic. Thus, the attendance data of a Spanish football club, FC Barcelona, and two Italian football clubs, AC Milan and FC Inter, are obtained. The data of 236 games of the three clubs are used to predict the attendance demand rates of their 2017–2018 season games. Data sources of the input variables are given in Table
Data sources of input variables.
Input variable  Data source 

Performance of the home team  The official websites of home teams 
Day of the game  The official websites of home teams 
Performance of away teams  The official websites of away teams 
Distance 

Uncertainty of outcome 

The data are standardized to obtain better forecasting results. This is fulfilled by employing maximum linear standardization whose formula is given as follows [
Fuzzy logic is a modeling technique in which two human capabilities, which are the reasoning ability and the ability to fulfill different mental tasks, are tried to be mechanized [
Fuzzy logic system.
To design a fuzzy rulebased model, the following steps are essential. First, the input and output variables are determined. Second, the fuzzy sets are determined for all variables. Third, the membership functions of all fuzzy inputs and outputs are created. There are different kinds of membership functions such as triangular, Gaussian, trapezoidal, and so on. Since type of membership functions impact the design of the fuzzy logic controller, they should be chosen carefully. Fourth, the fuzzy IFTHEN rules are generated to relate input and output variables. Fifth, the inference process is set. The two most common FIS types are the Sugeno and Mamdani. There are some differences between them. The output of the Sugeno is linear or constant, but the output of Mamdani comprises of membership functions that may be trapezoidal, triangular, and so on. Additionally, Sugeno is trained using data set, but Mamdani does not require a data set and relies on expert knowledge. In this study, the Mamdanitype fuzzy inference system is preferred in the rulebased fuzzy logic model by utilizing the expert knowledge. The Mamdani type comprises of the following processes. The input variables are fuzzified so that to the degree they fit to each of the fuzzy set is established over membership functions. Next, an “AND” or “OR” fuzzy operator is used to combine the inputs to provide a single number. Next, the rule’s weight is set before the implication that is implemented for each rule. Next, all of the fuzzy rules are combined and evaluated. The outputs are aggregated by the aggregation methods including max (maximum), probor (probabilistic OR), and sum (simply the sum of each rule’s output set). Thus, the outputs of each rule are combined into a fuzzy set that need defuzzification [
The ANFIS can solve any kind of complex and nonlinear problems effectively by combining the advantages of the NN and fuzzy logic. It combines numerical and linguistic knowledge by utilizing fuzzy methods. It also uses the ANN’s ability of data classification and pattern identification. Additionally, the ANFIS causes less memorization errors and is more observable to the user compared to the ANN.
The ANFIS is fundamentally the rulebased fuzzy modeling. Fuzzy rules are formed through the training process [
To describe the architecture of ANFIS, which is shown in Figure
The architecture of the ANFIS model with two inputs, one output, and two rules.
The ANFIS architecture has five layers that can be described as follows, where
In the layer 1, every node is defined by the function as
In the layer 2, every node calculates the firing strength of a rule by multiplication:
In the layer 3, evaluated firing strengths are normalized:
In the layer 4, node
In the layer 5, the single node computes the overall output of the ANFIS:
The ANFIS consists of backpropagation and hybrid learning algorithms that focus on minimizing the error between the observed and forecasted data [
The neural network, which is one of the Artificial Intelligence (AI) techniques, can be defined as a computational tool whose processing is similar to the behavior of biological neurons. In other words, the NN may be described as a mathematical demonstration of the individuals’ neural architecture [
By depending on the disposition of neurons and the composition of the layers, the architectures of the ANN is classified as recurrent NN, singlelayer feedforward NN, and multilayer feedforward NN [
Multilayer Perceptron (MLP) network architecture.
The optimal NN structure is formed after trial and errors in general [
The performance of forecasting results of each model is evaluated using the MAPE and MAD, which are calculated by the following formulae:
For both statistical indicators, MAPE and MAD, smaller values usually indicate more effective results. In this study, the MAPE and MAD values for each model are obtained by comparing the predicted results with the real past data.
In order to design effective forecasting models, the selection of input variables is one of the fundamental issues in the modeling system. The input variables should be chosen in a way that the model relates input and output variables effectively and provides accurate results. To predict attendances of European football games, five input variables have been identified by evaluating the literature thoroughly and expert knowledge. Considering the characteristics of European football games, the following effective factors are chosen. The first one is the ground distance between the home and away teams’ stadia [
The structure of the developed fuzzy rulebased model is illustrated in Figure
The structure of the proposed fuzzy rulebased model.
Fuzzy sets of the variables.
Fuzzy sets of input variables  Fuzzy sets of output variable  

Day of game  Distance (miles)  Performance of home team  Performance of away team  Uncertainty of outcome  Attendance rate 
Early  Small  Low  Low  Low  Very low 
Middle  Medium  Medium  Medium  Medium  Low 
Late  Large  High  High  High  Medium 
—  —  —  —  —  High 
—  —  —  —  —  Very high 
Next, the membership functions of all variables are formed. The most appropriate type is generally chosen after experimenting different types. The membership functions of the proposed model are shown in Figure
Membership functions formed for the input variables (a) day of game, (b) distance, (c) performance of home team, (d) performance of away team, (e) uncertainty of outcome, and the output variable (f) attendance rate.
If (
As seen in the fuzzy rule, the five conditions are related to each other with AND operators. As a FIS, the Mamdanitype inference system is chosen as explained before. Finally, to obtain crisp values, the centroid method that takes the center of the area under the curve is chosen as the defuzzification method.
In the ANFIS models, the subtractive clustering is chosen to generate FIS since the prediction accuracy obtained is higher compared to the grid partitioning. Three membership functions are formed for each input variable. Two different optimization methods, which are hybrid and backpropagation, are used for training FIS. Thus, different number of epochs, which are 100 and 1000, respectively, is established for accurate prediction results. The parameters for subtractive clustering and features of the ANFIS models are given in Table
The properties of the proposed ANFIS models.
Parameter  Description/value 

Type of fuzzy inference system  Sugeno 
Optimization method for training FIS  Subtractive clustering 
Range of influence  0.87 
Squash factor  1.1 
Accept ratio  0.5 
Reject ratio  0.1 
Input number  5 
Output number  1 
Number of input membership functions  3, 3, 3, 3, 3 
Optimization methods  Hybrid; backpropagation 
Training epoch numbers  100; 1000 
The structure of the proposed ANFIS models.
In the ANN models, three different network types, which are Elman backpropagation, feedforward backpropagation, and cascadeforward backpropagation, are used to determine the outperforming one. For all network types, one hidden layer is designed since the number of inputs, five, is not high. The designed Elman, feedforward, and cascadeforward backpropagation ANN models are shown in Figures
The proposed onelayer Elman NN model (10 neurons).
The proposed onelayer feedforward NN model (15 neurons).
The proposed onelayer cascadeforward NN model (20 neurons).
The features of the ANN models.
Network type  Elman  Feedforward  Cascadeforward 

Number of layers  2  2  2 
Input  5  5  5 
Neurons  Hidden: 10; 15; 20  Hidden: 10; 15; 20  Hidden: 10; 15; 20 
Output  1  1  1 
Training algorithm  Levenberg–Marquardt  Levenberg–Marquardt  Levenberg–Marquardt 
The proposed models are designed and implemented in MATLAB R2017a. The observed data are calculated by dividing the number of spectators attended to the game by the stadium capacity.
The proposed fuzzy logic model provides the following predicted attendance rates as shown in Table
The observed and predicted attendance rates of the proposed fuzzy logic model.
Number  Game  Observed attendance rate  Predicted attendance rate  Difference 

1  BarcelonaEspanyol  0.733  0.7076  0.026 
2  BarcelonaMalaga  0.703  0.7076  −0.004 
3  BarcelonaSevilla  0.695  0.7075  −0.012 
4  BarcelonaGetafe  0.746  0.7075  0.038 
5  BarcelonaGirona  0.860  0.7073  0.152 
6  MilanSPAL  0.567  0.7076  −0.141 
7  MilanRoma  0.771  0.9046  −0.134 
8  MilanBologna  0.494  0.5076  −0.014 
9  MilanAtalanta  0.572  0.6255  −0.054 
10  MilanCrotone  0.532  0.4545  0.077 
11  MilanLazio  0.635  0.7075  −0.072 
12  MilanSampdoria  0.582  0.5821  0.000 
13  InterSPAL  0.715  0.7076  0.008 
14  InterGenoa  0.625  0.7076  −0.082 
15  InterMilan  0.979  0.7827  0.196 
16  InterSampdoria  0.680  0.7076  −0.027 
17  InterAtalanta  0.652  0.7075  −0.055 
18  InterChievo  0.667  0.7072  −0.040 
19  InterUdinese  0.650  0.7076  −0.058 
20  InterLazio  0.773  0.8226  −0.050 
21  InterRoma  0.713  0.8847  −0.172 
22  InterCrotone  0.585  0.6737  −0.089 
23  InterBenevento  0.589  0.6153  −0.026 
To evaluate the performance of the model, the following MAPE and MAD indicators are computed as given in Table
MAPE and MAD values of the proposed fuzzy logic model.
Error measures  Fuzzy logic 

MAD  0.07 
MAPE  0.1 
As mentioned before, two ANFIS models are designed. The prediction results of the models are shown in Table
Predicted attendance rates of the ANFIS models.
Number  ANFIS–hybrid  ANFIS–backpropagation  

Predicted attendance rate  Difference  Predicted attendance rate  Difference  
1  0.7153  0.018  0.6534  0.080 
2  0.7431  −0.040  0.8420  −0.139 
3  0.8172  −0.122  0.7728  −0.078 
4  0.7029  0.043  0.7108  0.035 
5  0.8300  0.030  0.7590  0.101 
6  0.4912  0.075  0.4628  0.104 
7  0.7612  0.010  0.7188  0.052 
8  0.4381  0.056  0.4377  0.056 
9  0.7510  −0.179  0.7506  −0.179 
10  0.4522  0.079  0.4211  0.110 
11  0.5932  0.042  0.5678  0.067 
12  0.4967  0.086  0.4756  0.107 
13  0.7521  −0.037  0.6733  0.042 
14  0.5960  0.029  0.5958  0.030 
15  0.9718  0.007  0.9815  −0.003 
16  0.6102  0.070  0.5702  0.110 
17  0.6037  0.048  0.5730  0.079 
18  0.6050  0.062  0.6022  0.065 
19  0.6638  −0.014  0.6453  0.005 
20  0.8284  −0.055  0.8537  −0.081 
21  0.7902  −0.077  0.7295  −0.016 
22  0.6445  −0.060  0.5682  0.017 
23  0.5989  −0.010  0.5328  0.057 
To evaluate the performance of both models, the MAPE and MAD values are obtained compared to the observed data as shown in Table
MAPE and MAD values of the proposed ANFIS models.
Error measures  ANFIS–hybrid  ANFIS–backpropagation 

MAD  0.05  0.07 
MAPE  0.09  0.11 
Nine ANN models are designed in total. Three different network types, in which each has three different numbers of neurons, provide different attendance predictions as shown in Table
Predicted attendance rates of the ANN models.
Number  Predicted attendance rates  

Elman (10 n.)  Elman (15 n.)  Elman (20 n.)  Feedforward (10 n.)  Feedforward (15 n.)  Feedforward (20 n.)  Cascade (10 n.)  Cascade (15 n.)  Cascade (20 n.)  
1  0.6824  0.8157  0.6821  0.6965  0.6513  0.5026  0.7222  0.8148  0.8153 
2  0.7167  0.7558  0.7168  0.7055  0.6342  0.8690  0.8293  0.5956  0.7328 
3  0.8484  0.7915  0.7848  0.7986  0.7488  0.8274  0.8096  0.7770  0.7009 
4  0.7577  0.7347  0.6988  0.7218  0.7064  0.7379  0.7127  0.6668  0.6234 
5  0.7748  0.7924  0.8128  0.7732  0.8198  0.7810  0.7802  0.8027  0.6234 
6  0.5293  0.5795  0.5135  0.5800  0.5484  0.5966  0.5373  0.6586  0.7041 
7  0.8783  0.8702  0.8009  0.7958  0.8384  0.8027  0.7532  0.7873  0.6618 
8  0.4377  0.4396  0.4449  0.4500  0.4514  0.4454  0.4530  0.4381  0.4478 
9  0.6335  0.5802  0.6561  0.5224  0.7232  0.8754  0.7309  0.6917  0.6388 
10  0.4167  0.4521  0.4239  0.4689  0.4874  0.9415  0.5034  0.4456  0.4907 
11  0.7226  0.6883  0.5958  0.5237  0.6519  0.3890  0.5491  0.7369  0.4813 
12  0.5552  0.5135  0.5436  0.5367  0.4743  0.4676  0.5075  0.5539  0.5352 
13  0.8922  0.8611  0.7636  0.8545  0.8776  0.6948  0.8228  0.7760  0.6995 
14  0.6605  0.6632  0.5511  0.7128  0.5111  0.6442  0.7148  0.7101  0.6906 
15  0.8709  0.9132  0.9031  0.9593  0.9143  0.9723  0.8558  0.8923  0.9336 
16  0.6431  0.6420  0.6833  0.8884  0.7360  0.6439  0.7098  0.8056  0.8008 
17  0.7518  0.7169  0.5774  0.5762  0.6571  0.6943  0.6172  0.5982  0.6691 
18  0.7603  0.7430  0.6260  0.6906  0.7074  0.5893  0.6552  0.6540  0.5848 
19  0.7046  0.6786  0.6261  0.6920  0.7004  0.6014  0.6402  0.6698  0.5929 
20  0.8080  0.7675  0.8248  0.7449  0.8608  0.9138  0.7650  0.8357  0.8793 
21  0.8424  0.8067  0.8229  0.8493  0.8167  0.9099  0.7388  0.8230  0.7678 
22  0.6025  0.6041  0.5912  0.5570  0.6711  0.8681  0.5898  0.6774  0.6387 
23  0.5210  0.5235  0.5170  0.5642  0.5030  0.6915  0.5251  0.5185  0.5831 
Differences between the predicted and observed attendance rates.
Number  Differences between observed and predicted attendance rates  

Elman (10 n.)  Elman (15 n.)  Elman (20 n.)  Feedforward (10 n.)  Feedforward (15 n.)  Feedforward (20 n.)  Cascade (10 n.)  Cascade (15 n.)  Cascade (20 n.)  
1  0.051  −0.082  0.051  0.037  0.082  0.231  0.011  −0.081  −0.082 
2  −0.013  −0.053  −0.014  −0.002  0.069  −0.166  −0.126  0.108  −0.030 
3  −0.153  −0.096  −0.090  −0.103  −0.054  −0.132  −0.114  −0.082  −0.006 
4  −0.012  0.011  0.047  0.024  0.039  0.008  0.033  0.079  0.122 
5  0.085  0.067  0.047  0.087  0.040  0.079  0.079  0.057  0.236 
6  0.037  −0.013  0.053  −0.013  0.018  −0.030  0.029  −0.092  −0.137 
7  −0.107  −0.099  −0.030  −0.025  −0.067  −0.032  0.018  −0.016  0.109 
8  0.056  0.054  0.049  0.044  0.042  0.048  0.041  0.056  0.046 
9  −0.062  −0.009  −0.084  0.049  −0.152  −0.304  −0.159  −0.120  −0.067 
10  0.115  0.079  0.108  0.063  0.044  −0.410  0.028  0.086  0.041 
11  −0.087  −0.053  0.039  0.111  −0.017  0.246  0.086  −0.102  0.154 
12  0.027  0.069  0.039  0.046  0.108  0.115  0.075  0.028  0.047 
13  −0.177  −0.146  −0.048  −0.139  −0.162  0.020  −0.107  −0.061  0.016 
14  −0.035  −0.038  0.074  −0.087  0.114  −0.019  −0.089  −0.085  −0.065 
15  0.108  0.066  0.076  0.020  0.065  0.007  0.123  0.087  0.045 
16  0.037  0.038  −0.003  −0.208  −0.055  0.037  −0.029  −0.125  −0.120 
17  −0.100  −0.065  0.075  0.076  −0.005  −0.042  0.035  0.054  −0.017 
18  −0.093  −0.076  0.041  −0.023  −0.040  0.078  0.012  0.013  0.083 
19  −0.055  −0.029  0.024  −0.042  −0.050  0.049  0.010  −0.020  0.057 
20  −0.035  0.006  −0.052  0.028  −0.088  −0.141  0.008  −0.063  −0.106 
21  −0.129  −0.093  −0.110  −0.136  −0.104  −0.197  −0.026  −0.110  −0.055 
22  −0.018  −0.019  −0.006  0.028  −0.086  −0.283  −0.005  −0.092  −0.054 
23  0.068  0.066  0.072  0.025  0.086  −0.102  0.064  0.071  0.006 
By looking at Table
To determine the most accurate ANN model, the MAPE and MAD values for all models are obtained as given in Table
MAPE and MAD values of the proposed ANN models.
Error measures  Elman (10 n.)  Elman (15 n.)  Elman (20 n.)  Feedforward (10 n.)  Feedforward (15 n.)  Feedforward (20 n.)  Cascade (10 n.)  Cascade (15 n.)  Cascade (20 n.) 

MAD  0.07  0.06  0.05  0.06  0.07  0.12  0.06  0.07  0.07 
MAPE  0.11  0.09  0.08  0.09  0.10  0.19  0.09  0.11  0.11 
In this section, the outperforming models of ANN, ANFIS, and fuzzy logic approaches are compared with the observed data and each other as shown in Table
The comparison of the predictions of the proposed models.
Game  Observed attendance rate  Predicted attendance rate (Elman NN–20 n.)  Predicted attendance rate (ANFIS–hybrid)  Predicted attendance rate (fuzzy logic) 

BarcelonaEspanyol  0.733  0.6821  0.7153  0.7076 
BarcelonaMalaga  0.703  0.7168  0.7431  0.7076 
BarcelonaSevilla  0.695  0.7848  0.8172  0.7075 
BarcelonaGetafe  0.746  0.6988  0.7029  0.7075 
BarcelonaGirona  0.860  0.8128  0.8300  0.7073 
MilanSPAL  0.567  0.5135  0.4912  0.7076 
MilanRoma  0.771  0.8009  0.7612  0.9046 
MilanBologna  0.494  0.4449  0.4381  0.5076 
MilanAtalanta  0.572  0.6561  0.7510  0.6255 
MilanCrotone  0.532  0.4239  0.4522  0.4545 
MilanLazio  0.635  0.5958  0.5932  0.7075 
MilanSampdoria  0.582  0.5436  0.4967  0.5821 
InterSPAL  0.715  0.7636  0.7521  0.7076 
InterGenoa  0.625  0.5511  0.5960  0.7076 
InterMilan  0.979  0.9031  0.9718  0.7827 
InterSampdoria  0.680  0.6833  0.6102  0.7076 
InterAtalanta  0.652  0.5774  0.6037  0.7075 
InterChievo  0.667  0.6260  0.6050  0.7072 
InterUdinese  0.650  0.6261  0.6638  0.7076 
InterLazio  0.773  0.8248  0.8284  0.8226 
InterRoma  0.713  0.8229  0.7902  0.8847 
InterCrotone  0.585  0.5912  0.6445  0.6737 
InterBenevento  0.589  0.5170  0.5989  0.6153 
The comparisons of the prediction gamebygame are shown in Figure
The comparison of the outperforming ANFIS, ANN, and fuzzy logic models with the observed data.
To determine the outperforming model among twelve prediction models, MAPE and MAD values of the best models of each technique are compared as shown in Table
Comparison of the MAPE and MAD values of the proposed models.
Error measures  ANN–Elman–20 neurons  ANFIS–hybrid  Fuzzy logic 

MAD  0.05  0.05  0.07 
MAPE  0.08  0.09  0.1 
As it is seen from the table and explained before, three techniques provide accurate and effective predictions. However, the Elman network type that has one hidden layer with 20 neurons is the most successful and outperforming for this purpose. However, the performance of the proposed ANFIS model is not that bad. By adding data from different clubs from different countries and training the ANFIS model, its performance might be improved. In addition, the performance of the proposed fuzzy logic model might be enhanced as well by making few modifications. The fuzzy rules might be modified.
In this study, nine ANN models, two ANFIS models, and a fuzzy logic model are designed to predict attendance demand in European football games. Since results of demand forecasting are crucial inputs for decision making and planning, the accuracy of the forecasting is vital. Therefore, the most effective attendance determinants are chosen after a comprehensive literature review and interviewing with experts. The distance, game day, performance of the home team, performance of the away team, and uncertainty of outcome are selected and used in all twelve models in this study.
The 236 games’ data of three European football clubs are utilized for training the ANN and ANFIS models, and one season’s data of the club are used for testing the ANN, ANFIS, and fuzzy logic models. The performance of each model is evaluated by two statistical indicators that are MAPE and MAD. Based on the prediction results, it can be inferred that the ANN, ANFIS, and fuzzy logic provide effective and competitive predictions since the MAPE and MAD values are generally under 10%. However, the ANN model whose network type is the Elman that has one hidden layer with 20 neurons delivers the most accurate and effective results among twelve models. The MAPE and MAD values of the model are 0.08 and 0.05, respectively, meaning that the prediction accuracy is high in general as well.
Even though the NN, ANFIS, and fuzzy logic models were proposed with the similar purpose before, this study extended the literature by the following additions. First, including the uncertainty of outcome that covers the effective factors in ANN and ANFIS models improved the accuracy of the predictions. Second, a large, diverse data set is used to train and test the models, and different input variables are established in the fuzzy logic model. Finally, nine ANN models are designed that allow a comprehensive analysis of the network types. Thus, the Elman network type that provides the most effective prediction results among ANN models is proposed for the first time in this study.
Future research may analyze different network types of the ANN technique. In addition, alternative effective demand factors may be included in the models to evaluate whether prediction results are improved or not. Finally, a larger data set may be used to train models.
The data used to support the findings of this study are available from the corresponding author upon request.
The authors declare that there are no conflicts of interest regarding the publication of this article.