Analysis and Modeling of Football Team’s Collaboration Mode and Performance Evaluation Using Network Science and BP Neural Network

,


Introduction
Football is known as the "world's first game" and the most influential sport in the world [1]. e standard 11-player football match consists of 10 players and 1 goalkeeper from each team, 22 players in total who fight, defend, and attack on the rectangular grass field [2]. In the game, players from both sides try to shoot the football into the goal of the other side. e more goals your team scores, the better your chances of victory [3]. When the game is done, the team scoring more goals wins [4]. Tie results are generally allowed, except in certain situations, such as knockout rounds. If the scores are the same within the specified time of the game, it shall be determined by the rules of the game, and the scores can be higher in the form of drawing lots, extra time replay, or penalty kick (12 yards) as well [5].
Network science is widely used in the classic problems of the social system [6], such as the stimulation of cooperation among individuals, epidemic infection, or the spread of information on social networks [7]. e football team's playing style and personal contribution can be revealed through indicators from network science [8]. e current research on the cooperative system mainly focuses on the cooperative process and the cooperative results [9][10][11]. e research of the collaborative process mainly lies in the behavioral characteristics of both partners, which can be explicit or implicit. For instance, Reid et al. [12] proposed a semistructured interview method to study the collaborative process between nurses and patients and finally found that the collaborative process between patients and nurses was conducive to the formulation of care planning. Research on the collaborative results mainly focuses on the team performance evaluation. In the process of team collaboration, the influence of the factors and the degree of the collaborative results are analyzed. For example, Yang et al [13]. proposed a moderated mediation model to analyze the relationship among team reflexivity, team diversity, and team performance and found that team diversity can enhance the mediation relationship between team reflexivity and team reflexivity. is paper mainly uses the CNS method to establish and analyze the football team's cooperation network and uses the BPNN method to evaluate the team's performance.
e CNS method can analyze the whole process of the football match and the collaborative network established by it is both flexible and dynamic, which can not only reflect the overall situation of the team but also track the dynamics of each player during the game. is is the case in the study by Buldu et al [5,14]. is paper mainly uses the CNS method to analyze the cooperation problems in football teams and through the use of a large number of match data to build a matching network and then to reveal the mechanism of team cooperation, focusing on the team's attack and defense strategy. In the process of constructing the passing network, the CNS method used by Buldu et al. was adopted [3,15]. However, when determining the weight among nodes, the entropy method was not adopted, but the corresponding improvements were made according to the actual game data.
One of the classic methods of performance evaluation use the key performance indicator (KPI) index method, but this method is gradually replaced by BPNN due to its limitation in weight assignment. For instance, Dony et al [16]. proposed a hybrid principal component neural network method to evaluate the compression performance of digital chest radiographs. e evaluation process is fully quantifiable, and the evaluation results are highly reliable. In this paper, based on the performance evaluation model of nonprofit hospitals based on BPNN proposed by Li and Yu [17], the irrationality of its weighting is improved, and the model is applied to team performance evaluation. As the team's performance evaluation is relatively complex, it has a great impact on the accuracy and convergence speed of the network [18][19][20]. e solution proposed in this paper is to enhance the adaptability of the evaluation network by using a large amount of data, thus greatly improving the accuracy of the interpretation of the network.

Complex Network Science.
Complex network science (CNS) is a development of the graph theory approach, in which CNS can analyze more complex and dynamic systems. e advantages of using the CNS analysis system are shown as follows [21][22][23]. For a large system, the behavior of the system cannot be explained from an individual perspective. e point of network science is that by representing individuals with nodes, directed or undirected segments among nodes represent the connection between two individuals. When enough points and lines are used to describe the system, a network is formed and usually represented by an adjacency matrix so that a system is rationally characterized [24]. en the network method is used to analyze the distribution of point and line and the interaction between points. Finally, the results are mapped to the system to describe the relationship among system members.
e aggregation state of system members can be explained and analyzed by using the aggregation coefficient of the network. For the network, the weight and orientation among nodes determine the degree of point aggregation [25]. e clustering coefficient expresses the probability of a connection between two indirectly related nodes. ere will be another state between the members of the system. At a certain point in the future, the relationship among the members of the system will change with the continuous evolution of the system [26][27][28]. At this point, the CNS method can be used to analyze the evolution form of the system and the connection form of the members in the future [29].
System cooperation and division of labor can be characterized by the degree of nodes in the network. In reality, the members of a system have different roles to play while for a network the degree of nodes reflects the connection status of it whether it is tight or loose. By mapping the analysis results to the system, the cooperative division of labor among members can be analyzed [30].
Going back to the system to be analyzed, it is necessary to consider the errors generated by replacing the system analysis with the CNS method [31]. Since the distribution of system members is extremely random, and the change of nodes in the network is limited within a certain probability range, the difference itself will bring errors [32].

BPNN.
BPNN is an artificial neural network whose analytical process is similar to the decision-making process of neurons in the human brain [33]. BPNN is a kind of supervised self-learning whose learning behavior is essentially receiving potential guidance. Using BPNN to analyze team performance evaluation has theoretical advantages [34][35][36].
First of all, team performance evaluation is a continuous function of many complex factors, and the change of each factor affects the overall performance of the team [37][38][39]. ese factors can be at the member level, for example, a member's behavior will have a great impact on the team's score and then affect the team's performance, such as passing, shooting, and foul [40]; it can also be at the team level [41], for example, when each member performs consistently, the team's flexibility and strategy changes can also affect the team [42][43][44]. e theoretical study shows that a three-layer BPNN structure can approximate any continuous function. is lays a foundation for analyzing performance evaluation with BPNN [45].
Secondly, to determine the input of BPNN is the foundation of network construction and we need to analyze what kind of data will affect the team performance from two levels according to the specific team [45][46][47]. Determining the number of hidden layers is the key to ensure the accuracy of BPNN. In this paper, the theoretical formula and trial-and-error method are combined, and the analysis results are satisfactory. To determine the network output is a necessary step of BPNN, and to determine a reasonable network output can make the results intuitive.
Finally, it is a better method to judge and improve the network by using a regression level and error training state. e regression level reflects the best approximation level of BPNN for performance evaluation function. e error training state reflects the variation of error in network training.

Data Processing and Assumptions.
When analyzing the competition, all the matches of Team A (38 games in total) are analyzed, which makes the relevant data of Team A and its competitors available. e number of successful passes, shots, x-coordinate relative to the network center of mass, ycoordinate relative to the network center of mass, and the number of passes each player made to the other ten teammates are counted. e above statistical data to calculate the position, personal strength, playing style, and the possibility of passing to the other ten teammates are weighted. Several basic assumptions can make the analysis process simplified. (1) It is assumed that in every game of Team A, there is no condition other than performance. (2) It is assumed there is no correlation between team-level factor set and member level factor set. (3) It is assumed that the change of team coach only affects the internal factors of the team, and there is no other external influence.
When analyzing the team performance, the data used in this model are mainly from the full event.csv, passing event.csv, and matches.csv provided by the Consortium for Mathematics and Its Application (COMAP). After processing, it is stored in a file named data1.mat, which contains two subdatasets named P and T. Each column of dataset P represents 38 matches against Team A, and each row represents 11 event types including foul and pass. Dataset P is mainly used to input the original data of BP neural network training. When using the U-AHP method to analyze the weight, the relevant data in three documents need to be used, which will not be discussed here. e following formula shows the data normalization formula:

Analysis and Modeling Based on Network Science.
Python is used to create a network for passing among players, in which each player is a node, and each passing constitutes a link between players. Using network science, the team is regarded as a complex network, whose nodes (players) interact to overcome the opponent network. Different network indexes are used to extract the characteristics of Team A, including the times of passing, shooting, x-coordinate of relative network center of mass, y-coordinate of relative network center of mass, shortest path length, the maximum eigenvalue of the adjacent matrix, and algebraic connectivity. At the same time, the influence of the opponent's network attribute on Team A's network attribute is considered. A variety of scales, such as micro (individual) to macro (all players interaction), and time, such as short (minute to minute) to long (the whole game and the whole season), are explored. e main steps for establishing a network are as follows.
Speculate the player's position by averaging the x-coordinate of each player's relative network centroid and ycoordinate of relative network centroid, and take them as the node coordinates.
Calculate node radius R. e size of player node R depends on its passing times P and shooting times S: Determine the link width W. e width of the pass link W between the player and the teammate depends on the number of passes P between the player and the other ten players: Based on the analysis above, the competition processing at one time is shown in Figure 1 (refer to the appendix for the player number represented by the label).
According to the network diagram above, a lot of information can be extracted.
(1) Team A is good at defense in the game. It is not hard to find that Team A's defenders and avant-garde are relatively backward; accordingly, this team is more defensive than offensive. It can be inferred that the playing style of Team A should follow the trend of "5,4,1". e highlight of the team's defense is that it can not only maintain the width of defense, but also quickly build up enough defense strength in the middle and maintain the situation of "2 vs. 1". Of course, such a team also has a natural disadvantage is weak on the counterattack. (2) e players of Team A have a clear division of duties, especially in passing. Player a can be said to be the player who loves passing the ball most. In 38 games, player a has passed 1225 times, while player II passed 859 times successfully and player I has only 238 times. However, when processing the data, it is found that the number of times I passes to striker F2 is half of the number of times that II passes to striker I who is farther away. Almost all of the 38 games are like this and it is reasonable to speculate that the division of duties between them is very different.   the lineup, Team A adopts "5-3-2" (defensive lineup) rather than our best team "4-4-2". In the first game, the position of the central defender was crucial, but he did not play a role and the defensive strength of the central road was sacrificed. Besides, the defense width has also been reduced. In the second game, the defenders 1, 9, and 7 are too dispersed although they kept the defensive width. e "4-2-2" formation was regarded as the standard, and the clustering analysis of the adopted formation showed that the clustering coefficients were "0.5924" and "0.6785," respectively. From the perspective of personal strength, only four of the best players predicted are selected in the first game while seven are selected in the second game, but none play their due role due to the lineup. e final game was a 1-5 and 2-5 fiasco, respectively.

Analysis and Modeling Based on BPNN.
Firstly, the structure of the neural network needs to be determined. eoretical research shows that the BP neural network with a three-layer structure can approach any differentiable nonlinear functions, but this structure will fall into the local optimal solution. erefore, this thesis introduces the momentum factor which can avoid entering into local optimum and accelerating convergence. Figure 3 shows a flowchart of performance evaluation using the BP neural network.
It is exceedingly crucial to determine the number of hidden layers of the BP neural network for the calculation stability of the neural network. Although there are many empirical formulas in practice, this method often produces large errors. e method selected in this thesis is the combination of the empirical formula and trial-and-error method, which not only avoids the error of relying on empirical formula, but also reduces blindness. Determining the transfer function of the BP neural network is conducive to accelerating convergence. Since there is no negative number transformation in the data of this problem, log-sig function and tan-sig function are selected as the transfer function.
In order to study the problem stratification, the contribution of the scheme layer to the target layer is obtained through the weight relationship not only between the criterion layer and the target layer but also between the criterion layer and the scheme layer. is method can simplify a complex problem layer by layer, especially suitable for quantitative analysis of qualitative problems. ere is a kind of problem that the scheme layer and the criterion layer are independent of each other, but the target layer is consistent, which is the case of the performance evaluation studied in this paper. In the above problems, the uncrossed weight between factors can be solved by using the U-AHP method, that is, the weight of each criterion layer is obtained and then the total weight of the scheme layer to the target layer is obtained by using the weight transfer. Compared with the analytic hierarchy process (AHP), the U-AHP is not different in essence, but slightly different in the calculation of weights. e specific application methods are shown in the modeling process of this thesis, which will not be described in detail here.
In this model, the number of input layers is 9 and the number of output layers is 4. e data of the input layer mainly include nine actions of team members' passing and shooting in 38 games. e data of the output layer includes four indexes: own and opponent's score, game results (measured in terms of 0 and 1, win-1, lose-0), and successful passing times, shown in Table 1. e number of hidden layers and network parameters are determined, among which the setting of network parameters mainly includes maximum learning times, learning rate, learning target error, and momentum factor. e programming language is shown in Table 2.
e general empirical formula for the number of hidden layers is given by hl � log 2 (inl + oul) + exp, where inl and oul are the number of input layers and output layers, exp is the error term, and its value is (1,10). e number of hidden layers selected in this thesis is 12. After the network parameters are determined, the weight of team members' behaviors can be easily obtained by using the test data. When analyzing team-level factors, the factors including adaptability, flexibility, rhythm control, and the opponent strategy are selected as indicators. e specific contents of each indicator are shown in Table 3.
By using the existing data and entropy weight method, the evaluation network based on team level is constructed. e U-AHP method is used to build the total performance evaluation index of the team, and the classical Input: P i , T i , exp i , ne, ng, nl, nmc Output: ps i , ts i , pre, bhl, psout i , tsout i , outsim i for i = 1, ... ,n iter do Find the best number of hidden layers using equation (5) if exp ≥ 1 and exp ≤ 10 then Normalize data using equation (1) Calculate partial derivatives of network weights and bias using equation (2) Initialize and update the network weights and bias  method of calculating the weight of AHP is used to obtain the total weight of members and teams to the goal. It should be noted that because the weight of the member level and team level to the overall goal is relatively large, the method of performance appraisal KPI is introduced to evaluate the weight of the two overall indicators of the overall goal.

Character of Team A in Competition.
By selecting the specific data of Team A's passing and shooting in the whole season (the number of passes is 14043, and the number of shots is 320), it is concluded that the weight of passing is far greater than that of shooting. e number of successful passes of each player is statistically analyzed, and the best position of players with a passing success weight of 91.84% (as shown in Figure 4(a)) in the whole season is highlighted (as shown in Figure 4(c)).
By further analyzing the data, the passing data of each player is obtained and shown in Table 4 and Figure 5.
It can be seen from Figure 5(a) that G is the goalkeeper, II and d play in every game. Player 1-7, I, a, c, and f have high attendance (more than half of the total number of games), which shows that the positions of these players have a great influence on the rhythm control, passing, and shooting. Of course, there are players with similar attacking positions and passing methods, such as 2, f and 3, 4, and 5, 6 and 7. Coaches may employ alternate tactics. From Figures 5(a) and 5(b), based on meeting the requirements of more passing times and higher success rate, the passing energy of a, II, 1, c, 3, 5, G, 4, 2, f, d, 7, and other players is better, which can be given priority in the analysis of structural strategy. It can be seen from Figure 5(b) that I, II, IV, f, and other players shoot the most times, who should be the shooter of the team and to a large extent can determine the choice of the striker in each game.
Finally, the comparison of ten football indexes is listed between Team A and the enemy team in the season (as shown in Figure 6). In the given index, the left column is the average index value of Team A and the right column is the average index value of 19 enemy teams. e indexes in the figure include the number of free kicks, the number of duels, the number of fouls, goalkeeper leaving the line, offside, other on the ball, passes, save attempts, shots, and times of transmission. Also, in Figure 6, the difference between Team A and the other 19 teams is mainly in the number of shooting and passing, which is mainly since Team A is a team with conservative tactics. It can be seen that Team A has a good overall consistency for the ball. e team members of Team A have strong cohesion, so they have a good control of the rhythm of the game. Figure 6 shows a comparison of the 10 parameters directly related to the topology of the average through the network and gives a detailed description. In the previous figure (Figure 2), the game data are plotted, which are related to the number of triangles created among any three players. e clustering coefficient is an indicator of the local robustness of the network because when a triangle connecting three nodes exists, the link between two nodes is lost, and there is another way to reach another node passing through the other two sides of the triangle. In football, the clustering coefficient measures the triangulation among three players.
In the first question, Team A can be roughly inferred to play conservatively through a passing network. Figure 6 illustrates this problem even more powerfully. An analysis of Team A shows that Team A is more defensive in the game. For example, Team A has more passes and duel times, which are indicators of defensive strategy. Another defense strategy is the number of fouls and not surprisingly Team A has a great number of fouls.

Team Performance Evaluation of Team A in Competition.
Firstly, the method of neural network is used to evaluate the indicators of member level, and the regression level and error training are as follows (Figure 7). e regression level reflects the interpretation of variables to the results. e closer the value is to 1, the better the interpretability is. e error training level reflects when and how the error decreases. e method used in this model is exceedingly successful. Further application in each indicator can analyze the interpretation degree of the indicator to the      Mathematical Problems in Engineering results, which can be used as the basis for performance evaluation of this level, and the specific results will be displayed in the result analysis.
Secondly, in the process of team-level analysis, it is necessary to assign values to indicators. Using entropy weight method and known data, it is not difficult to obtain the corresponding weight matrix: By analyzing the matrix, the corresponding concrete results can be easily obtained.
Using the BP neural network method, the influence weight of each competition behavior on the results of the output layer of the competition in all competitions also can be easily obtained, as shown in Table 5.
Behavior analysis is an exceedingly successful explanation of the influence of behavior itself on the result of the game. Specifically, the factors that affect the number of successful passing are free kick, foul, pass, offside, and other on the ball; the factors that affect the result of the game are offside, other on the ball, pass, and shot; the main factors that affect the score of one's side are shot and pass; the main factors that affect the score of the opponent are shot and pass.
Finally, to facilitate the subsequent analysis, the weight value of the member level should be given. Shot, pass, offside, and duel are selected as the main influencing factors of member level, and the weight is w � 0.3075 0.3823 0.1706 0.1396 . e method to determine the weight can be entropy weight method. is is because the gray level of the data is too large to get the weight matrix by normal factor analysis ( Table 6). Table 6 shows the contribution of two factors to team performance evaluation, in which the weight of the team level is 0.8 and that of member level is 0.2. e results strongly suggest that, as a competitive sport, teamwork and interaction are far more important than individual performance.
At the team level, there are three main subfactors, namely, team adaptability, team flexibility, and team rhythm control. e weight of team rhythm control is 0.6236, which is the biggest subfactor affecting team level, followed by team flexibility and adaptability, whose weights are 0.2165 and 0.1608, respectively. It is easy to get such inspiration from the results that the rhythm control of a team in the game almost affects the outcome of the game, so it is very important to develop appropriate strategies to ensure the team's rhythm control of the game.
At the member level, the player's passing behavior and shooting behavior are the biggest subfactors affecting the member level, with a weight of 0.3823 and 0.3057, respectively. Offside behavior and duel behavior have less impact, with a weight of 0.1706 and 0.1396, respectively. Based on the analysis, Team A's players should pass more and duel less.

Conclusions
is paper proposed a new method for analyzing and modeling the performance of Team A, and the following four conclusions are obtained: (i) e best lineup match is "4-4-2" by restoring the most commonly used lineup position of players in the season, using the means of network science and cross-validation with data to prove that Team A is a tactically conservative team. (ii) A performance index evaluation model is established based on BP-(U-AHP) method to comprehensively evaluate the team. e conclusion is that Team A ought to pay attention to the importance of individual behavior to member level and the influence of factors including adaptability and flexibility for the strength of the team. (iii) e conservative style of the team's game can be determined through the analysis of the game data, the best selection of the specific personnel of the team's on-site structural position can be obtained, and the best structural strategy guidance can be designed for the next season. (iv) e performance of complex systems can be evaluated from three aspects: supervision rules, team performance, and member performance.
e construction process of this model reflects the application of group dynamics in actual combat. rough the construction of the football network, the thesis analyzes a series of internal and external influencing factors that Team A performs better than the other 19 teams in the field so that the advantages of the team can be better displayed and the effectiveness of the team can be fully played.

Supplementary Materials
In this paper, three datasets are used as supplementary materials. e dataset contains the real-time information of Team A and 38 teams in the game, the result of the game, and the information of the successful pass. Because the information reflected in the dataset is relatively large, it is uploaded as an attachment. (Supplementary Materials)