An Improved Logistic Regression Method for Assessing the Performance of Track and Field Sports

Track and field is an important part of sports. Track and field athletes are an important reserve force for the development of national sports. An accurate assessment of track and field athletes' performance can help them develop more appropriate training programs and improve their performance. In order to assess the performance of track and field athletes better, this paper proposes an improved logistic regression method. Firstly, this method uses factor analysis to reduce the data dimensions of the factors that affect the performance of track and field athletes, and uses the principal component analysis to select common factors and their corresponding values. Then, according to the common factors, a binary logistic regression model is established to evaluate the performance of track and field athletes. Experiments show that the method can effectively evaluate the performance of track and field athletes and is suitable for athletes of different track and field sports. It has high accuracy, fast evaluation efficiency, and good universality of performance evaluation. For different numbers of athletes, the proposed method has a lower error evaluation index, higher evaluation accuracy, and better evaluation quality. Compared with the other two methods, the proposed method has the shortest evaluation time and is more effective for the performance evaluation of track and field athletes.


Introduction
Athletes are an important reserve force for the development of national sports, and the accurate assessment of athletes' performance can develop more applicable training plans for them and improve their performance [1]. In addition to a complete training system, an objective and fair assessment evaluation system is particularly important for training athletic sports talents. An objective evaluation system of track and field sports performance aims to explore the potential of track and field athletes and is conducive to the national selection of more suitable track and field sports talents [2]. Constructing an evaluation model reflects the objective training effects of track and field athletes, finds the strengths and weaknesses of track and field athletes themselves, and then promotes track and field training reform and maximizes the effects and benefits of the sports training reform. At the same time, the feedback information from the model can also promote track and field athletes to clearly recognize their training situation in the future training process and continuously adjust their training status to achieve the highest training efficiency. is can also serve as a guideline for track and field athletes' career planning.
Factors such as training intensity and track and field athletes' own physical quality can directly affect their performance. Accurately understanding the changing characteristics of track and field athletes' performance can ensure their better performance [3]. is makes it very important to assess track and field athletes' performance. e assessment of track and field athletes' performance is an important part of the athletes' training activities. is work plays a role in diagnosing, regulating, and strengthening the training process of track and field athletes, as well as making value judgments about the effectiveness of their training [4]. e evaluation of the training effectiveness of track and field athletes should be an evaluation of the training effect and training process. e evaluation should not only emphasize the function of screening and selection, but also strengthen the function of motivation and development [5]. What should we do in the evaluating the training performance of track and field athletes to achieve the purpose of cultivating the training interest of track and field athletes, stimulating the subjective initiative of track and field athletes, and meeting the psychological needs of athletes is directly related to the functional orientation of track and field athletes' training and the realization of training goals.
is is a problem that needs to be solved urgently at present.
Athletic athlete performance assessment is not only a test of sports training effects, but also a comprehensive judgment of athletes' sports ability. Whether the assessment is comprehensive, objective and fair, and truly reflects an athlete's actual level in sports [6] is often an concern of the athletes. erefore, it is particularly important to construct a diversified performance assessment system. e diversification of the assessment system is reflected in the diversification of the assessment content. Athletic performance assessment should not be limited to physical fitness and motor skills, but also include training attitude, physical exercise, training participation, and competition winning together. e assessment should cover various factors such as cognition, emotion, cooperation, learning, and practice of the athletes [7]. e diversification of the assessment system is also reflected in the setting of dual subjects of assessment. ey are the summative assessment made by the coach as the main body relying on the assessment results of training programs and the formative assessment made by the athlete as the main body with training activities as the main content [8]. By constructing a diversified track and field athletes' performance assessment system, it further broadens the dimensions and connotations of track and field athletes' performance assessment, which is of practical significance to improve the fairness and comprehensiveness of track and field athletes' performance assessment, enhance track and field athletes' participation and dominance in the process of performance formation, and help track and field athletes understand themselves, discover themselves, and transform themselves more objectively. e regression model is a predictive model that studies the dependent and independent variables and integrates various possible influencing factors to assess athletes' performance and training effects through multiple regression models [9]. e research methodology in this paper takes the factors affecting the training performance of track and field athletes as the object of study, selects the factors affecting the assessment of track and field athletes' performance as the target variable, and establishes a logistic regression model. In this paper, the historical performance of track and field athletes was selected as the dataset. Among the assessment variables were competition ranking, competition time, age, gender, training duration, BMI, and blood pressure. First, factor analysis is carried out on the evaluation indicators to reduce the dimension of the data, eliminate the correlation between the data, and determine the final indicators. en, a logistic regression model was established based on the final indicators. Finally, the assessment effects of the models were compared. Compared with the other methods, the method in this paper can achieve high-quality assessment of track and field athletes' performance, which is very important for their training planning. Accurate assessment of track and field athletes' performance can help them understand themselves and training planning, which is good to improve their performance and make them better and better. is paper has the following innovative points.
(1) e factors affecting track and field athletes' performance are multiple. In order to effectively conduct track and field athletes' performance assessment, this paper simplifies the data and influencing factors by the factor analysis method. Discarding secondary factors and selecting primary factors as evaluation variables allows for a more simplified and efficient operation of the algorithm. (2) e common factors affecting the performance of track and field athletes were selected using the principal component analysis, and classified and assigned different weight values according to the degree of influence, which can improve the accuracy of the evaluation.
is paper mainly consists of five parts; the first is the introduction, the second is the state of the art, the third is the methodology, the fourth is the experiment and analysis, and the fifth is the conclusion.

Research Status.
At present, with the deepening of the concept of "Internet Plus," information technology has been widely used in sports training activities. A large number of scholars have conducted in-depth research on sports performance assessment models and constructed many assessment models. Under the guiding principles of advancement and comprehensiveness, the literature [10] established indicators such as training hours to improve the quality of sports training and to promote further the internalization of athletes' knowledge. e literature [11] established an evaluation model from three aspects of the basic needs theory. e model used hierarchical analysis to analyse the indicator weights and found that the greatest weight was given to the autonomy needs and the least weight to the competence needs. When summarizing the methodological studies on the quantification of performance evaluation in universities, hierarchical analysis was found to be the most representative, but it is very difficult to test whether the judgment matrix is consistent when studying real-world problems and it is difficult to truly reflect the fuzzy nature of human evaluation [12]. erefore, the literature [13] addresses the shortcomings of expert scoring in the hierarchical analysis method and integrates the principles of fuzzy mathematics to establish a mathematical model to evaluate the training quality more objectively. e literature [14] established an evaluation system from three aspects: training platform, coaches, and athletes. e method is based on AHP to determine the index weights and introduces a fuzzy comprehensive evaluation model for the differences that exist between the consistency of judgment matrix and the consistency of human brain thinking. is provides a new perspective for athlete training quality assessment. With the continuous improvement of the fuzzy complementary judgment matrix theory, the literature [15] established athlete satisfaction indicators. e theory indicates that the influence of personal factors on the index system is the highest and the influence of gymnasium factors is the lowest, which provides a more scientific and reasonable reference basis for athlete training strategies. e literature [16] investigated the athlete performance prediction method integrating knowledge mapping and collaborative filtering to establish a training knowledge map depicting training information. e algorithm calculates the similarity of training at the knowledge level by the neighbour node-based method and the knowledge graph-based learning method, and integrates the obtained similarity into the collaborative filtering performance prediction framework to obtain athlete performance prediction results. e literature [17] investigates a two-way attention-based mechanism for athlete performance prediction model. e model obtains the attention scores of different attribute features on the first stage and second stage competition performance through two attention calculations, and combines the multi-feature fusion approach to obtain the competition performance prediction results. e historical data-driven prediction method is implemented by historical data.
ere are many historical data-driven prediction methods such as hidden Markov models, chaotic prediction, and support vector machines [18]. Support vector machines have the advantage of small sample learning and high learning ability in prediction, and hence they are also used to study the historical data-driven athlete performance estimation method. is method uses the KNN algorithm to pre-process the historical performance of athletes to remove the effect of distracting data and classify the data accurately. It uses support vector institutions to build regression prediction models and introduces Lagrangian functions for data transformation to avoid data operations from getting localized [19]. e support vector regression prediction model parameters are optimized using the particle swarm algorithm to reduce the interference of input quantity noise and reduce the complexity of the computation. ese evaluation models are built according to different application scenarios and are important for promoting the scientific training of athletes.

Factor Analysis Method.
Factor analysis is a technique to reduce the dimensionality and simplify the data. It explores the underlying structure of the observed data by examining the internal dependencies among many variables and represents the underlying data structure with a few "abstract" variables. ese abstract variables are called "factors." By discarding secondary factors and selecting primary factors as evaluation variables, the model is made more simplified and the efficiency of the algorithm operation is improved. is reflects the main information of the original set of variables. e original variables are observable explicit variables, while the factors are generally unobservable latent variables. e common factors in factor analysis are common influences that are not directly observable but exist objectively. Each variable can be expressed as a linear function of the common factor and the sum of the special factors [20]. Its mathematical model can be expressed as is an w-common factor variable. w is less than or equal to u. It is the factor that appears in the expressions of each original observed variable, which are mutually independent unobservable theoretical variables. Matrix G is called the factor loading matrix. g xy is called the factor loading. It represents the correlation coefficient between the x-th original variable and the y-th public factor variable. e larger g xy indicates the stronger correlation between the public factor F y and the original variable I. ε is a special factor. It represents the part of the original variables that cannot be explained by the common factor variables, which is equivalent to the residual part in the multiple linear regression analysis.
Factor analysis utilizes the idea of dimensionality reduction, starting from the study of the dependencies within the correlation matrix of the original variables, and groups the original variables according to the magnitude of the correlation, making the correlation between variables within the same group high and the correlation between variables in different groups low [21]. Each group of variables represents a basic structure and is represented by an unobservable composite variable. is underlying structure is called the common factor. Capturing these main factors can help us analyse and interpret complex problems.

Regression Model Where the Dependent Variable Is a Qualitative Variable. (1)
Qualitative variables e dependent variable takes only two outcomes. j � 0 means that the event did not occur. j � 1 means that the event occurred. Consider the following expression for a simple linear regression model.
Since j x is a Bernoulli random variable of type 0 to 1, the following probability expression is obtained: According to the definition of discrete random variable expectation, the following function is obtained: Computational Intelligence and Neuroscience 3 (4) Error term e error term ε x � j x − (β 0 + β 1 i x ) can only take two values for a dependent variable, i.e., 0 or 1. Its expression is as follows: e error term is a two-point discrete distribution, and thus it cannot be assumed to be a normal error regression model.
Zero-mean heteroskedasticity means that the error term is zero-mean and its variances are not equal, and the expressions are as follows: If a multiple linear regression equation is used to analyse the quantitative relationship between the dependent variable and the independent variable, the relationship function is expressed as follows: (3) e left side of the equation j takes 0 or 1, and the right side of the equation can take any real number; the left and right sides do not correspond to each other in terms of the range of values. erefore, multiple linear regression cannot be used for fitting the dependent variable as a qualitative variable.

Logistic Regression Model.
e logistic function has the form [22].
e range of values of its independent variable is (-∞,+∞) and the range of values of the function is (0,1). e dependent variable j itself takes only two discrete values of 0 or 1. It is not suitable as the dependent variable in the regression model, such that where π x is the probability that the random variable j takes 1, and its value varies continuously in the interval [0, 1]; thus, π x can be used as the dependent variable instead of j.
Let j be a variable of type 0 to 1, and t sets of observations be (i x1 , . . . , i xu , j x ), where j 1 ,j 2 , . . .,j t is a random variable that takes the value 0 or 1. e expression for the expected value is as follows: e expression of the function for the logistic regression model [23] is as follows: us, j x is a random variable of type 0 to 1 with mean π x � f(β 0 + β 1 i x1 + · · · + β u i xu ).; and the probability function is e random probability of j x can be defined as: e likelihood function of j 1 ,j 2 , . . .,j t is thus e likelihood function is taken logarithmically and the following expression can be obtained: Bringing equation (2.14) into the equation gives the expression ln L � t x�1 j x β 0 + β 1 i x1 + · · · + β u i xu − ln 1 + exp β 0 + β 1 i x1 + · · · + β u i xu . (16) e maximum likelihood estimation yields the estimate β 0 , β 1 , . . . , β u of β 0 , β 1 , . . . , β u .

Data Processing.
e data taken in this paper are from the historical performance of athletes in a sports school. ey mainly contain factors such as competition ranking, competition sports time, age, gender, training hours, and physical fitness at all levels of events. e dataset contains data related to 100 athletes. e dataset consists of training predictor variables and one target variable for track and field sports. e predictor variables include athletic athletes' competition ranking, competition time, age, gender, training time, BMI, and blood pressure, and the variable descriptions are shown in Table 1. e goal of the dataset is to predict athletic athletes' performance based on certain parameter measures contained in the dataset. e process of data cleaning requires the consideration of the following effects. e data are a standard database, and hence there is no duplicate or irrelevant data and no vacant data points have been checked. Since blood pressure, age, and body mass index cannot be 0 in general, and 0 is an abnormal data point, the rows with 0 values in each feature of blood pressure, age, and body mass index were filtered out. ere were 724 valid data left after processing.

Factor Analysis
(1) Applicability Test of Factor Analysis. e results of the KMO and Bartlett's sphericity tests on the data of track and field athletes using SPSS software are shown in Table 2. It is generally considered that if the KMO measure is greater than 0.5, then factor analysis can be performed. e significance of p � 0 indicates that there is a certain correlation between the original variables, and the conditions for factor analysis are available.
(2) Extraction of Common Factors. Factor analysis was performed on the data, and the extraction of principal components was performed by principal component analysis. Classifying them according to the degree of influence and assigning different weight values improves the assessment accuracy. Under the principle of eigenvalue of 1, three principal factors were retained, i.e., the seven variables were grouped into three categories. is reduces the amount of operations, but categorization causes information loss, and the amount of information retained is 64.49%, and the amount of information lost is large; thus, a common factor is added to make the amount of information lost reside within an acceptable range. e following variance interpretation Table 3 shows that each principal component contains the total variance of each original variable, and the improved retained information is 77.08%.
(3) Public Factor Naming. e original factor loading matrix was rotated by extracting the four public factors and performing maximum variance orthogonal rotation to obtain the variance maximum orthogonal rotation matrix, as shown in Table 4.
Based on the rotated component matrix, the four common factors can be named. e first factor Z1 has large loadings on the gender and age indicators. e second factor Z2 has larger loadings on the race ranking and race time indicators. e third factor Z3 has a large loading on BMI and training duration. e fourth factor Z4 has a larger loading on blood pressure. It can be found that the evaluation indexes corresponding to Z1 are indirect influence data. e evaluation indexes corresponding to Z2 are race performancerelated data. e evaluation indexes corresponding to Z3 are other physical data. Z4 represents blood pressure. ey are named as indirect factors, competition performance, physical quality, and blood pressure, respectively.

Binary Logistic Regression
(1) Hosmer-Lemeshaw test. e original hypothesis H0: the model fits well with the observations. e results are shown in Table 5p � 0.279 > 0.05; the original hypothesis is accepted and the regression model can fit the data well.
LogitU � − 0.82 + 0.65k 1 + 0.86k 2 + 0.67k 3 + 0.38k 4 , (17) (2) As shown in Table 7, the significant p-values are all 0, indicating that BMI, age, gender, and training duration have highly significant effects on the performance of track and field athletes. e effects were ranked from the highest to the lowest: BMI > age > gender > training duration.
(3) As shown in Table 7, the significant p-values are all 0, indicating that BMI, age, gender, and training duration have highly significant effects on the performance of track and field athletes. e effects were ranked from the highest to the lowest: BMI > age > gender > training duration.
(4) e accuracy is shown in Table 6, with an accuracy of 74.9%, which indicates that the model predicts more accurately.

Result Analysis and Discussion
Using track and field athletes of a sports school as the experimental subjects, 10 groups of 200m sprinters were randomly selected as the research subjects. e evaluation results are shown in Figure 1. According to Figure 1, the method of this paper can effectively evaluate the performance of 200m sprinters, and the estimated value is very close to the actual value. e experiment proves that the method in this paper can accurately estimate the performance of track and field athletes and has a high accuracy of track and field athletes' performance assessment results. e athletes of 10 types of track and field sports were randomly selected in this sports school to verify the generality of this paper's method. Using the method of this paper, the athletes' performance of these 10 types of track and field sports was evaluated and compared with the actual values, and the evaluation accuracy of the 10 types of track and field sports is shown in Figure 2. According to Figure 2, it can be seen that for different types of track and field sports, the method in this paper can accurately assess the performance of track and field athletes, and the estimation accuracy is basically maintained at more than 96%. e experiment proves that the method in this paper has good generality and a high estimation accuracy for different types of track and field sports. Comparing the method of this paper with the methods in the literature [16] and literature [17] clearly indicates that the assessment of athletes' performance of the above-mentioned 10 types of track and field sports was implemented at the same time. e evaluation accuracy and evaluation efficiency of the three methods were tested by comparing, among which literature [16] is a student performance prediction method integrating knowledge mapping and collaborative filtering, and literature [17] is a student performance prediction model based on a two-way attention mechanism. e performance of 100 athletes in each type of track and field sports was selected for testing and the average value was taken to enhance the credibility of the experiment. e accuracy and assessment efficiency of the three methods for assessing athletes' performance in the 10 types of track and field sports are shown in Figures 3 and 4. According to Figure 3, it can be seen that for athletes of different types of track and field sports, the assessment accuracy of athletes' performance of this paper's method is significantly higher than the remaining two methods, and the average assessment accuracy of this paper's method is 97.7%, the average assessment accuracy of literature [16] is 81.8%, and the average assessment accuracy of literature [17] is 86.5%. e experiment proves that when assessing the performance of athletes in different types of track and field sports, the method in this paper has the highest assessment accuracy, which significantly reduces the estimation error of athletes' performance and increases the credibility of the assessment results at the same time.
According to Figure 4, for athletes of different types of track and field sports, the evaluation time of this paper's method is significantly lower than the remaining two methods, and the evaluation time of this paper's method is always maintained within 20s with less variation, while the estimation time of the remaining two methods is more variable and less stable. is is due to the fact that the algorithm in this paper introduces factor analysis to optimize the parameters affecting the evaluation, which reduces the computational parameters and decreases the computational effort. e experiment proves that the evaluation time of this paper's method is the least and the athlete's performance evaluation is more efficient.     Taking 200m sprinters' performance as an example, the accuracy of the performance assessment of the three methods was tested with different numbers of athletes. e accuracy of the three methods was evaluated by the Mean Absolute Percentage Error (MAPE), an error evaluation index. e results of the error evaluation index tests for the three methods with different numbers of athletes are shown in Figure 5. According to Figure 5, the MAPE values of all three methods increased with the increasing number of athletes. Generally, if the MAPE value is lower than 10, it indicates that the evaluation accuracy of the evaluation methods is higher. With different numbers of athletes, the MAPE values of this paper are significantly lower than those of the remaining two methods. e MAPE value of this method always stayed within 6, and the MAPE value of the other two methods was lower than 10 only when the number of athletes was less than 200. When the number of athletes was more than 200, the MAPE values of the other two methods were greater than 10. e experiment proves that the MAPE value of this paper method is the lowest when the number of athletes is different.
is indicates that the evaluation value of this paper's method is closest to the actual value, with a higher evaluation accuracy and high evaluation quality.

Conclusion
Based on the model obtained in this paper, the performance of track and field athletes can be effectively evaluated. e main objective of this paper is to evaluate the performance of track and field athletes using a logistic regression model. e method adopts the idea of factor analysis, reduces and simplifies the data, and improves the evaluation effect. e experiments show that the method can accurately estimate the performance of track and field athletes and has a high accuracy of track and field athletes' performance assessment. At the same time, the method has good generality for track and field athletes' performance evaluation, less evaluation time, and higher evaluation efficiency. Overall, the method in this paper can achieve high-quality athlete performance assessment, which is very important for athletes' training planning. e accurate estimation of athletes' performance can help improve their performance and make them better by understanding the training planning they need. erefore, we study the historical data-driven athlete performance estimation method to improve the accuracy and estimation efficiency of athlete performance estimation, provide more valuable information for athlete training planning, and develop better athletes for the country. However, the good experience of using this method requires a large amount of track and field athletes' historical competition performance data, and the effectiveness of the evaluation for individual track and field athletes' performance is yet to be verified. e experimental data are all track and field athletes' performance, and the generality of the prediction for other nontrack and field sports is also yet to be verified. e next step will be to further explore the effectiveness of the model in evaluating the performance of athletes in a wider range of sports, to verify the generality of the model in evaluating athletes' performance, and to expand the scope of the application.

Data Availability
e labeled dataset used to support the findings of this study is available from the corresponding author upon request.

Conflicts of Interest
e author declares no competing interests.