This paper presents the use of linear and nonlinear multivariable models as tools to support training process of race walkers. These models are calculated using data collected from race walkers’ training events and they are used to predict the result over a 3 km race based on training loads. The material consists of 122 training plans for 21 athletes. In order to choose the best model leave-one-out cross-validation method is used. The main contribution of the paper is to propose the nonlinear modifications for linear models in order to achieve smaller prediction error. It is shown that the best model is a modified LASSO regression with quadratic terms in the nonlinear part. This model has the smallest prediction error and simplified structure by eliminating some of the predictors.
1. Introduction
The level of today’s high-performance sport is very high and very even. Both coaches and competitors are forced to search for and use newer and sometimes innovative solutions in the process of sports training [1]. A solution supporting this process may be the application of various types of regression models.
Prediction in sport concerns many aspects including the prediction of performance results [2, 3] or predicting sporting talent [4, 5]. Models predicting results in sport, taking into account the seasonal statistics of each team, were also constructed [6]. The application of predictive models in athletics was described by Maszczyk et al. [2], where the regression was used to predict results in a javelin throw. These models were applied to support the choice and selection of prospective javelin throwers.
Prediction of sports results using linear regression was also presented in the work by Przednowek and Wiktorowicz [7]. A linear predictive model, implemented by ridge regression, was applied to predict the outcomes of a walking race after the immediate preparation phase. As input for the model, the basic somatic features (height and weight) and training loads (training components) for each day of training were provided, and the output was the result expected over a distance of 5 km. In addition to linear models, artificial neural networks, whose parameters were specified in cross-validation, were also used to implement this task.
In the paper by Drake and James [8], the regressions estimating the results over distances of 5, 10, 20, and 50 km and the levels of the selected physiological parameters (e.g., VO2max) were presented. The regressions applied were the classical linear models, and the R2 criterion was chosen for the quality evaluation. This study included 23 women and 45 men. The amount of collected data was different depending on the task and ranged from 21 to 68 records.
A nonlinear regression equation to predict the maximum aerobic capacity of footballers was proposed by Chatterjee et al. [9]. The data came from 35 young players aged from 14 to 16. The experiment was to verify the use of the 20 m MST (Multistage Shuttle Run Test) in evaluating the performance of VO2max. The talent of young hockey players was identified by Roczniok et al. [5] using a regression equation. The research involved 60 boys aged between 15 and 16, who attended selection camps. The applied regression model classified candidates for future training, based on selected parameters of the players. The logistic regression was used in the model as the classification method.
The nonlinear predictive models used in sport are also based on the selected methods of “data mining” [10]. Among them, an important role is played by fuzzy logic expert systems. Papić et al. [4] described practical application of such a system. The proposed system was based on knowledge of experts in the field of sport, as well as the data obtained as a result of motor tests. The model suggested the most suitable sport and it was designed to search for prospective sports talents.
The application of fuzzy modeling techniques in sports prediction was also presented by Mężyk and Unold [11]. The goal of their paper was to find the rules that can express swimmer’s feelings the day after in-water training. The data was collected for two months among competitors practicing swimming. The swimmers were characterized by a good level of sports attainment (2nd sport class). The material obtained consisted of 12 attributes, and the total number of models was 480, out of which 136 were used in the final stage. The authors proved that their method was characterized by better predictive ability than the traditional methods of classification.
Other papers also concern the use of artificial neural networks in sports prediction [6]. Neural models are used to analyze the effectiveness of the training of swimmers, to identify handball players’ tactics, or to predict sporting talent [12]. Many studies present the application of neural networks in various aspects of sports training [13–15]. These models support the planning of training loads, practice control, or the selection of sports.
An approach developed by the authors is the construction of models performing the task of predicting the results achieved by a competitor in the proposed sports training. This allows for the proper selection of training components and thus supports the achievement of the desired result. The aim of this study is to determine the effectiveness of selected linear and nonlinear models in predicting the outcome in a 3-kilometer walking race for the proposed training. The research hypothesis of the paper is stated as follows: the prediction error of 3 kilometers’ result in race walking for nonlinear models can be smaller than for linear models.
The paper is organized as follows. In Section 2, the training data of the race walkers recorded during annual training cycle is described. Section 3 contains the methods used to build the linear and nonlinear predictive models, including ordinary least squares regression, regularized methods, that is, ridge, LASSO, and elastic net regressions, nonlinear least squares regression, and artificial neural networks as multilayer perceptron and radial basis function network. In Section 3, the criterion used to evaluate the performance of the models, calculated using mean square error in the process of cross-validation, is also defined. Section 4 describes the procedures used for building models and their evaluation in R language and STATISTICA software. The obtained results are analyzed and discussed in Section 5. Finally, in Section 6, the performed work is concluded.
2. Material
The predictive models were built using the training data of athletes practising race walking. The analysis involved a group of colts and juniors from Poland. Among the competitors were the finalists in the Polish Junior Indoor Championships and the Polish Junior Championships. The data of race walkers was recorded during the 2011-2012 season in the form of training means and training loads. The training mean is the type of work performed while the training load is the amount of work at a particular intensity done by an athlete during exercise [1]. In the material, which has been collected, 11 means of training were distinguished. The material was drawn from the annual training cycle for the following four phases: transition, general preparation, special preparation, and starting phase. The training data has the form of sums of training loads completed in one month of the chosen training phase. The material included 122 training patterns made by 21 race walkers.
Control of the training process in race walking requires different tests of physical fitness at every training level. Because this research concerns the competitors in colt and junior categories, thus in order to determine a unified criterion of the level of training, a result for 3000 m race walking was used. The choice of the distance of 3000 m is valid because this is the indoor walking competition.
The description of the variables under consideration and their basic statistics are presented in Table 1. The variables are as follows: arithmetic mean of x-, minimum value xmin, maximum value xmax, standard deviation SD, and coefficient of variation V=SD/x-·100%. The qualitative variables are X1, X2, X3, X4, which take their values from the set {0,1}. The other variables, that is, X5,…,X18, are quantitative variables. If the value at inputs X1, X2, X3 is 0, it means that the transitional period is considered. Setting the value 1 on one of the inputs X1, X2, X3, it means the training period is selected. The variable X4 represents the gender of the competitor, where the value 0 denotes a female, while the value 1 denotes a male, and the age is represented by X5. Basic somatic features of race walkers such as weight and height are presented in the form of BMI (X6) expressed by the formula(1)BMI=MH2kg/m2,where M is the body weight [kg] and H is the body height [m]. The variable X7 denotes the current result over 3 km in seconds. Training loads are characterized by the following variables: running exercises (X8), walking with different levels of intensity (X9,X10,X11), exercises forming different types of endurance (X12,X13,X14), exercises forming techniques (X15), exercises forming muscle strength (X16), exercises forming general fitness (X17), and warming up exercises (X18).
The variables and their basic statistics.
Variable
Description
x-
xmin
xmax
SD
V [%]
Y
Result over 3 km [s]
936.9
780
1155
78.4
8.4
X1
General preparation phase
—
—
—
—
—
X2
Special preparation phase
—
—
—
—
—
X3
Starting phase
—
—
—
—
—
X4
Competitor’s gender
—
—
—
—
—
X5
Competitor’s age [years]
18.9
14
24
3.0
15.6
X6
BMI (body mass index) [kg/m^{2}]
19.3
16.4
22.1
1.7
8.7
X7
Current result over 3 km [s]
962.6
795
1210
87.7
9.1
X8
Overall running endurance [km]
30.9
0
56
10.6
34.4
X9
Overall walking endurance in the 1st intensity range [km]
224.6
57
440
96.1
42.8
X10
Overall walking endurance in the 2nd intensity range [km]
53.2
0
120
34.6
65.1
X11
Overall walking endurance in the 3rd intensity range [km]
7.9
0
30
9.4
119.7
X12
Short tempo endurance [km]
8.9
0
24
5
56.0
X13
Medium tempo endurance [km]
8.3
0
32.4
8.6
103.2
X14
Long tempo endurance [km]
12.9
0
56
16.1
125.0
X15
Exercises forming technique (rhythm) of walking [km]
4.4
0
12
4.2
96.0
X16
Exercises forming muscle strength [min]
90.2
0
360
104.8
116.3
X17
Exercises forming general fitness [min]
522.0
120
720
109.9
21.0
X18
Universal exercises (warm up) [min]
317.3
150
420
72.5
22.8
An example of data used for building the model has the form(2)x5=0,1,0,0,23,22.09,800,32,400,112,20,16,32.4,48,8,280,640,400,y5=800.The vector x5 represents a 23-year-old race walker with BMI=22.09 kg/m^{2}, who completes training in the special preparation phase. The result both before and after the training was the same and is equal to 800 s.
3. Methods
In this study, two approaches were considered. The first approach was based on white box models realized by modern regularized methods. These models are interpretable because their structure and parameters are known. The second approach was based on black box models realized by artificial neural networks.
3.1. Constructing Regression Models
Consider a multivariable regression model with the inputs (predictors or regressors) Xj, j=1,…,p, and one output (response) Y shown in Figure 1. We assume that the model is linear and has the form (3)Y^=w0+X1w1+⋯+Xpwp=w0+∑j=1pXjwj,where Y^ is the estimated response and w0, wj are unknown weights of the model. The weight w0 is called constant term or intercept. Furthermore, we assume that the data is standardized and centered and the model can be simplified to the form (see, e.g., [16])(4)Y^=X1w1+⋯+Xpwp=∑j=1pXjwj.
A diagram of a system with multiple inputs and one output.
Observations are written as pairs (xi,yi), where xi=[xi1,…,xip], i=1,…,n, xij is the value of the jth predictor in the ith observation, and yi is the value of the response in the ith observation. Based on formula (4), the ith observation can be expressed as(5)y^i=xi1w1+⋯+xipwp=∑j=1pxijwj=xiw,where w=[w1,…,wp]T. Introducing matrix X in the form of(6)X=x11x12⋯x1px21x22⋯x2p⋮⋮⋱⋮xn1xn2⋯xnpformula (5) can be written as(7)y^=Xw,where y^=[y^1,…,y^n]T.
In order to construct regression models, an error (residual) is introduced as the difference between the real value yi and the estimated value y^i in the form of(8)ei=yi-y^i=yi-∑j=1pxijwj=yi-xiw.Using matrix (6), the error can be written as(9)e=y-y^=y-Xw,where e=[e1,…,en]T and y=[y1,…,yn]T.
Denoting by J(w,·) the cost function, the problem of finding the optimal estimator can be formulated as to minimize the function J(w,·), which means solving the problem(10)w^=argminwJw,·,where w^ is the vector of solutions.
Depending on the function J(w,·), different regression models can be obtained. In this paper, the following models are considered: ordinary least squares regression (OLS), ridge regression, LASSO (least absolute shrinkage and selection operator), elastic net regression (ENET), and nonlinear least squares regression (NLS).
3.2. Linear Regressions
In OLS regression (see, e.g., [16–18]) the model is calculated by minimizing the sum of squared errors:(11)Jw=eTe=y-XwTy-Xw=y-Xw22,where ·2 denotes the Euclidean norm (L2). Minimizing the cost function (11), which is the quadratic function of w, we get the following solution:(12)w^=XTX-1XTy.
It should be noted that solution (12) does not exist if the matrix XTX is singular (due to correlated predictors or if p>n). In this case, different methods of regularization, including the previously mentioned ridge, LASSO, and elastic net regressions, can be used.
In ridge regression by Hoerl and Kennard [19], the cost function includes a penalty and has the form(13)Jw,λ=eTe+λwTw=y-XwTy-Xw+λwTw=y-Xw22+λw22.The parameter λ≥0 determines the size of the penalty: for λ>0, the model is penalized, for λ=0, ridge regression reduces to OLS regression. Solving problem (10) for ridge regression, we get(14)w^=XTX+λI-1XTy,where I is the identity matrix with the size of p×p. Because the diagonal of the matrix XTX is increased by a positive constant, the matrix XTX+λI is invertible and the problem becomes nonsingular.
In LASSO regression by Tibshirani [20], similarly to ridge regression, the penalty is added to the cost function, where the L1-norm (the sum of absolute values) is used:(15)Jw,λ=eTe+λzTw=y-XwTy-Xw+λzTw=y-Xw22+λw1,where z=[z1,…,zp]T, zj=sgn(wj), and ·1 denotes the Manhattan norm (L1). Because problem (10) is not linear in relation to y (due to the use of L1-norm), the solution cannot be obtained in the compact form as in ridge regression. The most popular algorithm used in this case is the LARS algorithm (least angle regression) by Efron et al. [21].
In elastic net regression by Zou and Hastie [22], the features of ridge and LASSO regressions are combined. The cost function in the so-called naive elastic net has the form of(16)Jw,λ1,λ2=eTe+λ1zTw+λ2wTw=y-XwTy-Xw+λ1zTw+λ2wTw=y-Xw22+λ1w1+λ2w22.To solve the problem, Zou and Hastie [22] proposed the LARS-EN algorithm, which was based on the LARS algorithm developed for LASSO regression. They used the fact that elastic net regression reduces to LASSO regression for the augmented data set (X∗,y∗).
3.3. Nonlinear Regressions
To take into account the nonlinearity in the models, we can apply the transformation of predictors or use nonlinear regression. In this paper, the latter solution is applied.
In OLS regression, the model is described by formula (5), while in more general nonlinear regression the relationship between the output and the predictors is expressed by a certain nonlinear function f(·) in the form of(17)y^i=fxi,w.In this case, the cost function J(w) is formulated as(18)Jw=∑i=1nei2=∑i=1nyi-y^i2=∑i=1nyi-fxi,w2.Since the minimization of function (18) is associated with solving nonlinear equations, numerical optimization is used in this case. The main problem connected with the construction of nonlinear models is the choice of the appropriate function f(·).
3.4. Artificial Neural Networks
Artificial neural networks (ANNs) were also used for building predictive models. Two types of ANNs were implemented: a multilayer perceptron (MLP) and networks with radial basis function (RBF) [18].
The MLP network is the most common type of neural models. The calculation of the output in 3-layer multiple-input-one-output network is performed in feed-forward architecture. In the first step, m linear combinations, or the so-called activations, of the input variables are constructed as(19)ak=∑j=1pxjwkj1,where k=1,…,m and wkj(1) denotes the weights for the first layer. From the activations ak, using a nonlinear activation function h(·), hidden variables zk are calculated as(20)zk=hak.The function h(·) is usually chosen as logistic or “tanh” function. The hidden variables are used next to calculate the output activation(21)a=∑k=1mzkwk2,where wk(2) are weights for the second layer. Finally, the output of the network is calculated using an activation function σ(·) in the form of(22)y=σa.
For regression problems, the function σ(·) is chosen as identity function, and so we obtain y=a. The MLP network utilizes iterative supervised learning known as error backpropagation for training the weights. This method is based on gradient descent applied to the sum of squares function. To avoid the problem with overtraining the network, the number m of hidden neurons, which is a free parameter, should be determined to give the best predictive performance.
In the RBF network, the concept of radial basis function is used. Linear regression (5) is extended by linear combinations of nonlinear functions of the inputs in the form of(23)y^i=∑j=1pϕjxijwj=φxiw,where φ=[ϕ1,…,ϕp]T is a vector of basis functions. Using nonlinear basis functions, we get a nonlinear model, which is, however, a linear function of parameters wj. In the RBF network, the hidden neurons perform a radial basis function whose value depends on the distance from selected center cj:(24)φxi,c=φxi-c,where c=[c1,…,cp] and · is usually the Euclidean norm. There are many possible choices for the basis functions, but the most popular is Gaussian function. It is known that RBF network can exactly interpolate any continuous function; that is, the function passes exactly through every data point. In this case, the number of hidden neurons is equal to the number of observations and the values of coefficients wj are found by simple standard inversion technique. Such a network matches the data exactly, but it has poor predictive ability because the network is overtrained.
3.5. Choosing the Model
In this paper, the best predictive model is chosen using leave-one-out cross-validation (LOOCV) method [23], in which the number of tests is equal to the number of data n and one pair (xi,yi) creates a testing set. The quality of the model is evaluated by means of the square root of the mean square error (RMSECV) defined as(25)MSECV=1n∑i=1nyi-y^-i2,RMSECV=MSECV,where y^-i denotes the output of the model built in the ith step of validation process using a data set containing no testing pair (xi,yi) and MSECV is the mean square error.
In order to describe the measure to which the model fits the training data, the root mean square error of training (RMSET) is considered. This error is defined as(26)MSET=1n∑i=1nyi-y^i2,RMSET=MSET,where y^i denotes the output of the model built in the ith step using the full data set and MSET is the mean square error of training.
4. Implementation of the Predictive Models
All the regression models were calculated using R language with additional packages [24].
The lm.ridge function from “MASS” package [25] was used for calculating OLS regression (where λ=0) and ridge regression (where λ>0). With the function enet included in the package “elastic net” [26], LASSO regression and elastic net regression were calculated. The parameters of the enet function are s∈[0,1] and λ≥0, where s is a fraction of the L1 norm, whereas λ denotes λ2 in formula (16). The parameterization of elastic net regression using the pair (λ,s) instead of (λ1,λ2) in formula (16) is possible because elastic net regression can be treated as LASSO regression for an augmented data set (X∗,y∗) [22]. Assuming that λ=0, we get LASSO regression with one parameter s for the original data (X,y).
All the nonlinear regression models were calculated using the nls function coming from the “stats” package [27]. It calculates the parameters of the model using the nonlinear least squares method. One of the parameters of the nls function is a formula that specifies the function f(·) in model (18). To calculate the weights, Gauss-Newton’s algorithm was used which was selected by default in the nls function. In all the calculations, it was assumed that the initial values of the weights are equal to zero.
For the implementation of artificial neural networks, StatSoft STATISTICA [28] software was used. The learning of MLP networks was implemented using the BFGS (Broyden-Fletcher-Goldfarb-Shanno) algorithm [18]. While calculating the RBF network, the parameters of the basis functions were automatically set by the learning procedure.
The parameters in all models were selected using leave-one-out cross-validation. In the case of regularized regressions, the penalty coefficients were calculated, while, in the case of neural networks, the number of neurons in the hidden layer was calculated. The primary performance criterion of the model was RMSECV error. Cross-validation functions in the STATISTICA program were implemented using Visual Basic language.
5. Results and Discussion
From a coach’s point of view, the prediction of results is very important in the process of sport training. A coach using the model, which was constructed earlier, can predict how the training loads will influence the sport outcome. The presented models can be used for predictions based on the proposed monthly training introduced as the sum of the training loads of each type implemented in a given month.
The results of the research are presented in Table 2; the description of the selected regressions will be presented in the next paragraphs. Linear models such as OLS, ridge, and LASSO regressions have been calculated by the authors in work [3]. They will be briefly described here. The nonlinear models implemented using nonlinear regression and artificial neural networks will be discussed in greater detail. All the methods will be compared taking into account the accuracy of the prediction.
Coefficients of linear models and linear part of nonlinear model NLS1 and error results.
Regression
OLS
RIDGE
LASSO, ENET
NLS1
w0
237.2
325.7
296.6
2005
w1
45.67
34.67
32.75
41.24
w2
90.61
74.84
71.91
77.12
w3
39.70
27.49
24.45
−3.439
w4
−2.838
2.424
15.45
w5
−0.9755
−1.770
−1.416
−22.44
w6
1.072
0.5391
−24.71
w7
0.7331
0.6805
0.7069
−1.782
w8
−0.2779
−0.3589
−0.3410
−1.500
w9
−0.1428
−0.1420
−0.1364
−0.0966
w10
−0.1579
−0.0948
−0.0200
0.7417
w11
0.7472
0.4352
0.0618
0.6933
w12
0.4845
0.3852
0.1793
−0.6726
w13
0.1216
0.1454
0.1183
−0.0936
w14
−0.1510
−0.0270
2.231
w15
−0.5125
−0.3070
0.7349
w16
−0.0601
−0.0571
−0.0652
−0.2685
w17
−0.0153
−0.0071
0.0358
w18
−0.0115
−0.0403
−0.0220
−0.0662
RMSECV [s]
26.90
26.76
26.20
28.83
RMSET [s]
22.70
22.82
22.89
20.21
5.1. Linear Regressions
The regression model calculated by the OLS method generated the prediction error RMSECV=26.90 s and the training error RMSET=22.70 s (Table 2). In the second column of Table 2, the weights w0 and wj are presented.
The search for the ridge regression model is based on finding the parameter λ, for which the model achieves the smallest prediction error. In this paper, ridge regression models for λ changing from 0 to 2 with step of 0.1 were analyzed. Based on the results, it was found that the best ridge model is achieved for λopt=1. The prediction error RMSECV=26.76 s was smaller than in the OLS model, while the training error RMSET=22.82 s was greater (Table 2). The obtained ridge regression improved the predictive ability of the model. It is seen from Table 2 that as in the case of OLS regression, all weights are nonzero and all the input variables are used in computing the output of the model.
The LASSO regression model was calculated using the LARS-EN algorithm, in which the penalty is associated with the parameter s changing from 0 to 1 with step of 0.01. It was found that the optimal LASSO regression is calculated for sopt=0.78. The best LASSO model generates the error RMSECV=26.20 s, which improves the results of OLS and ridge models. However, it should be noted that this model is characterized by the worst data fit with the greatest training error RMSET=22.89 s. The LASSO method is also used for calculating an optimal set of input variables. It can be seen in the fourth column of Table 2 that the LASSO regression eliminated the five input variables (X4, X6, X14, X15, and X17), which made the model simpler than for OLS and ridge regression.
The use of elastic net regression model has not improved the value of the prediction error. The best model was obtained for a pair of parameters sopt=0.78 and λopt=0. Because the parameter λ is zero, the model is identical to the LASSO regression (fourth column of Table 2).
5.2. Nonlinear Regressions
Nonlinear regression models were obtained using various functions f(·) in formula (18). It was assumed that the function f(·) consists of two components: the linear part, in which the weights are calculated as in OLS regression, and the nonlinear part containing expressions of higher orders in the form of a quadratic function of selected predictors:(27)fxi,w,v=∑j=1pxijwj+∑j=1pxij2vj,where w=[w1,…,wp]T is the vector of the weights of the linear part and v=[v1,…,vp]T is the vector of the weights of the nonlinear part. The following cases of nonlinear regression were considered (Table 3), wherein each of the following models does not take into account the squares of qualitative variables X1, X2, X3, and X4 (v1=v2=v3=v4=0):
NLS1: both the weights of the linear part and the weights v5,…,v18 of the nonlinear part are calculated.
NLS2: the weights of the linear part are constant, and their values come from the OLS regression (the second column of Table 2); the weights v5,…,v18 of the nonlinear part are calculated (the third column of Table 3).
NLS3: the weights of the linear part are constant, and their values come from the ridge regression (the third column of Table 2); the weights v5,…,v18 of the nonlinear part are calculated (the fourth column of Table 3).
NLS4: the weights of the linear part are constant, and their values come from the LASSO regression (the fourth column of Table 2); the weights v5,v7,…,v13,v16,v18 of the nonlinear part are calculated (the fifth column of Table 3).
Coefficients of nonlinear part of nonlinear models and error results (all coefficients have to be multiplied by 10^{−2}).
Regression
NLS1
NLS2
NLS3
NLS4
v5
53.35
−0.3751
−0.3686
−0.6995
v6
59.43
−1.0454
−1.3869
v7
0.1218
0.0003
0.0004
0.0001
v8
1.880
0.0710
0.0372
−0.0172
v9
−0.0016
0.0093
0.0093
0.0085
v10
−0.6646
−0.0577
−0.0701
−0.1326
v11
−3.0394
−0.3608
0.0116
0.8915
v12
4.8741
0.3807
0.4170
1.0628
v13
0.4897
−0.2496
−0.2379
−0.1391
v14
−4.7399
−0.1141
−0.1362
v15
−13.6418
1.3387
0.8183
v16
0.0335
−0.0015
−0.0003
−0.0004
v17
−0.0033
−0.0006
−0.0006
v18
0.0054
0.0012
0.0013
−0.0002
RMSECV [s]
28.83
25.24
25.34
24.60
RMSET [s]
20.21
22.63
22.74
22.79
Based on the results shown in Table 3, the best nonlinear regression model is the NLS4 model, that is, the modified LASSO regression. This model is characterized by the smallest prediction error and the reduced number of predictors.
5.3. Neural Networks
In order to select the best structure of a neural network, the number of neurons m∈[1,10] in the hidden layer was analyzed. In Figures 2, 3, and 4, the relationships between cross-validation error and the number of hidden neurons are presented. The smallest cross-validation errors for the MLP(tanh) and MLP(exp) networks were obtained for one hidden neuron (18-1-1 architecture) and they were, respectively, 29.89 s and 30.02 s (Table 4). For the RBF network, the best architecture was the one with four neurons in the hidden layer (18-4-1) and cross-validation error in this case was 55.71 s. Comparing the results, it is seen that the best model is the MLP(tanh) network with the 18-1-1 architecture. However, it is worse than the best regression model NLS4 (Table 3) by more than 5 seconds.
The number of hidden neurons and error results for the best neural nets.
ANN
MLP(tanh)
MLP(exp)
RBF
m
1
1
4
RMSECV [s]
29.89
30.02
55.71
RMSET [s]
25.19
25.17
52.63
Cross-validation error (RMSECV) and training error (RMSET) for MLP(tanh) neural network; vertical line drawn for m=1 signifies the number of hidden neurons chosen in cross-validation.
Cross-validation error (RMSECV) and training error (RMSET) for MLP(exp) neural network; vertical line drawn for m=1 signifies the number of hidden neurons chosen in cross-validation.
Cross-validation error (RMSECV) and training error (RMSET) for RBF neural network; vertical line drawn for m=4 signifies the number of hidden neurons chosen in cross-validation.
6. Conclusions
This paper presents linear and nonlinear models used to predict sports results for race walkers. Introducing a monthly training schedule for a selected phase in the annual cycle, a decline in physical performance may be predicted based on the generated results. Thanks to that, it is possible to take into account earlier changes in the scheduled training. The novelty of this research is the use of nonlinear models, including modifications of linear regressions and artificial neural networks, in order to reduce the prediction error generated by linear models. The best model was the nonlinear modification of LASSO regression for which the error was 24.6 seconds. In addition, the method has simplified the structure of the model by eliminating 9 out of 32 predictors. The research hypothesis was confirmed. Comparing with other results is difficult because there is a lack of publications concerning predictive models in race walking.
Experts in the fields of sports theory and training were consulted during the construction of the models in order to maintain the theoretical and practical principles of sport training. The importance of the work is that practitioners (coaches) can use predictive models for planning of training loads in race walking.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of the paper.
Acknowledgment
This work has been partially supported by the Polish Ministry of Science and Higher Education under Grant U-649/DS/M.
BompaT. O.HaffG.MaszczykA.ZającA.RygułaI.A neural network model approach to athlete selectionPrzednowekK.WiktorowiczK.Prediction of the result in race walking using regularized regression modelsPapićV.RoguljN.PleštinaV.Identification of sport talents using a web-oriented expert system with a fuzzy moduleRoczniokR.MaszczykA.StanulaA.CzubaM.PietraszewskiP.KantykaJ.StarzyńskiM.Physiological and physical profiles and on-ice performance approach to predict talent in male youth ice hockey players during draft to hockey teamHaghighatM.RastegariH.NourafzaN.BranchN.EsfahanI.A review of data mining techniques for result prediction in sportsPrzednowekK.WiktorowiczK.Neural system of sport result optimization of athletes doing race walkingDrakeA.JamesR.Prediction of race walking performance via laboratory and field testsChatterjeeP.BanerjeeA. K.DasbP.DebnathP.A regression equation to predict VO_{2} max of young football players of NepalOfoghiB.ZeleznikowJ.MacMahonC.RaabM.Data mining in elite sports: a review and a frameworkMężykE.UnoldO.Machine learning approach to model sport trainingPfeifferM.HohmannA.Applications of neural networks in training scienceRygułaI.Artificial neural networks as a tool of modeling of training loadsProceedings of the 27th Annual International Conference of the Engineering in Medicine and Biology Society (IEEE-EMBS '05)September 2005IEEE298529882-s2.0-33846937113SilvaA. J.CostaA. M.OliveiraP. M.ReisV. M.SaavedraJ.PerlJ.RouboaA.MarinhoD. A.The use of neural network technology to model swimming performanceMaszczykA.RoczniokR.WaśkiewiczZ.CzubaM.MikolajecK.ZajacA.StanulaA.Application of regression and neural models to predict competitive swimming performanceHastieT.TibshiraniR.FriedmanJ.MaddalaG. S.BishopC. M.HoerlA. E.KennardR. W.Ridge regression: biased estimation for nonorthogonal problemsTibshiraniR.Regression shrinkage and selection via the LassoEfronB.HastieT.JohnstoneI.TibshiraniR.Least angle regressionZouH.HastieT.Regularization and variable selection via the elastic netArlotS.CelisseA.A survey of cross-validation procedures for model selectionR Development Core TeamRipleyB.VenablesB.HornikK.GebhardtA.FirthD.ZouH.HastieT.Package “elasticnet”, version 1.1, CRAN, 2012R Development Core Team and Contributors WorldwideStatSoft