Visual Analysis of Sports Actions Based on Machine Learning and Distributed Expectation Maximization Algorithm

In order to improve the scientificity of sports action analysis, this paper constructs a sports action analysis model based on machine learning based on the greedy algorithm and the bat algorithm. According to the structural characteristics of the model, the structure of the model is reflected in the form of face order, that is, the face neighborhood structure. Moreover, this paper judges the degree of similarity between model faces through the pros and cons of the order and applies it to the structural similarity matrix between models. In addition, this paper establishes corresponding mathematical models for the shape and structure of the model and constructs the shape similarity matrix, the surface neighborhood structure similarity matrix, and the structure similarity matrix between the source model and the target model. Finally, this paper designs and implements CAD model retrieval methods based on greedy algorithm and bat algorithm and designs experiments to compare the performance of the algorithm proposed in this paper with traditional algorithms. The result of the experiment shows that the algorithm proposed in this paper has obvious advantages in sports action analysis compared with the traditional algorithm.


Introduction
Scientific analysis of sports movements can effectively improve the effects of sports training, provide athletes with more scientific and effective training guidance, and reduce training injuries.erefore, auxiliary sports training through intelligent sports movement recognition and analysis models is the direction of future sports training development.With the development of cost-effective sensors and the advancement of human body pose estimation algorithms, it has become easier to obtain three-dimensional and two-dimensional bone point data from human motion video data.e motion trajectory of the human skeleton points in the time sequence can represent the actions in the time sequence and is not affected by light, clothing, skin color, and so on.In recognition of the human body local limb movement with higher fine-grained requirements, the geometric characteristics between the bone points can be expressed naturally, and the movement has good interpretability.ese characteristics make human action recognition based on skeletal point data an important direction in computer vision.
Action recognition based on bone points needs to design different classification algorithms according to different features of the extracted bone points.e relative motion of the human bones will produce the posture, so the relative position changes of the bone points can characterize the posture changes in the sequence.Moreover, the angle change of the connected bones indirectly reflects the change of the movement there.erefore, the spatial structure used in this paper is the vector modulus ratio of the characteristic bone points and the vector angle of the bone points as the bone points.
Video retrieval is to search for useful or needed information from a large amount of video data.It is mainly based on the given example or the designed special diagnosis description to find out the video clips that meet the conditions in a large amount of video data.With the widespread use of video capture equipment, the amount of available video data is also increasing.If manual processing is used to process video data, it will cause a substantial increase in labor costs.However, if a computer is used instead of labor to analyze the human motion behavior in the video, it can avoid excessive labor costs.Moreover, it can be faster, more comprehensive, and accurate in the labeling and indexing of videos, thereby helping people find important information of interest more conveniently.

Related Work
Most feature extraction models describe a certain part of the model, and it is difficult to extract and describe all functional models.Some research proposed a 3D shape descriptor and weight combination optimization scheme and used it to retrieve CAD models [1]. is scheme judges the similarity between models based on the weighted sum of the L1 distance on the distance histogram corresponding to the descriptor and uses the weight combination optimization scheme to improve the retrieval accuracy of the 3D model database.Some research proposed three combined feature descriptors and one class-based feature descriptor and applied them to model retrieval [2,3].Some research introduced coupled machining feature clusters to represent the three-dimensional CAD model as a structured model with machining features as a carrier and uses multilevel feature descriptors to establish a machining feature similarity evaluation model.Some research used fuzzy clustering of some physical descriptors to generate a multiscale index of the 3D model database for fast matching.
Some research proposed an ontology-based 3D CAD model retrieval algorithm.e CAD model is divided into several related subcomponents, and semantic description and annotations are added to complete the similarity evaluation between the models.Some research used hierarchical feature ontology and ontology mapping to generate semantic descriptors based on semantic descriptors for ontology reasoning so that the retrieval system can obtain better performance [4].Some research used the ontology to describe the functional semantics of the CAD model and semiautomatically annotated the functional semantics of the CAD model according to the attribute adjacency graph, thereby establishing a CAD model library that supports functional semantic model retrieval according to the existing model retrieval technology and feature extraction method [5,6].Some research used deep learning technology to train model features and built a deep neural network classifier for the three-dimensional computer-aided design model.Some research used another way to extract the feature vector of the three-dimensional model and to predict and match the model, effectively assessing the difference between the two models [7].In addition, many researchers began to convert three-dimensional models into two-dimensional graphics to extract the edge and shape information of the model to describe the characteristics of the model.Some research divided the surface adjacency graph into convex, concave, and flat areas, used the area attribute code to represent the surface area, and compared the area attribute codes to measure the similarity between the models.Some research used B-Rep information and label attribute adjacency graphs to represent 3D CAD models and used a two-stage filtering strategy combined with graphic code indexing to construct filters and verification frameworks to speed up the retrieval of 3D models [8,9].
Many researchers have also begun to use this framework to optimize and improve traditional machine learning algorithms.Some research has studied clustering algorithms under the Hadoop cloud platform [10].Some research used the MapReduce distributed framework to parallelize the improvement of the traditional ant colony algorithm, which makes the traditional ant colony algorithm faster and more effective when processing large-scale data sets [11,12].Some research used Newton's method to solve the beta distribution parameter algorithm and proposed a suitable initial value selection algorithm to enable the EM (Expectation Maximization) algorithm to effectively solve the parameters of the implicit regression model.Some research proposed an EM algorithm based on density detection, which selected the initial value based on density and distance to reduce the influence of the traditional EM algorithm's initial value selection on the convergence effect [13][14][15].Some research proposed a fast and robust finite Gaussian mixture model clustering algorithm and applied an entropy penalty operator to the mixing coefficients of the model components and the probability coefficients of the components to which the samples belong to make the algorithm converge to a certain value in a few iterations.Some research used the greedy algorithm method and set appropriate thresholds for the hidden parameters so that the traditional EM algorithm can obtain the number of model components of the Gaussian mixture model in a few iterations without presetting the number of model components.

EM Algorithm
e EM algorithm is an iterative method and is used to solve the maximum likelihood estimation or maximum a posteriori estimation of parameters in the probability model, which can greatly reduce the computational complexity of solving the maximum likelihood estimation.
e specific algorithm flow is as follows.
If we assume that the sample set is X � x 1 , x 2 , x 3 ,  . . ., x m } and obeys the Gaussian mixture distribution, the probability density function f k (x) of the Gaussian mixture distribution is Among them, x i is a P-dimensional vector, Φ j (x i ; θ j ) is the probability density of the Jth Gaussian model component, and θ j is its parameter.w j is the mixing coefficient of the jth component, describing the proportion of the sample covered by the jth Gaussian model component to the total sample, and k is the number of model components of the Gaussian mixture model.
2 Computational Intelligence and Neuroscience μ j is the mean value of Gaussian model components, and  j is the covariance matrix of Gaussian model components.Among them, the expression of Gaussian component probability density Φ j (x i ; θ j ) is . ( e initial value μ 0 ,  0 , w 0 is given, and steps E and M are repeated until the algorithm converges. Step E: according to the initial value of the parameter θ or the parameter value obtained in the last iteration, the posterior probability of the recessive variable (i.e., the expectation of the recessive variable) is calculated as the current estimated value of the recessive variable: Step M: by calculating the maximum value of the likelihood function, new parameter values are obtained: Finally, the following results are obtained: Among them, w t+1 k is the k-th class weight, μ t+1 k is the k-th class mean, θt + 1 is the d-dimensional vector, and  t+1 k is the k-th class covariance matrix.
Finally, through calculation, the parameter values of the Gaussian mixture model are obtained, and each sample is found in the subordinate class to obtain the final clustering result.

Greedy EM Algorithm
Based on the original probability density function f k (x) of the Gaussian mixture distribution of the EM algorithm, when a new component δ(x; θ) is brought into an existing k-component mixture density function f k (x), a new Gaussian mixture model density function is generated: Among them, α is the mixing coefficient of the newly added model components, 0 < α < 1.
en, the newly generated log-likelihood function is After the new Gaussian mixture model, including the mixture model components and the new components, is obtained, the mixture model f k (x) is set as unchanged.
erefore, the core of the greedy EM algorithm is to optimize the mixing coefficient α of the new model components and the parameters of the new components to calculate the highest value of the newly generated loglikelihood function L k+1 .erefore, we first find a set of initial parameters μ 0 ,  0 , and α 0 of the new component through a global search.At α 0 , the second-order Taylor formula of L k+1 is expanded, and the quadratic function of α is maximized to obtain an approximation of the likelihood function: In the formula, L k+1 ′ and L k+1 ″ are the first and second differentials with respect to α.If we define then the log-likelihood local optimum of L k+1 near α 0 � 0.5 can be written as us, the formula for estimating  α of the newly added model components can be obtained: us, the estimated value of the new model component is obtained, and then the optimal solution α k+1 , μ k+1 ,  k+1   of the new model parameter is extracted through the formula iteratively to extract the log-likelihood function value L k+1 of the new Gaussian mixture model.e actual processing process of MapReduce is e specific workflow is shown in Figure 1.For the Map phase, the algorithm begins to process data on the key-value pairs parsed in the Input phase.First, it initializes and then adds new model components to the single Gaussian mixture model initialized in each node.e new model component is a standard normal distribution with a mean value of 0 and a standard deviation of 1, and then the initialized mixing coefficient α 0 of the new model component is obtained.Among them, the initial parameter value of the lth node model is Computational Intelligence and Neuroscience Before output to the Reduce stage, there will be a series of intermediate processing.For the Sort stage, because the key value in the key-value pair is set to the number of iterations, this stage does not need to be considered too much.For the Partition stage, in this stage, the number of keys is the number of iterations, and the key-value pairs generated under the same iteration level have the same key value.
erefore, for each key-value pair generated in the Mapper phase, a Reduce job is assigned, the key and value values of the key-value pair are serialized into a byte array, and then the result is written into the buffer to wait for the call.
For the combine stage, the key-value pairs obtained by all nodes are integrated.Under the same number of iterations, the key values of the key-value pairs obtained from each node are the same, so the value values in these key-value pairs are integrated into a value set; namely, list〈key, value〉 ⟶ 〈key, list〈value〉〉. (15) Among them, key is the number of iterations, and value is the Gaussian mixture model density.
For the Reduce phase, the key-value pair generated in the combine phase is 〈key, list〈value〉〉.at is, the Gaussian mixture model density function set in the key-value pair is summed, then the logarithm is taken, and the integrated loglikelihood function is obtained as en, it is judged whether F k satisfies the convergence condition, and the judgment factor is set: When λ < 10 − 6 , if the algorithm satisfies the convergence condition, the algorithm proceeds to the next step of judgment.However, if it does not converge, the algorithm restarts the maximum likelihood estimation operation.
For the Output stage, it is judged whether the convergence condition F k satisfies F k+1 ≤ F k .If the convergence conditions are not met, new model components are added again.However, if the convergence condition is met, the output condition is met, and the result is output in the form of key-value pairs according to the format of the output file.If the value is F k that satisfies the output condition, then the k value corresponding to F k at this time is the number of output ideal model components.e final algorithm flowchart is shown in Figure 2.

Implementation of Distributed Greedy EM Algorithm
According to the programming model of MapReduce, corresponding functions are designed for program operation.In the realization of the distributed greedy EM algorithm, because the main process of the distributed greedy EM algorithm includes the reading of global data, the update calculation of Gaussian mixture model parameters, and the three aspects of iteration, the function design and implementation of these three main steps are carried out.

Reading of Global Data.
In the algorithm of this paper, the data needs to be globally searched and initialized, and all the data in the distributed file system can be obtained through the setup function in the Mapper class.Since the Because the global variables mentioned are sometimes required to be acquired and used, the Map function can be defined after the setup function is defined.First, the path of the model parameters of the distributed greedy EM algorithm is obtained and stored in the Configuration object.For the first iteration, the initial parameter values of the model are obtained from the path stored in the Configuration object.en, when the subsequent iterations are performed, the output path is obtained from the Reduce function in the previous iteration.
en in the Mapper class, the setup function is reassigned, and the corresponding data is obtained from the global file from the Configuration object.After the reading is completed, the parameter values are read into the cluster object through the Buffered Reader function, thus completing the reading of the global data.

Update of Gaussian Mixture Model Parameters.
When the parameters are updated through the maximum likelihood estimation of the Gaussian mixture model, the parameters include the mixing coefficient, mean value, and covariance matrix of the new model components.Among them, the covariance matrix needs to obtain the mixing coefficient and mean value of the newly added model components.erefore, two functions need to be designed to update them separately.
e constant MapReduce function is defined to obtain the mixing coefficient and mean value of the new model components, and the matrix MapReduce function is defined to obtain the covariance matrix and output all parameter values.
For the constant MapReduce function, every time the mixing coefficient and mean value of the new model component are performed, all the parameter values obtained from the previous iteration (the mixing coefficient, mean value, and covariance matrix of the new model component) are required.erefore, the data in the cluster global object is read in the step function.However, for the matrix Map-Reduce function, the mixing coefficient and mean value of the new model components obtained by the constant MapReduce function update this time are required.Moreover, the covariance matrix obtained this time will be used as the input of the constant MapReduce function of the next iteration, so it is necessary to use the step function to read the mixing coefficient, mean, and covariance matrix of the new model component obtained this time into the cluster_new object.

Iteration of Distributed Greedy EM Algorithm.
When the distributed greedy EM algorithm is iterative, it needs to store the output result of the previous MapReduce in HDFS and use it as the input of the subsequent iterative MapReduce.Moreover, after an iteration is completed, the corresponding median data needs to be deleted.erefore, the iterative process when designing the distributed greedy EM algorithm is shown in Figure 3.
Our work used the bat algorithm to find the optimal face matching sequence.In the process of bats searching for the structural similarity matrix FC of the source model and the target model, in order to make the bat move to the best position, a corresponding fitness value is set for each bat.e position solution sequence of the ith bat in the structural similarity matrix FC is (1, j(1)), (2, j(2)), . . ., (i, j(i)), . . ., (m, j(m)).Among them, j(i) represents the target face number i � 1, 2, . . ., m that matches the ith face of the source model.e calculation process of the fitness function of the ith bat is given, as shown in the following formula: In the process of searching for the optimal solution, the points acquired by the bat are discrete.In a d-dimensional search space, the size of the population is N. x i represents the position of the ith bat, that is, a two-dimensional sequence pair composed of the row and column indices of the structural similarity matrix, and v i represents the flight speed of the ith bat.At time t, the speed of bat i is updated as follows: Computational Intelligence and Neuroscience Among them, round represents the rounding operation, and Q(i) is the frequency of the ith bat, as shown in the following formula: rand is a uniformly distributed random number between [0, 1], and x * is the current optimal position solution.Taking source model A and target model B as examples, using the above formula, a set of velocity solutions about the current ith bat can be obtained.From the sequence of these solutions, a set of sequence solutions is obtained to simulate the whole process of bats searching for the optimal position sequence.
e algorithm uses the following formula to take the remainder and reacquire the current bat position solution.Among them, mod is the remainder function.

(28)
After the mod operation, the second component of the bat position solution may appear 0. If 0 appears, the algorithm executes Step 2; otherwise, the algorithm executes Step 3.
Step 2. e second component of the processed position solution is 0 (replace operation).e random sequence of the face number m of the source model is obtained, and the value of the second component that does not appear in the position solution after the remainder is found.After that, the found value is used to fill in the 0 in the second component of the position solution after the remainder.e replace operation is shown in Figure 4. e black solid point on the left is the position solution sequence after taking the remainder: (1, 0), (2, 0), (3, 4), (4, 5), (5, 5), (6, 8), (7,6), (8,4), (9,5).
Step 4. Fill position solution sequence (fill operation).From the position solution sequence after a unique operation, the algorithm obtains all the first components to form a set FSet. e algorithm obtains all the second components to form the set SSet. e algorithm obtains a set of random number sequences of the source model face number m and finds all the values that do not appear in FS to form the first component set FS 1 .After that, the algorithm obtains a set of random number sequences of the target model face number n and finds all the values that do not appear in SS to form the second component set SS 1 .From FS 1 and SS 1 , the algorithm randomly selects a numerical value to form a position solution and adds it to the position solution sequence until FS 1 and SS 1 are empty.

(36)
Since the points in the structural similarity matrix between the source model and the target model are all discrete Computational Intelligence and Neuroscience points, when the bat algorithm searches this matrix, the position solution obtained is also discrete.is requires processing these position solutions to reacquire the current optimal position solution sequence.
e face matching process based on the bat algorithm is as follows: Step 1. e algorithm calculates the shape similarity and neighborhood structure similarity between the source model surface and the target model surface.Moreover, the algorithm constructs a structural similarity matrix FC between the source model and the target model.
Step 2. Initialize the population.e algorithm sets the population size N, the number of iterations count, the maximum number of iterations Maxk, the initial loudness A 0 i of the ith bat at the moment t � 0, the pulse rate r 0 i , and the maximum value Q max and minimum value Q min of the frequency.
e algorithm calculates the fitness function value f(i)(i � 1, 2, . . ., n) of the ith bat and obtains the smallest fitness function value f min and its corresponding optimal position solution x * .
then the algorithm ends, and the current optimal position solution x * is output; otherwise, count + +, and the algorithm goes to Step 5.
Step 5. e algorithm redefines various values and rounds the speed v t i and position x t i of the ith bat at time t.
Step 6. e algorithm performs mod processing on x t i and judges whether there is a 0 solution in the surface sequence position solution.If there is 0 solution, the algorithm to Step 7; otherwise, the algorithm executes Step 8.
Step 7. rough the replace operation, the algorithm handles the 0 solution problem in the position solution sequence.
Step 8. e algorithm performs unique processing to delete the repeated position solutions in the current surface sequence.
Step 9. rough the fill operation, the algorithm handles the empty solution problem in the surface sequence position solution and obtains a new undetermined solution x new .
e algorithm generates a random number rand1.If r and 1 > r 0 i , the position of the best bat in the current group shifts to the next position, the algorithm goes to Step 6; otherwise, the algorithm goes to Step 11.
e algorithm generates a random number rand2.If rand2 < A 0 i and f(x new ) < f(i), then f(i) � f(x new ) and x t− 1 i � x new ; otherwise, f(i) and x t− 1 i do not change.
Step 12.If f(x new ) < f min , then x * � x new ; otherwise, x * does not make any changes, and the algorithm returns to Step 4. e bat algorithm is used to search the structural similarity matrix, and the final optimal position solution vector

e Analysis Effect of the Model on Sports
Movements.e scientific analysis of sports actions is carried out through the model constructed above, the algorithm in this paper is named HB algorithm, and the algorithm performance of the algorithm proposed in this paper is compared with neural network algorithm (NN) and deep learning algorithm (DL).First, this paper compares the effects of sports action feature recognition and compares 40 sets of actions.e results are shown in Figure 7.
It can be seen from the above chart that the hybrid model constructed in this paper performs well in recognition of sports action features, while the recognition rates of traditional neural network algorithms and deep learning algorithms are low.Next, this paper analyzes the scientific evaluation of the action by the algorithm love, the results are expressed by the scoring method, and the action correction opinions are scored.e results are shown in Figures 8 and 9.
It can be seen from the above figure and table that the algorithm in this paper performs very well in action evaluation and action correction, which is much higher than the traditional algorithm.It can be seen that the algorithm in this paper has a certain effect.

Conclusion
is paper combines the greedy algorithm and the bat algorithm to construct an intelligent model that can be applied to sports action analysis.Moreover, this paper designs and implements CAD model retrieval methods based on greedy algorithm and bat algorithm.In addition, this paper focuses on the retrieval process based on the greedy algorithm and the bat algorithm and compares and analyzes the advantages and disadvantages of the two algorithms in model retrieval based on experimental data.
e results show that the bat algorithm is feasible in model retrieval, and the bat algorithm is better than the greedy algorithm when measuring the subtle differences between CAD models.e pros and cons of the order are used to judge the degree of similarity between the model faces and apply it to the structural similarity matrix between the models.At the same time, this paper establishes the corresponding mathematical model for the shape and structure of the model and constructs the shape similarity matrix, the surface neighborhood structure similarity matrix, and the structure similarity matrix between the source model and the target model.Finally, this paper designs experiments to verify the performance of the model.From the research results, it can be seen that the model constructed in this paper has a certain effect.

( 14 )
en, the new model component parameters μ t+1 k and  t+1 k of each node are calculated to obtain the new model component, then the new model component mixing coefficient α t+1 k is obtained, and the new Gaussian mixture model density function is obtained.e generated key value is used as the number of iterations, and the key-value pair with the Gaussian mixture model density as the value is output to the next stage.

Figure 3 :
Figure 3: Iterative process of distributed greedy EM algorithm.

Figure 9 :Figure 8 :
Figure 9: Comparison diagram of scores of action correction opinions.