Research on the Optimization of the Physical Education Teaching Mode Based on Cluster Analysis under the Background of Big Data

With the rise of digital campuses, online learning platforms, and the improvement of educational technology, the interaction between teachers and students has entered a new stage. Especially under the influence of CSCL (Computer Supported Collaborative Learning), computer-assisted collaborative learning, and E-learning (Electronic Learning) network digital learning that emerged in recent times, new technologies are changing the way people learn. Students’ learning is not limited to one-way absorption of knowledge taught by teachers, and the interaction between students and teachers, students and students, and the interaction between students and the teaching environment are increasingly appearing in modern teaching classrooms. This paper optimizes the physical education management based on the clustering algorithm under the background of big data, uses the popular Java language to write codes to realize all the functions of the algorithm, and uses some small examples to prove the correctness of the fuzzy clustering algorithm sex. The algorithm reads a file through the input and output streams FileReader and BufferedReader. The content of the file is the relevant information of the physical education mode. It gives the methods to calculate the cluster center (i.e., the center of mass), calculate the distance, correct the fuzzy classification matrix, and display the matrix. (i.e., output matrix), conversion to deterministic classification method, calculation of classification coefficients, and basic functions such as average fuzzy entropy. According to the data, if the FCM clustering method is not used, but the previous average method is used as the evaluation basis, there will be results that we do not want to see, but if we use the FCM algorithm, the evaluation result is that what we want to see is more specific. From the analysis results, thematic teaching+basic teaching seems to be the most popular mode of physical education.


Introduction
With the rapid development of today's society, a large amount of data are generated due to people's close communication. is era has promoted the rise of big data and the development of the Internet, and the data are also increasing. Cloud computing makes it di cult to use big data. It became easy to use. Data mining based on the clustering algorithm has become the future development trend of physical education model optimization, and data mining technology will provide new innovation points for future education reform. Clustering is a process of dividing things without any prior knowledge. rough clustering, the similarity between things can be found, so that the results can be better revealed. First of all, the quality of teaching is clear. In addition to the evaluation of teaching facilities, teaching tools and other hardware, the evaluation of teaching quality also includes the level of teaching and the seriousness of teaching. is paper selects the most widely used FCM clustering algorithm, but it is di cult to achieve the optimum by relying only on the FCM algorithm, and it also has many limitations.
is paper introduces some improved algorithms of the FCM algorithm. e popular improvement is generally carried out from the selection of the cluster center and the adjustment of the cluster center. After the program is improved, we need to verify whether the algorithm is e ective or not through simulation experiments.
e physical education management model designed by the clustering algorithm in this paper mainly includes the design model to analyze teaching data and student information, including comprehensive evaluation of students, evaluation of student performance, evaluation of classroom teaching, student status, student achievement, classroom teaching, student affairs, psychology health, comprehensive assessment, and many other information can also teach students according to their aptitude according to the clustered information [1][2][3][4][5][6][7][8][9] e research results of this paper, combined with the current teaching management methods, can reduce the unreasonable and unscientific defects of the evaluation caused by subjective factors in the previous evaluation system and can make the teaching evaluation more scientific and reasonable.

Related Work
At present, the current physical education teaching system in colleges and universities in our country is not standardized enough, and the evaluation methods that have been used such as AHP, fuzzy comprehensive evaluation method, one-way analysis of variance, neural network method, and so on. Although there are many teaching evaluation methods, we cannot get the information we need from the evaluation data obtained, or what we finally find are some superficial and worthless information. erefore, we need to obtain hidden valuable information. Clustering methods are generally divided into hard clustering and soft clustering. Hard clustering is an algorithm whose degree can only be 1 or 0. It takes less time and can draw conclusions quickly. However, this algorithm also has significant shortcomings because this algorithm ignores the connection between the data, which makes the obtained results have a large error from the correct results. e fuzzy clustering algorithm is an algorithm that requires the sum of the membership degrees of the data to be 1. is regulation greatly improves the correctness of the clustering results because this algorithm does not only consider the membership of the data. It also takes into account the influence of noise data on the clustering results. is algorithm can avoid the influence of noise data on the final result and is suitable for cluster analysis with noisy data. e research on the optimization of teaching mode is carried out through the improvement of the method. In the early twenty-first century, people's main research direction is to use more complex mathematical models to process teaching information, and the fuzzy comprehensive evaluation method is a typical example. Some domestic scholars have proposed the methods of UML and KDD. ey have carried out detailed research on the subject and process of teaching, but they are still only in the stage of theoretical verification, which is still far from practical application. Data mining technology integrates fuzzy mathematics, statistics, machine learning, logical reasoning, and many other fields. It has now received extensive attention in many fields such as business, finance, medical care, education, Internet, and government. Based on the background of big data, this paper conducts a cluster analysis on the physical education teaching mode and optimizes it through quantitative analysis based on the current physical education teaching mode [10][11][12][13][14].

Big Data.
e definition of the connotation of relative big data mainly focuses on three aspects. As shown in Table 1, big data can summarize their connotation from the data set, technical system, and way of thinking. First of all, from the perspective of datasets, big data is an information dataset with enormous value. Different from the previous data sets, the characteristics are the diversification and complex association of data objects. is is because the data source of big data is not only the regular data of the information system but also the scraped data of network logs and the original content data of users. e emergence of new mobile devices such as mobile phones and tablets, as well as sensors and the Internet of ings, has accelerated the diversification and high growth of data. Among them, data forms include structured, semi-structured, and unstructured data, and semi-structured and unstructured data account for the majority. Structured data mostly refer to databases that have been artificially organized in advance, while unstructured data refer to actively generated data such as videos, voices, web logs, and original texts. Big data can also be called massive data. Second, from a technical point of view, big data are a technology to obtain valuable information from massive data, including a series of technical systems such as new data storage technology, mining technology, data processing technology, data analysis technology, and data visualization technology. e most widely used technology in education are educational data mining, learning analysis technology, and technology applications in largescale online education platforms. ird, in terms of mindset, big data have a broader meaning.
is study reflects the connotation of big data from this perspective with the help of the discussion in " e Age of Big Data". Big data are a worldview, a quantitative worldview of "the essence of the world is data"; big data are an empirical methodology, including three major thesis "not samples, but all data", "not causality but correlation", "it is not accuracy but hybridity"; it forms a big data value chain with massive data and data technology series, which are intertwined vertically and horizontally to form a big data system. However, as far as the nature of big data is concerned, big data are a technology for recognizing and solving problems, and its role lies in people's rational control over it. Mass data are the realistic basis and environment for generating and using technology. Analysis and other technologies are the realization paths to accomplish its purpose, and its data thinking is the value load of the technical connotation, which specifies the purpose and means of the technology [15].

Cluster Analysis.
Clustering is a very common data mining method. e general algorithm is good at dealing with spherical clustering without isolated points. e CURE algorithm can better solve the spherical problem and can also better solve the isolated point problem. Its advantage is that it can better handle the problem of outliers. It selects c points from the cluster and shrinks them to the centroid by a shrinking factor. e clusters represented by these points can better represent the shape of the cluster. Clustering is a kind of classification that does not rely on any prior knowledge, but only relies on the characteristics of the data, and finds out the same set of objects with higher density of objects in one area than other areas. In cluster analysis, outliers are a special class of points. In data analysis, we want to avoid the impact of these outliers on cluster analysis because outliers behave differently from other points. When analyzing, they should be excluded, which is the only way to reduce the impact of these outliers on cluster analysis [16][17][18][19].
Fuzzy C-means clustering algorithm (i.e., FCM) is a widely used algorithm in cluster analysis. It is one of the most commonly used algorithms. It is widely used in data analysis and pattern recognition. Its essence is mountain climbing.
e FCM algorithm is a clustering algorithm based on fuzzy partitioning, because the parent target of fuzzy clustering is not a convex function. If the initialization is not done well, the final result will converge to the local extreme point. At this time, the resulting classification of the data is not optimal. is algorithm is very time-consuming for processing large amounts of data, which is a big disadvantage in practical industrial and scientific applications. e fuzzy C-means clustering algorithm is obtained, that is, the FCM algorithm. e criterion for evaluating the clustering degree of the FCM algorithm is measured by the membership function. Its objective function is evolved from the objective function of HCM. e clustering algorithm is obtained by applying the fuzzy theory. e difficulty of the FCM algorithm is the selection of the C value. It is difficult to determine the C value only based on experience when we do not know what the data are distributed according to.
Many scholars have improved the FCM algorithm, and the FCM algorithm based on information entropy is one of the improvements. is improvement helps to reduce the error generated in the iteration and improves the efficiency of the system. Weighted coefficient helps to improve the accuracy of the initial value of the cluster center. e FCM algorithm based on information entropy is an algorithm that uses information entropy to initialize cluster centers. After initialization, the number of cluster centers can be obtained, which has a great effect on reducing errors. It is easy to obtain an algorithm with high operating efficiency, which greatly reduces the possibility of local optimum caused by improper selection of the initial value of the traditional fuzzy C-means clustering algorithm. e fuzzy C-means clustering algorithm based on entropy weighting evolves on the basis of the appeal algorithm, and the weighting coefficient is quoted in the traditional fuzzy clustering algorithm. Such an algorithm can continuously select the center of the cluster in the clustering and make it as close as possible to the actual cluster center position.
Assuming a P-order vector X, the object is divided into c cluster sets by the clustering algorithm, and the centroid of each set is a P-order vector V, and the set composed of the fuzzy classification method is defined as follows: R is a matrix of c rows and n columns with an objective function as follows: Among them, I fc is the weight coefficient, and the distance between the element X k and the centroid V i of the ith cluster is as follows: Among them, U ∈ I fc V ∈ R pc , m ∈ [1, ∞) is the weight coefficient, and the distance between the element X k and the centroid V i of the ith cluster is as follows: By optimizing the objective function, the FCM algorithm obtains the fuzzy classification of the object set by iterative optimization of the objective function. Very large data collections, which are different from typical databases in terms of data objects and data forms are more diverse and complex, and have exceeded the ability of traditional databases to acquire, store, manage and analyze data Massive teaching resources (text, video) and student learning behavior data, student learning logs, teacherstudent interaction, student original data, etc.

Technology system
Big data technology for mining data value, new data acquisition, data storage, data analysis, data interpretation, data warehouse, data query, data visualization technology, etc.
Educational data mining and learning analysis technology, adaptive learning system, MOOCs, and other open education platforms

Way of thinking
Quantitative worldview and quantitative empirical methodology, full-sample data analysis methods, pursuit of correlation, tolerance for hybridity Educational quality quantification and evaluation thinking, etc.

Construction of the Optimization Model of Physical Education Teaching Mode Based on Cluster Analysis under the Background of Big Data
If teachers are regarded as an observation object, the scores of each teacher will form a matrix X, which is the data object. We know that the optimization of the physical education management system is a fuzzy process, and the FCM algorithm is a mature algorithm to a certain extent, so we choose this fuzzy clustering algorithm in the teaching management system.

Model Construction.
We built the model shown in Figure 1 below based on the above ideas, where A is the clustering process, X is the teacher's score, c (the number of categories, the initial setting is 5), and m (smoothing coefficient, the initial setting is 5) (2) is a parameter.

Example of Cluster Analysis Model Implementation.
e scores of each teacher on the 10 factors constitute a 10column vector, the number of columns is the same as the number of teachers, thus forming a matrix X, which will be used as the initial data for the clustering algorithm.

Clustering Algorithm
(1) Algorithm steps. ① Each teacher is a sample point in the 10-dimensional space defined by the system, and each such sample point is initialized into a class before the program runs and ② the distance between the sample point and the class is calculated. In this system, we choose the Euclidean distance. ③ Next, the two points with the smallest distance as a cluster is defined and the cluster center is set as the middle point of these two points. ④ Steps 2 and 3 are continued to run until the cluster center becomes stable. (2) e flowchart of the clustering algorithm, as shown in Figure 2.
Fuzzy Mean Clustering Algorithm: e fuzzy mean clustering algorithm used in this paper is more complicated than the systematic clustering algorithm. We set m as a fixed value and c as a variable.
(1) Algorithm steps: (i) ① Set an initial c, and randomly select c samples from the sample points as the cluster center; (ii) ②For all i, update U(t) to U(t + 1); (iii) ③ Update the centroid V(t + 1) according to the algorithm terminates; otherwise, let t � t + 1, go to step 2. (where ε is set to 0.001).
(2) e flow chart of the algorithm is shown in Figure 3.

Several Optimized Physical Education Teaching Modes.
Aiming at the problems brought by the prominence of different categories of characteristics of college students to education and teaching, this paper proposes a teaching mode optimization plan of classified teaching and teaching students in accordance with their aptitude. at Scientific Programming is to say, on the basis of knowing the teaching objectives and the groups to be taught, it is necessary to have a detailed understanding of the groups to be taught and analyze their category characteristics. ere are four types of students involved in this paper, as shown in Table 2, among which: Class A is active exploratory teaching + basic teaching, mainly on the basis of the basic teaching model, adding active exploratory teaching methods, which not only improves their subjective self-learning ability but also helps divergent thinking and strengthens language organization and expression ability, further deepening understanding of theoretical knowledge.
Class B is guided teaching + basic teaching. On the basis of the basic teaching model, students add guided teaching methods to enhance their learning initiative.
Class C is attraction-based teaching + basic teaching, which is based on the basic teaching method and adds attraction-based teaching.
Class D is subject-based teaching + basic teaching, and subject-based teaching is added on the basis of basic teaching methods. In this way of teaching students in accordance with their aptitude and targeted education, it is expected to comprehensively promote the cultivation of college students' ideological awareness and behavior habits, knowledge, and skills.    e code to calculate the distance is as follows:

Calculating Cluster Centers.
e following code implements the calculation of cluster centers (aka centroids).    Figure 9 shows the structure levels under different teaching modes.

Statistical Results of Teaching Evaluation.
e statistical results are shown in Table 3.

Results and Analysis.
e final output of the code is a number of non-fixed matrices composed of 1 and 0. e input data are clustered and calculated. According to the obtained data, if the FCM clustering method is not used, the previous average method will be used as the evaluation basis. e following results appear: Category D has the highest score and should focus on inquiry-based teaching + basic teaching. According to the results of FCM clustering, but the classification results of fuzzy clustering are fixed, the evaluated indicators are clustered and the threshold of clustering is delimited, so as to achieve the effect of clustering and complete fuzzy clustering. It can also be used to evaluate the management mode of physical education teaching.

Conclusion
e clustering algorithm is a very commonly used algorithm, and fuzzy clustering is a clustering algorithm that applies the fuzzy theory. is paper first gives a brief overview of the background of big data and the development status of cluster analysis at home and abroad. Next, it mainly introduces the operating environment and language for developing the algorithm and organizes and presents some codes. is paper studies the teaching management system for the fuzzy clustering algorithm. Among many clustering algorithms, the FCM clustering algorithm is the most widely used algorithm. It can deal with these problems by transforming the principle of nonlinear programming. It can process these physical education data by transforming the principle of a nonlinear programming. It is a fuzzy clustering method, and each iteration moves in the direction approaching the minimum point. Although the FCM algorithm also has many imperfections, with the continuous improvement of this algorithm by scholars, this algorithm has a good prospect. e main task of this paper is to construct a new teaching mode optimization algorithm, improve the original physical education teaching mode, and propose a new cluster analysis model. Judging from the research results, thematic teaching + basic teaching seems to be the most popular mode of physical education.
Data Availability e dataset can be accessed upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.