Analysis and Optimization of Flute Playing and Teaching System Based on Convolutional Neural Network

Influenced by cultural background, economic development, social system, education system, and other factors, there is still a big gap between Chinese institutions and developed countries in flute teaching, even with our neighbors, South Korea and Japan. Under the influence of cultural background, economic development, social system, and educational system, there is still a very big gap between Chinese colleges and universities and developed countries in flute teaching, even with our neighbors, South Korea and Japan. Because of its local perception and weight-sharing structure, the convolutional neural network is closer to the biological neural network in the real world. The weight-sharing structure reduces the complexity of the neural network, which can avoid the complexity of feature extraction and classification process in data reconstruction. This paper studies the analysis and optimization of flute playing and teaching system based on a convolutional neural network. By applying local perception field and parameter sharing in a convolutional neural network at the same time and adding multiple filters, it can not only effectively reduce the number of parameters but also extract features layer by layer. In the process of convolution, the parameters of the characteristic map obtained by each layer decrease layer by layer, but the number increases gradually. Based on the analysis of the problems faced by the flute performance teaching, this paper puts forward the corresponding solutions in order to promote the flute performance teaching in China to achieve better results.


Introduction
Flute is not a traditional Chinese musical instrument but was introduced from western countries. Flute is light and portable.
ere are a group of loyal fans in China. Some higher education courses also set up categories related to ute, so that students can receive professional and systematic ute performance training [1]. In uenced by cultural background, economic development, social system, education system, and other factors, there is still a big gap between Chinese institutions and developed countries in ute teaching, even with our neighbors, South Korea and Japan. At the same time, there are not many excellent Chinese ute teachers, let alone very systematic training and teaching. is makes most Chinese ute learners have poor nger exibility, inaccurate pronunciation, incorrect breathing methods, and many other basic skills [2,3]. Kneading is one of the most important skills and contents in ute playing. e correct and reasonable use of kneading sound can not only beautify the timbre of ute, improve, and enrich the artistic expression and appeal of ute but also re ect the artistic style of ute artists to a certain extent. erefore, kneading is what everyone who studies ute must master. In current universities and some training institutions, the teaching quality of ute performance is uneven. In order to enable students to pass the grade examination, teachers and students of many music schools and training institutions only pay attention to the teaching progress and ignore the teaching practice [1].
Based on a convolutional neural network, the personalized music recommendation system designed in this paper starts with the analysis of the characteristic data of ute performance, which not only brings great convenience to music users but also is the goal that every music software provider hopes to achieve, so it has important research signi cance and broad application prospects [2,4]. Before 2006, the development of arti cial neural network can be roughly divided into two periods. In 1943, Mc Culloch and Pitts put forward the earliest artificial neuron, which has the ability to learn. is is the beginning of an artificial neural network. During this period, we studied its learning algorithm along a single neuron [4]. e convolutional neural network is closer to a real-world biological neural network because of its local perception and weight-sharing structure. Weight sharing structure reduces the complexity of the neural network, which can avoid the complexity of feature extraction and classification process in data reconstruction. In the mid-1980s, Nobel Prize winner John Hopfield proposed Hopfield neural network model, which is dynamic and may be used to solve complex problems [5][6][7]. At the same time, the back-propagation algorithm of the multilayer feedforward neural network was rediscovered. e convolutional neural network is used to realize the analysis of flute performance and teaching system. In machine learning, the convolutional neural network is a deep feedforward artificial neural network. e artificial neuron can respond to the surrounding units and can perform flute performance and teaching. It has been successfully applied to music recognition [7][8][9][10].
e difference between a convolutional neural network and an ordinary neural network is that it includes input layer, output layer, and multiple hidden layers. Each layer is composed of several two-dimensional planes, and each two-dimensional plane is composed of many independent neurons. Convolution neural network has powerful functions of feature extraction and feature learning. For the initial features of the input, it learns the intermediate features in the multilayer convolution process and finally learns the advanced features conducive to flute performance and teaching system [11][12][13][14][15][16]. A convolutional neural network has been widely used in flute performance and teaching systems. However, this paper mainly studies flute performance and teaching systems based on a convolutional neural network, which is essentially a hybrid recommendation model based on music content and user historical behavior [17][18][19][20][21]. e music features extracted from the audio signal can essentially express the characteristics of music, which can not only better fit human's intuitive feeling of music but also effectively avoid the problem of a cold start. is chapter will describe the related contents involved in the implementation of the recommendation algorithm, such as the argot meaning model, matrix decomposition, audio feature representation, and convolution neural network model architecture design, in order to achieve a better recommendation effect [22][23][24][25][26]. By simultaneously applying local perception fields and parameter sharing in a convolutional neural network and adding multiple filters, it can not only effectively reduce the number of parameters but also extract features layer by layer. In the process of convolution, the parameters of the characteristic map obtained by each layer decrease layer by layer, but the number increases gradually [27][28][29][30][31].

Research Status at Home and Abroad.
Lv J, Sun Q, and Li Q put forward that not only flute, oboe, bassoon, saxophone, and other wind instruments, such as horn, trumpet, and trombone, but also kneading sound is an important skill. For stringed instruments, playing without kneading is even more incredible. Literature Guo Y, He Y, and Song H pointed out that compared with foreign students of the same age, most flute playing students in China basically have no systematic training, and their playing skills are not up to standard, and even many students have not correctly understood and grasped the style of western music. Literature Yao P A, Rh A, and Hy A pointed out that the flute is one of the most important wind instruments, which mainly depends on the control of the mouth and playing posture to complete the performance of the whole piece of music. e whole process needs not only good listening but also coordinated movements to grasp the breath. Alvarez J and Leon A pointed out that for flute basic performance teaching, and first of all, the ability level of teachers should be solved in college teaching. On the basis of training teachers' level, only teachers with relevant professional qualities can better complete the teaching task. e difficulty of basic flute playing is relatively high, so we should pay more attention to it. Literature Xu C, Yang J, and Lai H pointed out that there are not many excellent flute teachers in China, which led many teachers to pay little attention to the cultivation of students' music literacy and basic skills training, and even to the training of students' playing style. In short, at present, Chinese flute teaching is facing many problems, which is also an important factor restricting the sustainable development of Chinese flute. Literature Bl A, Adfc B, and Rc C pointed out that at present, as far as China's domestic situation is concerned, many colleges and universities regard basic flute performance as one aspect of students' comprehensive quality education. However, due to the influence of the school's own level, the lack of enough excellent teachers leads to insufficient teachers' strength, and at the same time, there is a problem that teachers' comprehensive quality can't keep up, which directly affects the teaching effect of basic flute performance. e document Ishino M points out that after the combination of new media technology and flute playing teaching, many teachers use new media to explain music theory knowledge and vocal essentials, which promotes flute lovers to make a breakthrough in playing technology. e literature by Meenakshi, Khosla, and Keith pointed out that at present, many colleges and universities in China regard flute playing as one aspect of students' comprehensive quality education. However, many schools are influenced by their own factors, and there are no professional teachers or teachers who are not very strong, and the level of teachers' comprehensive quality is not very high, which directly leads to the poor teaching effect of flute playing, so that most students do not receive systematic training, which is very unfavorable for students' healthy development in the future. e literature by Satar H. M. and Wigham CR put forward that the basic flute performance in contemporary China is a continuation of the traditional teaching mode, which is relatively inflexible in teaching form.
erefore, it is necessary to make corresponding reasonable plans in teaching. According to the literature by Guizzo E, Weyde T, and Tarroni G, new media technology, as an auxiliary teaching method, should be used to assist teaching. However, many ute teachers pay too much attention to the role of new media technology in teaching and do not pay enough attention to students' performance in performance technology, which leads to many students not mastering the key of performance and making technical mistakes when playing tunes.

Research Status of Flute Playing and Teaching System Based on Convolutional Neural Network.
is paper studies the analysis and optimization of ute performance and teaching system based on a convolutional neural network.
ere are two main aspects in the cultivation of basic ute performance skills. First, we need to improve students' music literacy and performance skills. Music literacy is the rst thing that needs to be possessed in learning vocal music knowledge, followed by the pro ciency and enrichment of performance skills, which also a ects students' ability and occupies an important position in many aspects. Second, we need to improve students' mastery of breathing skills. Having a correct breathing method is an important basis for a student to master the basic ute playing methods. Whether he can have a correct breathing method is very important for a ute player, which directly a ects the accuracy of sound quality in the playing process. erefore, in order to provide students with systematic training and high-quality teaching, we must strengthen the cultivation of the quality of teachers, improve teachers' professional skills, and make teachers have high music literacy. A convolutional neural network is a virtual technology. It can teach and learn anytime and anywhere without the limitation of time and space. is advantage is very helpful for ute performance teaching so as to continuously guide students in the process of ute performance teaching, give students more systematic training, and improve the classroom quality of ute performance teaching, so as to lay a good foundation for students' sustainable development.

Principle and Model of Convolutional Neural Network
A convolutional neural network is a kind of feedforward neural network with convolution calculation and depth structure. e research on convolutional neural networks can be traced back to the 1980s and 1990s of last century. TDNN and LeNet-5 are generally regarded as the earliest convolutional neural networks. Teachers/administrators can log in to the management end of the teacher system, upload exam tracks, set scoring weights, maintain school and student information, and create exams. e overall ow chart of ute performance and educational art quality monitoring system based on a convolutional neural network is shown in Figure 1.
From the perspective of candidates, the basic operation process can be divided into the following ve steps:  Figure 1: Flow chart of ute playing teaching system based on convolutional neural network.

Mathematical Problems in Engineering
③ Candidates prepare for audition. ④ Candidates perform formally, which can be adjusted according to the beat prompt. ⑤ Data submitted to background server convolutional neural network can supplement information sources for ute performance and teaching system, and to a certain extent, it can alleviate the common problems of cold start, sparseness, and expansibility in recommendation system and better meet the increasingly strong personalized application requirements. e overall system design block diagram is shown in Figure 2.
As can be seen from Figure 2, the system mainly includes a user modeling module, a music feature extraction module, and recommendation algorithm module. e user modeling module is mainly used to collect the historical behavior data of music users in the system and construct the user preference feature model. e performance feature extraction module is mainly used to preprocess the performance content and extract the spectrum features, so as to prepare for training the convolutional neural network to obtain the regression model for predicting the potential features of ute performance. e recommendation algorithm module is mainly used to calculate the matching degree between users and music according to the potential characteristics of ute performance predicted by the regression model and combined with the user preference characteristics and nally generate a recommendation list of music objects that users may be interested in. e basic characteristics of music processing include pitch, loudness, and timbre. Pitch is the most intuitive parameter that people can feel. It is determined by the frequency of the vocal signal, and the unit is hertz. e higher the pitch is, the sharper the sound feeling is. is is why the sharpness of girls' voice is generally higher than that of boys in daily life. e loudness intuitively re ects the size of the sound, in decibels. During network model training, the di erence between the output result and the real label is called the loss function. e essence of parameter training is to nd an optimal parameter set in the parameter space to minimize the di erence between all output results and the corresponding real labels. e convolution layer mainly completes the feature extraction of the ute performance and teaching system. e convolution core is used for convolution operation with the ute performance and teaching system. When the convolution core is working, the convolution core slides along the horizontal and vertical directions of the ute performance in a certain step. Each step moves, and it will regularly sweep the input features. In the receptive eld, the input image and the corresponding position elements of the lter are multiplied and then summed, and nally, the o set term is added. e operation result is placed on the output characteristic image corresponding to the position of the convolution kernel. With the end of the sliding, the teaching system of ute performance can be obtained. Among many loss functions, convolutional neural networks are commonly used in two ways: mean square error loss function and cross-entropy loss function.
is feature makes convolutional neural network have obvious advantages in processing grid or matrix structure data like images, and audio signals can extract the corresponding time-frequency spectrum, so convolutional neural network is gradually applied to the identi cation and processing of audio signals. e two key steps in a convolutional neural network are convolution operation and pooling operation. e expression of convolution operation is In a convolution neural network, x(t) is the input feature, w(t) is the convolution kernel, and s(t) is the feature map. When processing two-dimensional matrix data, the above formula can be written as e mean square error loss function, also known as the square loss function, uses the Euclidean distance to characterize the difference between the output value and the label value, and the expression is where l is the loss, X is the input sample, a L(x) is the final output of the network model, y(x) is the tag value, and N is the number of samples. Minimizing the loss function is the goal of parameter optimization. e general optimization method is to train the parameters by the back-propagation loss function. e derivation of the learnable parameters w and B in the network can be obtained by making the number of samples n � 1.
where σ ′ (z) is the gradient of activation function σ(z) with respect to neuron output Z. From the above two equations, it can be obtained that the learning rate of learnable parameters is directly proportional to σ ′ (z). When σ ′ (z) is smaller, the parameter training is slower, and the network is difficult to converge. Another commonly used loss function is the cross-entropy function. Cross-entropy is generally used to characterize the similarity of probability distribution between two sample sets. e expression of cross-entropy loss function is where l is the loss, X is the input sample, a is the output, y is the tag value, and N is the number of samples. e derivation of the learnable parameters w and B in the network is obtained, and the number of samples n � 1.
where σ(z) is the activation function of neuron output Z. From the above formula, it can be seen that the learning rate of parameters W and B is proportional to σ(z) − y, but independent of its derivative σ ′ (z). e learning rate of the parameter is proportional to σ(z) − y; that is, it is proportional to the loss function, which can prevent the convergence rate from missing the optimal solution too fast. is advantage of cross-entropy loss function makes it more used in CNN's network parameter training than mean square error loss function. Momentum algorithm adds momentum parameters to SGD algorithm, so that when updating parameters, not only the current gradient but also the accumulation of exponential decay results of previous gradients are considered. Specifically, when SGD updates the parameters, equations (2)-(10) are rewritten as where v is the velocity parameter, which is updated together with the gradient unbiased estimation.
ereby we can ensure stable convergence and reduce shock while training better. In application, the momentum parameter A is usually selected as 0.9 or 0.99.

Analysis and Optimization of Flute Performance and Teaching System Based on Convolutional Neural Network.
Teachers can organize students to participate in the discussion in their spare time. For some problems in the teaching process, students should put forward what they don't understand in time, and we should discuss together to put forward countermeasures and reasonable suggestions. In this way, we can effectively solve various problems in the teaching process of basic flute performance, because making plans according to students' mastery can also effectively avoid teachers from making unreasonable training arrangements. e analysis and optimization of flute playing and teaching system based on a convolutional neural network mainly cultivate students' flute playing skills from the following two aspects: first, the cultivation of students' good sound quality is strengthened. Both teachers and students must pay attention to good sound quality. Good sound quality is jointly affected by the player himself and the musical instrument. At the same time, the player's own factors are closely related to the player's experience, skills, and methods; second, the cultivation of students' breathing methods is strengthened. Based on the convolutional neural network, information is transmitted through the network, and a large amount of information can be transmitted to the receiver in a very short time. It is precise because of this advantage that the convolutional neural network is more and more favored by teachers in flute teaching. However, the convolutional neural network is good at theoretical teaching, and the actual playing skills still need teachers to teach handin-hand, simply by watching ppt or video, and students can't really master the performance methods. erefore, in teaching, teachers should integrate theory and practice to comprehensively improve students' performance level. Using a convolutional neural network in flute teaching needs to carry out relevant research and analysis according to the actual situation of students, which requires a relevant detailed analysis of the teaching content and class management and reasonable arrangement of courses and innovative teaching methods, so as to improve student's learning ability and teachers' teaching level. At present, people agree that the Mathematical Problems in Engineering most reasonable breathing method is chest abdominal breathing, which fully meets the requirements of flute players for playing and breathing. When students use thoracoabdominal breathing in the process of playing, the inhalation amount of this method is relatively large, which requires the intercostal muscle and diaphragm to participate in the breathing process, so as to control the uniformity of breath to the greatest extent and ensure good sound quality, so as to promote flute players to play better. According to the teaching requirements of school flute basic performance based on convolutional neural network, all students should be trained in basic performance methods after entering the school. rough this training, the performance of each student should be observed, and timely correction should be given according to the different conditions of students, so as to help students develop good performance methods. According to the observed progress of students, different teaching methods are established. At the same time, students' learning attitude is corrected, so as to lay a solid foundation for each student to learn flute playing.

Experimental Results and Analysis.
If the school wants to provide students with systematic training and high-quality teaching, it is necessary to strengthen the construction of the teaching staff, improve the quality of the teaching staff, improve the professional skills and skills of teachers, and promote the teachers' specific and stronger teaching level in teaching, so as to cultivate students' abilities and improve their knowledge, ability, and playing skills with a brand-new model. In the end-to-end neural network structure, the first layer structure can learn the primary feature representation of the input signal, which is equivalent to the primary feature with traditional time-frequency transform as input in improving the final classification performance. From a certain point of view, the positive influence of a convolutional neural network on flute teaching and performance is not limited to the improvement of technology and communication mode but also its impact on traditional teaching ideas. e hardware environment of the experimental platform is Intel i7-7800X CPU, clocked at 3.5 GHz, turbo frequency at 4.0 GHz, 6 cores and 12 threads, and 15 GB of memory, and the graphics card is a dual GPU of NVIDIA GTX 2080. For each time frame with a length of 8194, before extracting features by using logarithmic frequency-domain filter banks, it is segmented by a sliding window. Here, we use a cosine window, and adding a window function is beneficial to solve the spectrum leakage phenomenon caused by the boundary effect. e length of the window is 2045 sampling points, and the stride of window movement is 253 sampling points to prevent the loss of boundary information.
en, (8193-2046)/255 + 1 � 25 TXs per frame, that is, pT � 25 in the previous section, and the length s � 2047. e feature size changes in the whole process are shown in Table 1.
In order to explore the influence of the primary features extracted by the log-frequency filter bank with artificially defined weights on the classification effect, we replaced the log-frequency filter bank with a double-layer ReLu network to extract the primary features of audio, which was used as a comparative experiment. e double-layer ReLu network can be regarded as a set of filter banks whose weights need to be learned. Because the weights at this time are learned from random initialization, it is uncertain whether these learned weights show a topological structure sorted from low frequency to high frequency like logarithmic frequency-domain filter banks, and it is also possible to learn a topological structure of self-organizing mapping in the parameter space as shown in Table 2.
From Table 2, it can be seen that when the two-layer network and filter bank are used to extract the main features of audio respectively, the pitch recognition accuracy P, recall R, and F1 scores of the recognition model under different frame lengths. ese three are also the standards of multipitch estimation used by MIREX, an international conference on music information retrieval and evaluation. Under the convolutional neural network, it is faster and more convenient for teachers to teach students the teaching contents and matters needing attention in playing, which can realize the real-time transmission of information, and is not limited by time and space, which brings new opportunities to flute playing teachers' teaching. At the same time, flute teachers can also take advantage of the opportunities brought by new media to reform the traditional teaching mode, make new media serve flute playing teaching, realize the effective combination of the two, and construct new teaching methods. rough multimedia, students can learn and master the different styles of different performers, understand the basic common sense of performance, think and ponder, take the essence and discard the dross, and gradually form their own unique performance style, which is constantly run-in and revised in actual performance and is recognized by the audience.
In order to verify the feasibility of a convolutional neural network and measure the quality of recommendation results generated by the model, the recommendation accuracy under different recommendation list lengths is tested experimentally. In this paper, three experiments were carried out to compare. In the experiment, the recommended list was set to different lengths such as 15,20,25,30,35,40,45, and 50, respectively, and the accuracy of the recommended list was quantitatively evaluated by accuracy, recall, and F1 value.
e recommendation results under different recommendation list lengths are shown in  From the experimental results in Figures 3-5, it can be seen that the length of the recommendation list has a certain impact on the recommendation results, and with the    increase of the length of the recommendation list, the accuracy is decreasing, while the recall rate and F1 value are increasing. When the length of the recommendation list is 15, the highest accuracy rate is about 0.45, and the lowest recall rate is about 0.234. When the length of the recommendation list increases to 30, the accuracy rate decreases to about 0.45, and the recall rate increases to about 0.361, which basically conforms to the general law of the recommendation system. In order to more objectively show the effectiveness of the convolutional neural network, this paper selects other recommended algorithms that can be implemented on the existing data sets for comparative experiments. ree experiments were conducted to test the accuracy, recall, and F1 value of different recommendation algorithm models such as frunk SVD, user CF, and CB under different recommendation list lengths. e experimental results are shown in Figures 6-8.
It can be seen from Figures 6 to 8 that under the same length of recommendation list, the recommendation results generated by a convolution neural network in this paper are better in accuracy, recall, and F1 value than the other three traditional methods.
is may be because the traditional recommendation algorithm model only uses a sparse score matrix or a single item content for the recommendation. e recommendation algorithm in this paper not only uses the historical behavior data of users' interaction with music but also introduces the characteristics of audio content through deep learning, and the deep convolution neural network can better learn the characteristics of data. When making teaching plans, flute teachers must make reasonable teaching plans in combination with the actual mastery and acceptance of students. For example, teachers can organize students to talk together, put forward problems in the teaching process in time, and put forward corresponding solutions and reasonable suggestions, which will effectively solve various problems in the flute teaching process and effectively avoid unreasonable training arrangements. Of course, the convolutional neural network in this paper does not greatly improve the recommendation effect.
is is because the focus of this paper is to explore the feasibility of a convolutional neural network for music recommendation. At the same time, aiming at the improvement of the cold start problem of traditional recommendation algorithms, it can supplement the available information source for the music recommendation system. If we integrate more user and project attributes and further improve the model, it is expected to greatly improve the overall performance of the recommendation system.

Conclusions
To sum up, the teaching of flute performance still faces many difficulties in China. We must pay attention to it, strengthen the training of teachers, strengthen the training of students' basic performance skills, and correct students' learning attitude, so as to lay a good foundation for students' healthy development in the future. is paper uses a convolutional neural network to analyze and optimize flute performance and teaching system. For music educators, how to combine traditional teaching methods with new media technology to improve classroom efficiency, close to students' life reality, and help them improve their mastery of basic theoretical knowledge and performance skills is an important topic. Flute performance teachers should consider the positive and negative effects of new media technology, formulate reasonable teaching strategies, make use of the fast and simple characteristics of new media technology, give real-time guidance to students' learning, and promote students' performance skills to achieve a qualitative leap. rough this training, the performance of each student should be observed, and timely correction should be given according to the different conditions of students, so as to help students develop good performance methods.
Data Availability e figures and tables used to support the findings of this study are included in the article.

Conflicts of Interest
e author declares that there are no conflicts of interest.