Research on Multimedia Music Teaching Based on Artificial Intelligence

In order to improve the efficiency of music teaching, this paper constructs a multimedia music teaching system based on artificial intelligence. Moreover, this paper focuses on the technical research of intraframe prediction, intraframe filtering, and transformation technology, respectively proposes the intraframe prediction method based on brightness change and the intraframe filtering technique based on the iterative update, and respectively proposes corresponding optimization algorithms. In addition, this paper proposes intraframe prediction technology based on brightness changes and intraframe filtering technology based on an iterative update, which brings improvements in coding performance, analyzes and verifies the experimental results, and proposes corresponding optimization algorithms, respectively, which saves coding and decoding time. Finally, this paper constructs a corresponding model structure based on actual needs, and analyzes the performance of this model through experimental research. The research results show that the teaching system constructed in this paper has better performance.


Introduction
At the same time as the progress of science and technology, online education has formed a certain scale, and the music discipline has introduced multimedia technology into music teaching without exception [1]. Since multimedia technology entered music teaching, great changes have taken place in music teaching methods. Some scholars have proposed that the successful use of multimedia teaching makes music teaching more intuitive, interesting, and visual than traditional methods [2]. It can be said that because students can participate interactively, multimedia technology can stimulate students' interest in learning more and make teaching courses alive. In the modern information society, the Internet is not just a simple teaching medium. e Internet can promote education reform [3]. For example, the development of student-centered teaching can fully realize the integration of personalized education, social, school, and family education. Education network application software development, and network design and construction are of great significance to the realization of network education courses. Online education is the main driving force for the establishment of modern information technology based on computer and network technology [4]. Moreover, online education will become the main form of online distance education, an important way to modernize higher education, and will also become a basic lifelong education system. erefore, it is necessary to make full use of the distance education methods of online education to better achieve the goals of education [5]. In recent years, many schools have established "online education universities," "distance education," and modern distance education institutions.
How to introduce modern educational technology and methods into modern distance education to reduce the pressure of shortage of educational resources and existing education shortages, and to adapt to the rapid growth of the demand for technical talents in our country is a new topic [6]. It is necessary to carry out an in-depth and meticulous research and bold exploration of this issue, and to establish a modern distance education system in China that meets the national conditions as soon as possible. e advantage of distance teaching is that it breaks through the limitations of time and space, and increases learning opportunities. It is conducive to expanding the scale of education, reducing teaching costs, sharing excellent teaching resources, realizing curriculum advantages, and building a lifelong education system [7]. Learners can use more abundant teaching and learning resources according to their own time, place, and style, which is also the need of current knowledge and social development [8]. e remote teaching system is used as a remote teaching platform to realize the basic teaching activities of the network live broadcast, so the research of the remote teaching broadcast system becomes very meaningful [9].

Related Work
For online education, the courseware editing system is a very important part, and it is also an extremely active research direction at home and abroad. Many different experimental and commercial products have appeared one after another. e first category is internationally mature and widely used commercial software such as PowerPoint, Authorware, Dreamweaver, and Flash. Most of this kind of courseware production software adopts the page editing mode, which has a certain degree of independence, and the formed courseware is relatively single and highly targeted, but not flexible enough. At the same time, this type of software has higher requirements for users, requires a certain amount of learning, and is relatively complicated to operate [10]. e second category is a large-scale network teaching system based on the Web. Currently, the most popular systems in the world, with relatively complete functions, are WebCT (Web Course Tools), Blackboard, and Moodle. ese three systems can provide teaching staff with various tools and services such as real-time teaching, interactive communication and teaching resource management, support a large number of users, and have a large market share [11].
is article takes WebCT as an example. It is an asynchronous course delivery and management system developed by the Canadian British Columbia Computer Science Department for colleges and universities. It focuses on course content and integrates various related learning tools. Moreover, it can not only publish existing course content on the Internet, but also develop new courses. In recent years, the domestic educational technology community has also begun to pay attention to and independently develop network teaching systems, such as the Chinese version of Blackboard and eyouCT, Longteng multimedia distance education system, Tsinghua network learning platform, Peking University network teaching platform, Nanjing University's sky classroom network teaching system. Although a similar network teaching system has the function of courseware production, its system is huge and the functions are too comprehensive. It tries to cover the entire teaching link of traditional campus education and provide comprehensive support services for online teaching [12]. e third category is an independent network courseware generation system. is type of system or the design of the courseware structure template and wizard mechanism simplifies the steps of network courseware development and greatly reduces the technical threshold of courseware development. Moreover, the courseware-making process is simple, fast, and easy to operate, which is suitable for the development of a large number of network courseware [13,14].

The Best Reference Template Selection
3.1. Template Definition. As shown in Figure 1, we first use the upper row and left column of the currently encoded image block as the current template and search for a reference template in a certain range of a reconstruction area. Among them, the dark area represents the template. en, we use the MRSAD (mean-removed sumo absolute differences) of the current template and the searched reference template as a measurement standard to characterize the consistency and matching degree of the current coding block.
In formula (1), T Rec represents the template of the current block to be coded, T Pred represents the reference template of the reconstructed block, T avg represents the difference between the mean values of T Pred and T Rec , and L represents the length of the template. In formula (2), MRSAD(T Pred , T Rec ) represents the matching degree between T Pred and T Rec . Figure 1, by searching the reconstructed part of the current frame, the best matching result obtained by matching all the reconstructed reference templates T Rec with the current template T Pred is the smallest MRSAD between T Rec and T Pred . At this time, T Rec is called the optimal reference template T Rec−best .

Template Search. As shown in
In the abovementioned formula, D represents the search range of the reference template T Rec in the reconstructed block of the current frame.
In order to solve the problem of inconsistency between the reference template T Rec and the corresponding reconstruction block Rec, we assume that the current template T Pred and the reference template T Rec approximately satisfy a linear relationship as follows: In the abovementioned formula, T Pred ′ is the approximate representation of the current template T Pred , a is the scaling factor, and b is the compensation factor. In order to get the best fitting effect, we use the least square method, and the problem is transformed into an optimization problem as follows: 2 Computational Intelligence and Neuroscience erefore, the problem is equivalent to It is to find the scaling factor a and the compensation factor b to make the abovementioned formula the smallest. a and b obtained by solving the optimization problem are as follows: e obtained scaling factor a and compensation factor b will be used for intraframe brightness compensation prediction.

Intraframe Brightness Compensation Prediction.
After obtaining the optimal reference template m and the optimal compensation factors a and b of the current template, intraframe compensation prediction based on brightness can be performed, and the prediction value of the block to be coded is as follows: In the abovementioned formula, x and y represent the position of the pixel with coordinate (x, y) in the coding block, Pred represents the predicted value of the current block to be coded, and Rec represents the predicted value of the reconstructed block corresponding to the optimal ref- Pred(i).
In order to obtain a better prediction effect, as shown in the abovementioned formula and Figure 2, we selects N sets of optimal candidate templates T Rec (1), T Rec (2), . . . T Rec (N), and linear brightness compensation is performed on each set of candidate templates T Rec (i) to obtain the prediction block Pred(i) of the candidate reference template. Finally, the candidate N groups of prediction blocks are averaged to obtain a new prediction block Pred.
When performing intraframe video encoding, the intraframe mode selection at the encoder is determined based on the rate-distortion function, and the mode with the least rate-distortion cost will be selected. e rate-distortion cost function is as follows: at is, the coding distortion is minimized as much as possible at a predetermined code rate. Among them, R c represents the specified code rate, R represents the actual number of bits used to encode the mode, and D represents the distortion generated in the mode. In HEVC and H.266, both SSD and SATD are used to express distortion D. e reason is that in intraframe coding, two stages are used to complete the intraframe mode decision process. e first stage is the mode primary selection stage, which uses a lowcomplexity SATD-based cost function to select N intraprediction modes. e second stage is the mode selection stage, which uses a more complex SSD-based rate-distortion cost function to further select the final intraprediction mode.
As shown in the abovementioned formula, SSD represents the sum of squares between the errors of the original pixels and the reconstructed pixels.
e abovementioned formula represents the rate-distortion function corresponding to SSD. Among them, λ mode is the quantization parameter setting, and R all represents the number of bits required to encode all the coding information of the current mode, including the following: prediction mode number, division information, and residual coefficient. Figure 3 shows the SSD-based intra-frame rate-distortion cost calculation process. It can be seen that each mode needs to go through the process of prediction, transform and quantization, inverse quantization and inverse transform, and entropy coding to finally find the optimal prediction mode, which greatly increases the complexity of the encoding end. erefore, in the first mode selection stage, Computational Intelligence and Neuroscience 3 SATD is used instead of SSD to calculate encoding distortion, which greatly reduces the complexity of the encoding end.
As shown in the abovementioned formula, SATD first transforms the residual block by the Hada code, and then adds the absolute values of the transformed coefficients. It eliminates the process of transform quantization, inverse quantization and inverse transform, and entropy coding.
e abovementioned formula represents the rate-distortion function corresponding to SATD. R mode only calculates the number of bits consumed to encode the current mode.

Intraframe Iterative Filtering Technology.
In order to solve the problem of inaccurate prediction of the pixels to be coded in the intraframe prediction far away from the reference pixel, we first analyze the theoretical coding performance of intraframe coding based on the first-order Markov model, compare the theoretical coding performance with the performance of the existing encoder, give a theoretical explanation for the inaccurate prediction of the encoder, and find out the reason for the inaccurate prediction of the encoder. Next, we propose a method based on intraframe iterative filtering, which can effectively improve the performance of the encoder. Finally, considering the implementation complexity of intraframe iterative filtering, a related optimization algorithm is proposed, which greatly reduces the complexity of the encoder and decoder while sacrificing little coding performance.
In HEVC, Chen proposed a theory based on the firstorder Gauss-Markov model to analyze the ideal coding performance of intraframe prediction. He pointed out that when performing intraprediction, we need to consider the impact of errors, including quantization errors, noise, and edge pixels. In fact, the availability of reference pixels and errors directly affects the accuracy of intraprediction.
When considering a one-dimensional signal sequence X � [x 0 , x 1 , . . . , x N ] T , if it is assumed that the signal satisfies the first-order Gauss-Markov chain model with zero mean unit variance, its autocorrelation matrix is as follows: where ρ represents the correlation coefficient, which is close to 1. We assume that the first signal sample x 0 in X is the reference pixel, which is used to predict x i , i � 1, · · · , N. As shown in the following formula, there is a weight matrix P which generates a one-dimensional prediction signal x as follows: where p i , i � 1, . . . , N represents the weight corresponding to other signals x i , i � 1, . . . , N predicted by x 0 . It can be seen from the abovementioned formula that only the first column of the weight matrix P is nonzero. is is because the predicted values of other signals x i are only predicted by x 0 , as shown in the following formula: As shown in the following formula, in order to minimize the prediction residual, it is necessary to find the optimal prediction matrix P opt .
As shown in the following formula, the optimal weight p i can be obtained by derivation.
Combining the obtained autocorrelation matrix, the optimal prediction weight can be obtained as shown in the following formula.   Computational Intelligence and Neuroscience erefore, the optimal prediction weight matrix is as follows: If we set the residual signal as then, the residual matrix is as follows: From this, the autocorrelation matrix of Y can be shown as the following formula:

(24)
In the one-dimensional signal source distortion-free model, since the reference pixel does not change, the variance of the reference pixel is not considered when calculating the coding performance. e intraframe coding performance can be described as the following formula: Since the signal has zero mean, the autocorrelation coefficient shown in the following formula is the variance: It is synthesized to obtain the final optimal coding performance as shown in the following formula: Similarly, the coding performance of traditional intraprediction can be obtained by formulas. It can be seen from Figure 4 that the black line is the optimal coding performance, and the red line is the coding performance of the traditional prediction method. Here, ρ � 0.95 and N vary from 4 to 32. It can be seen that the optimal coding performance is far better than that of traditional prediction, and as N increases, the advantage becomes more obvious. e reason is that when N is small, the reference sample point can be very close to other signal samples. When they are very close, the correlation is high. As N increases, it can be found that the approximation effect of the reference sample is worse. e reason is that as the distance increases, the correlation between them becomes smaller. When N is very large, the prediction effect based on traditional prediction methods is poor, and smaller weights should be used to predict the current signal.

Construction of the Multimedia Music Teaching Model
In order to achieve the purpose of online music teaching, the system intends to use computer technology, communication technology, streaming media technology, and online video live broadcast technology to provide users with a set of online music teaching systems that can be cross-regional, highly shared, and has practical teaching significance. rough field surveys and visits to many places and users, it is determined that the main business characteristics of the system that need to be met are as follows: practical functionality, support for streaming online music appreciation, support for online video live broadcast, provide online teaching and examination functions, and support users consumption management. In addition, it needs to support students to be able to listen to lectures online in real time, independently and flexibly choose past courses, exchange discussions, and online tests. For online teaching, the application of streaming media is also crucial. It needs to compress and transcode music files to achieve the effect of a fast side-by-side and real-time playback. Finally, the system needs to provide powerful system management functions. e online music teaching activities are investigated, and users are divided into three categories: managers, students, and teachers. e music teaching system based on streaming media is designed with a typical media system, implemented with a standardized network protocol, supports multiple applications such as network video playback, and has strong scalability.
e system is composed of streaming media server, web server, storage workstation, video acquisition station, and client. e streaming media server edits, stores, and maintains the video information received from the collection station. e application server executes the web applications of the system. e specific design is shown in Figure 5. Traditional intra prediction under one-dimensional signal Intra prediction under one-dimensional source distortion-free model Computational Intelligence and Neuroscience e nonlinear editor can perform special effects rendering, adding letters, and adding watermarks to the recorded video clips. After the video is processed, the video and audio information are transmitted to the video server, and then stored in the video server or directly broadcasted live. e live broadcast processing part of the video server can transfer the analog signal that does not need to be processed to the live broadcast line or store it in the disk. e live broadcast part and the publishing server work together to balance the load of network data. e publishing server is designed and installed in the core computer room of the online music teaching system and publishes the live video information to the student client on the Internet. e student can watch the live course in the live broadcast room of the virtual classroom. At the same time, the video publishing server creates an on-demand service and points the publishing path to the recorded video file in the video storage server for students to review and watch.
e system architecture is divided according to the multilayer structure, and the system includes the following: user layer, business logic layer, persistence layer, and data layer. e user layer mainly completes accepting user input and returning the execution result to the user. e business logic layer and the persistence layer are between the user layer and the data layer. ey are the bridge between the two and save a large amount of user operation logic. In fact, the database and media library are accessed according to the operation logic. e data layer is responsible for data storage and retrieval, and management of media files.
In the online music teaching system, the transmission of music and tutorials is designed based on streaming media technology. e design of streaming media transmission is based on IP technology, which realizes streaming transmission through grouping and digitization. e realization principle is that the video data is compressed through the compression algorithm, the video data is packaged and transmitted to the client according to the relevant IP protocol, and then, these data packets are combined, and after decompression and decoding, they are restored to the original video data. It realizes the purpose of transmitting video online through the streaming media protocol. e basic process of streaming media transmission in this system is shown in Figure 6.
In the music teaching system, the design of streaming transmission is as follows: (1) when the user chooses to play the streaming media course on the client browser, the client browser will establish HTTP or TCP exchange information with the web server, thereby identifying the required video data from the network data. en, on the client browser, the streaming media player is started to initialize the acquired video data.
e initialized information includes program information, encoding type, and server address. (2) e streaming media player and the streaming media server are based on RTSP (real-time streaming control protocol) to exchange video control information. In addition, the realtime flow control protocol can also support common commands such as play, pause, fast forward, rewind, and record. (3) e streaming media server uses the RTP/UDP protocol to send streaming media data to the streaming media player. When the video data arrives at the client, the streaming media player can decompress and decode the data packet, and then play it. In the transmission and playback of streaming media, two different types of communication protocols, RTP/UDP and RTSP/TCP, can be used to send redirect commands from the server to streaming media players at different addresses.
After constructing the abovementioned model, this paper studies the teaching performance of the model. Combined with the current multimedia music teaching needs, this paper evaluates the teaching efficiency and teaching effect of multimedia music teaching. First, the teaching efficiency is evaluated. e teaching efficiency mainly refers to the speed improvement of the streaming media model compared with the traditional teaching method. rough the analysis of 87 sets of teaching data, the results are shown in Figure 7.
It can be seen from the chart that the model constructed in this paper has a significant effect in improving the efficiency of music teaching. On this basis, this paper evaluates the teaching effect of the model and verifies the teaching effect through the scores of 66 experimental subjects. e results are shown in Figure 8.
From Figure 8, we can see that the multimedia teaching system constructed in this paper performs well in music teaching and has certain practical effects.

Conclusion
e online music teaching system is designed and implemented based on streaming media technology, and its purpose is to solve the problems encountered in the actual application of online music teaching. When encoding in the rate-distortion mode, considering the directivity of the residual in the current mode, a detailed analysis of the residual energy within the frame is made. When performing transformation, a transformation method based on mode dependence is proposed, which eliminates the traversal process of multicore transformation in H.266 and saves coding time. Moreover, considering the complexity of time coding, this paper does not perform iterative filtering for intraprediction in DC and planar modes, and reduces the number of filtering iterations according to the simulation results to optimize the iterative filtering technology. In addition, this paper conducts research and analysis based on the next-generation video coding standard H.266, and proposes intraframe prediction technology based on brightness changes and intraframe filtering technology based on iterative updating, which will improve coding performance. Finally, this paper verifies the performance of this system through experimental research. e research results show that the model constructed in this paper has a certain effect.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.