Aided Recognition and Training of Music Features Based on the Internet of Things and Artificial Intelligence

With the development of the Internet of Things, many industries have been on the train of the information age, and digital audio technology is also constantly developing. Music retrieval has gradually become a research hotspot in the music industry. Among them, the auxiliary recognition of music characteristics is also a particularly important Task. Music retrieval is mainly to manually extract music signals, but now the music signal extraction technology has encountered a bottleneck. The article uses Internet and artificial intelligence technology to design an SNN music feature recognition model to identify and classify music features. The research results of the article show (1) statistic graphs of the main melody and accompanying melody of different music. The absolute value of the main melody and accompanying melody mainly fluctuates in the range of 0–7, and the proportion of the main melody can reach 36%. The accompanying melody can reach 17%. After the absolute value of the interval reaches 13, the interval ratio of the main melody and the accompanying melody tends to be stable, maintaining between 0.6 and 0.9, and the melody interval ratio value completely coincides; the main melody in the interval variable is X. (1) The relative difference value in the interval of −X(16) fluctuates greatly. After the absolute value of the interval reaches 17, the interval ratio of the main melody and the accompanying melody tends to be stable, maintaining between 0.01 and 0.04 and the main melody. The value of the difference is always higher than the accompanying melody. (2) When the number of feature maps is 24∗5, the recognition result is the most accurate, MAP recognition result can reach 78.8, and the recognition result of precision@ is 79.2; when the feature map size is 5∗5, the recognition result is the most accurate, MAP recognition result can reach 78.9, the recognition result of precision@ is 79.2, and the recognition result of HAM2 (%) is 78.6. The detection accuracy of the SNN music recognition model proposed in the article is the highest. When the number of bits is 64, the detection accuracy of the SNN detection model is 59.2%, and the detection accuracy of the improved SNN music recognition model is 79.3%, which is better than the detection rate of ITQ music recognition model of 17.9%, which is 61.4% higher. The experimental data further shows that the detection efficiency of the ITQ music recognition model is the highest. (3) The SNN music recognition model proposed in the article has the highest detection accuracy, regardless of whether it is in a noisy or no-noise music environment, with an accuracy rate of 97.97% and a detection accuracy value of 0.88, which is 5 types of music. The highest one among the recognition models, the ITQ music recognition model, has the lowest detection accuracy, with a detection accuracy of 67.47% in the absence of noise and a detection accuracy of 70.23% in the presence of noise. Although there is a certain noise removal technology, it can suppress noise interference to a certain extent, but cannot accurately describe music information, and the detection accuracy rate is also low.


Introduction
Because the network has the advantages of fast information dissemination, easy use, and sufficient network resources, it is widely used in human work and study life. At present, with the rapid development of popular music in our country, music is everywhere, and the music wave has also affected us.
When faced with a wide variety of music types, users will inevitably feel at a loss. Users need to spend a lot of time choosing the type of music they are interested in. is method is not only a waste of time, but also very inefficient. Based on the above background, it is inevitable to design an intelligent auxiliary model of music characteristics. Literature [1] studied the ability of using self-organizing neural mapping as a music style classifier for music fragments. e article cuts the music melody into many segments of equal length, then analyzes the music melody and rhythm, and presents the analyzed data to SOM. Document [2] discloses a system and method for implementing a simple and fast realtime single note recognition algorithm based on fuzzy pattern matching. e system can accept the music rhythm and notes during the performance, and then compared with the correct music rhythm, you can know whether the music rhythm during the performance is standard. Literature [3] proposed a new method for automatic music genre recognition in the visual domain using two texture descriptors. Literature [4] introduces the use of a dynamic set of classifier selection schemes and creates a classifier pool to perform automatic music genre classification. e working principle of the classifier is the principle of support vector machine, which can extract effective information from the spectrum image of music. e research results of the article show that the accuracy of music extraction can reach 83%. Literature [5] introduced optical music recognition technology and proposed a method for computer to automatically recognize music scores. e system can scan the printed images of music scores to extract effective information and then automatically generate audio files to provide users with listening functions. Literature [6] proposed a statistical method to deal with the task of handwritten music recognition in early notation. is method of processing music is different from the traditional method in that it directly recognizes the music signal without dividing the music signal into many paragraphs. Literature [7] investigated various aspects of automatic emotion recognition in music. Music is also a good way to express emotions. Different classifications and timbres in music will interpret different musical effects. is article explores the extensive research on music emotion recognition. Literature [8] studied the utility of the most advanced pretraining deep audio embedding method used in the task of music emotion recognition. Literature [9] proposed a music emotion recognition method based on adaptive aggregation regression model. Emotion recognition of music is an important task to evaluate the influence of music on the emotions of listeners. e article proposes an emotion estimation model, which uses the variance obtained by Gaussian process regression to measure the confidence of the estimation results of each regression model. Literature [10] proposed a new method of using template matching and pixel pattern features in computer games. e general music model does not have much to do with the change of the font, but the beats and notes of some notes do not maintain the original shape of the music signal. e model proposed in the article can be applied to these music symbols. Literature [11] proposed a method to solve the problem of multidimensional music emotion recognition, combining standard and melody audio features. Literature [12] studied the reduction of the number of training examples in music genre recognition. e article studies the impact of the reduction of training real numbers on the detection results in the process of music style recognition. e experimental results show that although the number of experiments is greatly reduced during the detection process, it can still maintain a high classification performance in many cases. Literature [13] presents a method to parse solo performances into individual note components and use support vector machines to adjust the back-end classifier. In order to realize the generalization of instrument recognition to ready-made, commercial solo music, [14] proposed a method of musical instrument recognition in chord recording. Literature [15] proposed a method for analyzing and recognizing music speech signals based on speech feature extraction. e method is to extract effective music information from the music signal and then reorganize the music signal to a certain extent, so as to achieve the function of noise reduction. e results of the experiment show that the reorganized music signal has good noise reduction compared with the original music signal ability.

Overall Structure of Music Feature
Recognition. e music feature recognition system based on the Internet of ings technology is mainly composed of a physical perception layer, a capability layer, an adaptation layer, and a system application layer. e overall structure of the system is shown in Figure 1.

Design of Music Collection Module.
To identify the music signal, it is necessary to collect the music signal first. e music collection module is composed of two parts, namely, the collection submodule and the encoding module. e music collection submodule is composed of sound sensors installed in different positions and is responsible for collecting the original music signal [16]. e sound sensor has a built-in capacitive electret microphone that is sensitive to sound, which is converted by an A/D converter and transmitted to the voice coding submodule [17]. e voice coding submodule is mainly responsible for the high-fidelity and lossless compression of the original music signal, converts the music signal into transmittable data information, and then transmits it to the music signal processing module.

Music Signal Module Processing Design.
e music signal processing module is designed by a DSP processor [18]. e module uses a fixed DS chip suitable for voice signal processing. e DSP chip has low power consumption and fast running speed. It carries 2 MCBSPS, can be connected to CODEC for voice input, and has an 8-bit enhanced host parallel port to communicate with the host. Establish a communication connection, including 4 KB ROM and 16 KB DARAM. Its structure is shown in Figure 2 (1) pi represents the pitch of the i note, and n represents the number of notes in the music.
Treble changes: e pitch mean square error can be used to express the pitch change: (3) e range describes the breadth of the pitch of the music: range � Max P 1 , P 2 , . . . , P n − Min P 1 , P 2 , . . . , P n . (4) Time value:

Tone and Music Feature Extraction.
e frequency spectrum distribution of music signals and the emotions expressed by timbre perception are shown in Table 1 [19]. e formula for extracting music strength is e degree of musical intensity change can also be expressed as

Melody Direction Recognition. e expression formula of music melody is
D represents the total length of all notes; D i represents the length of the i-th note [20]. e melody direction can also be expressed as e expression formula of pronunciation point density is e change intensity of the rhythm is Music mutation degree: e expression of BarCapacity is [21] BarCapacity , when 20 Hz ≺ f 0 ≺ 500 Hz 1, when 500 Hz ≺ f 0 ≺ 1000 Hz, , when 1000 Hz ≺ f 0 ≺ 5000 Hz, 3.2. Musical Inference Rules. Sudden changes in treble or tone stability appear in the sequence variance. In order to measure these change points, first express the music as the following time sequence: Among them, μ represents the unknown constant mean value of time series Y k , and σ 2 represents the unknown constant variance of time series Y k (and ε k ).
Get the iterative residual sequence: and make Get statistics: After centralized processing,

Music Separation
Algorithm. According to the difference between the impact sound and the harmonic sound in the frequency spectrum, we can separate the original spectrum 4 Computational Intelligence and Neuroscience W fJ into the impact spectrum P fJ and the harmonic spectrum, which is e separation of impact sound and harmonic sound: Minimum: in

Algorithm Definition.
Algorithm definition is as shown in Table 2.

Experimental Data and
Research. e article uses the Internet of ings and human intelligence technology to design an SNN music feature assisted recognition model. In order to detect the recognition efficiency of the SNN music feature assisted recognition model, the experiment selected more than 50 pieces of multiple types of music for music feature recognition and counted them separately: the main melody and accompanying melody curves of different music. e main melody lines of different types of music are different, and the main characteristics of music melody are linearity and fluidity. e abscissa of the experimental statistics graph represents the absolute value of the interval, and the ordinate represents the percentage of the absolute value of the interval. e specific experimental results are shown in Figure 3.
From the data in Figure 3, we can conclude that the absolute value of the interval between the main melody and the accompanying melody mainly fluctuates in the range of 0-7. In the interval line chart of the main melody, the second degree accounted for the highest proportion of the interval melody. Reaching 36%, in the interval line chart accompanying the melody, 5 degrees accounted for the highest proportion of interval melody, up to 17%. After the absolute value of the interval reaches 13, the interval ratio of the main melody and the accompanying melody tends to be stable, maintaining between 0.6 and 0.9, and the melody interval ratio values completely coincide.
According to the experimental data in Figure 4, we can conclude that the relative difference of the main melody within the interval of X(1)-X (16) fluctuates greatly. When the interval variable is X(3), the relative difference is the largest. e maximum can reach 0.79. e relative difference value of the accompanying melody in the interval variable X(1)-X(10) fluctuates greatly. When the interval variable is X(3), the relative difference value is the largest, and the maximum can reach 0.61. After the absolute value of the interval reaches 17, the interval ratio of the main melody and the accompanying melody tends to be stable, maintaining between 0.01 and 0.04, and the difference between the main melody and the accompanying melody is always higher than that of the accompanying melody.

e Influence Experiment of Feature
Map. Based on the same recognition results of different features, it can directly reflect the recognition accuracy of different models and experimentally study the influence of the number and size of feature maps on the detection results. In the different feature map number recognition experiment, different distributions of convolutional layers were selected, and the distribution size was from 8 to 64. In the different feature map size recognition results experiment, 11 feature maps of different sizes were selected. e experimental data is shown in Tables 3 and 4.
According to the data in Table 3 and Figure 5, we can conclude that when the number of feature maps is 24 * 5, the recognition result is the most accurate, the MAP recognition     According to the different characteristics of the interval distribution, the main track and the accompanying track are distinguished [23] Rhythm � (n/T) � (n/(Dura Meter × Dura Num)) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 9. In general, the detection accuracy of 6 different numbers of feature maps generally maintains above 74%. According to the data in Table 4 and Figure 6, we can conclude that when the feature map size is 5 * 5, the recognition result is the most accurate. e MAP recognition result can reach 78.9, the recognition result of precision@ is 79.2, and the recognition result of HAM2 (%) is 78.5. When the feature map size is 78.6 and the feature map size is 14 * 14, the recognition accuracy is the lowest. e recognition accuracy of MAP is 74.1, the recognition result of precision@ is 75.7, and the recognition result of HAM2 (%) is 75.8. In general, the detection accuracy of 11 different sizes of feature maps generally maintains above 74%.

Comparison with Other Methods.
In order to test the performance of the music recognition model, the experiment improved the SNN music recognition model proposed in the article and compared it with the detection performance of the other three models. e experiment chose 5 different types of bit numbers. e number of bits is a unit, and the same as the sampling accuracy, the higher the baud rate or bit rate is, the more detailed the light changes of the music can be reflected. Observe the detection accuracy rates of 5 different models under different types of bits. e specific experimental data are shown in Table 5.
According to the data in Table 5 and Figure 7, we can conclude that the detection accuracy of the SNN music recognition model proposed in the article is the highest among 5 different music recognition models. When the number of bits is 64, the SNN detection accuracy rate of the improved SNN music recognition model is 59.2%, and the detection accuracy rate of the improved SNN music recognition model is 79.3%, which is 61.4% higher than the 17.9% detection rate of the ITQ music recognition model. e experimental data further shows that the ITQ music recognition model has the highest detection efficiency, which greatly promotes the efficiency of music feature auxiliary recognition.

Evaluation Criteria.
e evaluation criteria are as shown in Table 6.    Accuracy e accuracy measurement standard refers to the ratio of the number of correct music types to the number of all music types [24]. e larger the index value, the more accurate the recognition result Recall rate e recall rate standard refers to the proportion of the theoretically largest number of hits that recognize musical characteristics [25]. e larger the index value, the more accurate the recognition result F1 measurement e F1 measurement index can effectively balance the accuracy rate and the recall rate by favoring the smaller value. e larger the index value, the more accurate the recognition result  In order to test the performance of the SNN music feature-assisted recognition model, we run the model proposed in the article and other music recognition models under noisy and no-noise music conditions, observe the detection accuracy of different models, and verify the detection accuracy of different models. In order to make the experimental results more analytical, we have selected 5 different types of music data. e experiment detects these 5 types of music data with and without noise and observes the experimental results. e music sample data is shown in Table 7, and the specific detection results are shown in Tables 8 and 9.
According to the data in Table 8 and Figure 8, we can conclude that the SNN music recognition model proposed in the article has the highest detection accuracy, with an accuracy rate of 97.97% and a detection accuracy value of 0.88. It is 5 types of music recognition models. e tallest one among them, the ITQ music recognition model, has the     No noise  1  10  30  2  10  30  3  20  40  4  10  40  5  20  50  6 20 50 lowest detection accuracy rate of 67.47%, and the highest detection accuracy value is 0.3. e CNNH music recognition model and the KSH music recognition model are in the middle of the highest value and the lowest value. We can find from Figure 9 that the ITQ music recognition model has the lowest detection accuracy. e detection accuracy rate in the absence of noise is 67.47%, and the detection accuracy in the presence of noise is 70.23%. Although there is a certain noise removal technology, it can suppress noise interference to a certain extent, but cannot accurately describe music information, and the detection accuracy rate is also low. e detection accuracy of the KSH music recognition model is higher than that of the ITQ music recognition model, which can accurately describe the changes of music signals, but there are certain defects in noise processing, and the error rate of music detection is relatively large. e SNN music feature assisted recognition model proposed in the article has the highest detection accuracy among the five models, and there are many types of music detected, which can analyze music signals more comprehensively and systematically, and the accuracy rate is as high as 99.12%, thus greatly improving the efficiency of music detection. It is believed that the detection accuracy will be improved by using feature extraction approaches.

Conclusion
Today we are in an era of informationization and intelligence. e use of intelligent methods to study music has attracted more and more people's attention. Computer music has also made many achievements and has a very broad market prospect. Using a computer to simulate music signals, this process involves not only computers and music, but also a lot of complex professional knowledge. At present, there are still many problems in the auxiliary recognition of music characteristics in human intelligence; despite the design of the auxiliary model of music characteristics recognition in the article, music signals can be analyzed and identified efficiently, but the way of expressing music needs further research.

Data Availability
e experimental data used to support the findings of this study are available from the corresponding author upon request.