The Evaluation of Classifier Performance during Fitting Wrist and Finger Movement Task Based on Forearm HD-sEMG

The transmission of human body movement signals to other devices through wearable smart bracelets has attracted increasing attention in the ﬁeld of human-machine interfaces. However, owing to the limited data collection range of wearable bracelets, it is necessary to study the relationship between the superposition of the wrist and ﬁngers and their cooperative motions to simplify the data collection system of such devices. Multichannel high-density surface electromyogram (HD-sEMG) signals exhibit high spatial resolutions, and they can help improve the accuracy of the multichannel ﬁtting. In this study, we quantiﬁed the HD-sEMG forearm spatial activation features of 256 channels of hand movement and performed a linear ﬁtting of the data obtained for ﬁnger and wrist movements in order to verify the linear superposition relationship between the cooperative and independent movements of the wrist and ﬁngers. This study aims to classify and predict the results of the ﬁtting and measured ﬁngers and wrist cooperative actions using four commonly adopted classiﬁers and evaluate the performance of the classiﬁers in gesture ﬁtting. The results indicated that linear discriminant analysis aﬀords the highest classiﬁcation performance, whereas the random forest method achieved the worst performance. This study can serve as a guide for gesture signal simpliﬁcation in the future.


Introduction
e transmission of human body movement signals to other devices through wearable smart bracelets [1] has attracted increasing attention in the field of human-machine interface(HMI) [2]. However, considering the fact that the data collection range of the smart bracelet is limited to a small portion of the wrist, it is necessary to simplify the acquired signal of the bracelet such that the data collection range of the bracelet can be expanded to the limbs and the signal processing accuracy can be improved. e movements of a human hand are complex and precise owing to the independent and combined motions between the fingers and wrist [3,4]. erefore, studies on the independent and combined motions between the fingers and wrist can be utilized to simplify the signal expression and to improve the input efficiency of these bracelets.
ere has been rapid development in technologies that deal with human skin surfaces, such as flexible electrodes, sensors, and high-density surface electromyograms (HD-sEMG) [5][6][7]. Because of these developments, Malesevic et al. were able to deduce that analyzing the electromyography (EMG) signals of human forearm muscles is the most effective method for studying human limb movements [8].
ere have been previous studies that focused on the independent movements of the finger or wrist. Previous studies have distinguished human gestures from signals without considering the spatial activation characteristics of the forearm muscles that depend on finger or wrist movements. Dai and Hu [9]acquired the spatial activation characteristics of the forearm muscles during the movements of a single finger. In their study, the linear discriminant analysis (LDA) classifier was used to distinguish different fingers. Jiang et al. [10] studied the high-density surface electromyogram (HD-sEMG) signal of the hand under different finger movements. eir results proved that HD-SEMG signals of the hand can be an effective boon for biometric recognition. Gazzoni et al. [11] quantified the spatial activation characteristics of human forearm muscles for finger and wrist movements. However, there have been no studies on the correlated movements between fingers and wrist.
Accordingly, the objective of this study was to explore the performance of the four types of classifiers commonly used in the study of hand movements for the classification and prediction of the finger, wrist, and their coordinated movements. e outline of our study is listed below: (1) For action superposition, the EMG signals of the forearm flexor and extensor muscles during hand movement were collected through 256 high-density acquisition channels. After quantifying the highresolution muscle spatial activation features of these high-density EMG signals, the quantitative features of the fingers and wrist movements were multichannel linearly fitted to obtain the linear superposition of the two composite signals. (2) For classifier performance evaluation, the linear superposition of composite signals of the finger, wrist, and their coordinated actions were put into four types of classifiers for classification and prediction. ese four types of classifiers are commonly used for EMG signals. Most importantly, the performances of these four classifiers were compared in detail based on the obtained results.
To the best of our knowledge, this is the first study that has considered the superposition of finger and wrist through 20 movements of 14 subjects and evaluated the performance of the predictive classifiers. e results of this study can simplify the finger and wrist cooperative signals in the future.

Subjects.
A total of 14 healthy volunteers (8 men and 6 women, aged 22 to 34 years) with right arm dominance were recruited for this study. All participants provided written informed consent, and none had a history of arm or hand disease. e research protocol was reviewed and approved by the Ethics Committee of Fudan University (approval number: BE2035).

Experimental Setup and Protocols.
To reduce the electrical impedance of the skin and increase the conductivity between the test electrode and the skin, each subject was required to wash the right forearm with an exfoliator. After drying, the skin surface was repeatedly wiped with medical alcohol to ensure that there were no residual impurities on the skin surface. e acquisition electrode plates used in the experiment were ELSCH064NM1, made by OT Bioelettronica, Turin, Italy, each having one matrix electrode with a gel. Each gelled electrode sheet contained a matrix array with an 8×8 electrode layout with a center distance of 10 mm (center-to-center) between adjacent electrodes. e electrode plate in contact with the skin was an elliptical gelled electrode with a long axis of 5 mm and a short axis of 2.8 mm. Two electrode plates were attached to the flexor surface and another two pieces were attached to the extensor surface of the right forearm of the subject, as shown in Figure 1. A total of 256-channel data acquisition arrays consisting of four electrodes were used to collect HD-sEMG signals from the forearm.
To make the electrode plate stick to the wrist, we divided the measurement areas of the electrode plate according to the anatomical marks of a human body. e two ends of the radius and ulna were used as the upper and lower boundaries, and the ulnar and radial sides of the forearm were used as the left and right boundaries, respectively, to draw the fitting area of the skin surface of the flexor and extensor muscles.
During the experiment, each subject was seated upright in a chair in front of the experimental apparatus, with the right arm placed naturally horizontal on a table on their right. e electrode plate can shift from its position when the arm touches the table. Moreover, to ensure that only the EMG data of gestures are obtained, force should be exerted by only the parts suggested in Figure 2 and the other parts should remain relaxed; in particular, the arm should be maintained in a horizontal posture. erefore, the elbow of the right arm and the forearm near the wrist of the subject were placed on a support mat. e fingers and wrist could move freely; however, the other forearm parts needed to be suspended in midair. e force generated during a gesture affects the hand signal.
erefore, to ensure that every movement in the experiment generated the same force, each subject was required to undergo a force measurement experiment. e five fingers of each subject were fastened to a force-sensing device (Interface, SM-100N) to test the maximum voluntary contraction (MVC) of each finger. To obtain the maximum bending and stretching force of the fingers, the subjects were instructed to press each finger down and lift them up with maximum force. For each finger force measurement, the subjects were asked to maintain the force for 3s, for a total of three tests, where a 2-minute rest was given between each test. A 5-minute rest was given after the force test and before the finger and wrist movement tests. e movements involved in the finger and wrist movements experiment are shown in Figure 2. [7]. For this experiment, each subject had a display screen placed in front of them, where the screen displayed the first to the 20th action shown in Figure 2 successively. Each action was played twice for four seconds, with a two-second break between two same movements and a five-second break between two different movements. e action sequence is shown in Figure 3.
When a certain action was played on the screen, the subject was asked to make the same gestures. To ensure that all the actions have a proper holding degree, the force for performing the action should be maintained at approximately 60% of the maximum force. All the gestures were recorded using an HD-sEMG, Quattrocento System (OT Bioelettronica in Turin, Italy), which was set to 10-900 Hz passband, 1000 gain, 2048 Hz sampling rate, and 16-bit resolution.

2
Mathematical Problems in Engineering

Data Preprocessing.
e acquired HD-sEMG signals were first passed through an 8-order Butterworth filter band pass-wave device of 10-500 Hz to filter out the clutter and interference signals outside the frequency range of the sEMG signals. A notch filter combination was then employed to eliminate the line interference signal and its harmonic components caused by the power supply. e HD-sEMG signals were segmented into different subsections according to the different actions after filtering.

Data Analysis.
e following steps were involved in the data processing: (1) signal feature extraction after preprocessing, (2) data fitting from the signal features, and (3) gesture discrimination using classifiers. Figure 4 shows a general framework of the HD-sEMG datasets used for distinguishing between hand gestures.

Feature Extraction.
To express the overall spatial activation characteristics of the forearm flexor and extensor muscles, we calculated the two-dimensional root-meansquared (2D RMS) of the monopolar sEMG recordings acquired from 4 × 8 × 8 channels (four electrodes with eight rows and eight columns of electrodes). In the RMS calculation, to reduce the effect of potential interferences, only data from a relatively stable action period in the experiment was considered. In the 4 s holding time, the data from the first and last 0.25 s were removed.
Relatively fixed outliers caused by equipment problems and random outliers caused by poor contact between electrodes and the skin (due to excessive hand movements during the experiment) were detected. ese outliers were then replaced with the average values from the adjacent channels.
is prevents the fitting coefficient calculation error caused by outliers.

Data Fitting.
e purpose of data fitting was to analyze the relationship between the movements of fingers and wrists [12]. In this study, the 20 gestures in the experiment were divided into three parts based on the following logical relationships: (1) finger movements, including hand close and hand open, refer to No. 19 and 20 in Figure 2; (2) wrist movements, including six wrist movements, No. 1 to 6 in Figure 2; (3) finger and wrist cooperative movements, including two kinds of movements-closed hand and wrist cooperative movement (No. 7 to 12 in Figure 2) and open hand and wrist cooperative movement (No. 13 to 18 in Figure 2).
First, we assumed that each coordinated movement of the fingers and wrist could be seen as a superposition of the corresponding independent fingers and wrist movements.
To verify this superposition relation, linear fitting of the three types of actions was employed to obtain the fitting coefficient. en, the combination coefficient and discrete actions were used for regression into the corresponding composite actions under different fitting channels. Finally, the classifiers were used to analyze the compound and original real actions. If the compound action can be classified correctly through the real action training, the compound action can be replaced by real action. e number of fitting channels was divided into three cases: 8, 32, and 64 channels. e different numbers of fitting channels correspond to the different areas of the skin sEMG signals that were selected as a superimposed whole. An increased number of fitting channels corresponds to the superposition of sEMG signals from a large area. e superposition method is used so that the component signal of individual finger and wrist movements could replace finger and wrist cooperative signals. e fitting relations of the different actions are shown in Table 1.
In our work, 2D-RMS was used as the depth correlation feature vector. erefore, to statistically determine the classification consistency of 2D-RMS, the variance test was subjected to a one-way Analysis of Variance (ANOVA) test [13].
e objective of the one-way ANOVA test was to present the distinction between the three types of gesture movements under the influence of the same feature. e results obtained from this test are provided in Section 3.2.

Classification.
After data fitting, the composite data obtained after fitting and the original real cooperative action data were put into the classifiers for analysis. e classifiers can verify whether the composite actions after fitting could restore the corresponding real cooperative action. e process of the classifiers includes the following: (1) linear discriminant analysis (LDA) [9,14,15], (2) K-nearest neighbor (KNN) [16], (3) support vector machine (SVM) [17][18][19], and (4)   Step 1: first holding period of action; Step 2: resting period between the same movements; Step 3: second holding period of the action; Step 4: resting period for the next movement; and Step 5: first holding period of the latter action.
In the design of parameters and kernel functions, we compared different parameters or kernel functions involved in the classifier, and those with the best performance were selected. For example, for the KNN classifier, we tried different K values. We finally set the value of K as 1 [22], which achieved the highest classification accuracy. For the SVM classifier, we tried linear kernel, RBF kernel, and quadratic kernel, respectively, and chose the RBF kernel. For random forests, we tried a number of different trees and selected 64 for obtaining the highest precision. e training sets of the classifier adopted the real collected cooperative action datasets, which contained 12 types of actions (No. 7 to 18 in Figure 2). As each action was performed twice, there were 24 datasets in total. e data from the test sets were composite action datasets after fitting. Similarly, there were also 12 corresponding actions. As each action was performed twice, there were 24 datasets in total. Under the selection of different numbers of fitting channels and classifier types, the classification results of 14 subjects were finally averaged to express the classification performance of the classifier. At the same time, to reduce the deviation of the experimental results, five-fold cross-validation was provided. e data analysis tool used for this processing was MATLAB 2020A, and the hardware platform was an Intel(R) Core (TM) I5-9400F CPU @ 2.90 GHz CPU; 16.0 GB of RAM. Operating system: windows10 Professional 21H1 version. e graphics card model was an NVIDIA GeForce GTX 1660.

Error of Fitting.
In this study, three types of fitting channels were demonstrated for comparison: 8 numbers, 32 numbers, and 64 numbers of channels. In the case of fitting with different numbers of channels, there was an error between the reconstructed and real signals when the fitting coefficients were used. e smaller the error, the closer the reconstructed signal to the real one. Experimental data showed that the data fitting error of the anterior muscle group was relatively large, while that of the posterior muscle group was relatively small during hand closing, as shown in  Wrist movement (serial number in Figure 2) Finger movement (serial number in Figure 2) Coordination action(serial number in Figure 2) Wrist flexion (1) Hand close (19) Hand close with wrist flexion (7) Wrist flexion (1) Hand open (20) Hand open with wrist flexion (13) Wrist extension (2) Hand close (19) Hand close with wrist extension (8) Wrist extension (2) Hand open (20) Hand open with wrist extension (14) Wrist radial (3) Hand close (19) Hand close with wrist radial (9) Wrist radial (3) Hand open (20) Hand open with wrist radial (15) Wrist ulnar (4) Hand close (19) Hand close with wrist ulnar (10) Wrist ulnar (4) Hand open (20) Hand open with wrist ulnar (16) Wrist pronation (5) Hand close (19) Hand close with wrist pronation (11) Wrist pronation (5) Hand open (20) Hand open with wrist pronation (17) Wrist supination (6) Hand close (19) Hand close with wrist supination (12) Wrist supination (6) Hand open (20) Hand open with wrist supination (18) Figure 5(a). In contrast, during hand opening, the data fitting error of the anterior muscle group was relatively small, while that of the posterior muscle group was relatively large, as shown in Figure 5(b). Generally, two-dimensional root-mean-squared (2D-RMS) [23] is often used to calculate the fitting error. However, in this case, the absolute error value was too small. Hence, the error relative to the mean value was used in this study to reflect the fitting error trend for different channels. e RMS error (RMSE) [24] was calculated using equation (1). e RMSE relative to the mean was represented by mean_err, and it was calculated using equation (2).
Furthermore, mean_err was quantified, as shown in Table 2; based on the data for the same gesture, it was found that mean_err gradually increased with an increase in the number of fitting channels. For the same fitting channel, mean_err of the closed hand was slightly larger than that of the open hand, as shown in Figure 6. e difference in mean_err was caused by the selection of different fitting channels. is is because an increase in the number of fitting channels leads to more channels sharing a fitting coefficient to express collaborative actions, which in turn causes data distortion. However, the greater the number of fittings, the less the fitting coefficient, the less the complexity of finger and wrist motion superimposed into their cooperative action.

Performance Evaluation of Different Classifiers.
In order to make a more intuitive comparison, the average classification accuracy of the three types of classifiers was considered. After that, the average accuracy was obtained by averaging the discriminant accuracy of all actions of each person and then of all the subjects, as presented in Table 3. With an increase in the number of fitting channels, the difference between the overall classification accuracy of the four classifiers gradually decreased, as shown in Figure 7. When the number of fitting channels was eight, the classification accuracies of LDA, KNN, and SVM showed little difference, maintaining a gesture discrimination accuracy of more than 99%; the accuracy of the RF classifier was approximately 7% less than that of the other three classifiers and LDA had the highest accuracy compared to other classifiers. When the number of fitting channels was 32, the classification accuracy of the four classifiers became worse; LDA had the highest accuracy than that of other classifiers; the accuracy of KNN was slightly worse than that of LDA but higher than that of SVM; the accuracy of RF was more than 10% lower than that of the above three classifiers. Compared to when the number of fitting channels is eight when the number of fitting channels is 64, the classification accuracy of LDA decreased by 3.87% but had more than 95% accuracy; the classification accuracy of KNN decreased by 7.73% and remained above 91%; the classification accuracy of SVM decreased by 12.3% and remained above 86%; RF's classification accuracy dropped by 10.45%, but the accuracy value was dropped to 83.3%. In the entire classification process, SVM classification accuracy decreased the fastest, LDA classification accuracy decreased the slowest, RF classification accuracy was the worst, and the accuracy of the LDA classifier was the highest. erefore, the LDA classifier was found to be relatively suitable for multitype data classification related to the finger and wrist data fitting in our study, as compared to the other classifiers.
e data for different types of gestures were maximum and the projection of the intraclass distance for the same type of gesture was minimum in LDA. As predicted, for gesture data fitting, with an increase in the number of fitting channels, the fitting error continued to increase; the difference between fitted datasets was much larger than that between acquired datasets and small channel number fitting datasets. Despite the type, the sample size within the class is small. erefore, with an increase in fitting error, the performance of the KNN classifier became worse. Compared with LDA and KNN, the SVM classifier had more errors in performance. is was because the study was essentially a multitype classification problem, where finding a hyperplane is difficult, so the corresponding support vector has a large deviation. For the RF classifier, there was not enough in-class data support to form a decision tree or support data outside the node to improve the accuracy. erefore, RF is not suitable for processing this class of problems.
One-way ANOVA was used for statistical analysis to determine whether there existed a significant difference between the characteristic values of 2D-RMS in the finger, wrist, and composite movements. e obtained p-value, which involved 255 features, from the one-way ANOVA test was 1.32E-5 (p-value < 0.05). e results indicated that the 2D-RMS had no significant difference in performance in the three gesture movements.
In addition, to evaluate the performance of different classifiers on different gestures, a confusion matrix was employed [25,26], as shown in Figure 8. e results of the confusion matrix revealed that the LDA classifier fully recognized seven kinds of actions, but there were 10 erroneous decisions with percentage probabilities of detection ranging from 1.19 to 5.95%. e KNN classifier made completely accurate judgments for the two actions, but for the other actions, it had 21 erroneous judgments with percentage probabilities ranging from 1.19 to 4.76%. e SVM classifier made 31 erroneous decisions with probabilities ranging from 1.19 to 4.76%. e random forest classifier made 48 incorrect decisions, with probabilities ranging from 1.18 to 9.52%. erefore, in terms of the classification and judgment of actions, the performance of classifiers was ranked in the following order: LDA performed best, KNN took second place, SVM was the third, and RF performed the worst. 6 Mathematical Problems in Engineering To compare the overall performance of the four classifiers, the precision, recall, and F1 scores were considered, as shown in Table 4. As can be seen from the table, LDA had the best performance and RF had the worst performance.
For an efficient comparison of the four classifiers [27], we calculated the average execution time [28] of the four classifiers on the same platform, which is illustrated in Table 5.        Mathematical Problems in Engineering e results of the execution time revealed that the running efficiency of RF was far lower than that of the other three types, LDA was slightly worse than KNN, and SVM had a very short running time and high execution efficiency.

Limitations.
e results of the classifiers show that the 2D-RMS as a muscle spatial activation evaluation feature can provide an accurate performance for EMG superposition. However, the low computational efficiency of HD-sEMG high feature space poses a challenge for online real-time estimation. erefore, the results of this study were obtained offline. Despite our work on online EMG decomposition, significant research efforts are still required before these features can be used for accurate classification. To reduce the complexity, we only used the 2D-RMS associated with the EMG depth as the eigenvalue of the test classifier. e results show that the commonly used classifiers can be evaluated under different channel choices. In future works, we plan to explore the impact of multiple features [29,30] for establishing real-time estimations.
In this study, although we considered three types of fitting channels, we performed classification calculations for 4, 8, 16, 32, 64, 128, and 256 channels. For 128 and 256 fitting channels, the spatial resolutions were too small, so the fitting error was too large, reflecting the superposition effect.
Finally, to ensure the stability and unity of the signal, we required all subjects to use 60% of the maximum voluntary contraction (MVC) when performing movements. However, it did not affect the results of our current research because different subjects exerted different forces. In future research, different forces should be included in the scope of the study, which would be more suitable for practical applications.

Conclusions
In this study, we systematically studied the relationship between independent and cooperative motions of human fingers and wrist. is study covered feature extraction and quantification of spatial activation features of HD-sEMG signals of human forearm extensor and flexor muscles in recognition of finger and wrist movements. ese features were then fitted according to the corresponding relationship between the independent and cooperative actions. e focus of this study was to put the fitted compound and acquired action datasets into four types of classifiers for prediction to evaluate the performance of the classifiers in detail. We found that the collaborative motion datasets of the fingers and wrist can be obtained by linear superposition of the independent motion data sets of the fingers and wrist. Four types of common classifiers were used to predict the compound action datasets and acquired signals of different   channels: LDA, KNN, SVM, and RF. If the number of fitting channels was 8 or 32, the classification accuracies of LDA, KNN, and SVM were similar, but SVM consumed the least time. However, when the number of fitting channels was 64, the accuracy of LDA was the highest, reaching 95.83%, that of KNN was 91.67%, and the accuracy of SVM dropped sharply to 86.9%. In all cases, RF had the lowest classification accuracy and consumed the most time. e results of the confusion matrix indicated that for gesture classification prediction, LDA had the least error, KNN was the next, SVM was the third, and RF had the most error. erefore, for hand gesture classification, when the number of fitting channels is less than 32, the classifier can be selected comprehensively. e SVM classifier performance was highly accurate and less time-consuming. However, when the number of fitting channels is higher than 32, the LDA classifier is preferable because of its high classification accuracy and acceptable time consumption. Finally, the results of this study can provide reliable guidance for future studies on the correlation between the finger and wrist and promote the further simplification of human forelimb bionics equipment. [31].

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Disclosure
A preprint has been already submitted [31].

Conflicts of Interest
e authors declare that they have no conflicts of interest.