^{1}

^{2}

^{3}

^{3}

^{3}

^{1}

^{1}

^{2}

^{3}

We present a novel computational technique intended for the robust and adaptable control of a multifunctional prosthetic hand using multichannel surface electromyography. The initial processing of the input data was oriented towards extracting relevant time domain features of the EMG signal. Following the feature calculation, a piecewise modeling of the multidimensional EMG feature dynamics using vector autoregressive models was performed. The next step included the implementation of hierarchical hidden semi-Markov models to capture transitions between piecewise segments of movements and between different movements. Lastly, inversion of the model using an approximate Bayesian inference scheme served as the classifier. The effectiveness of the novel algorithms was assessed versus methods commonly used for real-time classification of EMGs in a prosthesis control application. The obtained results show that using hidden semi-Markov models as the top layer, instead of the hidden Markov models, ranks top in all the relevant metrics among the tested combinations. The choice of the presented methodology for the control of prosthetic hand is also supported by the equal or lower computational complexity required, compared to other algorithms, which enables the implementation on low-power microcontrollers, and the ability to adapt to user preferences of executing individual movements during activities of daily living.

The methods for estimating the intention of an amputee to control the movement of a prosthetic hand often relies on the interpretation/decoding of the electrical activity of muscles (electromyograms, EMG) recorded on the skin-surface of the residual limb [

With the ongoing development of dexterous prosthetic hands [

As one of the approaches for decoding finger movements using only multiple EMG signals, in this study we propose a novel classification algorithms based on the combination of EMG feature extraction and piecewise modeling of the feature temporal dynamics, incorporated in hierarchical hidden semi-Markov models (HHSMM) [

Five able bodied subjects participated in the experiment. All the subjects provided informed consent, and the study was approved by the Regional Ethical Review Board in Lund, Sweden.

The methodology presented in this research relies on multielectrode EMG signals for movement classification. During the measurement procedure, a subject was comfortably seated with the right hand resting in a neutral position. The subject was asked to perform a movement to match a hand image shown on the screen in front of him/her. The requested movements were flexion/extension of all five fingers and adduction/abduction of the thumb (12 different movements). Movement and resting periods between movements were of equal length (5 seconds) and were timed by a LabVIEW (National Instruments, Austin, TX) custom application. The participant was visually prompted by the program, which synchronously acquired the EMG signal and the current cue annotation. Two different datasets (one for training and one for testing) each consisting of five repetitions of each movement, totaling 60 movements, and the rest states, were stored with the intended class information, for offline analysis.

The dataset used in this paper was previously recorded and used in the publication by Huang et al. [

Measurement setup as in [

(a) shows sample EMG channels (2, 7, 15, and 15). (b) shows only channel 7 raw EMG signal (blue) with the visual cues/classes (red). This figure is an example of the latencies between visual cue and the EMG onset. This is also noticeable for the movement endings which are delayed with respect to the visual cue. Another important piece of information presented in this figure is similarity between EMG waveforms for the same movements (i.e., class 12).

A set of commonly used EMG features in the time domain was selected as an additional comparison criterion in this study. The choice of time domain features, versus spectral or time-frequency domain features, is in line with the tendency of deriving a control chain that could be implemented in an embedded system with limited processing power/speed. Among the relatively high number of reported features, we chose three-amplitude-related and three-frequency-related time-based EMG features. We implemented functions for the following EMG features as reported in [

Mean Absolute Value (MAV) is the favored EMG feature in many myoelectric control applications. It is calculated within a moving average window of the rectified EMG signal:

Root Mean Square (RMS) calculation is, similar to MAV, also windowed function:

Variance (VAR) in the case of sEMG signal could be simplified enabling fast implementation in an embedded system. The resulting formula is the following:

Slope Sign Change (SSC) is a time domain method used to estimate frequency feature if the EMG signal. The calculation of the SSC relies on counting changes between positive and negative slope among three consecutive samples. To limit SSC calculation only to periods with EMG activity, the threshold function is imposed in the feature extraction method:

Zero Crossing (ZC) is the function that counts the number of consecutive EMG samples that change sign within the sliding window. Similar to SSC calculation, the threshold function is imposed to remove calculation of the ZC during periods without pronounced EMG activity:

Willison amplitude (WA) is a measure related to superimposed action potentials that make the EMG signal. The WA is the number of consecutive differences between consecutive samples that exceed set threshold:

In the case of SSC, ZC, and WA calculations, thresholds were set based on the estimated white noise level during recording. As each recording session started with a rest period, it was convenient to use first 0.5 s to calculate thresholds that are used throughout the same recording. A sample of calculated features is shown in Figure

Example of the extracted features calculated from the EMG signal recorded on a single channel. The figure reveals that MAV and RMS features, in this case, almost overlap, but other features contain complementary information regarding EMG signal.

All feature extraction calculations were conducted using MATLAB, with moving window size of 250 ms and a displacement of 25 ms.

Here, we will extend our previous work [

VARHHMM flowchart. At the hierarchically higher HMM layer, the two algorithms are split before the Bayesian inference layer. While one algorithm for HMM (orange path) is the same as the HMM for segments, the other algorithm is based on hidden semi-Markov model (HSMM) paradigm that incorporates movement duration as a free parameter.

From the basic flowchart, it can be noted that the main difference between the two algorithms is in the top layer of the hierarchy. In the case of the VARHHMM, this layer only takes signal likelihood as the input to for the layer that identifies the current movement. In the case of the VARHHSMM, the current movement label will also depend on the expected duration of individual movements when identifying the active moment label. In what follows, we will refer to these two classifiers as hierarchical hidden Markov model (HMM) and hierarchical hidden semi-Markov model (HSMM), where we dropped the VAR and hierarchical labels for readability.

We assumed here that the selected time domain EMG features could be represented by a number of segments. To accommodate signal dynamics in real-time and enable creating a classifier suitable for embedded implementation, we defined individual segments as vector autoregressive (VAR) models of the first-order VAR(1). Using the VAR process for approximating the dynamics of a segment of an EMG time series allows for efficient prediction of feature dynamics and implicitly accounts for the noise presented in the recorded EMG signals. Formally, the VAR(1) process is defined as

Based on predicted and measured sample vectors we can calculate which of the possible (fitted) VAR(1) models provides the best description of the recorded time series. The transition between individual segments (between different VAR(1) models) is considered as stochastic; thus, in our realization it is modeled as a HMM (as in [

By applying an approximate Bayesian inference procedure to the hierarchically arranged HMM and the emission distributions defined with VAR(1) models, we can classify each of the measured data points in a time series. The Bayesian inference for real-time signal classification relies on estimating the posterior probability for each movement. The posterior probability is calculated by combining expectations over movement probabilities with the evidence provided by the emission distributions (observation likelihoods). The expectations (predictions) are estimated using the posterior distribution at previous time sample and the transition matrices of movements and associated segments. Using the values of the posterior probability

The HSMM algorithm is similar to the HMM algorithm with the only difference being that each movement includes a sequence of durations associated with it. This approach increases the number of free parameters with an additional vector comprising prior duration probabilities of individual movements. In other words, the probability of switching between movements is also influenced by the model of movement duration. An illustration of a specific realization of the movement switching sequence of the HSMM is shown in Figure

HSMM movement models. M1–M3 denote different dynamic movements. The HSMM algorithm uses distributions of individual movements durations to constrain transitions between them.

The implementation of the HSHMM algorithm relies on a timer, or counter (Cnt) in the case of equidistant sampling. Upon entering a state/movement, the counter is set to predefined value related to that particular state duration and with every new sample is decremented. When the counter reaches 0, the state/movement is free to change. Upon switching to a new state, the counter is reset to zero. This additional condition is defined as

In the case of HSMM, the duration of a movement is provided in form of a prior distribution. As the measurement protocol was executed in automated manner, all performed movements are of the same duration (~5 s) with some variability related to subject’s reaction times to a visual cue. In order to simulate more realistic end-user scenario, we defined prior distribution to have a duration range with 2-second variability (

As the number of segments per movement is the free parameter of both algorithms, it was of specific interest to evaluate performance of the classification with respect to a movement division by the automated optimization procedure. Specifically, the case of a single segment per movement was analyzed independently. The rationale for this lies in the extracted features dynamics in the time domain. When visualizing the recorded signals, a movement in a feature space reassembles a step response of a dynamic system, with a relatively short transient compared to the constant part. When ignoring the noise part of the vector autoregressive model (see above) we can rewrite the update equation as

Example of the segment switching with AR(1) model (a one-dimensional VAR(1) model) with the forced one segment per movement. Blue line represents MAV feature dynamics when switching from a rest state to a movement, and the red line represents example of a single AR segment optimized for that movement.

The estimation of the free parameters of all models presents the only computationally demanding procedure of the presented methodology. During the optimization, the free parameters of the VAR models (number of segments per movement, mean signal values of individual segments

Optimization procedure flowchart. The main load of the optimization procedure is division of movements to segments. With a higher number of segments per movement, the Viterbi algorithm needs a large number of initial conditions for the model to converge to the global minimum instead of converging to a local minimum. This movement dividing method is not deterministic, resulting in slightly different models with every optimization run.

The optimization procedure is an iterative process where from the initial values of VAR parameters the Viterbi algorithm determines the most likely sequence of segments and estimates transition matrices. The following step includes reestimation of VAR parameters based on a maximum likelihood calculation and the most likely separation of training data suggested by the Viterbi algorithm. The procedure shown in Figure

Bearing in mind that the optimization method could converge towards a local optimum, we repeated the whole optimization procedure for 1000 different initial values of free parameters and only the solution with the lowest value of Bayesian Information Criterion (BIC) [

To evaluate the effectiveness of our algorithm, we used some of the most commonly used classification methods including

Linear Discriminant Analysis (LDA),

Quadratic Discriminant Analysis (QDA),

Support Vector Machine with the Error-Correcting Output Codes (SVM ECOC),

Linear Classifier with the Error-Correcting Output Codes (LC ECOC),

Linear Discriminant Analysis with the Error-Correcting Output Codes (LDA ECOC),

Naive Bayes (NB),

Random Forest (RF),

Decision Tree (DT).

The implementation of selected classifiers was done using

The ground truth for classification in this paper is set to be the visual cue presented to the subject. Although there is the noticeable latency between the visual cue and the muscle activity in both movement onset and cessation as shown in Figure

The main metric for comparing features and performance of the classifiers was accuracy. For the accuracy calculation, each feature sample (one observation every 25 ms) was treated individually. Moreover, as the rest state is directly involved in active motor driving by prosthesis control, it was also considered equal to the other movement classes. With this evaluation paradigm, each classification point was categorized as a true positive (TP) if the predicted class matches the visual cue label, or as a false positive (FP) if there is mismatch between predicted class and the visual cue. This approach generates a large number of data points, which was in our case approximately 20000 per dataset that could be used to investigate the performances of various features and classifiers.

The cumulative results for all subjects are presented in Table

Classifier accuracy in % for LDA, QDA, knn (15), SVM ECOC, LC ECOC, LDA ECOC, KNN ECOC, NB, HHMM, HSMM, and HHMM with only one segment per movement and HSMM with one segment per movement. The top ranked classifiers are marked with

LDA | QDA | KNN | SVM | LC | LDA | KNN | NB | RF | DT | HHMM | HSMM | HHMM | HSMM | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

| 80.28 | 81.94 | 79.10 | 82.43 | 78.06 | 82.02 | 79.19 | 69.36 | 79.37 | 64.86 | | | 82.59 | |

| 80.16 | 81.40 | 78.89 | | 76.92 | 81.37 | 79.74 | 69.81 | 79.44 | 63.68 | | | 69.95 | 81.16 |

VAR | 71.77 | 38.12 | 78.22 | | 70.22 | 76.23 | 78.40 | 51.91 | 79.44 | 64.61 | | | 42.08 | 70.85 |

SSC | 74.80 | 69.29 | 73.73 | | | | 73.99 | 62.56 | 73.83 | 62.70 | 71.41 | 75.56 | 71.41 | 75.56 |

ZC | 74.75 | 73.03 | 72.64 | | | 75.42 | 73.08 | 60.17 | 72.34 | 59.72 | 74.26 | | 74.26 | 75.44 |

| 79.20 | 75.54 | 76.73 | 79.50 | 77.96 | | 76.92 | 63.06 | 77.74 | 64.93 | 78.39 | | 78.39 | |

| ||||||||||||||

Median | 77.00 | 74.29 | 77.47 | | 76.30 | 78.30 | 77.66 | 62.81 | 78.59 | 64.15 | | | 72.83 | 78.36 |

As the average accuracy across all subjects and all classifiers was the greatest in the case of MAV feature, the results obtained using this feature were analyzed in more detail. Table

Classifier accuracy in % for the different subjects and MAV feature.

SUB | LDA | QDA | KNN | SVM | L | LDA | KNN | NB | RF | DT | HMM | HSMM | HMM | HSMM |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

1 | 82.07 | 86.39 | 79.10 | 82.43 | 80.60 | 82.65 | 79.19 | 65.97 | 79.65 | 69.70 | 87.50 | 91.18 | 86.70 | 90.94 |

2 | 80.28 | 81.94 | 79.96 | 84.83 | 80.85 | 83.55 | 80.45 | 69.36 | 83.01 | 65.68 | 83.51 | 85.61 | 83.51 | 85.42 |

3 | 79.69 | 56.82 | 77.19 | 81.26 | 72.77 | 80.31 | 77.98 | 63.79 | 79.48 | 57.75 | 76.28 | 75.76 | 71.51 | 73.43 |

4 | 81.64 | 82.31 | 82.95 | 82.83 | 77.92 | 82.02 | 83.12 | 75.33 | 80.56 | 70.08 | 83.00 | 84.30 | 82.59 | 83.35 |

5 | 80.21 | 66.88 | 78.54 | 80.97 | 78.06 | 78.45 | 78.78 | 70.25 | 74.62 | 63.94 | 77.20 | 82.96 | 62.86 | 78.39 |

| ||||||||||||||

Median | 80.28 | 81.94 | 79.10 | 82.43 | 78.06 | 82.02 | 79.19 | 69.36 | 79.65 | 65.68 | 83.00 | 84.30 | 82.59 | 83.35 |

| ||||||||||||||

Rank | 8 | 7 | 11 | 5 | 12 | 6 | 10 | 13 | 9 | 14 | 3 | 1 | 4 | 2 |

With the focus on VARHHMM methodology, to illustrate ability to separate individual finger movements and resting periods, the confusion matrix is shown in Figure

Confusion matrix, HSMM, MAV, Subject 4. There are around 800 observations per movement and 10100 in the rest class. The algorithm underperformed when discriminating classes 3, 8, 9, and 10, with class 9 being most problematic as the majority of observations were misclassified as class 3.

Figure

Sample classifications of the HSHMM. MAV (feature with the highest accuracies across all classifiers) and Subject 4 (median results).

Other metrics that were implemented to calculate classifier behavior in the real-time are motion selection (MS) and motion completion (MC) times. The calculation of MS and MC is derived from the papers of Li et al. and Ortiz-Catalan et al. [

Motion selection (MS) and motion completion (MC) time. Mean values for MS time ranged from 0.25 s (HSMM-RMS) to 0.6 s (HHMM S1-ZC) and for MC time ranged from 1.3 s (HSMM-RMS) to 1.8 s (DT-SSC).

We also analyzed another parameter, namely, the optimal number of segments per movement that results from the optimization procedure. For both HHMM and HSMM algorithms, two segments per movement most frequently produced the highest accuracy score (Figure

Optimization of the parameters. The chart shows that there is no clear advantage of using high number of segments per movement as in lot of cases even one segment is enough to achieve the highest classification accuracy. The same conclusion could be made for the number of AR samples as having only one previous sample available for making prediction could lead to high accuracy.

As the performance of HSMM algorithm depends on the predefined distribution of possible movement durations, analyzing this parameter was of great importance. During the measurements, the duration of movements was governed by the visual cues that were presented in automated manner. This way, the resulting movements have roughly the same duration of 5 s. To artificially introduce expected movement duration variability, we expanded the uniform duration distribution around central point of 5 s. With the extension of the expected movement duration, we evaluated classification accuracy. As presented in Figure

Dependence of the distribution range to the accuracy of the HSHMM classifier. The graph is showing that the accuracy decreases almost linearly with the increase of the movement duration range.

In this paper, we presented novel algorithms for classifying features from surface EMG signals. Both of the proposed algorithms (HHMM and HSMM) are variations of the VARHHMM algorithm which combines vector autoregression, hidden Markov models, and Bayesian inference. The main focus of the presented results is the comparison between different EMG classifiers, including some of the most commonly used ones and VARHHMM variants. The results presented in Table

The results also reveal that reducing number of segments per movement to one does not result in considerable drop in accuracy scores for the derived algorithms (with the exception of VAR feature). This fact has significant impact on the possible implementation of the algorithms as the optimization, even in the case of large time series, for example, tens of minutes, could be executed in a matter of seconds. As the illustration, optimization of algorithm variants with free parameter of segments per movement takes up to 10 minutes per movement (Python 3.6 and Intel i7 processor). Depending on the embedded platform, our estimate is that optimization with one segment per movement could be still done in less than a minute. The other fact that contributes to low processing during optimization and real-time application is that algorithm does not need high order autoregressive models. As presented in Figure

As this study was also aiming at finding out which feature extraction method works the best when coupled with the proposed algorithms and commonly used ones, we carried out a systematic test that included all features and subjects. With the EMG signals that we acquired, using MAV feature as the first step in the classification chain produced the highest accuracy scores among classifiers. The performance of MAV was closely matched by RMS, while WA produced the best results among the frequency related time-based features.

When compared with the reported results of Li et al. [

The study performed on our set of multichannel surface EMG indicates that using MAV feature coupled with HSMM algorithm leads to movement decoding accuracy higher than other combinations of features and classifiers. This combination also guaranties the shortest MS and MC times, influencing a response of a prosthetic hand to a user intent.

The main advantages inherent to our algorithms compared to the existing methods are the following. (1) Low computation complexity for the execution of the algorithm: following the optimization procedure, which is computationally demanding but executed only once per model, the implementation of our algorithm comes down to the basic algebraic operations that could be implemented even on low-end microcontrollers. This feature also permits true real-time execution with the delay only related to EMG feature extraction and AR depth. Additionally, we tested even faster variations of developed algorithms that have only one segment per movement. With the small drop in performances, these algorithms, especially HSMM with one segment per movement, significantly decreased optimization time and further decreased execution load. (2) Easy expansion: once optimized, models could be stored, and in the case of introducing a new movement class, the optimization could be performed only on the newly added movement class. (3) Noise resilient: in comparison with some weak classifiers and threshold centered rule-based algorithms, the VARHHMM approach implicitly takes into account possible sources of stochastic noise.

The method proposed in this paper is intended for decoding individual finger movements for directly controlling actuated fingers of a prosthetic hand. Although desirable, this approach is not common as the prosthesis use during activities of daily living mostly consists of synergistic movements (grasps). Thus, as the alternative control paradigm, the proposed method could be implemented as the state machine classifier-controller. In this approach, the separable classes might be used to initiate fixed grasps (pinch/power/lateral/open) as this is what most commercial hands support.

The authors declare that there are no conflicts of interest regarding the publication of this paper.

This research is supported by the EU-Funded DeTOP Poject (EIT-ICT-24-2015, GA no. 687905) and the Swedish Research Council (637-2013-444).