Neuronal spike sorting algorithms are designed to retrieve neuronal network activity on a single-cell level from extracellular multiunit recordings with Microelectrode Arrays (MEAs). In typical analysis of MEA data, one spike sorting algorithm is applied indiscriminately to all electrode signals. However, this approach neglects the dependency of algorithms’ performances on the neuronal signals properties at each channel, which require data-centric methods. Moreover, sorting is commonly performed off-line, which is time and memory consuming and prevents researchers from having an immediate glance at ongoing experiments. The aim of this work is to provide a versatile framework to support the evaluation and comparison of different spike classification algorithms suitable for both off-line and on-line analysis. We incorporated different spike sorting “building blocks” into a Matlab-based software, including 4 feature extraction methods, 3 feature clustering methods, and 1 template matching classifier. The framework was validated by applying different algorithms on simulated and real signals from neuronal cultures coupled to MEAs. Moreover, the system has been proven effective in running on-line analysis on a standard desktop computer, after the selection of the most suitable sorting methods. This work provides a useful and versatile instrument for a supported comparison of different options for spike sorting towards more accurate off-line and on-line MEA data analysis.
Simultaneous multisite recordings using Microelectrode Arrays (MEAs) coupled to cultured neuronal networks are a widely applied approach in the field of
Many algorithms with different levels of complexity and automaticity have been proposed to sort neuronal spikes [
Despite many efforts to tackle the spike sorting problem, it is still difficult to identify the best algorithm with large generality and also to define which spike sorter is the most appropriate under specific circumstances [
Overview of spike sorting algorithms.
Reference | Feature extraction | Clustering |
---|---|---|
Letelier and Weber [ |
Wavelet | Fuzzy- |
Harris et al. [ |
PCA |
Expectation maximization |
Zouridakis and Tam [ |
Waveforms | Fuzzy- |
Hulata et al. [ |
Wavelet |
|
Egert et al. [ |
PCA | Manual cluster cutting |
Shoham et al. [ |
PCA | Expectation maximization |
Quiroga et al. [ |
Wavelet packet coefficients | Superparamagnetic clustering |
Rutishauser et al. [ |
— | Template matching |
Cho et al. [ |
LDA |
Fuzzy- |
Adamos et al. [ |
PCA | Expectation maximization |
Awais and Andrew [ |
Zero crossing |
|
Biffi et al. [ |
PCA | Hierarchical clustering |
Takekawa et al. [ |
Wavelet | Bayes |
Gibson et al. [ |
Discrete derivative | Fuzzy- |
Cheng et al. [ |
PCA | Density-based clustering |
Liu et al. [ |
PCA | Valley-seeking |
Lai et al. [ |
Wavelet | Gray relation analysis |
Bestel et al. [ |
PCA, wavelet, geometrical features | Expectation maximization |
Yuan et al. [ |
Wavelet |
|
Oliynyk et al. [ |
PCA | Fuzzy- |
Kwon et al. [ |
DWT |
Expectation maximization, |
Englitz et al. [ |
Geometrical features | 1D clustering |
Paraskevopoulou et al. [ |
FSDE |
|
Nick et al. [ |
PCA, DWT, geometrical features | Expectation maximization |
MCRack (Multi Channel Systems GmbH) |
— | Manual amplitude window |
Spike2 (Cambridge Electronic Design Ltd.) |
PCA | Template matching |
Off-Line Sorter (Plexon Inc.) |
PCA | Expectation maximization, |
Overview of the literature about spike sorting algorithms, including published papers about methods, custom toolboxes, and commercial software.
Neuronal spikes recorded with MEAs can be sorted with two different approaches: (i) off-line, which means that spikes are sorted after the acquisition and storage of raw voltage traces, or (ii) on-line, which means that spikes are sorted during data acquisition. In the first case, the information about the spikes collected throughout the recording is available to the algorithm; in the other case, only information available up to the current point in time can be exploited to sort a spike. Although in off-line modality spike waveforms are classified with a better accuracy [
To our knowledge it is difficult to find a spike sorting framework that incorporates alternative methods for all the spike sorting processing steps and that allows on-line analysis with any selected method. To this aim, in this work different spike sorting methods proposed in the literature and suitable for on-line analysis have been selected and integrated within a software environment familiar to the MEA users, that is, Matlab. The tool is to provide the users with the possibility to select a method according to the data set at hand, optionally different for each electrode of the same MEA. Here, the working principle of each algorithm is described together with metrics and the evaluation flow used to assess the performance of each method. To validate the tool, accuracy performances of the implemented methods on neuronal signals (both simulated and real recordings) are reported and discussed. Finally, the work reports a comparison of runtimes of the implemented spike sorting algorithms during MEA data acquisition with a test-bed setup, showing that on-line operations are feasible.
An overview of the processing blocks and implemented methods is presented in Figure
Scheme of the spike sorting processing algorithms incorporated in this work. For each electrode the raw signal is preprocessed before the subsequent spike detection by a threshold-based algorithm (i.e., AdaBandFlt [
To attain accurate spike sorting, spike waveforms have to be properly detected and aligned. The most common methods to detect spikes apply a threshold to the voltage of the input signal, computed as a multiple of the standard deviation of the signal over a predefined window. In this work, spike waveforms data provided as inputs to spike classifiers were obtained by means of an adaptive threshold-based algorithm, that is, “AdaBandFlt,” fully described in [
Four feature extraction algorithms, coupled to suitable dimensionality reduction methods and to three feature clustering methods, and one waveform clustering method (i.e., without a feature extraction phase) have been selected for implementation (Figure
Properties of the selected feature extraction methods.
Domain | Percentage of publications |
Need of training before on-line FE | |
---|---|---|---|
Principal Components Analysis (PCA) | Time | 36% | Yes |
First and Second Derivative Extrema (FSDE) | Time | 3% | No |
Geometric features (GEO) | Time | 13% | Yes |
Discrete Wavelet Transform (DWT) | Time/scale | 26% | Yes |
Other methods | — | 22% | — |
The “domain” column refers to the analysis domain in which each method works, for example, time domain or time/scale domain. The “percentage of publications (with respect to Table
Properties of the implemented and evaluated clustering algorithms.
Input | Percentage of publications |
Automaticity | Parametric | Need of training before on-line clustering | |
---|---|---|---|---|---|
|
Spike features | 30% | Yes |
Yes | — |
Fuzzy- |
Spike features | 25% | Yes |
Yes | Yes |
Density-based (DBC) | Spike features | 4% | Yes | No | Yes |
O-sort | Spikes | 11% | Yes | Yes | No |
Other methods |
— | 30% | — | — | — |
The “percentage of publications (with respect to Table
To validate the implemented code, the methods have been applied to simulated signals and real MEA recordings.
Features of the simulated data set. Spike waveforms were selected from a database of averaged spike waveforms obtained from spontaneous activity recorded in hippocampal and cortical
The algorithms were implemented and evaluated in Matlab (version R2008b, The Mathworks). Scripts are in Matlab native language apart from part codes written in C language and running in Matlab as MEX-files. Source code for MEX-files was written using Microsoft® Visual C++ 2008 Express Edition. Graphical user interfaces (GUI) were designed using the graphical user interface development environment (GUIDE) of Matlab. To convert the file format generated by the acquisition software of our commercial acquisition platform (
The evaluation scheme adopted to assess the performance of the algorithms on simulated and real signals is depicted in Figure
Performance assessment flow. (a) Scheme of the performance assessment procedure employed to evaluate the simulated data set in presence of “ground truth,” obtaining the cluster validity index and the classification accuracy. (b) Scheme of the performance assessment procedure employed to evaluate the set of real signals without a “ground truth,” obtaining the intracluster variance and parameters judged by visual inspection.
To test the effectiveness of spike sorting methods on simulated data sets, two indexes were employed:
(
(
To compare algorithm performances on real signals, two measures were used:
(
(
A statistical analysis has been carried out to highlight relevant differences in the performance of the different methods on the data sets by means of Statistica (StatSoft Inc.). Each group presented as input to the statistical analysis consisted in the values of a performance index (Section
Runtimes of the different spike sorting algorithms were compared in Matlab on the same dedicated desktop computer (quad-core 3.3 GHz CPUs with 4 GB RAM running Windows 7 64-bit). The algorithms were launched from a custom script including code for the real-time communication with a MEA A/D device (USBME-64, Multi Channel Systems GmbH), through a proprietary dynamic-link library distributed by Multi Channel Systems (“McsUsbNet.dll”). Thus, it was possible to evaluate the effective feasibility of an on-line implementation, taking into account the time required for raw data reading, filtering, spike detection, spike sorting, and storage of results. Runtimes were computed in a worst-case scenario, simulated by the occurrence of a high frequency spiking signals simultaneously in all the 60 channels (i.e., 250 Hz [
The algorithms described in Section
Structure and functionalities of the graphical user interface. Structure and functionalities of the GUI, which is composed of a “test data” section (intended for the analysis of simulated signals) and a “real data” section that can be used for either off-line or on-line analysis.
From a main menu (Figure
Besides manual selections of the methods, the GUI embeds an automatic routine which runs all the possible combinations of spike sorting blocks on a selected signal and displays the performance indexes (i.e., CV and CA for simulated data, ICV and CV for real data) (Figure
For a selected method, the GUI shows the performance indexes, the spikes projected and clustered in the feature space, the aligned spike waveforms, color-coded according to the clustering results, and the raster plots of each identified unit, as shown in Figure
Graphical user interface. (a) Screenshot of the GUI built for spike sorting on simulated data. (b) Example of graphical result of spike sorting on multichannel MEA signals, representing the clustered spike waveforms for each electrode of the matrix. (c) Example of graphical result of spike sorting on multichannel MEA signals, representing the spike trains collected by each electrode, with spikes colored according to the signal source.
The GUI and the scripts were written with Matlab R2008b, but they are compatible with all the following releases up to version R2014a. The framework is freely available upon request to
Comparison of the separability of simulated spikes in the feature space. (a) Example of projections of the spikes extracted from a simulated signal (i.e., signal #10 of Figure
The FE effectiveness assessed on each signal with the different feature extraction methods is shown in Figure
All the methods showed a similar trend of reduced performance when the noise level was increased (Figure
For PCA, DWT, and FSDE FE methods, the waveforms similarity index (see Section
Overall, the DWT and the GEO feature selection yielded a comparable CV (
Comparison of classification accuracy on the simulate data sets with the benchmark
As mentioned earlier, part of the results shown in Figure
Classification accuracy of all the tested methods.
|
FCM | DBC | O-sort | ||||
---|---|---|---|---|---|---|---|
|
|
|
|
| |||
PCA | 97.16 |
98.46 |
97.11 |
94.76 (12.15) | 91.04 (27.45) | 97.66 |
— |
DWT | 96.42 |
77.18 (31.08) | 95.29 |
93.60 (12.68) | 76.13 (24.47) | 93.90 (13.39) | — |
FSDE | 68.42 (29.64) | 69.83 (14.07) | 63.66 (33.26) | 67.01 (25.22) | 36.97 (50.62) | 56.71 (21.24) | — |
GEO | 79.57 (17.82) | 72.25 (28.26) | 77.43 (28.63) | 82.35 (19.77) | 84.99 (18.29) | 88.10 |
— |
— | — | — | — | — | — | 94.37 (4.75) |
Spike sorting classification accuracy, CA (%), on the simulated data sets for all the possible combinations of FE (rows) and clustering algorithms (columns) and for O-sort. CA is presented as median and (IQR) over the different signals (
The performances of FCM clustering are shown with respect to two different degrees of fuzziness (i.e.,
Results of the statistical comparison of clustering methods applied after a given FE method are presented in Table
The statistical analysis applied to all the possible combinations of methods confirmed the absence of a unique method outperforming the other when applied indiscriminately to all the signals. Indeed, PCA+
Classification accuracy on the simulated data sets. (a) Indication of which method yielded the highest classification accuracy (CA) for each data set (marked by the red box). (b) Box-plots (median and IQR with whiskers delimited by the maximum and minimum nonoutliers values) of classification accuracy provided by all the methods on all the data sets (
Spike sorting performances measured on real data by visual inspection and by quantitative assessment (i.e., intracluster variance, ICV) were proven to be in good agreement with results on simulated signals. For each combination of FE and clustering algorithms and O-sort, Figure
Performances of the methods on real data. (a) Outcome of the visual inspection on the results of the methods, where the percentage of nonclassified spikes and the ratio between the number of correctly identified clusters and the real number of clusters are reported. Each symbol represents a combination of algorithms, as indicated by the legend and annotations in the graph.
An evaluation aimed at comparing Matlab execution time relative to the feature extraction and clustering steps was performed. Parameter values set for this evaluation were the ones which allowed the best performance for each method (see Section
Computational effort and runtime to process a single spike.
Method | Number of additions | Number of multiplications | Number of if-operations | Time ( |
---|---|---|---|---|
PCA |
|
|
0 |
|
DWT |
|
|
0 |
|
FSDE |
|
0 |
|
|
GEO |
|
1 |
|
|
FCM |
|
|
|
|
DBC | 0 | 0 |
|
|
O-sort |
|
|
8 |
|
Computational requirements to classify a single spike (3 ms waveform sampled at 25 kHz). Columns from 2 to 4 indicate the number of operations for each spike included into the implemented Matlab code, showing their dependence on algorithms parameters. The last column reports the resulting execution times (averaged over 100 repetitions and reported as mean ± standard deviation). Times are for Matlab running on a quad-core 3.3 GHz CPUs desktop computer with 4 GB RAM and Windows 7 64-bit. Asterisks in the first column indicate methods for which an implementation in C language with MEX-files was performed.
Extraction of features from each input spike takes on average a comparable time for PCA, FSDE, and GEO methods (i.e., 5-6
FCM classification of one spike is fourfold slower than DBC classification (i.e., 27
The experimental test performed with the setup described in Section
Evaluation of runtimes of the spike sorting algorithms. Runtimes measured in the experimental setup, for different lengths of input data block (ms) sent from the acquisition device to Matlab. Runtimes were measured in a worst-case scenario of high firing activity simultaneously occurring at all the 64 channels. Values related to raw data reading, filtering, spike detection, and classification with all the possible methods are reported. The runtime is related to input data block length (i.e., the time available for processing before the buffer update) and is expressed as its percentage (e.g., a runtime percentage equal to 60% for a 1 second block means that there is a margin of 400 ms for further operations). Times are for Matlab running on a quad-core 3.3 GHz CPUs desktop computer with 4 GB RAM and Windows 7 64-bit.
In the simulated scenario of high spiking frequency simultaneously in all the channels, all the feature extraction methods (apart from DWT), coupled to both clustering methods, can process data before buffer overwriting, if the data block length is higher than ~300 ms. When using shorter data blocks, the operation of raw data reading from the acquisition device (black dashed line in Figure
The present work addressed two important issues in the field of spike sorting of neuronal signals collected by means of MEAs, which are the very limited availability of data-centric and on-line spike sorting tools. Our aim was to provide a framework for an easy comparison of different spike sorting algorithms on the same data which would be suitable for off-line and on-line data analysis. Rather than proposing a new sophisticated algorithm, we exploited the modularity of existing spike sorting processes, that is, the presence of several steps and different techniques that can be mixed and matched to adjust the process to the data set. Therefore, the implemented toolbox integrates different spike sorting blocks (i.e., four feature extraction methods, three feature clustering methods, and one template matching clustering), which have been selected from the literature. Thus, it provides the possibility to choose the algorithm that optimally performs on a specific channel data most suitably, in contrast to commonly used tools which apply one predefined method to all electrodes. The pool of algorithms integrated in the framework presents features of automaticity and simplicity that are important requirements in spike sorting and facilitate on-line implementations [
Besides the modular software tool, the work has presented an extensive evaluation of the different combinations of feature extraction and clustering methods integrated in the framework. In order to help the users in the selection of their best methods for data processing and to guide the evaluation of multiple algorithms on other data sets, the algorithms were tested both on simulated data sets (as most commonly done to assess the performance of spike sorting methods [
Matlab has been primary chosen because the MEA users often share algorithms written in Matlab language, which can facilitate the utilization and extension of the framework. Despite existing open-source alternatives (e.g., Python), Matlab is still a very common framework for neurophysiologists and research institutions working with MEAs [
For running in the off-line mode, the GUI integrates the Neuroshare API library, which is a community-supported vendor-neutral library. Therefore, it would be possible to import into Matlab neural data files acquired by different platforms from the one we used (Multi Channel Systems GmbH) with minor modifications. For running in the on-line mode, actually two types of acquisition boards from Multi Channel Systems were tested (i.e., USB-ME64/128 and MEA2100). However, the framework is expected to be compatible with other acquisition boards provided that an
Taking into consideration that Matlab is slower in performing some operations compared to low level languages [
Further code optimization activity will be performed to reduce the time for spike sorting blocks currently requiring too long computational runtime. To this aim, a more careful optimization of codes, an implementation of MEX-files of all the spike processing steps, and the resorting to the Parallel Computing toolbox [
Moreover, an issue of the current implementation is that on-line parallel data visualization is not possible, due to speed limits of Matlab graphics [
Besides improvements centered on shortening algorithm runtimes, a future activity will be focused on the automatization of the training phase. To this aim, an evaluation of the time needed to train the data (e.g., acquire a data fragment until a minimum number of spikes have been detected) and the most reasonable frequency of the training during acquisition will be performed. A repetition of the training would be preferred especially for long-lasting experiments since (i) only neurons which fire during the learning phase can be classified and (ii) the physical relations between neurons and electrodes may change due to cell growth (nonstationarity). A possible solution could be to perform a training step whenever a metric referring to the goodness of clustering detects the fact that data have changed their features to a considerable extent, as suggested by a recent work [
Furthermore, a possible enrichment of the tool would be to provide the possibility to combine heterogeneous features extracted from the spikes (e.g., DWT, PCA, and GEO) allowing taking advantage of the strengths of each feature extraction method to achieve better performances [
Emilia Biffi is currently working at the Bioengineering Laboratory, Scientific Institute IRCCS Eugenio Medea, Bosisio Parini, Italy.
The authors declare that they have no competing interests.
The authors thank Alembic facility and Dr. Andrea Menegon for providing them with lab facilities. They are grateful for Dr. Ludovico Minati for his help in setting up the real-time communication between Matlab and the MEA acquisition device.