Abstract
Electrophysiology is entering the era of big data. Multiple probes, each with
hundreds to thousands of individual electrodes, are now capable of
simultaneously recording from many brain regions. The major challenge
confronting these new technologies is transforming the raw data into
physiologically meaningful signals, that is, single unit spikes. Sorting the
spike events of individual neurons from a spatiotemporally dense sampling of the
extracellular electric field is a problem that has attracted much attention
(Rey, Pedreira, & Quian Quiroga, 2015; Rossant et al., 2016) but is still far from solved. Current methods still rely on human
input and thus become unfeasible as the size of the data sets grows
exponentially. Here we introduce the -student stochastic neighbor embedding (t-SNE) dimensionality
reduction method (Van der Maaten & Hinton, 2008) as a visualization tool in the spike sorting process. t-SNE
embeds the
-dimensional extracellular spikes (
= number of features by which each spike is decomposed)
into a low- (usually two-) dimensional space. We show that such embeddings, even
starting from different feature spaces, form obvious clusters of spikes that can
be easily visualized and manually delineated with a high degree of precision. We
propose that these clusters represent single units and test this assertion by
applying our algorithm on labeled data sets from both hybrid (Rossant
et al., 2016) and paired
juxtacellular/extracellular recordings (Neto et al., 2016). We have released a graphical user interface
(GUI) written in Python as a tool for the manual clustering of the t-SNE
embedded spikes and as a tool for an informed overview and fast manual curation
of results from different clustering algorithms. Furthermore, the generated
visualizations offer evidence in favor of the use of probes with higher density
and smaller electrodes. They also graphically demonstrate the diverse nature of
the sorting problem when spikes are recorded with different methods and arise
from regions with different background spiking statistics.