Skip to content

Import data & preprocessing

Nabarb edited this page Nov 20, 2024 · 2 revisions

Import data

Data can be imported in the toolbox in many formats. Supported formats are:

Double

If you store your spiking in a double (or sparse) Matrix, you can input it directly to the toolbox. This method is suitable for homogeneous trial that all have the same length. The spiking data matrix is expected to be three dimensional with dimension (nUnits x TrialL x nTrials), where:

  • nUnits is the number of single units (or multi units) available in the dataset
  • TrialL is the length of the trials. Being stored in 3d matrix all trials have to be the same length
  • nTrials is the number of available trials in the dataset.

If this option is chosen, the toolbox will require additional arguments:

  • time is a ( TrialL x 1 ) double vector of timestamps. It is mainly used in preprocessing and plotting and has to be the same length as the second dimension of D
  • condition is a (nTrials x 1) string vector of trial labels. These are used to mask trials when only some conditions needs to be visualized or processed.
  • area is a (nUnits x 1) string vector of unit labels and it used to aggregate units based on areas where they were recorded. Same as condition, this is used to to visualize or process a subset of units
D         = spikingData:       % input double data  (nUnits  x TrialL x nTrials)
T         = trialTime;         % Time vector        (TrialL  x 1)
C         = trialLabels;       % Condition labels   (nTrials x 1)
A         = areaLabels;        % Area labels        (nUnits  x 1)
fs        = samplingFrequency; % Sampling frequency (1 x 1)
NE = NeuralEmbedding(D,...
  "fs", fs,...
  "time",T,...
  "area",A,...
  "condition",C);

Cell

If you store your spiking data in a cell-array, you can input it directly to the toolbox. This method is suitable for inhomogeneous trials that all have different lengths. The cell array is expected to have one cell per trial, hence being nTrials long. Each cell is expected to contain a double (or sparse) 2D spiking data matrix with dimension (nUnits x TrialL).

If this option is chosen, the toolbox will require additional arguments:

  • time is a ( nTrials x 1 ) cell-array each containing a (TrialL x 1) double vector of timestamps. It is mainly used in preprocessing and plotting and has to be the same length as the second dimension of D
  • condition is a (nTrials x 1) string vector of trial labels. These are used to mask trials when only some conditions needs to be visualized or processed.
  • area is a (nUnits x 1) string vector of unit labels and it used to aggregate units based on areas where they were recorded. Same as condition, this is used to to visualize or process a subset of units
D         = spikingData:       % input data, cells  (nTrials x 1) of (nUnits  x TrialL)
T         = trialTimes;        % Time vector, cells (nTrials x 1) of (TrialL  x 1)
C         = trialLabels;       % Condition labels   (nTrials x 1)
A         = areaLabels;        % Area labels        (nUnits  x 1)
fs        = samplingFrequency; % Sampling frequency (1 x 1)
NE = NeuralEmbedding(D,...
  "fs",fs,...
  "time",T,...
  "area",A,...
  "condition",C);

Struct

If you store your spiking data in a field of a struct-array, you can input it directly to the toolbox. This method is suitable for inhomogeneous trials that all have different lengths. The struct-array is expected to have one cell per trial, hence being nTrials long, and can store additional information about the data in the struct itself. If everything is provided this way, you wont need additional arguments. If you forget something, the toolbox will remind you (and get angry). Each element of the array is expected to have the following fields:

  • data, (nUnits x TrialL) double or sparse spiking data
  • time, (TrialL x 1) double array of timestamps for the trial
  • area, (nUnits x 1) string array of unit label
  • condition, (1 x 1) trial label string scalar

With everything properly configured, initializing the toolbox is as simple as

NE = NeuralEmbedding(D,...
  "fs",fs);

Preprocessing

The preprocessing step is crucial to to obtain meaningful and comparable results. Preprocessing is performed automatically after object initialization and goes through the following steps:

  • Pruning of inactive neurons, removeInactiveNeurons(): It's objective is to remove those units that do not show acceptable level of activity during most trials. The acceptable firing rate threshold and the percentage of trials considered "most trials" are definable in the object's parameters.
  • Binning of neural data, binData(): this step performs a binning procedure on neural sparse data. The bin width (in ms) can be set in the objects's properties.
  • Smoothing of binned data, smoothData(): this step performs a gaussian kernel smoothing of the binned data, obtaining a continuous instantaneous firing rate (ifr). Smoothing kernel width can be set in the objects's properties.
  • Z-scoring of binned data, zscoreData(): the smoothed data at this point is z-scored to homogenize the high range of units' firing. Z-scoring can be deactivated by setting the zscore property to false.

Clone this wiki locally