Content from Introduction to EEG-BIDS


Last updated on 2024-02-19 | Edit this page

Overview

Questions

  • Why are data standards important?
  • What information needs to be standardized for EEG?

Objectives

  • Understand the value of using data standards in research
  • Understand the complexities of establishing a standard for EEG data

The big picture of EEG data standards


EEG data is complicated and with many properties that can be stored in an infinite number of ways

In order to work with EEG data, the researcher needs to work with several properties of the data. These properties include things like the voltage signals that are sampled at a specific rate, the physical locations of the recording sites, as well as more idiosyncratic properties like the experimental task event marks, etc. When building a lab or planning a new study decisions are made about all of the EEG properties and those properties make up the interpretable material of the research project. Decisions are also made about how all of that information is going to be stored and managed so that it can be found and used by the researcher and software tools later.

When a researcher is making decisions about EEG parameters and data management strategy for a lab or project some of the questions they might ask are:

1- How should the information be stored so that it can be used efficiently now?

2- How should the information be stored so that it can be used efficiently later?

3- Could the data be efficiently shared with another research group?

4- Could another group or project’s data be used or integrated (pooled)?

5- Can the data be made open access and available long-term?

6- Are there existing analytic tools, procedures or platforms to which the data need to be compliant?

7- What are the costs now (and later) for designing a new unique data storage strategy?

Can a data standard provide answers to these questions?

1- It is always a concern that something important will be missed when designing a data management strategy, a clear community driven discussion about adaptive best practices is an ideal way to build confidence in decisions.

2- The best way to keep data valuable in the long term is to have sufficient documentation that is accessible (both findable and interpretable) to not only yourself, but also to others in the future. A published standard offers this documentation, not to mention various other resources (such as workshops like this one).

3- When sharing data that is compliant with a data standard the recipient only needs to become familiar with the standard in order to access the relevant properties and parameters of the research data.

4- Independent datasets that share compliance with a data standard can easily be pooled in terms of their data management strategy and access methods. Variance in project parameters still need to be addressed in the analytic process but those parameters can be accessed in a common way.

5- Open-access data curation platforms are the ideal place to make a data project openly available over the long term. Compliance with a data standard greatly facilitates the ingestion of data into databases or platforms.

6- Software and platform development often requires substantial effort to account for input and output diversity (or it is constrained to a subset of data types). If a data standard has substantial uptake it is an ideal place for analytic tool developers to focus their attention.

7- People’s time in a research project is an important consideration and the time spent on designing a data management strategy can vary greatly. It is also important to note that a unique data management strategy can have unexpected costs in the future. An established data standard has had the opportunity to collect experience from across the research community over several generations of projects.

The face13 data set


The example data set used in this lesson is the sample used in the 2013 paper “Deconstructing the early visual electrocortical responses to face and house stimuli” by James Desjardins and Sid Segalowitz. The goal of the paper was to untangle the underlying cortical sources of the P100 and N170 ERP face effect complex using independent component analysis (ICA). Because ERP face effects have been reported at various times during the P100 and N170 ERP complex, the goal was to determine which underlying sources account for effects at the scalp over the ERP period.

face13 ICA decomposition

Because the reliability of some of the P100 and N170 complex ERP effects varied in the literature the statistical analysis of this paper not only looked for ERP differences but used robust bootstrapping measures to assess the replicability of the ERP differences over the period of the P100 and N170 ERP complex.

face13 DAR decomposition

The EEG-IP-L (EEG Integrated Platform Lossless) processing pipeline


The preprocessing methods used in this paper to objectively produce robust ICA decomposition in an automated way eventually evolved into the “EEG Integrated Platform Lossless (EEG-IP-L) pre-processing pipeline for objective signal quality assessment incorporating data annotation and blind source separation”(Desjardins et al. 2021, Journal of Neuroscience Methods).

EEG-IP-L diagram

The method also includes an interactive quality control method in which the researcher assesses and potentially modifies various signal quality annotations.

EEG-IP-L dashboard

EEG-BIDS examples: face13 data and the EEG-IP-L derivative


BIDS home page and publications

The Brain Imaging Data Structure (BIDS) is a standard for organizing the content of various neuroimaging modalities (MRI, MEG, EEG). “The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments” was published in 2016 by Gorgolewski et al. “EEG-BIDS, an extension to the brain imaging data structure for electroencephalography” was published in 2019 by Pernet et al.

BIDS starter kit repository

The BIDS standard starter kit can be found at github here.

BIDS examples repository

The standard is actively discussed and evolving to adopt new features. You can find the face13 data set among the bids-examples at github here.

Navigating the example data sets provides not only a demonstration of the data organization but also the diversity that the standard enables.

eeg_face13 example

The eeg_face13 examples data set not only demonstrates the layout of the raw sample data in the standard but also contains an example of storing the output of the EEG-IP-L pipeline as a derivative state of the sample.

eeg_face13 example root folder

The root folder of the project’s BIDS structure contains high level project information and a directory for each participant.

BIDS examples root

eeg_face13 example sourcedata folder

Although the raw data are stored in specified formats, the original native files acquired in the lab hardware can always remain in the sourcdata folder unmodified.

BIDS examples source

eeg_face13 example participant folder

Each participant’s folder contains an EDF file for the EEG signals and then several formatted text files containing the details of specific session properties such as channel locations and event markers.

BIDS examples raw

eeg_face13 example code folder

The software tools and scripts used to transform the data are an important part of being able to work with (and interpret) the data long term. This is not to mention the importance of storing code with the data for reasons of replication. The code folder of the BIDS standard is where all of the relevant procedures are stored.

BIDS examples code

eeg_face13 example derivatives folder

The project’s root folder can also contain derivatives for subsequent states of the processed data.

BIDS examples derivative

eeg_face13 example derivative participant folder

A derivative folder matches the organization of the project’s root folder, but its contents are a transformation of the parent directory. Derivative folders themselves can contain derivatives folders recursively as data states are transformed into subsequent data states.

BIDS examples derivative

Key Points

  • What does it take to standardize an EEG project?
  • What is gained by standardizing an EEG project?

Content from Matlab: need to knows


Last updated on 2024-02-19 | Edit this page

Overview

Questions

  • What does EEGLAB need to know about the folder structures?
  • How to set up the paths in Matlab for EEGLAB

Objectives

  • Understand what EEGLAB needs to be able to find (data and scripts) and how to tell it where to find them.

What does Matlab need to know about the folder structure in order to run EEGLAB?


EEGLAB is a set of Matlab functions (… which are text files).

Lets now take a look at how the functions operate in the Matlab environment. Matlab is an interpreted language. This means that it will read some text, interpret it in a very specific way and then perform the operations that are defined by the text.

For our purposes we can think of Matlab doing two very important things:

  1. It stores information in memory as different kinds of variables
  2. It can perform operations (or functions) on those variables

Lets open the Matlab Integrated Development Environment (IDE) and take a look at these capabilities.

Matlab Integrated Development Environment

Anatomy of the Matlab Integrated Development Environment (IDE)

The Matlab IDE is made up of several sections in the Graphical User Interface (GUI). Each of these are important to understand as we move towards interacting with EEG data via EEGLAB.

  1. Command Window: This is where we interact with Matlab using the Command Line Interface (CLI) creating variables and performing operations on them.
  2. Current Folder: This is the directory that Matlab is pointing to. This is the part of the file system that Matlab sees and is its starting point for relative paths.
  3. Workspace: this is a summary of the variables that are accessible to the Command Window
  4. Command History: This is the interactive list of operations that have been called from the Command Window.
  5. Editor: This is a text editor with added features for modifying Matlab interpreted text files (*.m files).

Creating variables in the Matlab CLI

Making new variables (or modifying existing variables) is accomplished using the “=” character.

We can create a new variable named “x” and make it equal to a series of numbers:

x=[1,2,3,4,5];

There are several ways to store information in the workspace

There are several types of variables in Matlab, such as the numeric array above, but there are also strings, cell arrays and structures. EEGLAB stores its information in a structure named “EEG”. That is where we will find all of the EEG properties that EEGLAB has at its disposal later in this lesson.

Performing operations on variables in the Matlab CLI

We can also perform operations on the variable and have the result saved to a new variable “y”:

y=mean(x);

Operations can include making new figure windows and plotting variables.

figure;plot(x);

With a few commands we can replicate the effect of calculating an Event Related Potential (ERP). First we create a three dimensional array of random numbers. Note that segmented EEG data takes this form in EEGLAB, where rows are channels, columns are time samples and pages are epochs.

y=rand(3,1000,600);

We can find the size of an array by calling its “size”. The order of the dimensions in Matlab indexing are 1- rows, 2- columns and 3- pages.

size(y)

A random number ERP

Next we can plot portions of the three dimensional array by indexing to y. Then apply an average to a specified range of values along the third dimension of y (pages, or trials in EEG) before plotting it as an overlay. Then finally average across the entire range of pages (epochs) and then overlay the result on the figure. This example demonstrates that the more random numbers that are averaged together, the smaller the resulting mean becomes. This is a fundamental property of ERPs

figure;plot(y(2,:,8));
hold on;plot(mean(y(2,:,1:16),3),'g');
hold on;plot(mean(y(2,:,:),3),'r');
mean figure

*Exploring .m files**

Note that operations in Matlab are text files that we can examine. Matlab needs to be able to see the text file (*.m) that describes a function in its “path”. We can find out where Matlab is finding a function file by querying “which [funcname]”.

which mean;

Calling the “edit” function on a function will open the *.m file in the Matlab editor (or if it does not see the function it will request to make a new empty file of requested name).

edit mean;
mean edit

Starting EEGLAB

EEGLAB is a function that is executed by Matlab. If we call it before Matlab knows where it is an error will be returned. Try calling “eeglab” in the Command Window from where you are now.

eeglab;

In Matlab, navigate to the Face13 folder that you created during the setup.

Now that Matlab is pointed to the Face13 folder we need to tell it where to find the “eeglab.m” file that we want to use for this lesson. Navigating in the Current Folder window (by expanding folders WITHOUT selecting (double clicking) them) we find the “eeglab.m” file in ‘code/BIDS-Init-Face13-EEGLAB/eeglab’. We need to add this folder to Matlab’s path in order to run this version of EEGLAB. We can add this folder to Matlab’s path in several ways including using the “set path” button in the toolbar, but given that we are all in the same folder structure the following “addpath” calls in the Command Window should work.

eeglab path

Path for Face13 folder

It is important to ensure that Face13 is the last folder listed in your Matlab path as this indicates that Face13 is your current working directory. You should see the code and sourcedata folders within the Current Folder window in Matlab.

>> addpath code/BIDS-Init-Face13-EEGLAB
>> addpath code/BIDS-Init-Face13-EEGLAB/eeglab

Now let’s try “eeglab” again from the Command Window.

eeglab

Warning: Name is nonexistent or not a directory

If when adding the paths or trying to open EEGLAB you receive a warning stating that the name is nonexistent or not a directory, ensure that you have navigated to the Face13 folder that you made during the setup. Also make sure that the code and sourcedata folders are located within the Face13 folder.

EEGLAB has a Graphical User Interface (GUI) so we will be able to do a lot of the processing by clicking in menus and interacting with figures, but we will also learn how to take advantage of EEGLAB’s integration with the command line interface to work more efficiently and reliably.

eeglab gui

Setting some EEGLAB options so it behaves similarly for everyone

Now that we all have EEGLAB running in Matlab on our computers let’s set a couple options to make sure that the software behaves similarly for all of us. Select “Preferences” from the EEGLAB “File” menu and adjust the settings as illustrated below.

memory options menu
memory options gui

Key Points

  • Working with EEGLAB requires that specific files can be found in specific locations

Content from Initializing Data into BIDS


Last updated on 2024-02-19 | Edit this page

Overview

Questions

  • How do I get my data into a BIDS compliant folder structure?
  • What does a BIDS folder structure look like?

Objectives

  • Understand how to initialize your data into BIDS.

Creating a BIDS folder structure


Initializing data into a BIDS compliant folder structure will result in individual subject folders that contain an eeg folder. All of the EEG files for that participant will be stored within sub-*/eeg/ folders.

To make the data compliant with BIDS, run the bids_face13.m script in the Matlab Command Window:

MATLAB

>> bids_face13

A file chooser window will pop up when running the bids_face13.m script. This file chooser is asking where you want the sub-*/eeg folders to be created. Select the project directory (Face13 folder) with the file chooser.

Creating the sub-*/eeg/ folders in the root of your project directory will create a folder structure that looks like this:

BIDS Folder Structure

Several files will be produced within each sub-*/eeg/ folder. All of the file names contain the subject number as well as the task name and a suffix to denote the information that is saved within that file. For example, the EEG recording data files have a suffix of _eeg and are saved as .edf files.

Once this procedure is completed, your initialized data will be in the BIDS standard (sub-*/eeg/) in the root of your project folder.

Exploring the formatted text files with the participant folders

Now that the data sessions are BIDS compliant many of the data parameters are available within formatted text files. This makes it easy to explore the session parameters using a folder browser and simple text editor. Note that this also makes it very efficient for tools and platform to access relevant session parameters without having to read the large data files.

BIDS EEG tsv

Digging down into the “BIDSification” process

The scripts used in this workshop use a forked version of the bids_matlab_tools EEGLAB plugin. We can explore the specifics of the process by examining the “bids_face13” script provided for this lesson.

MATLAB

>> edit bids_face13
Edit bids_face13

The bids_export function is the tool that does the BIDS compliant data writing of an EEGLAB EEG data structure. Looking through this function provides the information about what inputs are required to generate the BIDS compliant data set, as well as what options are available to modify the output.

MATLAB

>> edit bids_export
Edit bids_export

Next steps

Once the data are BIDS compliant, you are now ready to begin working with the data in the Processing data with EEGLAB tutorial.

Key Points

  • Now the data are initialized into the BIDS standard