Exploring Preprocessed fMRI Data from fMRIPREP
Last updated on 2024-02-17 | Edit this page
Overview
Questions
- How does fMRIPrep store preprocessed neuroimaging data
- How do I access preprocessed neuroimaging data
Objectives
- Learn about fMRIPrep derivatives
- Understand how preprocessed data is stored and how you can access key files for analysis
Exploring Preprocessed fMRI Data from fMRIPREP
BIDS applications such as fMRIPREP output data into a full data structure with strong similarity to BIDS organization principals. In fact, there is a specification for derivatives (outputs derived from) BIDS datasets; although this is a current work in progress, details can be found in: BIDS Derivatives.
In this tutorial, we’ll explore the outputs generated by fMRIPREP and get a handle of how the data is organized from this preprocessing pipeline
Luckily the semi-standardized output for fMRIPrep is organized in such a way that the data is easily accessible using pyBIDS! We’ll first show what the full data structure looks like, then we will provide you with methods on how you can pull specific types of outputs using pyBIDS.
The fMRIPrep Derivative Data Structure
First let’s take a quick look at the fMRIPrep data structure:
OUTPUT
../data/ds000030/derivatives/fmriprep/
├── sub-10171
├── sub-10292
├── sub-10365
├── sub-10438
├── sub-10565
├── sub-10788
├── sub-11106
├── sub-11108
├── sub-11122
├── sub-11131
├── sub-50010
├── sub-50035
├── sub-50047
├── sub-50048
├── sub-50052
├── sub-50067
├── sub-50075
├── sub-50077
├── sub-50081
└── sub-50083
First note that inside the fMRIPrep folder, we have a folder per-subject. Let’s take a quick look at a single subject folder:
OUTPUT
../data/ds000030/derivatives/fmriprep/sub-10788/
├── anat
│ ├── sub-10788_desc-aparcaseg_dseg.nii.gz
│ ├── sub-10788_desc-aseg_dseg.nii.gz
│ ├── sub-10788_desc-brain_mask.json
│ ├── sub-10788_desc-brain_mask.nii.gz
│ ├── sub-10788_desc-preproc_T1w.json
│ ├── sub-10788_desc-preproc_T1w.nii.gz
│ ├── sub-10788_dseg.nii.gz
│ ├── sub-10788_from-fsnative_to-T1w_mode-image_xfm.txt
│ ├── sub-10788_from-T1w_to-fsnative_mode-image_xfm.txt
│ ├── sub-10788_label-CSF_probseg.nii.gz
│ ├── sub-10788_label-GM_probseg.nii.gz
│ ├── sub-10788_label-WM_probseg.nii.gz
│ ├── sub-10788_space-MNI152NLin2009cAsym_desc-brain_mask.json
│ ├── sub-10788_space-MNI152NLin2009cAsym_desc-brain_mask.nii.gz
│ ├── sub-10788_space-MNI152NLin2009cAsym_desc-preproc_T1w.json
│ ├── sub-10788_space-MNI152NLin2009cAsym_desc-preproc_T1w.nii.gz
│ ├── sub-10788_space-MNI152NLin2009cAsym_dseg.nii.gz
│ ├── sub-10788_space-MNI152NLin2009cAsym_label-CSF_probseg.nii.gz
│ ├── sub-10788_space-MNI152NLin2009cAsym_label-GM_probseg.nii.gz
│ └── sub-10788_space-MNI152NLin2009cAsym_label-WM_probseg.nii.gz
├── func
│ ├── sub-10788_task-rest_desc-confounds_timeseries.json
│ ├── sub-10788_task-rest_desc-confounds_timeseries.tsv
│ ├── sub-10788_task-rest_from-scanner_to-T1w_mode-image_xfm.txt
│ ├── sub-10788_task-rest_from-T1w_to-scanner_mode-image_xfm.txt
│ ├── sub-10788_task-rest_space-MNI152NLin2009cAsym_boldref.nii.gz
│ ├── sub-10788_task-rest_space-MNI152NLin2009cAsym_desc-aparcaseg_dseg.nii.gz
│ ├── sub-10788_task-rest_space-MNI152NLin2009cAsym_desc-aseg_dseg.nii.gz
│ ├── sub-10788_task-rest_space-MNI152NLin2009cAsym_desc-brain_mask.json
│ ├── sub-10788_task-rest_space-MNI152NLin2009cAsym_desc-brain_mask.nii.gz
│ ├── sub-10788_task-rest_space-MNI152NLin2009cAsym_desc-preproc_bold.json
│ ├── sub-10788_task-rest_space-MNI152NLin2009cAsym_desc-preproc_bold.nii.gz
│ ├── sub-10788_task-rest_space-T1w_boldref.nii.gz
│ ├── sub-10788_task-rest_space-T1w_desc-aparcaseg_dseg.nii.gz
│ ├── sub-10788_task-rest_space-T1w_desc-aseg_dseg.nii.gz
│ ├── sub-10788_task-rest_space-T1w_desc-brain_mask.json
│ ├── sub-10788_task-rest_space-T1w_desc-brain_mask.nii.gz
│ ├── sub-10788_task-rest_space-T1w_desc-preproc_bold.json
│ └── sub-10788_task-rest_space-T1w_desc-preproc_bold.nii.gz
...
As you can see above, each subject folder is organized into an
anat
and func
sub-folder.
Specifically:
- the
anat
folder contains the preprocessed anatomical data. If multiple T1 files are available (all T1s even across sessions), then these data are merged - you will always have oneanat
folder under the subject folder - the
func
folder contains the preprocessed functional data. All tasks are dumped into the same folder and like the BIDS convention are indicated by the use of their filenames (task-[task_here]
)
Callout
This data is single-session, so a session folder is missing here -
but with multiple sessions you will see anat
and
ses-[insert_session_here]
folders where each session folder
contain a func
folder.
Hopefully you’re now convinced that the outputs of fMRIPREP roughly follows BIDS organization principles. The filenames themselves give you a full description of what each file is (check the slides to get an idea of what each file means!
Now let’s see how we can pull data in using pyBIDS!
Let’s import pyBIDS through the bids
module first:
We can make a bids.BIDSLayout
object as usual by just
feeding in the fmriprep directory! However, there is one caveat… note
that fMRIPrep doesn’t exactly adhere to the standard BIDS
convention. It uses fields such as desc-
which are not
part of the original BIDS specification. I.e:
In fact, BIDS allows for extensions which enable you to add
additional fields to the standard BIDS convention (such as
desc-
!). fMRIprep uses the derivatives
extension of the BIDS standard. pyBIDS can handle standard extensions to
the BIDS specification quite easily:
PYTHON
layout = bids.BIDSLayout('../data/ds000030/derivatives/fmriprep/', config=['bids','derivatives'])
Now that we have a layout object, we can pretend like we’re working with a BIDS dataset! Let’s try some common commands that you would’ve used with a BIDS dataset:
First, we’ll demonstrate that we can grab a list of pre-processed subjects much like in the way we would grab subjects from a raw BIDS dataset:
OUTPUT
['10171',
'10292',
'10365',
'10438',
'10565',
'10788',
'11106',
'11108',
'11122',
'11131',
'50010',
'50035',
'50047',
'50048',
'50052',
'50067',
'50075',
'50077',
'50081',
'50083']
We can also do the same for tasks
OUTPUT
['rest']
Now let’s try fetching specific files. Similar to how you would fetch BIDS data using pyBIDS, the exact same syntax will work for fMRIPREP derivatives. Let’s try pulling just the preprocessed anatomical data.
Recall that the anatomical folder is organized as follows:
OUTPUT
../data/ds000030/derivatives/fmriprep/sub-10788/anat
├── sub-10788_desc-aparcaseg_dseg.nii.gz
├── sub-10788_desc-aseg_dseg.nii.gz
├── sub-10788_desc-brain_mask.json
├── sub-10788_desc-brain_mask.nii.gz
├── sub-10788_desc-preproc_T1w.json
├── sub-10788_desc-preproc_T1w.nii.gz
├── sub-10788_dseg.nii.gz
├── sub-10788_from-fsnative_to-T1w_mode-image_xfm.txt
├── sub-10788_from-T1w_to-fsnative_mode-image_xfm.txt
├── sub-10788_label-CSF_probseg.nii.gz
├── sub-10788_label-GM_probseg.nii.gz
├── sub-10788_label-WM_probseg.nii.gz
├── sub-10788_space-MNI152NLin2009cAsym_desc-brain_mask.json
├── sub-10788_space-MNI152NLin2009cAsym_desc-brain_mask.nii.gz
├── sub-10788_space-MNI152NLin2009cAsym_desc-preproc_T1w.json
├── sub-10788_space-MNI152NLin2009cAsym_desc-preproc_T1w.nii.gz
├── sub-10788_space-MNI152NLin2009cAsym_dseg.nii.gz
├── sub-10788_space-MNI152NLin2009cAsym_label-CSF_probseg.nii.gz
├── sub-10788_space-MNI152NLin2009cAsym_label-GM_probseg.nii.gz
└── sub-10788_space-MNI152NLin2009cAsym_label-WM_probseg.nii.gz
0 directories, 20 files
The file that we’re interested in is of form
sub-[subject]_desc-preproc_T1w.nii.gz
. Now we can construct
a pyBIDS call to pull these types of files specifically:
OUTPUT
[<BIDSImageFile filename='/home/jerry/projects/workshops/SDC-BIDS-fMRI/data/ds000030/derivatives/fmriprep/sub-10438/anat/sub-10438_desc-preproc_T1w.nii.gz'>,
<BIDSImageFile filename='/home/jerry/projects/workshops/SDC-BIDS-fMRI/data/ds000030/derivatives/fmriprep/sub-10438/anat/sub-10438_space-MNI152NLin2009cAsym_desc-preproc_T1w.nii.gz'>,
<BIDSImageFile filename='/home/jerry/projects/workshops/SDC-BIDS-fMRI/data/ds000030/derivatives/fmriprep/sub-10788/anat/sub-10788_desc-preproc_T1w.nii.gz'>,
<BIDSImageFile filename='/home/jerry/projects/workshops/SDC-BIDS-fMRI/data/ds000030/derivatives/fmriprep/sub-10788/anat/sub-10788_space-MNI152NLin2009cAsym_desc-preproc_T1w.nii.gz'>,
Callout
If we didn’t configure pyBIDS with
config=[‘bids’,‘derivatives’]
then the desc
keyword would not work!
Note that we also pulled in MNI152NLin2009cAsym_preproc.nii.gz data
as well. This is data that has been transformed into MNI152NLin2009cAsym
template space. We can pull this data out by further specifying our
layout.get
using the space
argument:
PYTHON
mni_preproc_T1 = layout.get(datatype='anat',desc='preproc',extension='.nii.gz',space='MNI152NLin2009cAsym')
mni_preproc_T1
OUTPUT
[<BIDSImageFile filename='/home/jerry/projects/workshops/SDC-BIDS-fMRI/data/ds000030/derivatives/fmriprep/sub-10438/anat/sub-10438_space-MNI152NLin2009cAsym_desc-preproc_T1w.nii.gz'>,
<BIDSImageFile filename='/home/jerry/projects/workshops/SDC-BIDS-fMRI/data/ds000030/derivatives/fmriprep/sub-10788/anat/sub-10788_space-MNI152NLin2009cAsym_desc-preproc_T1w.nii.gz'>,
...
What if we wanted to pull out the data in T1 “native space” (it
really is a template space, since it is merged T1s)? Unfortunately for
this isn’t directly possible using layout.get
. Instead
we’ll use a bit of python magic to pull the data that we want:
Similarily fMRI data can be pulled by specifying
datatype=‘func’
and using the desc
argument as
appropriate:
Exercise 1
- Get the list of all preprocessed functional data
- Get the list of functional data in MNI152NLin2009cAsym space
- Get the list of functional data in T1w space (native) Note that T1
space fMRI data can be pulled using
space=“T1w”
(this is unlike the T1w data which required you to do some filtering)
Now that we have a handle on how fMRIPREP preprocessed data is organized and how we can pull this data. Let’s start working with the actual data itself!
Key Points
- fMRIPrep stores preprocessed data in a ‘BIDS-like’ fashion
- You can pull files using pyBIDS much like how you can navigate raw BIDS data