Data science tools for neuroimaging
Contents
11. Data science tools for neuroimaging¶
In previous chapters, we saw that there is plenty of things that you can do with general-purpose software and data science tools. Many of the examples used data from neuroimaging experiments, but the ideas that we introduced – arrays and computing with arrays, tables and how to manipulate tabular data, scientific computing, and data visualization – were not specific to neuroimaging. In this chapter, we will discuss data science approaches that are specifically tailored to neuroimaging data. First (in Section 11.1), we will present a survey of neuroimaging-specific software implemented in Python. Next (in Section 11.2), we will discuss how to organize data from neuroscience experiments for data analysis that best takes advantage of these software tools. In the following chapters, we will dive into some of the applications of these data science tools to fundamental problems in neuroimaging data analysis.
11.1. Neuroimaging in Python¶
Within the broader ecosystem of Python tools for science, there is a family of tools specifically focused on neuroimaging (we will refer to them collectively as “NiPy”, which stands for “Neuroimaging in Python”). These software tools, developed by and for neuroimaging researchers, cover a wide range of data analysis tasks on a variety of different kinds of experimental data. In the next few sections, we will see in detail how some of these tools are used. But first, we will provide a broad survey of the different kinds of tools that currently exist. It is important to emphasize that this is a very dynamically evolving ecosystem, and some of these tools may evolve into other tools over time, or even disappear. New tools will inevitably also emerge. So, this survey will be, by necessity, a bit superficial and a bit dated. That said, we’ll try to give you a sense of how an ecosystem like this one emerges and evolves so that you can keep an eye on these trends as they play out in the future.
11.1.1. Reading and writing neuroimaging data¶
All of the NiPy tools depend to some degree on the ability to read neuroimaging data from commonly used file formats into the memory of the computer. The tool of choice for doing that is called nibabel. The name hints at the plethora of different file formats that exist to represent and store neuroimaging data. Nibabel, which has been in continuous development since 2007, provides a relatively straightforward common route to read many of these formats. Much in the way that Numpy serves as a basis for all of the scientific computing in Python, nibabel serves as the basis for the NiPy ecosystem. Because it is so fundamental, we devote an entire later section (Section 12) to some basic operations that you can do with nibabel. In Section 13, we will go even deeper to see how nibabel enables the integration of different neuroimaging data types. So, we will not say much more about nibabel right now.
11.1.2. Tools for MRI data analysis¶
Tools for neuroimaging data analysis are usually focused on particular data types and on particular ways of analyzing the data. Functional MRI (fMRI) is a very popular data type in cognitive neuroscience, and several tools focus on functional MRI. For example, both the BrainIAK software library and the Nilearn software library focus on machine learning analysis of functional MRI data. We will return to talk about machine learning in more detail in chapter Section 17. Some tools fit more traditional generalized linear models (GLM) to fMRI data. One of these is fitlins. It relies heavily on the Brain Imaging Data Structure standard, which we will discuss below in Section 11.2. Another one is glmsingle, implemented in both Matlab and Python, which focuses on the estimation of single-trial responses.
Another popular MRI modality is diffusion MRI (dMRI). It is used to characterize the properties of brain white matter and to find white matter connections between different parts of the brain. DIPY (Diffusion Imaging in Python) is a software library that implements a variety of methods for analysis of this kind of data. We will use image processing algorithms that are implemented in DIPY in chapter Section 16. Building upon DIPY, there are several tools that focus on more specific dMRI analysis tasks, such as DMIPY, which focuses on various ways to model the microstructure of brain white matter tissue, pyAFQ and TractSeg, which focus on delineating the major white matter tracts, and NDMG, which focuses on quantifying the connectivity matrix of the brain, based on dMRI data. Another widely-used software library for analysis of dMRI is the Medical Imaging Toolkit for Diffusion.
Other tools focus on analysis of structural MRI data. The advanced normalization tools (ANTs) software library implements a variety of algorithms for structural data, such as image registration and image segmentation, which will be described in more detail in chapter Section 14. The ANTs library, originally developed in the C programming language, is now also available through a user-friendly Python interface in the antspy library.
11.1.3. Tools for other neuroimaging modalities¶
One of the largest and mature projects for neuroimaging data analysis focuses on electrophysiology data that comes from experiments that use Magnetoencephalography (MEG) Electroencephalography (EEG), Electrocorticography (ECoG), and other similar measurements. The MNE software library has been in development since 2011 and contains implementations of many algorithms that are common across these different types of data. For example, analysis of the frequency spectrums of these signals. It also includes algorithms that are specific for particular types of data. For example, algorithms to localize the brain sources of MEG signals that are measured outside the skull.
11.1.4. Nipype, the Python tool to unify them all¶
One of the main challenges for anyone doing neuroimaging data analysis is that different algorithms are implemented in different software libraries. Even if you decide that you would like to use NiPy tools for most of your data analysis, you might still find yourself needing to use non-Python tools for some steps in your analysis. For example, to use novel algorithms that are implemented in the popular FMRIB Software Library (FSL) or Analysis of Functional NeuroImages (AFNI) frameworks for fMRI, the popular MRTRIX library for dMRI, or in FreeSurfer for analysis of structural MRI. In theory, if you wanted to construct an analysis pipeline that glues together operations from different frameworks, you would have to familiarize yourself with the varying ways in which these different frameworks expect their input data to appear and the varying ways in which they produce their output. The complexity can quickly become daunting. Moreover, creating reproducible data analysis programs that put together such pipelines becomes cumbersome and can be very difficult. Fortunately, these challenges have been addressed through the Nipype Python software library. Nipype implements Python interfaces to many neuroimaging software tools (including those we mentioned above). One of Nipype’s appealing features is that it provides a very similar programming interface across these different tools. In addition, it provides the functionality to represent pipelines of operations, where the outputs of one processing step become the inputs to the next processing steps. Running such a pipeline as one program is not only a matter of convenience but also important for reproducibility. By tracking and recording the set of operations that a dataset undergoes through analysis, as well as the parameters that are used in each step, Nipype allows users of the software to report exactly what they did with their data to come to a particular conclusion. In addition, Nipype allows a processing pipeline to be used with other datasets. However, this raises another kind of challenge: to achieve this kind of extensibility, the input data has to be organized in a particular way that the software recognizes. This is where community-developed data standards come in. In the rest of this section, we will discuss the Brain Imaging Data Structure, or BIDS, which is the most commonly-used data standard for organizing neuroimaging data for analysis.
11.2. The Brain Imaging Data Structure (BIDS) standard¶
In the previous section, we pointed out that Nipype can be used to create reproducible analysis pipelines that can be applied across different datasets. This is true, in principle, but in practice, it also relies on one more idea: that of a data standard. This is because to be truly transferable, an analysis pipeline needs to know where to find the data and metadata that it uses for analysis. Thus, in this section, we will shift our focus to talking about how entire neuroimaging projects are (or should be) laid out. Until recently, datasets were usually organized in an idiosyncratic way. Each researcher had to decide on their own what data organization made sense to them. This made data sharing and reuse quite difficult because if you wanted to use someone else’s data, there was a very good chance you’d first have to spend a few days just figuring out what you were looking at and how you could go about reading the data into whatever environment you were comfortable with.
11.2.1. The BIDS specification¶
Fortunately, things have improved dramatically in recent years. Recognizing that working with neuroimaging data would be considerably easier if everyone adopted a common data representation standard, a group of (mostly) fMRI researchers convened in 2016 to create something now known as the Brain Imaging Data Standard, or BIDS. BIDS wasn’t the first data standard proposed in fMRI, but it has become by far the most widely adopted. Much of the success of BIDS can be traced to its simplicity: the standard deliberately insists not only on machine readability but also on human readability, which means that a machine can ingest a dataset and do all kinds of machine processing with it, but a human looking at the files can also make sense of the dataset, understanding what kinds of data were collected and what experiments were conducted. While there are some nuances and complexities to BIDS, the core of the specification consists of a relatively simple set of rules a human with some domain knowledge can readily understand and implement.
We won’t spend much time describing the details of the BIDS specification in this book, as there’s already excellent documentation for that on the project’s website. Instead, we’ll just touch on a couple of core principles. The easiest way to understand what BIDS is about is to dive right into an example. Here’s a sample directory structure we’ve borrowed from the BIDS documentation. It shows a valid BIDS dataset that contains just a single subject:
project/ sub-control01/ anat/ sub-control01_T1w.nii.gz sub-control01_T1w.json sub-control01_T2w.nii.gz sub-control01_T2w.json func/ sub-control01_task-nback_bold.nii.gz sub-control01_task-nback_bold.json sub-control01_task-nback_events.tsv sub-control01_task-nback_physio.tsv.gz sub-control01_task-nback_physio.json sub-control01_task-nback_sbref.nii.gz dwi/ sub-control01_dwi.nii.gz sub-control01_dwi.bval sub-control01_dwi.bvec fmap/ sub-control01_phasediff.nii.gz sub-control01_phasediff.json sub-control01_magnitude1.nii.gz sub-control01_scans.tsv code/ deface.py derivatives/ README participants.tsv dataset_description.json CHANGES
There are two important points to note here. First, the BIDS specification
imposes restrictions on how files are organized within a BIDS project directory.
For example, every subject’s data goes inside a sub-[id]
folder below the
project root —- where the sub-
prefix is required, and the [id]
is a
researcher-selected string uniquely identifying that subject within the project
("control01"
in the example). And similarly, inside each subject directory, we
find subdirectories containing data of different modalities: anat
for
anatomical images; func
for functional images; dwi
for diffusion-weighted
images; and so on. When there are multiple data collection sessions for each
subject, an extra level is introduced to the hierarchy, so that functional data
from the first session acquired from subject control01
would be stored inside
a folder like sub-control01/ses-01/func
.
Second, valid BIDS files must follow particular naming conventions. The precise
naming structure of each file depends on what kind of file it is, but the
central idea is that a BIDS filename is always made up of (1) a sequence of
key-value pairs, where each key is separated from its corresponding value by a
dash, and pairs are separated by underscores; (2) a “suffix” that directly
precedes the file extension and describes the type of data contained in the file
(this comes from a controlled vocabulary, meaning that it can only be one of a
few accepted values, such as "bold"
or "dwi"
); and (3) an extension that
defines the file format.
For example, if we take a file like
sub-control01/func/sub-control01_task-nback_bold.nii.gz
and examine its
constituent chunks, we can infer from the filename that the file is a Nifti
image (.nii.gz
extension) that contains BOLD fMRI data (bold
suffix) for
task nback
acquired from subject control01
.
Besides these conventions, there are several other key elements of the BIDS specification. We won’t discuss them in detail, but it’s good to at least be aware of them:
Every data file should be accompanied by a JSON “sidecar” containing metadata describing that file. For example, a BOLD data file might be accompanied by a side-car file that describes acquisition parameters, such as repetition time.
BIDS follows an “inheritance” principle —- meaning that JSON metadata files higher up in the hierarchy automatically apply to relevant files lower in the hierarchy unless explicitly overridden. For example, if all of the BOLD data in a single dataset was acquired using the same protocol, this metadata need not be replicated in each subject’s data folder.
Every project is required to have a
dataset_description.json
file at the root level that contains basic information about the project (e.g., the name of the dataset and a description of its constituents, as well as citation information).BIDS doesn’t actively prohibit you from including non-BIDS-compliant files in a BIDS project -— so you don’t have to just throw out files that you can’t easily shoehorn into the BIDS format. The downside of including non-compliant files is just that most BIDS tools and/or human users won’t know what to do with them, so your dataset might not be quite as useful as it otherwise would be.
11.2.1.1. BIDS Derivatives¶
The BIDS specification was originally created with static representations of neuroimaging datasets in mind. But it quickly became clear that it would also be beneficial for the standard to handle derivatives of datasets -— that is, new BIDS datasets generated by applying some transformation to one or more existing BIDS datasets. For example, suppose we have a BIDS dataset containing raw fMRI images. Typically, we’ll want to preprocess our images (for example, to remove artifacts, apply motion correction, temporally filter the signal, etc.) before submitting them to analysis. It’s great if our preprocessing pipeline can take BIDS datasets as inputs, but what should it then do with the output? A naive approach would be to just construct a new BIDS dataset that’s very similar to the original one, but replace the original (raw) fMRI images with new (preprocessed) ones. But that’s likely to confuse: a user could easily end up with many different versions of the same BIDS dataset, yet have no formal way to determine the relationship between them. To address this problem, the BIDS-Derivatives extension introduces some additional metadata and file naming conventions that make it easier to chain BIDS-aware tools (see the next section) without chaos taking hold.
11.2.2. The BIDS Ecosystem¶
At this point, you might be wondering: what is BIDS good for? Surely the point of introducing a new data standard isn’t just to inconvenience people by forcing them to spend their time organizing their data a certain way? There must be some benefits to individual researchers – and ideally, the community as a whole -— spending precious time making datasets and workflows BIDS-compliant, right? Well, yes, there are! The benefits of buying into the BIDS ecosystem are quite substantial. Let’s look at a few.
11.2.2.1. Easier data sharing and reuse¶
One obvious benefit we alluded to above is that sharing and re-using
neuroimaging data becomes much easier once many people agree to organize their
data the same way. As a trivial example, once you know that BIDS organizes data
according to a fixed hierarchy (i.e., subject –> session –> run), it’s easy to
understand other people’s datasets. There’s no chance of finding time-course
images belonging to subject 1 in, say,
/imaging/old/NbackTask/rawData/niftis/controlgroup/1/
. But the benefits of
BIDS for sharing and reuse come into full view once we consider the impact on
public data repositories. While neuroimaging repositories have been around for a
long time (for an early review, see [Van Horn et al., 2001]), their utility
was long hampered by the prevalence of idiosyncratic file formats and project
organizations. By supporting the BIDS standard, data repositories open the door
to a wide range of powerful capabilities.
To illustrate, consider OpenNeuro – currently the largest and most widely-used repository of brain MRI data. OpenNeuro requires uploaded datasets to be in BIDS format (though datasets do not have to be fully compliant). As a result, the platform can automatically extract, display, and index important metadata. For example, the number of subjects, sessions, and runs in each dataset; the data modalities and experimental tasks present; a standardized description of the dataset; and so on. Integration with free analysis platforms like BrainLife is possible, as is structured querying over datasets via OpenNeuro’s GraphQL API endpoint.
Perhaps most importantly, the incremental effort required by users to make their BIDS-compliant datasets publicly available and immediately usable by others is minimal: in most cases, users have only to click an Upload button and locate the project they wish to share (there is also a command-line interface, for users who prefer to interact with OpenNeuro programmatically).
11.2.2.2. BIDS-Apps¶
A second benefit to representing neuroimaging data in BIDS is that one immediately gains access to a large, and rapidly growing, ecosystem of BIDS-compatible tools. If you’ve used different neuroimaging tools in your research -— for example, perhaps you’ve tried out both FSL and SPM (the two most widely used fMRI data analysis suites) —- you’ll have probably had to do some work to get your data into a somewhat different format for each tool. In a world without standards, tool developers can’t be reasonably expected to know how to read your particular dataset, so the onus falls on you to get your data into a compatible format. In the worst case, this means that every time you want to use a new tool, you have to do some more work.
By contrast, for tools that natively support BIDS, life is simpler. Once we know that fMRIPrep -— a very popular preprocessing pipeline for fMRI data [Esteban et al., 2019] -— takes valid BIDS datasets as inputs, the odds are very high that we’ll be able to apply fMRIPrep to our own valid BIDS datasets with little or no additional work. To facilitate the development and use of these kinds of tools, BIDS developed a lightweight standard for “BIDS Apps” [Gorgolewski et al., 2017]. A BIDS App is an application that takes one or more BIDS datasets as input. There is no restriction on what a BIDS App can do, or what it’s allowed to output (though many BIDS Apps output BIDS-Derivatives datasets); the only requirement is that a BIDS App is containerized (using Docker or Singularity; see Section 4), and accept valid BIDS datasets as input. New BIDS Apps are continuously being developed, and as of this writing, the BIDS Apps website lists a few dozen apps.
What’s particularly nice about this approach is that it doesn’t necessarily require the developers of existing tools to do a lot of extra work themselves to support BIDS. In principle, anyone can write a BIDS-App “wrapper” that mediates between the BIDS format and whatever format a tool natively expects to receive data in. So, for example, the BIDS-Apps registry already contains BIDS-Apps for packages or pipelines like SPM, CPAC, Freesurfer, and the Human Connectome Project Pipelines. Some of these apps are still fairly rudimentary and don’t cover all of the functionality provided by the original tools, but others support much or most of the respective native tool’s functionality. And of course, many BIDS-Apps aren’t wrappers around other tools at all; they’re entirely new tools designed from the very beginning to support only BIDS datasets. We’ve already mentioned fMRIPrep, which has very quickly become arguably the de facto preprocessing pipeline in fMRI; another widely-used BIDS-App is MRIQC [Esteban et al., 2017], a tool for automated quality control and quality assessment of structural and functional MRI scans, which we will see in action in Section 12. Although the BIDS-Apps ecosystem is still in its infancy, the latter two tools already represent something close to killer applications for many researchers.
To demonstrate this statement, consider how easy it is to run fMRIPrep once your data is organized in the BIDS format. After installing the software and its dependencies, running the software is as simple as issue this command in the terminal:
fmriprep data/bids_root/ out/ participant -w work/
where, data/bids_root
points to a directory that contains a BIDS-organized
dataset that includes fMRI data, out
points to the directory into which the
outputs (the BIDS derivatives) will be saved and work
is a directory that will
store some of the intermediate products that will be generated along the way.
Looking at this, it might not be immediately apparent how important BIDS is for
this to be so simple, but consider what software would need to do to find all of
the fMRI data inside of a complex dataset of raw MRI data, that might contain
other data types, other files and so forth. Consider also the complexity that
arises from the fact that fMRI data can be collected using many different
acquisition protocols, and the fact that fMRI processing sometimes uses other
information (for example, measurements of the field map, or anatomical
T1-weighted or T2-weighted scans). The fact that the data complies with BIDS
allows fMRIPrep to locate everything that it needs with the dataset and to make
use of all the information to perform the preprocessing to the best of its
ability given the provided data.
11.2.2.3. Utility libraries¶
Lastly, the widespread adoption of BIDS has also spawned a large number of utility libraries designed to help developers (rather than end users) build their analysis pipelines and tools more efficiently. Suppose I’m writing a script to automate my lab’s typical fMRI analysis workflow. It’s a safe bet that, at multiple points in my script, I’ll need to interact with the input datasets in fairly stereotyped and repetitive ways. For instance, I might need to search through the project directory for all files containing information about event timing, but only for a particular experimental task. Or, I might need to extract some metadata containing key parameters for each time-series image I want to analyze (e.g., the repetition time, or TR). Such tasks are usually not very complicated, but they’re tedious and can slow down development considerably. Worse, at a community level, they introduce massive inefficiency, because each person working on their analysis script ends up writing their own code to solve what are usually very similar problems.
A good utility library abstracts away a lot of this kind of low-level work and allows researchers and analysts to focus most of their attention on high-level objectives. By standardizing the representation of neuroimaging data, BIDS makes it much easier to write good libraries of this sort. Probably the best example so far is a package called PyBIDS, which provides a convenient Python interface for basic querying and manipulation of BIDS datasets. To give you a sense of how much work PyBIDS can save you when you’re writing neuroimaging analysis code in Python, let’s take a look at some of the things the package can do.
We start by importing an object called BIDSLayout
, which we will use to manage
and query the layout of files on disk. We also import a function that knows how
to locate some test data that was installed on our computer together with the
PyBIDS software library.
from bids import BIDSLayout
from bids.tests import get_test_data_path
One of the datasets that we have within the test data path is data set number 5 from OpenNeuro. Note that the software has not actually installed into our hard-drive a bunch of neuroimaging data – that would be too large! Instead, the software installed a bunch of files that have the right names and are organized in the right way, but are mostly empty. This allows us to demonstrate the way that the software works, but don’t try reading the neuroimaging data from any of the files in that directory. We’ll work with a more manageable BIDS dataset, including the files in it in Section 12.
For now, we initialize a BIDSLayout
object, by pointing to the location of the
dataset in our disk. When we do that, the software scans through that part of
our file-system, validates that it is a properly organized BIDS dataset, and
finds all of the files that are arranged according to the specification. This
allows the object to already infer some things about the dataset. For example,
the dataset has 16 subjects and 48 total runs (here a “run” is an individual
fMRI scan). The person who organized this dataset decided not to include a
session folder for each subject. Presumably, because each subject participated in
just one session in this experiment, that information is not useful.
layout = BIDSLayout(get_test_data_path() + "/ds005")
print(layout)
BIDS Layout: ...packages/bids/tests/data/ds005 | Subjects: 16 | Sessions: 0 | Runs: 48
The layout
object now has a method called get()
, which we can use to gain
access to various parts of the dataset. For example, we can ask to give us a
list of the filenames of all of the anatomical ("T1w"
) scans that were
collected for subjects sub-01
and sub-02
layout.get(subject=['01', '02'], suffix="T1w", return_type='filename')
['/home/runner/.local/lib/python3.10/site-packages/bids/tests/data/ds005/sub-01/anat/sub-01_T1w.nii.gz',
'/home/runner/.local/lib/python3.10/site-packages/bids/tests/data/ds005/sub-02/anat/sub-02_T1w.nii.gz']
Or, using a slightly different logic, all of the functional ("bold"
) scans collected for subject sub-03
layout.get(subject='03', suffix="bold", return_type='filename')
['/home/runner/.local/lib/python3.10/site-packages/bids/tests/data/ds005/sub-03/func/sub-03_task-mixedgamblestask_run-01_bold.nii.gz',
'/home/runner/.local/lib/python3.10/site-packages/bids/tests/data/ds005/sub-03/func/sub-03_task-mixedgamblestask_run-02_bold.nii.gz',
'/home/runner/.local/lib/python3.10/site-packages/bids/tests/data/ds005/sub-03/func/sub-03_task-mixedgamblestask_run-03_bold.nii.gz']
In these examples, we asked the BIDSLayout
object to give us the 'filename'
return type. This is because if we don’t explicitly ask for a return type, we
will get back a list of BIDSImageFile
objects. For example, selecting the
first one of these for sub-03
’s fMRI scans:
bids_files = layout.get(subject="03", suffix="bold")
bids_image = bids_files[0]
This object is quite useful, of course. For example, it knows how to parse the
file name into meaningful entities, using the get_entities()
method, which
returns a dictionary with entities such as subject
and task
that can be used
to keep track of things in analysis scripts.
bids_image.get_entities()
{'datatype': 'func',
'extension': '.nii.gz',
'run': 1,
'subject': '03',
'suffix': 'bold',
'task': 'mixedgamblestask'}
In most cases, you can also get direct access to the imaging data using the
BIDSImageFile
object. This object has a get_image
method, which would
usually return a nibabel Nifti1Image
object. As you will see in
Section 12 this object lets you extract metadata, or even read the data
from a file into memory as a Numpy array. However, in this case, calling the
get_image
method would raise an error, because, as we mentioned above, the
files do not contain any data. So, let’s look at another kind of file that you
can read directly in this case. In addition to the neuroimaging data, BIDS
provides instructions on how to organize files that record the behavioral events
that occurred during an experiment. These are stored as tab-separated-values
(‘.tsv’) files, and there is one for each run in the experiment. For example,
for this dataset, we can query for the events that happened during subject
sub-03
’s 3rd run:
events = layout.get(subject='03', extension=".tsv", task="mixedgamblestask", run="03")
tsv_file = events[0]
print(tsv_file)
<BIDSDataFile filename='/home/runner/.local/lib/python3.10/site-packages/bids/tests/data/ds005/sub-03/func/sub-03_task-mixedgamblestask_run-03_events.tsv'>
Instead of a BIDSImageFile
, the variable tsv_file
is now a BIDSDataFile
object, and this kind of object has a get_df
method, which returns a Pandas
DataFrame
object
bids_df = tsv_file.get_df()
bids_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 85 entries, 0 to 84
Data columns (total 12 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 onset 85 non-null float64
1 duration 85 non-null int64
2 trial_type 85 non-null object
3 distance from indifference 0 non-null float64
4 parametric gain 85 non-null float64
5 parametric loss 0 non-null float64
6 gain 85 non-null int64
7 loss 85 non-null int64
8 PTval 85 non-null float64
9 respnum 85 non-null int64
10 respcat 85 non-null int64
11 RT 85 non-null float64
dtypes: float64(6), int64(5), object(1)
memory usage: 8.1+ KB
This kind of functionality is useful if you are planning to automate your analysis over large datasets that can include heterogeneous acquisitions between subjects and within subjects. At the very least, we hope that the examples have conveyed to you the power inherent in organizing your data according to a standard, as a starting point to use, and maybe also develop, analysis pipelines that expect data in this format. We will see more examples of this in practice in the next section.
11.2.2.4. Exercises¶
BIDS has a set of example datasets available in a GitHub repository at https://github.com/bids-standard/bids-examples. Clone the repository and use pyBIDS to explore the dataset called “ds011”. Using only pyBIDS code, can you figure out how many subjects participated in this study? What are the values of TR that were used in fMRI acquisitions in this dataset?
11.3. Additional resources¶
There are many places to learn more about the NiPy community. The documentation of each of the software libraries that were mentioned here includes fully worked-out examples of data analysis, in addition to links to tutorials, videos, and so forth. For Nipype, in particular, we recommend Michael Notter’s Nipype tutorial as a good starting point.
BIDS is not only a data standard but also a community of developers and users that support the use and further development of the standard. For example, over the time since the standard was first proposed, the community has added instructions to support the organization and sharing of new data modalities (e.g., intracranial EEG) and derivatives of processing. The strength of such a community is that, like open-source software, it draws on the individual strengths of many individuals and many groups who are willing to spend time and effort evolving and improving it. One of the resources that the community has developed to help people who are new to the standard learn more about it is the BIDS Starter Kit. It is a website that includes materials (videos, tutorials, explanations, and examples) to help you get started learning about and eventually using the BIDS standard.
In addition to these relatively static resources, users of both NiPy software and BIDS can interact with other members of the community through a dedicated questions and answers website called Neurostars. On this website, anyone can ask questions about neuroscience software and get answers from experts who frequent the website. In particular, the developers of many of the projects that we mentioned in this chapter, and many of the people who work on the BIDS standard often answer questions about these projects through this website.