Frictionless Data and FAIR Data

Overview

Teaching: 0 min
Exercises: 0 min

Questions

How does Frictionless Data relate to FAIR Data?

Objectives

Understand how Frictionless Data relates to FAIR data.

What are the FAIR Data Principles

FAIR stands for Findable, Accessible, Interoperable and Reusable and provides an important set of guiding principles for creating reusable datasets. The FAIR data principles are being widely adopted across the research community for improving re-use of datasets

For more information on the FAIR Data Principles visit GO-FAIR.

FAIR has 15 principles and using Frictionless we can meet 10 of them, These are:

F1. (Meta)data are assigned a globally unique and persistent identifier
I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation
I2. (Meta)data use vocabularies that follow FAIR principles
R1. (Meta)data are richly described with a plurality of accurate and relevant attributes
R1.1. (Meta)data are released with a clear and accessible data usage license
R1.2. (Meta)data are associated with detailed provenance
R1.3. (Meta)data meet domain-relevant community standards

How does Frictionless help us meet the FAIR Data Principles

The Frictionless Data Package Schema provides us with a metadata schema for describing the contents and structure of a data package. The schema uses named properties such as id, name, licences, sources and contributors which capture specific information about a dataset.

In the following section we will see how we can use the schema to meet specific FAIR data principles.

F1. (Meta)data are assigned a globally unique and persistent identifier

The Frictionless Data Package schema has a recommended property called id reserved for globally unique identifiers. An example of a globally unique identifier might be a DOI.

I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.

The Frictionless Data Package Schema uses the JSON format to store metadata and CSV to store data. Both JSON and CSV are standard and accessible text based formats using standard formats to represent metadata and data.

I2. (Meta)data use vocabularies that follow FAIR principles

Frictionless allows us to annotate data using controlled vocabularies that also follow FAIR principles.

R1. (Meta)data are richly described with a plurality of accurate and relevant attributes

It is much easier for researchers to re-use data if they can understand a dataset and make a decision whether or not it is useful for them. To help the researcher make this decision you should metadata that richly describe the data. Here plurality means including as much information as possible so that a researcher can confidently re-use the data. The Frictionless Data Package Schema provides standard metadata properties which allows us to provide rich metadata for a dataset.

R1.1. (Meta)data are released with a clear and accessible data usage license

The Frictionless Data Package Schema has a property called licences which allows us include a licence outlining the conditions under which the dataset can be re-used. For example a Creative Commons Attribution Licence could be applied to indicate users must credit the dataset in any publications which use it.

R1.2. (Meta)data are associated with detailed provenance

The Frictionless Data Package Schema has a property called sources which allows us to identify other sources of data. This can be used to show provenance of the data. We can also use other schema properties to provide text information about the provenance of the data, such as location and timing of data collection.

R1.3. (Meta)data meet domain-relevant community standards

Although the Frictionless Data Package Schema defines several standard metadata properties these are very general. Frictionless allows us to add other properties for describing the dataset. For example, we could use agreed community properties such as temporal and location to describe the time and place that a dataset was generated.

In future lessons we will see where Frictionless data is used to support these FAIR data principles.

Key Points

Frictionless Datasets are described by metadata using a standard schema.

Providing good metadata to describe a dataset makes it easier for other researchers to understand and re-use the data.

The FAIR data principles are a set of guidelines for creating findable, accessible, interoperable and re-useable datasets.

Using Frictionless Data can help you create datasets that meet many of the FAIR Data Principles.

previous episode

Frictionless Data for Agricultural Research

next episode