This lesson is in the early stages of development (Alpha version)

Data Management: Data Management Planning

Overview

Teaching: 10 min
Exercises: 5 min
Questions
  • What is a data management plan and do I need one?

Objectives
  • Understand the idea of a data management plan

  • Know when to create one

Data Management Planning

Often with projects that generate data, the datasets produced are relatively small and straight-forwards. For example, a single experiment with a single datafile which contains ELISA plate data is very easy to handle. However, what if you’ve planned a project that is expected to generate 10,000 small files? Or 10 huge files? What if the data is known to be in an uncommon file-format (from, say, an old piece of equipment), or what if you’re expecting outside help to process the data? In cases such as these, planning the data aspects of an upcoming project can be beneficial.

Data management plans are typically short documents that outline the plans for the use of the data throughout the project’s life-cycle. Also note that data management plans are becoming a requirement for many funding organisations. This document should briefly answer the following questions,

The Data Management Planning Tool is a free, online platform that allows data management plans to be created quickly from a range of templates.

Data management plans should be reviewed as a project progressed, and modified accordingly.

As part of a data management plan, it may be useful to create a flow-diagram of the planned data-flow through the expected analysis pipeline, i.e. a visual overview starting with the raw data, through all analysis steps (and associate files used), to the final output.

Example of a good project setup

Now that we’ve covered all the main topics for setting up the files and folders of a project, let’s take a look at an example of a good project setup,

alt text

Exercise: Data problems and solutions

Think of a past project where you encountered some sort of data issue. Which of these ideas (file-naming, folder organisation, meta-data, data dictionaries, data management plans) would be useful and why?

Key Points

  • Projects that are expected to generate a large amount of data, and/or complex data, may benefit from such a plan