This lesson is still being designed and assembled (Pre-Alpha version)

Introduction to Data Science and AI for senior researchers

The Data Science for Biomedical Scientists project is funded by The Alan Turing Institute - AI for Science and Government (ASG) Research Programme. This training material provides an introduction to data science and Artificial Intelligence (AI). Providing contexts and examples from biomedical research, this material will discuss AI for automation, the process of unsupervised and supervised machine learning, their practical applications and common pitfalls that researchers should be aware of in order to maintain scientific rigour and research ethics.

This project builds on The Turing Way, The Carpentries and Open Life Science practices. Hosted by the Tools, practices and systems (TPS) research team, all materials are shared under CC-BY 4.0 License. Although the training course is tailored to the biomedical sciences community, materials will be generally transferable and directly relevant for data science projects across different domains. Anyone interested in collaboration and improvements of this material is welcome to connect with the development team on GitHub (see the repository).


This resource is designed for experimental biologists and biomedical research communities, with a focus on two key professional/career groups:

  • Group leaders or lab managers without prior experience with Data Science or management of computational projects
  • Postdoc and lab scientists (next-generation senior leaders) interested in enabling the integration of computational science into biosciences

In defining the scope of this project for our target audience, we make some assumptions about the learner groups:

  • Our learners have a good understanding of designing or contributing to a scientific project throughout its lifecycle.
  • They have a computational project in mind for which funding and research ethics approval have been received.
  • We also assume that the research team of any size is (either partially or fully) established.

This lesson is developed alongside the Managing Open and Reproducible Computational Projects lesson. Our learners are encouraged to go through the Managing Open and Reproducible Computational Projects lesson to learn about practices and tools that should be adopted by senior researchers to manage and supervise data science and AI/ML projects life science domains.


Setup Download files required for the lesson
00:00 1. Introduction to this course What is the purpose of this training?
Who are the target audience?
What will they learn at the end of this training?
00:10 2. What is AI? What is AI and how can we define it?
What types of AI exist and what doesn’t exist?
00:10 3. AI for Automation How is AI used for automating tasks?
00:10 4. AI for Data Insights How is AI used with scientific research to gain insight into data?
00:10 5. Problems with AI What are the common pitfalls with using machine learning?
00:10 6. Practical Considerations for Researchers What considerations do I need to be aware of when conducting research with AI
00:10 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.