This lesson is in the early stages of development (Alpha version)

Introduction to Conda for (Data) Scientists

This lesson is an introduction to Conda for (data) scientists. Conda is an open source package and environment management system that runs on Windows, macOS and Linux. Conda installs, runs, and updates packages and their dependencies. Conda easily creates, saves, loads, and switches between environments on your local computer. While Conda was created for Python programs it can package and distribute software for any languages such as R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN. This lesson motivates the use of Conda as a development tool for building and sharing project specific software environments that facilitate reproducible (data) science workflows.


This is an intermediate lesson and assumes familiarity with the core materials covered in the Software Carpentry Lessons. In particular learners need to be familiar with material covered in The Unix Shell, Version Control with Git, and either Plotting and Programming in Python or R for Reproducible Scientific Analysis.


Setup Download files required for the lesson
00:00 1. Getting Started with Conda What is Conda?
Why should I use a package and environment management system as part of my research workflow?
Why use Conda (+pip)?
00:20 2. Working with Environments What is a Conda environment?
How do I create (delete) an environment?
How do I activate (deactivate) an environment?
How do I install packages into existing environments using Conda (+pip)?
Where should I create my environments?
How do I find out what packages have been installed in an environment?
How do I find out what environments that exist on my machine?
How do I delete an environment that I no longer need?
01:35 3. Sharing Environments Why should I share my Conda environment with others?
How do I share my Conda environment with others?
How do I create a custom kernel for my Conda environments inside JupyterLab?
02:20 4. Using Packages and Channels What are Conda channels?
What are Conda packages?
Why should I be explicit about which channels my research project uses?
02:50 5. Managing GPU dependencies Which NVIDIA libraries are available via Conda?
What do you do when you need the NVIDIA CUDA Compiler (NVCC) for your project?
03:50 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.