Content from Reproducible Research


Last updated on 2025-05-26 | Edit this page

Estimated time: 18 minutes

Overview

Questions

  • What does it mean to be “reproducible”?
  • How is “reproducibility” different than “reuse”?

Objectives

  • Understand the concepts of reproducibility and reuse
  • Be able to describe what is needed for a computational environment to be reproducible.

Introduction


Modern scientific analyses are complex software and logistical workflows that may span multiple software environments and require heterogenous software and computing infrastructure. Scientific researchers need to keep track of all of this to be able to do their research, and to ensure the validity of their work, which can be difficult. Scientific software enables all of this work to happen, but software isn’t a static resource — software is continually developed, revised, and released, which can introduce large breaking changes or subtle computational differences in outputs and results. Having the software you’re using change without you intending it from day-to-day, run-to-run, or on different machines is problematic when trying to do high quality research and can cause software bugs, errors in scientific results, and make findings unreproducible. All of these things are not desirable!

Callout

When discussing “software” in this lesson we will primarily be meaning open source software that is openly developed. However, there are situations in which software might (for good reason) be:

  • Closed development with open source release artifacts
  • Closed development and closed source with public binary distributions
  • Closed development and closed source with proprietary licenses

Ask the participants to discuss the question in small groups of 2 to 4 people at their table for 3 minutes and then to share their group’s thoughts.

What are other challenges to reproducible research?

There are many! Here are some you might have thought of:

  • (Not having) Access to data
  • Required software packages be removed from mutable package indexes
  • Unreproducible builds of software that isn’t packaged and distributed on public package indexes
  • Analysis code not being under version control
  • Not having any environment definition configuration files

What did you come up with?

Computational reproducibility


“Reproducible” research can mean many things and is a multipronged problem. This lesson will focus primarily on computational reproducibility. Like all forms of reproducibility, there are multiple “levels” of reproducibility. For this lesson we will focus on “full” reproducibility, meaning that reproducible software environments will:

  • Be defined through high level user configuration files.
  • Have machine produced hash level lock files with a full definition of all software in the environment.
  • Specify target computer platforms for all environments solved.
  • Have the resolution and “solving” of a platform’s environments be machine agnostic.
  • Have the software packages defined in the environments exist on immutable public package indexes.

Hardware accelerated environments

Software the involves hardware acceleration on computing resources like GPUs requires additional information to be provided for full computational reproducibility. In addition to the computer platform, information about the hardware acceleration device, its supported drivers, and compatible hardware accelerated versions of the software in the environment (GPU enabled builds) are required. Traditionally this has been very difficult to do, but multiple recent technological advancements (made possible by social agreements and collaborations) in the scientific open source world now provide solutions to these problems.

What are possible challenges of reproducible hardware accelerated environments?

Here are some you might have thought of:

  • Installing hardware acceleration drivers and libraries on the machine with the GPU
  • Knowing what drivers are supported for the available GPUs
  • Providing instructions to install the same drivers and libraries on multiple computing platforms
  • Having the “deployment” machine’s resources and environment where the analysis is done match the “development” machine’s environment

What did you come up with?

Does computational reproducibility mean that the exact same numeric results should be achieved every time?

Not necessarily. Even though the computational software environment is identical there are things that can change between runs of software that could slightly change numerical results (e.g. random number generation seeds, file read order, machine entropy). This isn’t necessarily a problem, and in general one should be more concerned with getting answers that make sense within application uncertainties than matching down to machine precision.

What are additional reasons you thought of?

Computational reproducibility vs. scientific reuse


Aiming for computational reproducibility is the first step to making scientific research more beneficial to us. For the purposes of a single analysis this should be the primary goal. However, just because a software environment is fully reproducible does not mean that the research is automatically reusable. Reuse allows for the tools and components of the scientific workflow to be composable tools that can interoperate together to create a workflow. The steps of the workflow might exist in radically different computational environments and require different software, or different versions of the same software tools. Given these demands, reproducible computational software environments are a first step toward fully reusable scientific workflows.

This lesson will focus on computational reproducibility of hardware accelerated scientific workflows (e.g. machine learning). Scientifically reusable analysis workflows are a more extensive topic, but this lesson will link to references on the topic.

What are challenges to your own research practices to making them reproducible and reusable?

  • Technical expertise in reproducibility technologies
  • Time to learn new tools
  • Balancing reproducibility concerns with using tools the entire research team can understand

What did you come up with?

Key Points

  • Modern scientific research is complex and requires software environments.
  • Computational reproducibility helps to enable reproducible science, but is not sufficient by itself.
  • Reproducible computational software environments that use hardware acceleration require additional information.
  • New technologies make all of these processes easier.
  • Reproducible computational software environments are a first step toward fully reusable scientific workflows but are not sufficient by themselves.

Content from Introduction to Pixi


Last updated on 2025-06-17 | Edit this page

Estimated time: 45 minutes

Overview

Questions

  • What is Pixi?
  • How does Pixi enable fully reproducible software environments?
  • What are Pixi’s semantics?

Objectives

  • Learn Pixi’s workflow design
  • Understand the relationship between a Pixi manifest and a lock file
  • Understand how to create a multi-platform and multi-environment Pixi workspace

Pixi


As described in the previous section on computational reproducibility, to have reproducible software environments we need tools that can take high level human writeable environment configuration files and produce machine readable hash level lock files that exactly specify every piece of software that exists in an environment.

Pixi a cross-platform package and environment manager that can handle complex development workflows. Importantly, Pixi automatically and non-optionally will produce or update a lock file for the software environments defined by the user whenever any actions mutate the environment. Pixi is written in Rust, and leverages the language’s speed and technologies to solve environments fast.

Pixi addresses the concept of computational reproducibility by focusing on a set of main features

  1. Virtual environment management: Pixi can create environments that contain conda packages and Python packages and use or switch between environments easily.
  2. Package management: Pixi enables the user to install, update, and remove packages from these environments through the pixi command line.
  3. Task management: Pixi has a task runner system built-in, which allows for tasks with custom logic and dependencies on other tasks to be created.

combined with robust behaviors

  1. Automatic lock files: Any changes to a Pixi workspace that can mutate the environments defined in it will automatically and non-optionally result in the Pixi lock file for the workspace being updated. This ensures that any and every state of a Pixi project is trivially computationally reproducible.
  2. Solving environments for other platforms: Pixi allows the user to solve environment for platforms other than the current user machine’s. This allows for users to solve and share environment to any collaborator with confidence that all environments will work with no additional setup.
  3. Pairity of conda and Python packages: Pixi allows for conda packages and Python packages to be used together seamlessly, and is unique in its ability to handle overlap in dependencies between them. Pixi will first solve all conda package requirements for the target environment, lock the environment, and then solve all the dependencies of the Python packages for the environment, determine if there are any overlaps with the existing conda environment, and the only install the missing Python dependencies. This ensures allows for fully reproducible solves and for the two package ecosystems to compliment each other rather than potentially cause conflicts.
  4. Efficient caching: Pixi uses an extremely efficient global caching scheme. This means that the first time a package is installed on a machine with Pixi is the slowest is will ever be to install it for any future project on the machine while the cache is still active.

Project-based workflows


Pixi uses a “project-based” workflow which scopes a workspace and the installed tooling for a project to the project’s directory tree.

Pros

  • Environments in the workspace are isolated to the project and can not cause conflicts with any tools or projects outside of the project.
  • A high level declarative syntax allows for users to state only what they need, making even complex environments easy to understand and share.
  • Environments can be treated as transient and be fully deleted and then rebuilt within seconds without worry about breaking other projects. This allows for much greater freedom of exploration and development without fear.

Cons

  • As each project has its own version of its packages installed, and does not share a copy with other projects, the total amount of disk space on a machine can be larger than other forms of development workflows. This can be mitigated for disk limited machines by cleaning environments not in use while keeping their lock files and cleaning the system cache periodically.
  • Each project needs to be set up by itself and does not reuse components of previous projects.

Pixi project files and the CLI API basics


Every Pixi project begins with creating a manifest file. A manifest file is a declarative configuration file that list what the high level requirements of a project are. Pixi then takes those requirements and constraints and solves for the full dependency tree.

Let’s create our first Pixi project. First, to have a uniform directory tree experience, clone the pixi-lesson GitHub repository (which you made as part of the setup) under your home directory on your machine and navigate to it.

BASH

cd ~
git clone git@github.com:<username>/pixi-lesson.git
cd ~/pixi-lesson

Then use pixi init to create a new project directory and initialize a Pixi manifest with your machine’s configuration.

BASH

pixi init example

OUTPUT

Created /home/<username>/pixi-lesson/example/pixi.toml

Navigate to the example directory and check the directory structure

We see that Pixi has setup Git configuration files for the project as well as a Pixi manifest pixi.toml file. Checking the default manifest file, we see three TOML tables: workspace, tasks, and dependencies.

BASH

cat pixi.toml

TOML

[workspace]
authors = ["Your Name <your email from your global Git config>"]
channels = ["conda-forge"]
name = "example"
# This will be whatever your machine's platform is
platforms = ["linux-64"]
version = "0.1.0"

[tasks]

[dependencies]
  • workspace: Defines metadata and properties for the entire project.
  • tasks: Defines tasks for the task runner system to execute from the command line and their dependencies.
  • dependencies: Defines the conda package dependencies from the channels in your workspace table.

Callout

For the rest of the lesson we’ll ignore the authors list in our discussions as it is optional and will be specific to you.

At the moment there are no dependencies defined in the manifest, so let’s add Python using the pixi add CLI API.

BASH

pixi add python

OUTPUT

✔ Added python >=3.13.3,<3.14

What happened? We saw that python got added, and we can see that the pixi.toml manifest now contains python as a dependency

BASH

cat pixi.toml

TOML

[workspace]
channels = ["conda-forge"]
name = "example"
# This will be whatever your machine's platform is
platforms = ["linux-64"]
version = "0.1.0"

[tasks]

[dependencies]
python = ">=3.13.3,<3.14"

Further, we also now see that a pixi.lock lock file has been created in the project directory as well as a .pixi/ directory.

The .pixi/ directory contains the installed environments. We can see that at the moment there is just one environment named default

Inside the .pixi/envs/default/ directory are all the libraries, header files, and executables that are needed by the environment.

The pixi.lock lock file contains YAML that defines all requested conda package dependencies in the manifest, as well as their dependencies, at the exact versions that were solved for. It provides their full URLs on the conda package index to download from as well as digest information for the exact package to ensure that it is exactly specified and that version, and only that version, will be downloaded and installed in the future. We can even test that now by deleting the installed environment fully with pixi clean and then getting it back (bit for bit) in a few seconds with pixi install.

BASH

pixi clean

OUTPUT

  removed /home/<username>/pixi-lesson/example/.pixi/envs

BASH

 pixi install

OUTPUT

✔ The default environment has been installed.

We can also see all the packages that were installed and are now available for us to use with pixi list

Extending the manifest

Let’s extend this manifest to add the Python library numpy and the Jupyter tools notebook and jupyterlab as dependencies and add a task called lab that will launch Jupyter Lab in the current working directory.

Hint: Look at the Pixi manifest table structure to think how a task might be added. It is fine to read the docs too!

Let’s start at the command line and add the additional dependencies with pixi add

BASH

pixi add numpy notebook jupyterlab

We can manually edit the pixi.toml with a text editor to add a task named lab that when called executes jupyter lab. This is sometimes the easiest thing to do, but we can also use the pixi CLI.

BASH

pixi task add lab "jupyter lab" --description "Launch JupyterLab"

OUTPUT

✔ Added task `lab`: jupyter lab, description = "Launch JupyterLab"

The resulting pixi.toml manifest is

With our new dependencies added to the project manifest and our lab task defined, let’s use all of them together by launching our task using pixi run

BASH

pixi run lab

and we see that Jupyter Lab launches!

Adding the canonical Pixi start task

For Pixi projects, it is canonical to have a start task so that for any Pixi project a user can run

BASH

pixi run start

and begin to explore the project. Add a task called start that depends-on the lab task.

Task overview

A user can also run pixi task list to get a summary of the tasks that are available to them in the workspace.

BASH

pixi task list

OUTPUT

Tasks that can run on this machine:
-----------------------------------
lab, start
Task   Description
lab    Launch JupyterLab
start  Start exploring the Pixi project

Here we used pixi run to execute tasks in the workspace’s environments without ever explicitly activating the environment. This is a different behavior compared to tools like conda of Python virtual environments, where the assumption is that you have activated an environment before using it. With Pixi we can do the equivalent with pixi shell, which starts a subshell in the current working directory with the Pixi environment activated.

BASH

pixi shell

Notice how your shell prompt now has (example) (the workspace name) preceding it, signaling to you that you’re in the activated environment. You can now directly run commands that use the environment.

BASH

python

OUTPUT

Python 3.13.5 | packaged by conda-forge | (main, Jun 13 2025, 01:14:40) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

As we’re in a subshell, to exit the environment and move back to the shell that launched the subshell, just exit the shell.

BASH

exit

Multi platform projects

Extend your project to additionally support the linux-64, osx-arm64, and win-64 platforms.

Using the pixi workspace CLI API, one can add the platforms with

BASH

pixi workspace platform add linux-64 osx-arm64 win-64

OUTPUT

✔ Added linux-64
✔ Added osx-arm64
✔ Added win-64

This both adds the platforms to the workspace platforms list and solves for the platforms and updates the lock file!

One can also manually edit the pixi.toml with a text editor to add the desired platforms to the platforms list.

The resulting pixi.toml manifest is

TOML

[workspace]
channels = ["conda-forge"]
name = "example"
platforms = ["linux-64", "osx-arm64", "win-64"]
version = "0.1.0"

[tasks.lab]
description = "Launch JupyterLab"
cmd = "jupyter lab"

[tasks.start]
description = "Start exploring the Pixi project"
depends-on = ["lab"]

[dependencies]
python = ">=3.13.5,<3.14"
numpy = ">=2.3.0,<3"
notebook = ">=7.4.3,<8"
jupyterlab = ">=4.4.3,<5"

So far the Pixi project has only had one environment defined in it. We can make the project multi-environment by first defining a new “feature” which provides all the fields necessary to define part of an environment to extend the default environment. We can create a new feature named dev and then create an environment also named dev which uses the dev feature to extend the default environment

TOML

[workspace]
channels = ["conda-forge"]
name = "example"
platforms = ["linux-64", "osx-arm64", "win-64"]
version = "0.1.0"

[tasks.lab]
description = "Launch JupyterLab"
cmd = "jupyter lab"

[tasks.start]
description = "Start exploring the Pixi project"
depends-on = ["lab"]

[dependencies]
python = ">=3.13.5,<3.14"
numpy = ">=2.3.0,<3"
notebook = ">=7.4.3,<8"
jupyterlab = ">=4.4.3,<5"

[feature.dev.dependencies]

[environments]
dev = ["dev"]

We can now add pre-commit to the dev feature’s dependencies and have it be accessible in the dev environment.

BASH

pixi add --feature dev pre-commit

OUTPUT

✔ Added pre-commit >=4.2.0,<5
Added these only for feature: dev

TOML

[workspace]
channels = ["conda-forge"]
name = "example"
platforms = ["linux-64", "osx-arm64", "win-64"]
version = "0.1.0"

[tasks.lab]
description = "Launch JupyterLab"
cmd = "jupyter lab"

[tasks.start]
description = "Start exploring the Pixi project"
depends-on = ["lab"]

[dependencies]
python = ">=3.13.5,<3.14"
numpy = ">=2.3.0,<3"
notebook = ">=7.4.3,<8"
jupyterlab = ">=4.4.3,<5"

[feature.dev.dependencies]
pre-commit = ">=4.2.0,<5"

[environments]
dev = ["dev"]

This now allows us to specify the environment we want tasks to run in with the --environment flag

BASH

pixi run --environment dev pre-commit --help

BASH

pixi shell --environment dev

Caution

The pixi workspace CLI can also be used to add existing featues to environments, but a feature needs to be defined before it can be added to the manifest

BASH

pixi add --feature dev pre-commit
pixi workspace environment add --feature dev dev
pixi upgrade --feature dev pre-commit

Global Tools


With the pixi global CLI API, users can manage globally installed tools in a way that makes them available from any directory on their machine.

As an example, we can install the bat program — a cat clone with syntax highlighting and Git integration — as a global utility from conda-forge using pixi global.

BASH

pixi global install bat

OUTPUT

└── bat: 0.25.0 (installed)
    └─ exposes: bat

Pixi has now installed bat for us in a custom environment under ~/.pixi/envs/bat/ and then exposed the bat command globally by placing bat on our shell’s PATH at ~/.pixi/bin/bat. This now means that for any new terminal shell we open, bat will be available to use.

Using pixi global has also created a ~/.pixi/manifests/pixi-global.toml file that tracks all of the software that is globally installed by Pixi

TOML

version = 1

[envs.bat]
channels = ["conda-forge"]
dependencies = { bat = "*" }
exposed = { bat = "bat" }

As new software is added to the system with pixi global this global manifest is updated. If the global manifest is updated manually, the next time pixi global update is run, the environments defined in the global manifest will be installed on the system. This means that by sharing a Pixi global manifest, a new machine can be provisioned with an entire suite of command line tools in seconds.

Key Points

  • Pixi uses a project based workflow and a declarative project manifest file to define project operations.
  • Pixi automatically creates or updates a hash level lock file anytime the project manifest or dependencies are mutated.
  • Pixi allows for multi-platform and multi-environment projects to be defined in a single project manifest and be fully described in a single lock file.

Content from Backwards compatibility with conda


Last updated on 2025-06-15 | Edit this page

Estimated time: 15 minutes

Overview

Questions

  • Can Pixi environments be backported to conda formats?

Objectives

  • Learn how to export Pixi workspace environments as conda environment definition files
  • Learn how to export Pixi workspace environments as conda explicit spec files

Backporting to conda environments


While Pixi is currently unique in its abilities, there may be situations in which given technical debt, migration effort in large collaborations, or collaborator preferences that switching all infrastructure to use Pixi might not yet be feasible. It would still be useful to take advantage of Pixi’s technology and features as an individual but be able to export Pixi workspace environments and lock files to the “legacy system” of conda. 1 Luckily, we can do this with the pixi workspace export commands.

Exporting workspace environments to conda environment definition files


If you want to export a Pixi workspace environment’s high level dependencies to a conda environment definition file (environment.yaml) you can use the pixi workspace export conda-environment subcommand

BASH

pixi workspace export conda-environment --environment <environment> --platform <platform> environment.yaml

where if no environment or platform options are given the default environment and the system’s platform will be used.

Export one of your Pixi workspace environments to a conda environment

BASH

pixi workspace export conda-environment environment.yaml

YAML

name: default
channels:
- conda-forge
- nodefaults
dependencies:
- python >=3.13.5,<3.14
- numpy >=2.3.0,<3
- notebook >=7.4.3,<8
- jupyterlab >=4.4.3,<5

Exporting workspace environments to conda explicit spec files


We’d like to ideally go further than the high level conda environment definition file and aim for computational reproducibility with a conda explicit spec file. Conda explicit spec files are a form of platform specific lock files that consist of a text file with an @EXPLICIT header followed by a list of conda package URLs, optionally followed by their MD5 or SHA256 digest (aka, “hash”).

Example:

TXT

@EXPLICIT
https://conda.anaconda.org/conda-forge/noarch/python_abi-3.13-7_cp313.conda#e84b44e6300f1703cb25d29120c5b1d8

Explicit spec files can be created from locked Pixi workspace environments with the pixi workspace export conda-explicit-spec subcommand

BASH

pixi workspace export conda-explicit-spec --environment <environment> --platform <platform> .

where if no environment or platform options are given the default environment and the system’s platform will be used. The explicit spec file will be automatically named with the form <environment>_<platform>_conda_spec.txt. So if you are on a linux-64 machine and didn’t specify an environment name, your generated explicit spec file will be named default_linux-64_conda_spec.txt.

Caution

Conda spec files only support conda packages and do not support Python packages or source packages.

Export one of your Pixi workspace environment lock files as a conda explicit spec file

Hint: Check the --help output.

BASH

pixi workspace export conda-explicit-spec --platform linux-64 --platform osx-arm64 .

BASH

head -n 20 default_linux-64_conda_spec.txt

OUTPUT

# Generated by `pixi workspace export`
# platform: linux-64
@EXPLICIT
https://conda.anaconda.org/conda-forge/noarch/python_abi-3.13-7_cp313.conda#e84b44e6300f1703cb25d29120c5b1d8
https://conda.anaconda.org/conda-forge/noarch/tzdata-2025b-h78e105d_0.conda#4222072737ccff51314b5ece9c7d6f5a
https://conda.anaconda.org/conda-forge/linux-64/libgomp-15.1.0-h767d61c_2.conda#fbe7d535ff9d3a168c148e07358cd5b1
https://conda.anaconda.org/conda-forge/linux-64/_libgcc_mutex-0.1-conda_forge.tar.bz2#d7c89558ba9fa0495403155b64376d81
https://conda.anaconda.org/conda-forge/linux-64/_openmp_mutex-4.5-2_gnu.tar.bz2#73aaf86a425cc6e73fcf236a5a46396d
https://conda.anaconda.org/conda-forge/linux-64/libgcc-15.1.0-h767d61c_2.conda#ea8ac52380885ed41c1baa8f1d6d2b93
https://conda.anaconda.org/conda-forge/linux-64/libzlib-1.3.1-hb9d3cd8_2.conda#edb0dca6bc32e4f4789199455a1dbeb8
https://conda.anaconda.org/conda-forge/linux-64/tk-8.6.13-noxft_hd72426e_102.conda#a0116df4f4ed05c303811a837d5b39d8
https://conda.anaconda.org/conda-forge/linux-64/ncurses-6.5-h2d0b736_3.conda#47e340acb35de30501a76c7c799c41d7
https://conda.anaconda.org/conda-forge/linux-64/readline-8.2-h8c095d6_2.conda#283b96675859b20a825f8fa30f311446
https://conda.anaconda.org/conda-forge/noarch/ca-certificates-2025.4.26-hbd8a1cb_0.conda#95db94f75ba080a22eb623590993167b
https://conda.anaconda.org/conda-forge/linux-64/openssl-3.5.0-h7b32b05_1.conda#de356753cfdbffcde5bb1e86e3aa6cd0
https://conda.anaconda.org/conda-forge/linux-64/libgcc-ng-15.1.0-h69a702a_2.conda#ddca86c7040dd0e73b2b69bd7833d225
https://conda.anaconda.org/conda-forge/linux-64/libuuid-2.38.1-h0b41bf4_0.conda#40b61aab5c7ba9ff276c41cfffe6b80b
https://conda.anaconda.org/conda-forge/linux-64/libsqlite-3.50.0-hee588c1_0.conda#71888e92098d0f8c41b09a671ad289bc
https://conda.anaconda.org/conda-forge/linux-64/libmpdec-4.0.0-hb9d3cd8_0.conda#c7e925f37e3b40d893459e625f6a53f1
https://conda.anaconda.org/conda-forge/linux-64/liblzma-5.8.1-hb9d3cd8_1.conda#a76fd702c93cd2dfd89eff30a5fd45a8

BASH

head -n 20 default_osx-arm64_conda_spec.txt

OUTPUT

# Generated by `pixi workspace export`
# platform: osx-arm64
@EXPLICIT
https://conda.anaconda.org/conda-forge/noarch/python_abi-3.13-7_cp313.conda#e84b44e6300f1703cb25d29120c5b1d8
https://conda.anaconda.org/conda-forge/noarch/tzdata-2025b-h78e105d_0.conda#4222072737ccff51314b5ece9c7d6f5a
https://conda.anaconda.org/conda-forge/osx-arm64/libzlib-1.3.1-h8359307_2.conda#369964e85dc26bfe78f41399b366c435
https://conda.anaconda.org/conda-forge/osx-arm64/tk-8.6.13-h892fb3f_2.conda#7362396c170252e7b7b0c8fb37fe9c78
https://conda.anaconda.org/conda-forge/osx-arm64/ncurses-6.5-h5e97a16_3.conda#068d497125e4bf8a66bf707254fff5ae
https://conda.anaconda.org/conda-forge/osx-arm64/readline-8.2-h1d1bf99_2.conda#63ef3f6e6d6d5c589e64f11263dc5676
https://conda.anaconda.org/conda-forge/noarch/ca-certificates-2025.4.26-hbd8a1cb_0.conda#95db94f75ba080a22eb623590993167b
https://conda.anaconda.org/conda-forge/osx-arm64/openssl-3.5.0-h81ee809_1.conda#5c7aef00ef60738a14e0e612cfc5bcde
https://conda.anaconda.org/conda-forge/osx-arm64/libsqlite-3.50.0-h3f77e49_0.conda#cda0ec640bc4698d0813a8fb459aee58
https://conda.anaconda.org/conda-forge/osx-arm64/libmpdec-4.0.0-h5505292_0.conda#85ccccb47823dd9f7a99d2c7f530342f
https://conda.anaconda.org/conda-forge/osx-arm64/liblzma-5.8.1-h39f12f2_1.conda#4e8ef3d79c97c9021b34d682c24c2044
https://conda.anaconda.org/conda-forge/osx-arm64/libffi-3.4.6-h1da3d7d_1.conda#c215a60c2935b517dcda8cad4705734d
https://conda.anaconda.org/conda-forge/osx-arm64/libexpat-2.7.0-h286801f_0.conda#6934bbb74380e045741eb8637641a65b
https://conda.anaconda.org/conda-forge/osx-arm64/bzip2-1.0.8-h99b78c6_7.conda#fc6948412dbbbe9a4c9ddbbcfe0a79ab
https://conda.anaconda.org/conda-forge/osx-arm64/python-3.13.3-h81fe080_101_cp313.conda#b3240ae8c42a3230e0b7f831e1c72e9f
https://conda.anaconda.org/conda-forge/osx-arm64/llvm-openmp-20.1.6-hdb05f8b_0.conda#7a3b28d59940a28e761e0a623241a832
https://conda.anaconda.org/conda-forge/osx-arm64/libgfortran5-14.2.0-h2c44a93_105.conda#06f35a3b1479ec55036e1c9872f97f2c

Caution

While conda spec files meet our criteria for computational reproducibility, they are essentially package list snapshots and lack the metadata to provide robust dependency graph inspection and targeted updates. They can be a useful tool, but are not robust lock file formats like those from Pixi and conda-lock.

Creating conda environments from the exports


To create a conda environment from the exported environment.yaml conda environment definition file, you use the normal conda environment creation command

BASH

conda env create --file environment.yaml

but to create a conda environment from the exported conda explicit spec file, use the command

BASH

conda create --name <name> --file <spec file>

or to install the packages given in the explicit spec file into an existing conda environment, use

BASH

conda install --name <name> --file <spec file>

So by using Pixi, you can fully export your workspace environments to conda environments and then use them, even to get the exact hash level locked environment from your Pixi workspace installed on another machine!

Caution

Conda does not check that the platform is correct for the machine or the dependencies given when installing from an explicit spec file. Only use spec files when you are certain that you have the same platform machine as the machine that created the spec file.

Key Points

  • If you need to use conda, you can export Pixi workspace environment to formats conda can use.
  • Exporting conda explicit spec files from Pixi locked environments provides the ability to create the same hash level locked environment with conda that Pixi solved.

  1. Conda is still a very well supported tool and the dominant conda package environment manager by numbers of users.↩︎

Content from Conda packages


Last updated on 2025-06-15 | Edit this page

Estimated time: 45 minutes

Overview

Questions

  • What is a conda package?

Objectives

  • Learn about conda package structure

Conda packages


In a previous episode we learned that Pixi can control conda packages, but what is a conda package? Conda packages (.conda files) are language agnostic file archives that contain built code distributions. This is quite powerful, as it allows for arbitrary code to be built for any target platform and then packaged with its metadata. When a conda package is downloaded and then unpacked with a conda package management tool (e.g. Pixi, conda, mamba) the only thing that needs to be done to “install” that package is just copy the package’s file directory tree to the base of the environment’s directory tree. Package contents are also simple; they can only contain files and symbolic links.

Exploring package structure

To better understand conda packages and the environment directory tree structure they exist in, let’s make a new Pixi project and look at the project environment directory tree structure.

BASH

pixi init ~/pixi-lesson/dir-structure
cd ~/pixi-lesson/dir-structure

OUTPUT

✔ Created /home/<username>/pixi-lesson/dir-structure/pixi.toml

To help visualize this on the command line we’ll use the tree program (Linux and macOS), which we’ll install as a global utility from conda-forge using pixi global.

At the moment our Pixi project manifest is empty

TOML

[workspace]
channels = ["conda-forge"]
name = "dir-structure"
platforms = ["linux-64"]
version = "0.1.0"

[tasks]

[dependencies]

and so is our directory tree

Let’s add a dependency to our project to change that

BASH

pixi add python

OUTPUT

✔ Added python >=3.13.3,<3.14

which now gives us an update Pixi manifest

TOML

[workspace]
channels = ["conda-forge"]
name = "dir-structure"
platforms = ["linux-64"]
version = "0.1.0"

[tasks]

[dependencies]
python = ">=3.13.3,<3.14"

and the start of a directory tree with the .pixi/ directory

Let’s now use tree to look at the directory structure of the Pixi project starting at the same directory where the pixi.toml manifest file is.

We see that the default environment that Pixi created has the standard directory tree layout for operating systems following the Filesystem Hierarchy Standard (FHS) (e.g. Unix machines)

  • bin: for binary executables
  • include: for include files (e.g. header files in C/C++)
  • lib: for binary libraries
  • share: for files and data that other libraries or applications might need from the installed programs

as well as some less common ones related to system administration

  • man: for manual pages
  • sbin: for system binaries
  • ssl: for SSL (Secure Sockets Layer) certificates to provide secure encryption when connecting to websites

as well as other directories that are specific to conda packages

  • conda-meta: for metadata for all installed conda packages
  • x86_64-conda_cos6-linux-gnu and x86_64-conda-linux-gnu: for platform specific tools (like linkers) — this will vary depending on your operating system

How did this directory tree get here? It is a result of all the files that were in the conda packages we downloaded and installed as dependencies of Python.

We can download an individual conda package manually using a tool like curl. Let’s download the particular python conda package from where it is hosted on conda-forge’s Anaconda.org organization

.conda is probably not a file extension that you’ve seen before, but you are probably very familiar with the actual archive compression format. .conda files are .zip files that have been renamed, but we can use the same utilities to interact with them as we would with .zip files.

We see that the .conda archive contained package format metadata (metadata.json) as well as two other tar archives compressed with the Zstandard compression algorithm (.tar.zst). We can uncompress them manually with tar

BASH

cd output
mkdir -p pkg
tar --zstd -xvf pkg-python-*.tar.zst --directory pkg

and then look at the uncompressed directory tree with tree

So we can see that the directory structure of the conda package

is reflected in the directory tree of the Pixi environment with the package installed

Discussion

Hopefully this seems straightforward and unmagical — because it is! Conda package structure is simple and easy to understand because it builds off of basic file system structures and doesn’t try to invent new systems. It is important to demystify what is happening with the directory tree structure though so that we keep in our minds that our tools are just manipulating files.

Exploring conda-forge

As of 2025 conda-forge has over 28,500 packages on it. Go to the conda-forge package list website (https://conda-forge.org/packages/) and try to find three packages that you use in your research, and three packages from your scientific field that are more niche.

Research packages:

Niche particle physics packages:

Key Points

  • Conda packages are specially named .zip files that contain files and symbolic links structured in a directory tree.

Content from CUDA conda packages


Last updated on 2025-06-15 | Edit this page

Estimated time: 45 minutes

Overview

Questions

  • What is CUDA?
  • How can I use CUDA enabled conda packages?

Objectives

  • Understand how CUDA can be used with conda packages
  • Create a hardware accelerated environment

CUDA


CUDA (Compute Unified Device Architecture) is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). The CUDA ecosystem provides software developer software development kits (SDK) with APIs to CUDA that allow for software developers to write hardware accelerated programs with CUDA in various languages for NVIDIA GPUs. CUDA supports a number of languages including C, C++, Fortran, Python, and Julia. While there are other types of hardware acceleration development platforms, as of 2025 CUDA is the most abundant platform for scientific computing that uses GPUs and effectively the default choice for major machine learning libraries and applications.

CUDA is closed source and proprietary to NVIDIA, which means that NVIDIA has historically limited the download access of the CUDA toolkits and drivers to registered NVIDIA developers (while keeping the software free (monetarily) to use). CUDA then required a multi-step installation process with manual steps and decisions based on the target platform and particular CUDA version. This meant that when CUDA enabled environments were setup on a particular machine they were powerful and optimized, but brittle to change and could easily be broken if system wide updates (like for security fixes) occurred. CUDA software environments were bespoke and not many scientists understood how to construct and curate them.

CUDA on conda-forge


In late 2018 to better support the scientific developer community, NVIDIA started to release components of the CUDA toolkits on the nvidia conda channel. This provided the first access to start to create conda environments where the versions of different CUDA tools could be directly specified and downloaded. However, all of this work was being done internally in NVIDIA and as it was on a separate channel it was less visible and it still required additional knowledge to work with. In 2023, NVIDIA’s open source team began to move the release of CUDA conda packages from the nvidia channel to conda-forge, making it easier to discover and allowing for community support. With significant advancements in system driver specification support, CUDA 12 became the first version of CUDA to be released as conda packages through conda-forge and included all CUDA libraries from the CUDA compiler nvcc to the CUDA development libraries. They also released CUDA metapackages that allowed users to easily describe the version of CUDA they required (e.g. cuda-version=12.5) and the CUDA conda packages they wanted (e.g. cuda). This significantly improved the ability for researchers to easily create CUDA accelerated computing environments.

This is all possible via use of the __cuda virtual conda package, which is determined automatically by conda package managers from the hardware information associated with the machine the package manager is installed on.

With Pixi, a user can get this information with pixi info, which could have output that looks something like

BASH

pixi info

OUTPUT

System
------------
       Pixi version: 0.48.0
           Platform: linux-64
   Virtual packages: __unix=0=0
                   : __linux=6.8.0=0
                   : __glibc=2.35=0
                   : __cuda=12.4=0
                   : __archspec=1=skylake
          Cache dir: /home/<username>/.cache/rattler/cache
       Auth storage: /home/<username>/.rattler/credentials.json
   Config locations: No config files found

Global
------------
            Bin dir: /home/<username>/.pixi/bin
    Environment dir: /home/<username>/.pixi/envs
       Manifest dir: /home/<username>/.pixi/manifests/pixi-global.toml

CUDA use with Pixi


To be able to effectively use CUDA conda packages with Pixi, we make use of Pixi’s system requirement workspace table, which specifies the minimum system specifications needed to install and run a Pixi workspace’s environments.

To do this for CUDA, we just add the minimum supported CUDA version (based on the host machine’s NVIDIA driver API) we want to support to the table.

Example:

TOML

[system-requirements]
cuda = "12"  # Replace "12" with the specific CUDA version you intend to use

This ensures that packages depending on __cuda >= {version} are resolved correctly.

To demonstrate this a bit more explicitly, we can create a minimal project

BASH

pixi init ~/pixi-lesson/cuda-example
cd ~/pixi-lesson/cuda-example

OUTPUT

✔ Created /home/<username>/pixi-lesson/cuda-example/pixi.toml

where we specify a cuda system requirement

BASH

pixi workspace system-requirements add cuda 12

TOML

[workspace]
channels = ["conda-forge"]
name = "cuda-example"
platforms = ["linux-64"]
version = "0.1.0"

[system-requirements]
cuda = "12"

[tasks]

[dependencies]

system-requirements table can’t be target specific

As of Pixi v0.48.0, the system-requirements table can’t be target specific. To work around this, if you’re on a platform that doesn’t support the system-requirements it will ignore them without erroring unless they are required for the platform specific packages or actions you have. So, for example, you can have osx-arm64 as a platform and a system-requirements of cuda = "12" defined

TOML

[workspace]
...
platforms = ["linux-64"]
...

[system-requirements]
cuda = "12"

Pixi will ignore that requirement unless you try to use CUDA packages in osx-arm64 environments.

and then install the cuda-version metapacakge

BASH

pixi add "cuda-version 12.9.*"

OUTPUT

✔ Added cuda-version 12.9.*

TOML

[workspace]
channels = ["conda-forge"]
name = "cuda-example"
platforms = ["linux-64"]
version = "0.1.0"

[system-requirements]
cuda = "12"

[tasks]

[dependencies]
cuda-version = "12.9.*"

If we look at the metadata installed by the cuda-version package (the only thing it does)

BASH

$ cat .pixi/envs/default/conda-meta/cuda-version-*.json

JSON

{
  "build": "h4f385c5_3",
  "build_number": 3,
  "constrains": [
    "cudatoolkit 12.9|12.9.*",
    "__cuda >=12"
  ],
  "depends": [],
  "license": "LicenseRef-NVIDIA-End-User-License-Agreement",
  "md5": "b6d5d7f1c171cbd228ea06b556cfa859",
  "name": "cuda-version",
  "noarch": "generic",
  "sha256": "5f5f428031933f117ff9f7fcc650e6ea1b3fef5936cf84aa24af79167513b656",
  "size": 21578,
  "subdir": "noarch",
  "timestamp": 1746134436166,
  "version": "12.9",
  "fn": "cuda-version-12.9-h4f385c5_3.conda",
  "url": "https://conda.anaconda.org/conda-forge/noarch/cuda-version-12.9-h4f385c5_3.conda",
  "channel": "https://conda.anaconda.org/conda-forge/",
  "extracted_package_dir": "/home/<username>/.cache/rattler/cache/pkgs/cuda-version-12.9-h4f385c5_3",
  "files": [],
  "paths_data": {
    "paths_version": 1,
    "paths": []
  },
  "link": {
    "source": "/home/<username>/.cache/rattler/cache/pkgs/cuda-version-12.9-h4f385c5_3",
    "type": 1
  }
}

we see that it now enforces constraints on the versions of cudatoolkit that can be installed as well as the required __cuda virtual package provided by the system

JSON

{
  ...
  "constrains": [
    "cudatoolkit 12.9|12.9.*",
    "__cuda >=12"
  ],
  ...
}

Use the feature table to solve environment that your platform doesn’t support

CUDA is supported only by NVIDIA GPUs, which means that macOS operating system platforms (osx-64, osx-arm64) can’t support it. Similarly, if you machine doesn’t have an NVIDIA GPU, then the __cuda virtual package won’t exist and installs of CUDA packages will fail. However, there’s many situations in which you want to solve and environment for a platform that you don’t have and we can do this for CUDA as well.

If we make the Pixi workspace multiplatform

BASH

pixi workspace platform add linux-64 osx-arm64 win-64

OUTPUT

✔ Added linux-64
✔ Added osx-arm64
✔ Added win-64

TOML

[workspace]
channels = ["conda-forge"]
name = "cuda-example"
platforms = ["linux-64", "osx-arm64", "win-64"]
version = "0.1.0"

[tasks]

[dependencies]

We can then use Pixi’s platform specific target tables to add dependencies for an environment to only a specific platform. So, if we know that a dependency only exists for platform then we can have Pixi add it for only that platform with

BASH

pixi add --platform <platform> <dependency>

This now means that if we ask for any CUDA enbabled packages, we will get ones that are built to support cudatoolkit v12.9.*

BASH

pixi add --platform linux-64 cuda

OUTPUT

✔ Added cuda >=12.9.1,<13
Added these only for platform(s): linux-64

BASH

pixi list --platform linux-64 cuda

OUTPUT

Package                      Version  Build       Size       Kind   Source
cuda                         12.9.1   ha804496_0  26.7 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-cccl_linux-64           12.9.27  ha770c72_0  1.1 MiB    conda  https://conda.anaconda.org/conda-forge/
cuda-command-line-tools      12.9.1   ha770c72_0  20 KiB     conda  https://conda.anaconda.org/conda-forge/
cuda-compiler                12.9.1   hbad6d8a_0  20.2 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-crt-dev_linux-64        12.9.86  ha770c72_1  92.2 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-crt-tools               12.9.86  ha770c72_1  28.2 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-cudart                  12.9.79  h5888daf_0  22.7 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-cudart-dev              12.9.79  h5888daf_0  23.1 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-cudart-dev_linux-64     12.9.79  h3f2d84a_0  380 KiB    conda  https://conda.anaconda.org/conda-forge/
cuda-cudart-static           12.9.79  h5888daf_0  22.7 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-cudart-static_linux-64  12.9.79  h3f2d84a_0  1.1 MiB    conda  https://conda.anaconda.org/conda-forge/
cuda-cudart_linux-64         12.9.79  h3f2d84a_0  192.6 KiB  conda  https://conda.anaconda.org/conda-forge/
cuda-cuobjdump               12.9.82  hbd13f7d_0  237.5 KiB  conda  https://conda.anaconda.org/conda-forge/
cuda-cupti                   12.9.79  h9ab20c4_0  1.8 MiB    conda  https://conda.anaconda.org/conda-forge/
cuda-cupti-dev               12.9.79  h9ab20c4_0  4.4 MiB    conda  https://conda.anaconda.org/conda-forge/
cuda-cuxxfilt                12.9.82  hbd13f7d_0  211.4 KiB  conda  https://conda.anaconda.org/conda-forge/
cuda-driver-dev              12.9.79  h5888daf_0  22.5 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-driver-dev_linux-64     12.9.79  h3f2d84a_0  36.8 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-gdb                     12.9.79  ha677faa_0  378.2 KiB  conda  https://conda.anaconda.org/conda-forge/
cuda-libraries               12.9.1   ha770c72_0  20 KiB     conda  https://conda.anaconda.org/conda-forge/
cuda-libraries-dev           12.9.1   ha770c72_0  20.1 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-nsight                  12.9.79  h7938cbb_0  113.2 MiB  conda  https://conda.anaconda.org/conda-forge/
cuda-nvcc                    12.9.86  hcdd1206_1  24.3 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-nvcc-dev_linux-64       12.9.86  he91c749_1  13.8 MiB   conda  https://conda.anaconda.org/conda-forge/
cuda-nvcc-impl               12.9.86  h85509e4_1  26.6 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-nvcc-tools              12.9.86  he02047a_1  26.2 MiB   conda  https://conda.anaconda.org/conda-forge/
cuda-nvcc_linux-64           12.9.86  he0b4e1d_1  26.2 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-nvdisasm                12.9.88  hbd13f7d_0  5.3 MiB    conda  https://conda.anaconda.org/conda-forge/
cuda-nvml-dev                12.9.79  hbd13f7d_0  139.1 KiB  conda  https://conda.anaconda.org/conda-forge/
cuda-nvprof                  12.9.79  hcf8d014_0  2.5 MiB    conda  https://conda.anaconda.org/conda-forge/
cuda-nvprune                 12.9.82  hbd13f7d_0  69.3 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-nvrtc                   12.9.86  h5888daf_0  64.1 MiB   conda  https://conda.anaconda.org/conda-forge/
cuda-nvrtc-dev               12.9.86  h5888daf_0  35.7 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-nvtx                    12.9.79  h5888daf_0  28.6 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-nvvm-dev_linux-64       12.9.86  ha770c72_1  26.3 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-nvvm-impl               12.9.86  he02047a_1  20.4 MiB   conda  https://conda.anaconda.org/conda-forge/
cuda-nvvm-tools              12.9.86  he02047a_1  23.1 MiB   conda  https://conda.anaconda.org/conda-forge/
cuda-nvvp                    12.9.79  hbd13f7d_0  104.3 MiB  conda  https://conda.anaconda.org/conda-forge/
cuda-opencl                  12.9.19  h5888daf_0  30 KiB     conda  https://conda.anaconda.org/conda-forge/
cuda-opencl-dev              12.9.19  h5888daf_0  95.1 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-profiler-api            12.9.79  h7938cbb_0  23 KiB     conda  https://conda.anaconda.org/conda-forge/
cuda-runtime                 12.9.1   ha804496_0  19.9 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-sanitizer-api           12.9.79  hcf8d014_0  8.6 MiB    conda  https://conda.anaconda.org/conda-forge/
cuda-toolkit                 12.9.1   ha804496_0  20 KiB     conda  https://conda.anaconda.org/conda-forge/
cuda-tools                   12.9.1   ha770c72_0  19.9 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-version                 12.9     h4f385c5_3  21.1 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-visual-tools            12.9.1   ha770c72_0  19.9 KiB   conda  https://conda.anaconda.org/conda-forge/

To “prove” that this works, we can ask for the CUDA enabled version of PyTorch

BASH

pixi add --platform linux-64 pytorch-gpu

OUTPUT

✔ Added pytorch-gpu >=2.7.0,<3
Added these only for platform(s): linux-64

BASH

pixi list --platform linux-64 torch

OUTPUT

Package      Version  Build                           Size       Kind   Source
libtorch     2.7.0    cuda126_mkl_h99b69db_300        566.9 MiB  conda  https://conda.anaconda.org/conda-forge/
pytorch      2.7.0    cuda126_mkl_py312_h30b5a27_300  27.8 MiB   conda  https://conda.anaconda.org/conda-forge/
pytorch-gpu  2.7.0    cuda126_mkl_ha999a5f_300        46.1 KiB   conda  https://conda.anaconda.org/conda-forge/

TOML

[workspace]
channels = ["conda-forge"]
name = "cuda-example"
platforms = ["linux-64", "osx-arm64", "win-64"]
version = "0.1.0"

[system-requirements]
cuda = "12"

[tasks]

[dependencies]
cuda-version = "12.9.*"

[target.linux-64.dependencies]
cuda = ">=12.9.1,<13"
pytorch-gpu = ">=2.7.0,<3"

Redundancy in example

Note that we added the cuda package here for demonstraton purposes, but we didn’t need to as it would already be installed as a dependency of pytorch-gpu.

BASH

cat .pixi/envs/default/conda-meta/pytorch-gpu-*.json

JSON

{
  "build": "cuda126_mkl_ha999a5f_300",
  "build_number": 300,
  "depends": [
    "pytorch 2.7.0 cuda*_mkl*300"
  ],
  "license": "BSD-3-Clause",
  "license_family": "BSD",
  "md5": "84ecafc34c6f8933c2c9b00204832e38",
  "name": "pytorch-gpu",
  "sha256": "e1162a51e77491abae15f6b651ba8f064870181d57d40f9168747652d0f70cb0",
  "size": 47219,
  "subdir": "linux-64",
  "timestamp": 1746288556375,
  "version": "2.7.0",
  "fn": "pytorch-gpu-2.7.0-cuda126_mkl_ha999a5f_300.conda",
  "url": "https://conda.anaconda.org/conda-forge/linux-64/pytorch-gpu-2.7.0-cuda126_mkl_ha999a5f_300.conda",
  "channel": "https://conda.anaconda.org/conda-forge/",
  "extracted_package_dir": "/home/<username>/.cache/rattler/cache/pkgs/pytorch-gpu-2.7.0-cuda126_mkl_ha999a5f_300",
  "files": [],
  "paths_data": {
    "paths_version": 1,
    "paths": []
  },
  "link": {
    "source": "/home/<username>/.cache/rattler/cache/pkgs/pytorch-gpu-2.7.0-cuda126_mkl_ha999a5f_300",
    "type": 1
  }
}

and if on the supported linux-64 platform with a valid __cuda virtual pacakge check that it can see and find GPUs

PYTHON

# torch_detect_GPU.py
import torch
from torch import cuda

if __name__ == "__main__":
    if torch.backends.cuda.is_built():
        print(f"PyTorch build CUDA version: {torch.version.cuda}")
        print(f"PyTorch build cuDNN version: {torch.backends.cudnn.version()}")
        print(f"PyTorch build NCCL version: {torch.cuda.nccl.version()}")

        print(f"\nNumber of GPUs found on system: {cuda.device_count()}")

    if cuda.is_available():
        print(f"\nActive GPU index: {cuda.current_device()}")
        print(f"Active GPU name: {cuda.get_device_name(cuda.current_device())}")
    elif torch.backends.mps.is_available():
        mps_device = torch.device("mps")
        print(f"PyTorch has active GPU: {mps_device}")
    else:
        print(f"PyTorch has no active GPU")

BASH

pixi run python torch_detect_GPU.py

OUTPUT

PyTorch build CUDA version: 12.6
PyTorch build cuDNN version: 91001
PyTorch build NCCL version: (2, 26, 5)

Number of GPUs found on system: 1

Active GPU index: 0
Active GPU name: NVIDIA GeForce RTX 4060 Laptop GPU

Multi-environment Pixi workspaces

Create a new Pixi workspace that:

  • Contains an environment for linux-64, osx-arm64, and win-64 that supports the CPU version of PyTorch
  • Contains an environment for linux-64 that supports the GPU version of PyTorch
  • Supports CUDA v12.9

Create a new workspace

BASH

pixi init ~/pixi-lesson/cuda-exercise
cd ~/pixi-lesson/cuda-exercise

OUTPUT

✔ Created /home/<username>/pixi-lesson/cuda-exercise/pixi.toml

Add support for all the target platforms

BASH

pixi workspace platform add linux-64 osx-arm64 win-64

OUTPUT

✔ Added linux-64
✔ Added osx-arm64
✔ Added win-64

TOML

[workspace]
channels = ["conda-forge"]
name = "cuda-exercise"
platforms = ["linux-64", "osx-arm64", "win-64"]
version = "0.1.0"

[tasks]

[dependencies]

Add pytorch-cpu to a cpu feature

BASH

pixi add --feature cpu pytorch-cpu

OUTPUT

✔ Added pytorch-cpu
Added these only for feature: cpu

and then create a cpu environment that contains the cpu feature

BASH

pixi workspace environment add --feature cpu cpu

OUTPUT

✔ Added environment cpu

and then instantiate the pytorch-cpu package with a particular version and solve

BASH

pixi add --feature cpu pytorch-cpu

OUTPUT

✔ Added pytorch-cpu >=1.1.0,<3
Added these only for feature: cpu

TOML

[workspace]
channels = ["conda-forge"]
name = "cuda-exercise"
platforms = ["linux-64", "osx-arm64", "win-64"]
version = "0.1.0"

[tasks]

[dependencies]

[feature.cpu.dependencies]
pytorch-cpu = ">=1.1.0,<3"

[environments]
cpu = ["cpu"]

Now, for the GPU environment, add CUDA system-requirements for linux-64 for the gpu feature

BASH

pixi workspace system-requirements add --feature gpu cuda 12

TOML

[workspace]
channels = ["conda-forge"]
name = "cuda-exercise"
platforms = ["linux-64", "osx-arm64", "win-64"]
version = "0.1.0"

[tasks]

[dependencies]

[feature.cpu.dependencies]
pytorch-cpu = ">=1.1.0,<3"

[feature.gpu.system-requirements]
cuda = "12"

[environments]
cpu = ["cpu"]

and create a gpu environment with the gpu feature

BASH

pixi workspace environment add --feature gpu gpu

OUTPUT

✔ Added environment gpu

TOML

[workspace]
channels = ["conda-forge"]
name = "cuda-exercise"
platforms = ["linux-64", "osx-arm64", "win-64"]
version = "0.1.0"

[tasks]

[dependencies]

[feature.cpu.dependencies]
pytorch-cpu = ">=1.1.0,<3"

[feature.gpu.system-requirements]
cuda = "12"

[environments]
cpu = ["cpu"]
gpu = ["gpu"]

then add the cuda-version metapackage and the pytorch-gpu pacakge for linux-64 to the gpu feature

BASH

pixi add --platform linux-64 --feature gpu 'cuda-version 12.9.*' pytorch-gpu

OUTPUT

✔ Added cuda-version 12.9.*
✔ Added pytorch-gpu >=2.7.0,<3
Added these only for platform(s): linux-64
Added these only for feature: gpu

TOML

[workspace]
channels = ["conda-forge"]
name = "cuda-exercise"
platforms = ["linux-64", "osx-arm64", "win-64"]
version = "0.1.0"

[tasks]

[dependencies]

[feature.cpu.dependencies]
pytorch-cpu = ">=1.1.0,<3"

[feature.gpu.system-requirements]
cuda = "12"

[feature.gpu.target.linux-64.dependencies]
cuda-version = "12.9.*"
pytorch-gpu = ">=2.7.0,<3"

[environments]
cpu = ["cpu"]
gpu = ["gpu"]

One can check the environment differences

BASH

pixi list --environment cpu
pixi list --environment gpu

and activate shells with different environments loaded

BASH

pixi shell --environment cpu

So in 23 lines of TOML

BASH

wc -l pixi.toml

OUTPUT

23 pixi.toml

we created separate CPU and GPU computational environments that are now fully reproducible with the associated pixi.lock!

Key Points

  • The cuda-version metapackage can be used to specify constrains on the versions of the __cuda virtual package and cudatoolkit.
  • Pixi can specify a minimum required CUDA version with the [system-requirements] table.
  • Pixi can solve environments for platforms that are not the system platform.
  • NVIDIA’s open source team and the conda-forge community support the CUDA conda packages on conda-forge.
  • The cuda metapackage is the primary place to go for user documetnation on the CUDA conda packages.

Content from Deploying Pixi environments with Linux containers


Last updated on 2025-06-17 | Edit this page

Estimated time: 45 minutes

Overview

Questions

  • How can Pixi environment be deployed to production compute facilities?
  • What tools can be used to achieve this?

Objectives

  • Version control Pixi environments with Git.
  • Create a Linux container that has a production environment.
  • Create an automated GitHub Actions workflow to build and deploy environments.

Deploying Pixi environments


We now know how to create Pixi workspaces that contain environments that can support CUDA enabled code. However, unless your production machine learning environment is a lab desktop with GPUs and lots of disk 1 that you can install Pixi on and run your code then we still need to be able to get our Pixi environments to our production machines.

There is one very straightforward solution:

  1. Version control your Pixi manifest and Pixi lock files with your analysis code with a version control system (e.g. Git).
  2. Clone your repository to the machine that you want to run on.
  3. Install Pixi onto that machine.
  4. Install the locked Pixi environment that you want to use.
  5. Execute your code in the installed environment.

That’s a nice and simple story, and it can work! However, in most realistic scenarios the worker compute nodes that are executing code share resource pools of storage and memory and are regulated to smaller allotments of both. CUDA binaries are relatively large files and amount of memory and storage to just unpack them can easily exceed a standard 2 GB memory limit on most high throughput computing (HTC) facility worker nodes. This also requires direct access to the public internet, or for you to setup a S3 object store behind your compute facility’s firewall with all of your conda packages mirrored into it. In many scenarios, public internet access at HTC and high performance computing (HPC) facilities is limited to only a select “allow list” of websites or it might be fully restricted for users.

Building Linux containers with Pixi environments


A more standard and robust way of distributing computing environments is the use of Linux container technology — like Docker or Apptainer.

Conceptualizing the role of Linux containers

Linux containers are powerful technologies that allow for arbitrary software environments to be distributed as a single binary. However, it is important to not think of Linux containers as packaging technologies (like conda packages) but as distribution technologies. When you build a Linux container you provide a set of imperative commands as a build script that constructs different layers of the container. When the build is finished, all layers of the build are compressed together to form a container image binary that can be distributed through Linux container image registries.

Packaging technologies allow for defining requirements and constraints on a unit of software that we call a “package”. Packages can be installed together and their metadata allows them to be composed programmatically into software environments.

Linux containers take defined software environments and instantiate them by installing them into the container image during the build and then distribute that entire computing environment for a single platform.

Resources on Linux containers

Linux containers are a full topic unto themselves and we won’t cover them in this lesson. If you’re not familiar with Linux containers, here are introductory resources:

If you don’t have a Linux container runtime on your machine don’t worry — for the first part of this episode you can follow along reading and then we’ll transition to automation.

Building Docker containers with Pixi environments


Docker is a very common Linux container runtime technology and Linux container builder. We can use docker build to build a Linux container from a Dockerfile instruction file. Luckily, to install Pixi environments into Docker container images there is effectively only one Dockerfile recipe that needs to be used, and then can be reused across projects.

Moving files

To use it later, move torch_detect_GPU.py from the end of the CUDA conda packages episode to ./app/torch_detect_GPU.py.

BASH

mv torch_detect_GPU.py app/

DOCKERFILE

FROM ghcr.io/prefix-dev/pixi:noble AS build

WORKDIR /app
COPY . .
ENV CONDA_OVERRIDE_CUDA=<cuda version>
RUN pixi install --locked --environment <environment>
RUN echo "#!/bin/bash" > /app/entrypoint.sh && \
    pixi shell-hook --environment <environment> -s bash >> /app/entrypoint.sh && \
    echo 'exec "$@"' >> /app/entrypoint.sh

FROM ghcr.io/prefix-dev/pixi:noble AS production

WORKDIR /app
COPY --from=build /app/.pixi/envs/<environment> /app/.pixi/envs/<environment>
COPY --from=build /app/pixi.toml /app/pixi.toml
COPY --from=build /app/pixi.lock /app/pixi.lock
# The ignore files are needed for 'pixi run' to work in the container
COPY --from=build /app/.pixi/.gitignore /app/.pixi/.gitignore
COPY --from=build /app/.pixi/.condapackageignore /app/.pixi/.condapackageignore
COPY --from=build --chmod=0755 /app/entrypoint.sh /app/entrypoint.sh
COPY ./app /app/src

EXPOSE <PORT>
ENTRYPOINT [ "/app/entrypoint.sh" ]

Let’s step through this to understand what’s happening. Dockerfiles (intentionally) look very shell script like, and so we can read most of it as if we were typing the commands directly into a shell (e.g. Bash).

  • The Dockerfile assumes it is being built from a version control repository where any code that it will need to execute later exists under the repository’s src/ directory and the Pixi workspace’s pixi.toml manifest file and pixi.lock lock file exist at the top level of the repository.
  • The entire repository contents are COPYed from the container build context into the /app directory of the container build.

DOCKERFILE

WORKDIR /app
COPY . .
  • It is not reasonable to expect that the container image build machine contains GPUs. To have Pixi still be able to install an environment that uses CUDA when there is no virtual package set the __cuda override environment variable CONDA_OVERRIDE_CUDA.

DOCKERFILE

ENV CONDA_OVERRIDE_CUDA=<cuda version>
  • The Dockerfile uses a multi-stage build where it first installs the target environment <environment> and then creates an ENTRYPOINT script using pixi shell-hook to automatically activate the environment when the container image is run.

DOCKERFILE

RUN pixi install --locked --environment <environment>
RUN echo "#!/bin/bash" > /app/entrypoint.sh && \
    pixi shell-hook --environment <environment> -s bash >> /app/entrypoint.sh && \
    echo 'exec "$@"' >> /app/entrypoint.sh
  • The next stage of the build starts from a new container instance and then COPYs the installed environment and files from the build container image into the production container image. This can reduce the total size of the final container image if there were additional build tools that needed to get installed in the build phase that aren’t required for runtime in production.

DOCKERFILE

FROM ghcr.io/prefix-dev/pixi:noble AS production

WORKDIR /app
COPY --from=build /app/.pixi/envs/<environment> /app/.pixi/envs/<environment>
COPY --from=build /app/pixi.toml /app/pixi.toml
COPY --from=build /app/pixi.lock /app/pixi.lock
# The ignore files are needed for 'pixi run' to work in the container
COPY --from=build /app/.pixi/.gitignore /app/.pixi/.gitignore
COPY --from=build /app/.pixi/.condapackageignore /app/.pixi/.condapackageignore
COPY --from=build --chmod=0755 /app/entrypoint.sh /app/entrypoint.sh
  • Code that is specific to application purposes (e.g. environment diagnostics) from the repository is COPYed into the final container image as well

DOCKERFILE

COPY ./app /app/src

Knowing what code to copy

Generally you do not want to containerize your development source code, as you’d like to be able to quickly iterate on it and have it be transferred into a Linux container to be evaluated.

You do want to containerize your development source code if you’d like to archive it as an executable into the future.

  • Any ports that need to be exposed for i/o are exposed

DOCKERFILE

EXPOSE <PORT>

DOCKERFILE

ENTRYPOINT [ "/app/entrypoint.sh" ]

With this Dockerfile the container image can then be built with docker build.

Challenge

Write a Dockerfile that will create a Linux container with the gpu environment from the previous exercises Pixi workspace.

DOCKERFILE

FROM ghcr.io/prefix-dev/pixi:noble AS build

WORKDIR /app
COPY . .
ENV CONDA_OVERRIDE_CUDA=12
RUN pixi install --locked --environment gpu
RUN echo "#!/bin/bash" > /app/entrypoint.sh && \
    pixi shell-hook --environment gpu -s bash >> /app/entrypoint.sh && \
    echo 'exec "$@"' >> /app/entrypoint.sh

FROM ghcr.io/prefix-dev/pixi:noble AS production

WORKDIR /app
COPY --from=build /app/.pixi/envs/gpu /app/.pixi/envs/gpu
COPY --from=build /app/pixi.toml /app/pixi.toml
COPY --from=build /app/pixi.lock /app/pixi.lock
# The ignore files are needed for 'pixi run' to work in the container
COPY --from=build /app/.pixi/.gitignore /app/.pixi/.gitignore
COPY --from=build /app/.pixi/.condapackageignore /app/.pixi/.condapackageignore
COPY --from=build --chmod=0755 /app/entrypoint.sh /app/entrypoint.sh

EXPOSE 8000
ENTRYPOINT [ "/app/entrypoint.sh" ]

Automation with GitHub Actions workflows

In the personal GitHub repository that we’ve been working in create a GitHub Actions workflow directory

BASH

mkdir -p .github/workflows

and then add the following workflow file as .github/workflows/docker.yaml

YAML

name: Docker Images

on:
  push:
    branches:
      - main
    tags:
      - 'v*'
    paths:
      - 'cuda-exercise/pixi.toml'
      - 'cuda-exercise/pixi.lock'
      - 'cuda-exercise/Dockerfile'
      - 'cuda-exercise/.dockerignore'
      - 'cuda-exercise/app/**'
  pull_request:
    paths:
      - 'cuda-exercise/pixi.toml'
      - 'cuda-exercise/pixi.lock'
      - 'cuda-exercise/Dockerfile'
      - 'cuda-exercise/.dockerignore'
      - 'cuda-exercise/app/**'
  release:
    types: [published]
  workflow_dispatch:

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

permissions: {}

jobs:
  docker:
    name: Build and publish images
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write

    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Docker meta
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: |
            ghcr.io/${{ github.repository }}
          # generate Docker tags based on the following events/attributes
          tags: |
            type=raw,value=noble-cuda-12.9
            type=raw,value=latest
            type=sha

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v3

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Login to GitHub Container Registry
        if: github.event_name != 'pull_request'
        uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.repository_owner }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Test build
        id: docker_build_test
        uses: docker/build-push-action@v6
        with:
          context: cuda-exercise
          file: cuda-exercise/Dockerfile
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          pull: true

      - name: Deploy build
        id: docker_build_deploy
        uses: docker/build-push-action@v6
        with:
          context: cuda-exercise
          file: cuda-exercise/Dockerfile
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          pull: true
          push: ${{ github.event_name != 'pull_request' }}

This will build your Dockerfile in GitHub Actions CI into a linux/amd64 platform Docker container image and then deploy it to the GitHub Container Registry (ghcr) associated with your repository.

Building Apptainer containers with Pixi environments


Most HTC and HPC systems do not allow users to use Docker given security risks and instead use Apptainer. In most situations, Apptainer is able to automatically convert a Docker image, or other Open Container Initiative (OCI) container image format, to Apptainer’s Singularity Image Format .sif container image format, and so no additional work is required. However, the overlay system of Apptainer is different from Docker, which means that the ENTRYPOINT of a Docker container image might not get correctly translated into an Apptainer runscript and startscript. In might be advantageous, depending on your situation, to instead write an Apptainer .def definition file, giving full control over the commands, and then build that .def file into an .sif Apptainer container image.

We can build a very similar Apptainer container image definition file to the Dockerfile we wrote

Bootstrap: docker
From: ghcr.io/prefix-dev/pixi:noble
Stage: build

%files
./pixi.toml /app/
./pixi.lock /app/
./.gitignore /app/

%post
#!/bin/bash
export CONDA_OVERRIDE_CUDA=12
cd /app/
pixi info
pixi install --locked --environment prod
echo "#!/bin/bash" > /app/entrypoint.sh && \
pixi shell-hook --environment prod -s bash >> /app/entrypoint.sh && \
echo 'exec "$@"' >> /app/entrypoint.sh


Bootstrap: docker
From: ghcr.io/prefix-dev/pixi:noble
Stage: final

%files from build
/app/.pixi/envs/prod /app/.pixi/envs/prod
/app/pixi.toml /app/pixi.toml
/app/pixi.lock /app/pixi.lock
/app/.gitignore /app/.gitignore
# The ignore files are needed for 'pixi run' to work in the container
/app/.pixi/.gitignore /app/.pixi/.gitignore
/app/.pixi/.condapackageignore /app/.pixi/.condapackageignore
/app/entrypoint.sh /app/entrypoint.sh

%files
./app /app/src

%post
#!/bin/bash
cd /app/
pixi info
chmod +x /app/entrypoint.sh

%runscript
#!/bin/bash
/app/entrypoint.sh "$@"

%startscript
#!/bin/bash
/app/entrypoint.sh "$@"

%test
#!/bin/bash -e
. /app/entrypoint.sh
pixi info
pixi list

Let’s break this down too.

  • The Apptainer definition file is broken out into specific operation sections prefixed by % (e.g. files, post).
  • The Apptainer definition file assumes it is being built from a version control repository where any code that it will need to execute later exists under the repository’s src/ directory and the Pixi workspace’s pixi.toml manifest file and pixi.lock lock file exist at the top level of the repository.
  • The files section allows for a mapping of what files should be copied from a build context (e.g. the local file system) to the container file system
%files
./pixi.toml /app/
./pixi.lock /app/
./.gitignore /app/
  • The post section runs commands listed in it as a shell script executed in a clean shell environment that does not have any pre-existing build environment context. It is not reasonable to expect that the container image build machine contains GPUs. To have Pixi still be able to install an environment that uses CUDA when there is no virtual package set the __cuda override environment variable CONDA_OVERRIDE_CUDA.
%post
#!/bin/bash
export CONDA_OVERRIDE_CUDA=12
...
  • The definition files uses a multi-stage build where it first installs the target environment <environment> and then creates an entrypoint.sh script script that will be used as a runscript using pixi shell-hook to automatically activate the environment when the container image is run.
...
cd /app/
pixi info
pixi install --locked --environment prod
echo "#!/bin/bash" > /app/entrypoint.sh && \
pixi shell-hook --environment prod -s bash >> /app/entrypoint.sh && \
echo 'exec "$@"' >> /app/entrypoint.sh
  • The next stage of the build starts from a new container instance and then copies the installed environment and files from the build stage into the final container image. This can reduce the total size of the final container image if there were additional build tools that needed to get installed in the build phase that aren’t required for runtime in production.
Bootstrap: docker
From: ghcr.io/prefix-dev/pixi:noble
Stage: final

%files from build
/app/.pixi/envs/prod /app/.pixi/envs/prod
/app/pixi.toml /app/pixi.toml
/app/pixi.lock /app/pixi.lock
/app/.gitignore /app/.gitignore
# The ignore files are needed for 'pixi run' to work in the container
/app/.pixi/.gitignore /app/.pixi/.gitignore
/app/.pixi/.condapackageignore /app/.pixi/.condapackageignore
/app/entrypoint.sh /app/entrypoint.sh
  • By repeating the files section we can also copy in the source code
%files
./app /app/src
  • The post section then verifies that the Pixi workspace is valid and makes the /app/entrypoint.sh executable
%post
#!/bin/bash
cd /app/
pixi info
chmod +x /app/entrypoint.sh
%runscript
#!/bin/bash
/app/entrypoint.sh "$@"
  • We also define a startscript section that is the same as the runscript’s contents that is executed when the instance start command is executed (which creates a container instances that starts running in the background).
%startscript
#!/bin/bash
/app/entrypoint.sh "$@"
  • Finally, the test section defines a script that will be executed in the built container at the end of the build process. This allows for validation of the container functionality before it is distributed.
%test
#!/bin/bash -e
. /app/entrypoint.sh
pixi info
pixi list

With this Apptainer defintion file the container image can then be built with apptainer build

BASH

apptainer build <container image name>.sif <definition file name>.def

Automation with GitHub Actions workflows

In the personal GitHub repository that we’ve been working in create a GitHub Actions workflow directory

BASH

mkdir -p .github/workflows

and then add the following workflow file as .github/workflows/apptainer.yaml

YAML

name: Apptainer Images

on:
  push:
    branches:
      - main
    tags:
      - 'v*'
    paths:
      - 'cuda-exercise/pixi.toml'
      - 'cuda-exercise/pixi.lock'
      - 'cuda-exercise/apptainer.def'
      - 'cuda-exercise/app/**'
  pull_request:
    paths:
      - 'cuda-exercise/pixi.toml'
      - 'cuda-exercise/pixi.lock'
      - 'cuda-exercise/apptainer.def'
      - 'cuda-exercise/app/**'
  release:
    types: [published]
  workflow_dispatch:

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

permissions: {}

jobs:
  docker:
    name: Build and publish images
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write

    steps:
      - name: Free disk space
        uses: AdityaGarg8/remove-unwanted-software@v5
        with:
          remove-android: 'true'
          remove-dotnet: 'true'
          remove-haskell: 'true'

      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Install Apptainer
        uses: eWaterCycle/setup-apptainer@v2

      - name: Build container from definition file
        working-directory: ./examples/hello_pytorch
        run: apptainer build pixi-docker-chtc.sif apptainer.def

      - name: Test container
        working-directory: ./examples/hello_pytorch
        run: apptainer test pixi-docker-chtc.sif

      - name: Login to GitHub Container Registry
        if: github.event_name != 'pull_request'
        uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.repository_owner }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Deploy built container
        if: github.event_name != 'pull_request'
        working-directory: ./examples/hello_pytorch
        run: apptainer push pixi-docker-chtc.sif oras://ghcr.io/${{ github.repository }}:hello-pytorch-noble-cuda-12.9-apptainer

This will build your Apptainer definition file in GitHub Actions CI into a .sif container image and then deploy it to the GitHub Container Registry (ghcr) associated with your repository.

Key Points

  • Pixi environments can be easily installed into Linux containers.
  • As Pixi environments contain the entire software environment, the Linux container build script can simply install the Pixi environment.
  • Using GitHub Actions workflows allows for the build process to happen automatically through CI/CD.

  1. Which is a valid and effective solution.↩︎

Content from Using Pixi environments on HTC Systems


Last updated on 2025-06-16 | Edit this page

Estimated time: 90 minutes

Overview

Questions

  • How can you run workflows that use GPUs with Pixi CUDA environments?
  • What solutions exist for the resources you have?

Objectives

  • Learn how to submit containerized workflows to HTC systems.

High Throughput Computing (HTC)


One of the most common forms of production computing is high-throughput computing (HTC), where computational problems as distributed across multiple computing resources to parallelize computations and reduce total compute time. HTC resources are quite dynamic, but usually focus on smaller memory and disk requirements on each individual worker compute node. This is in contrast to high-performance computing (HPC) where there are comparatively fewer compute nodes but the capabilities and associated memory, disk, and bandwidth resources are much higher.

Two of the most common HTC workflow management systems are HTCondor and SLURM.

Setting up a problem


First let’s create a computing problem to apply these compute systems to.

Let’s first create a new project in our Git repository

BASH

pixi init ~/pixi-lesson/htcondor
cd ~/pixi-lesson/htcondor

OUTPUT

✔ Created ~/<username>/pixi-lesson/htcondor/pixi.toml

Training a PyTorch model on the MNIST dataset

Let’s write a very standard tutorial example of training a deep neral network on the MNIST dataset with PyTorch and then run it on GPUs in an HTCondor worker pool.

Mea culpa, more interesting examples exist

More exciting examples will be used in the future, but MNIST is perhaps one of the most simple examples to illustrate a point.

The neural network code

We’ll download Python code that uses a convocational neural network written in PyTorch to learn to identify the handwritten number of the MNIST dataset and place it under a src/ directory. This is a modified example from the PyTorch documentation (https://github.com/pytorch/examples/blob/main/mnist/main.py) which is licensed under the BSD 3-Clause license.

BASH

curl -sLO https://raw.githubusercontent.com/matthewfeickert/nvidia-gpu-ml-library-test/c7889222544928fb6f9fdeb1145767272b5cfec8/torch_MNIST.py
mkdir -p src
mv torch_MNIST.py src/

The Pixi environment

Now let’s think about what we need to use this code. Looking at the imports of src/torch_MNIST.py we can see that torch and torchvision are the only imported libraries that aren’t part of the Python standard library, so we will need to depend on PyTorch and torchvision. We also know that we’d like to use CUDA accelerated code, so that we’ll need CUDA libraries and versions of PyTorch that support CUDA.

Create the environment

Create a Pixi workspace that:

  • Has PyTorch and torchvision in it.
  • Has the ability to support CUDA v12.
  • Has an environment that has the CPU version of PyTorch and torchvision that can be installed on linux-64, osx-arm64, and win-64.
  • Has an environment that has the GPU version of PyTorch and torchvision.

This is just expanding the exercises from the CUDA conda packages episode.

Let’s first add all the platforms we want to work with to the workspace

BASH

pixi workspace platform add linux-64 osx-arm64 win-64

OUTPUT

✔ Added linux-64
✔ Added osx-arm64
✔ Added win-64

We know that in both environment we’ll want to use Python, and so we can install that in the default environment and have it be used in both the cpu and gpu environment.

BASH

pixi add python

OUTPUT

✔ Added python >=3.13.5,<3.14

Let’s now add the CPU requirements to a feature named cpu

BASH

pixi add --feature cpu pytorch-cpu torchvision

OUTPUT

✔ Added pytorch-cpu
✔ Added torchvision
Added these only for feature: cpu

and then create an environment named cpu with that feature

BASH

pixi workspace environment add --feature cpu cpu

OUTPUT

✔ Added environment cpu

and insatiate it with particular versions

BASH

pixi upgrade --feature cpu

TOML

[workspace]
channels = ["conda-forge"]
name = "htcondor"
platforms = ["linux-64", "osx-arm64", "win-64"]
version = "0.1.0"

[tasks]

[dependencies]
python = ">=3.13.5,<3.14"

[feature.cpu.dependencies]
pytorch-cpu = ">=2.7.0,<3"
torchvision = ">=0.22.0,<0.23"

[environments]
cpu = ["cpu"]

Now let’s add the GPU environment and dependencies. Let’s start with the CUDA system requirements

BASH

pixi workspace system-requirements add --feature gpu cuda 12

Override the __cuda virtual package

Remember that if you’re on a platform that doesn’t support the system-requirement you’ll need to override the checks to solve the environment.

BASH

export CONDA_OVERRIDE_CUDA=12

and create an environment from the feature

BASH

pixi workspace environment add --feature gpu gpu

OUTPUT

✔ Added environment gpu

and then add the GPU dependencies for the target platform of linux-64 (where we’ll run in production).

BASH

pixi add --platform linux-64 --feature gpu pytorch-gpu torchvision

OUTPUT

✔ Added pytorch-gpu >=2.7.0,<3
✔ Added torchvision >=0.22.0,<0.23
Added these only for platform(s): linux-64
Added these only for feature: gpu

TOML

[workspace]
channels = ["conda-forge"]
name = "htcondor"
platforms = ["linux-64", "osx-arm64", "win-64"]
version = "0.1.0"

[tasks]

[dependencies]
python = ">=3.13.5,<3.14"

[feature.cpu.dependencies]
pytorch-cpu = ">=2.7.0,<3"
torchvision = ">=0.22.0,<0.23"

[feature.gpu.system-requirements]
cuda = "12"

[feature.gpu.target.linux-64.dependencies]
pytorch-gpu = ">=2.7.0,<3"
torchvision = ">=0.22.0,<0.23"

[environments]
cpu = ["cpu"]
gpu = ["gpu"]

To validate that things are working with the CPU code, let’s do a short training run for only 2 epochs in the cpu environment.

BASH

pixi run --environment cpu python src/torch_MNIST.py --epochs 2 --save-model --data-dir data

OUTPUT

100.0%
100.0%
100.0%
100.0%
Train Epoch: 1 [0/60000 (0%)]	Loss: 2.329474
Train Epoch: 1 [640/60000 (1%)]	Loss: 1.425185
Train Epoch: 1 [1280/60000 (2%)]	Loss: 0.826808
Train Epoch: 1 [1920/60000 (3%)]	Loss: 0.556883
Train Epoch: 1 [2560/60000 (4%)]	Loss: 0.483756
...
Train Epoch: 2 [57600/60000 (96%)]	Loss: 0.146226
Train Epoch: 2 [58240/60000 (97%)]	Loss: 0.016065
Train Epoch: 2 [58880/60000 (98%)]	Loss: 0.003342
Train Epoch: 2 [59520/60000 (99%)]	Loss: 0.001542

Test set: Average loss: 0.0351, Accuracy: 9874/10000 (99%)

Running multiple ways

What’s another way we could have run this other than with pixi run?

You can enter a shell environment first

BASH

pixi shell --environment cpu
python src/torch_MNIST.py --epochs 2 --save-model --data-dir data

The Linux container

Let’s write a Dockerfile that installs the gpu environment into the container image when built.

Write the Dockerfile

Write a Dockerfile that will install the gpu environment and only the gpu environment into the container image.

DOCKERFILE

FROM ghcr.io/prefix-dev/pixi:noble AS build

WORKDIR /app
COPY . .
ENV CONDA_OVERRIDE_CUDA=12
RUN pixi install --locked --environment gpu
RUN echo "#!/bin/bash" > /app/entrypoint.sh && \
    pixi shell-hook --environment gpu -s bash >> /app/entrypoint.sh && \
    echo 'exec "$@"' >> /app/entrypoint.sh

FROM ghcr.io/prefix-dev/pixi:noble AS production

WORKDIR /app
COPY --from=build /app/.pixi/envs/gpu /app/.pixi/envs/gpu
COPY --from=build /app/pixi.toml /app/pixi.toml
COPY --from=build /app/pixi.lock /app/pixi.lock
# The ignore files are needed for 'pixi run' to work in the container
COPY --from=build /app/.pixi/.gitignore /app/.pixi/.gitignore
COPY --from=build /app/.pixi/.condapackageignore /app/.pixi/.condapackageignore
COPY --from=build --chmod=0755 /app/entrypoint.sh /app/entrypoint.sh

ENTRYPOINT [ "/app/entrypoint.sh" ]

Building and deploying the container image

Now let’s add a GitHub Actions pipeline to build this Dockerfile and deploy it to a Linux container registry.

Build and deploy Linux container image to registry

Add a GitHub Actions pipeline that will build the Dockerfile and deploy it to GitHub Container Registry (ghcr).

Create the GitHub Actions workflow directory tree

BASH

mkdir -p .github/workflows

and then write a YAML file at .github/workflows/ci.yaml that contains the following:

YAML

name: Build and publish Docker images

on:
  push:
    branches:
      - main
    tags:
      - 'v*'
    paths:
      - 'htcondor/pixi.toml'
      - 'htcondor/pixi.lock'
      - 'htcondor/Dockerfile'
      - 'htcondor/.dockerignore'
  pull_request:
    paths:
      - 'htcondor/pixi.toml'
      - 'htcondor/pixi.lock'
      - 'htcondor/Dockerfile'
      - 'htcondor/.dockerignore'
  release:
    types: [published]
  workflow_dispatch:

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

permissions: {}

jobs:
  docker:
    name: Build and publish images
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write

    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Docker meta
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: |
            ghcr.io/${{ github.repository }}
          # generate Docker tags based on the following events/attributes
          tags: |
            type=raw,value=hello-pytorch-noble-cuda-12.9
            type=raw,value=latest
            type=sha

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v3

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Login to GitHub Container Registry
        if: github.event_name != 'pull_request'
        uses: docker/login-action@v3
        with:
          registry: ghcr.io
          username: ${{ github.repository_owner }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Test build
        id: docker_build_test
        uses: docker/build-push-action@v6
        with:
          context: htcondor
          file: htcondor/Dockerfile
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          pull: true

      - name: Deploy build
        id: docker_build_deploy
        uses: docker/build-push-action@v6
        with:
          context: htcondor
          file: htcondor/Dockerfile
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          pull: true
          push: ${{ github.event_name != 'pull_request' }}

To verify that things are visible to other computers, install the Linux container utility crane

BASH

pixi global install crane
└── crane: 0.20.5 (installed)
    └─ exposes: crane

and then use crane ls to list all of the container images in your container registry for the particular image

BASH

crane ls ghcr.io/<your GitHub username>/pixi-lesson

HTCondor


This episode will be on a remote system

All the computation in the rest of this episode will take place on a remote system with an HTC workflow manager.

To provide a very high level overview of HTCondor in this episode we’ll focus on only a few of its many resources and capabilities.

  1. Writing HTCondor execution scripts to define what the HTCondor worker nodes will actually do.
  2. Writing HTCondor submit description files to send our jobs to the HTCondor worker pool.
  3. Submitting those jobs with condor_submit and monitoring them with condor_q.

Connection between execution scripts and submit description files

As HTCondor execution scripts are given as the executable field in HTCondor submit description files, they are tightly linked and can not be written fully independently. Though they are presented as separate steps above, you will in practice write these together.

Write the HTCondor execution script

Let’s first start to write the execution script mnist_gpu_docker.sh, as we can think about how that relates to our code.

  • We’ll be running in the gpu environment that we defined with Pixi and built into our Docker container image.
  • For security reasons the HTCondor worker nodes don’t have full connection to all of the internet. So we’ll need to transfer out input data and source code rather than download it on demand.
  • We’ll need to activate the environment using the /app/entrypoint.sh script we built into the Docker container image.

BASH

#!/bin/bash

# detailed logging to stderr
set -x

echo -e "# Hello CHTC from Job ${1} running on $(hostname)\n"
echo -e "# GPUs assigned: ${CUDA_VISIBLE_DEVICES}\n"

echo -e "# Activate Pixi environment\n"
# The last line of the entrypoint.sh file is 'exec "$@"'. If this shell script
# receives arguments, exec will interpret them as arguments to it, which is not
# intended. To avoid this, strip the last line of entrypoint.sh and source that
# instead.
. <(sed '$d' /app/entrypoint.sh)

echo -e "# Check to see if the NVIDIA drivers can correctly detect the GPU:\n"
nvidia-smi

echo -e "\n# Check if PyTorch can detect the GPU:\n"
python ./src/torch_detect_GPU.py

echo -e "\n# Extract the training data:\n"
if [ -f "MNIST_data.tar.gz" ]; then
    tar -vxzf MNIST_data.tar.gz
else
    echo "The training data archive, MNIST_data.tar.gz, is not found."
    echo "Please transfer it to the worker node in the HTCondor jobs submission file."
    exit 1
fi

echo -e "\n# Check that the training code exists:\n"
ls -1ap ./src/

echo -e "\n# Train MNIST with PyTorch:\n"
time python ./src/torch_MNIST.py --data-dir ./data --epochs 14 --save-model

Write the HTCondor submit description file

This is pretty standard boiler plate taken from the HTCondor documentation

# mnist_gpu_docker.sub
# Submit file to access the GPU via docker

# Set the "universe" to 'container' to use Docker
universe = container
# the container images are cached, and so if a container image tag is
# overwritten it will not be pulled again
container_image = docker://ghcr.io/<github user name>/pixi-lesson:sha-<sha>

# set the log, error and output files
log = mnist_gpu_docker.log.txt
error = mnist_gpu_docker.err.txt
output = mnist_gpu_docker.out.txt

# set the executable to run
executable = mnist_gpu_docker.sh
arguments = $(Process)

# transfer training data files to the compute node
transfer_input_files = MNIST_data.tar.gz,src

# transfer the serialized trained model back
transfer_output_files = mnist_cnn.pt

should_transfer_files = YES
when_to_transfer_output = ON_EXIT

# We require a machine with a modern version of the CUDA driver
Requirements = (Target.CUDADriverVersion >= 12.0)

# We must request 1 CPU in addition to 1 GPU
request_cpus = 1
request_gpus = 1

# select some memory and disk space
request_memory = 2GB
request_disk = 2GB

# Opt in to using CHTC GPU Lab resources
+WantGPULab = true
# Specify short job type to run more GPUs in parallel
# Can also request "medium" or "long"
+GPUJobLength = "short"

# Tell HTCondor to run 1 instances of our job:
queue 1

Write the submission script

To make it easy for us, we can write a small job submission script submit.sh that will prepare the data for us and submit the submit description file to HTCondor for us with condor_submit.

BASH

#!/bin/bash

# Download the training data locally to transfer to the worker node
if [ ! -f "MNIST_data.tar.gz" ]; then
    # c.f. https://github.com/CHTC/templates-GPUs/blob/450081144c6ae0657123be2a9a357cb432d9d394/shared/pytorch/MNIST_data.tar.gz
    curl -sLO https://raw.githubusercontent.com/CHTC/templates-GPUs/450081144c6ae0657123be2a9a357cb432d9d394/shared/pytorch/MNIST_data.tar.gz
fi

# Ensure existing models are backed up
if [ -f "mnist_cnn.pt" ]; then
    mv mnist_cnn.pt mnist_cnn_"$(date '+%Y-%m-%d-%H-%M')".pt.bak
fi

condor_submit mnist_gpu_docker.sub

Submitting the job

Before we actually submit code to run, we can submit an interactive job from the HTCondor system’s login nodes to check that things work as expected.

BASH

#!/bin/bash

# Download the training data locally to transfer to the worker node
if [ ! -f "MNIST_data.tar.gz" ]; then
    # c.f. https://github.com/CHTC/templates-GPUs/blob/450081144c6ae0657123be2a9a357cb432d9d394/shared/pytorch/MNIST_data.tar.gz
    curl -sLO https://raw.githubusercontent.com/CHTC/templates-GPUs/450081144c6ae0657123be2a9a357cb432d9d394/shared/pytorch/MNIST_data.tar.gz
fi

# Ensure existing models are backed up
if [ -f "mnist_cnn.pt" ]; then
    mv mnist_cnn.pt mnist_cnn_"$(date '+%Y-%m-%d-%H-%M')".pt.bak
fi

condor_submit -interactive mnist_gpu_docker.sub

Submitting the job for the first time will take a bit as it needs to pull down the container image, so be patient. The container image will be cached in the future and so this will be faster.

BASH

bash interact.sh

OUTPUT

Submitting job(s).
1 job(s) submitted to cluster 2127828.
Waiting for job to start...
...
Welcome to interactive3_1@vetsigian0001.chtc.wisc.edu!
Your condor job is running with pid(s) 2368233.
groups: cannot find name for group ID 24433
groups: cannot find name for group ID 40092
I have no name!@vetsigian0001:/var/lib/condor/execute/slot3/dir_2367762$

We can now activate our environment manually and look around

BASH

. /app/entrypoint.sh

OUTPUT

(htcondor:gpu) I have no name!@vetsigian0001:/var/lib/condor/execute/slot3/dir_2367762$

BASH

command -v python

OUTPUT

/app/.pixi/envs/gpu/bin/python

BASH

python --version

OUTPUT

Python 3.13.5

BASH

pixi list pytorch

OUTPUT

Environment: gpu
Package      Version  Build                           Size      Kind   Source
pytorch      2.7.0    cuda126_mkl_py313_he20fe19_300  27.8 MiB  conda  https://conda.anaconda.org/conda-forge/
pytorch-gpu  2.7.0    cuda126_mkl_ha999a5f_300        46.1 KiB  conda  https://conda.anaconda.org/conda-forge/

BASH

pixi list cuda

OUTPUT

Environment: gpu
Package               Version  Build       Size       Kind   Source
cuda-crt-tools        12.9.86  ha770c72_1  28.2 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-cudart           12.9.79  h5888daf_0  22.7 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-cudart_linux-64  12.9.79  h3f2d84a_0  192.6 KiB  conda  https://conda.anaconda.org/conda-forge/
cuda-cuobjdump        12.9.82  hbd13f7d_0  237.5 KiB  conda  https://conda.anaconda.org/conda-forge/
cuda-cupti            12.9.79  h9ab20c4_0  1.8 MiB    conda  https://conda.anaconda.org/conda-forge/
cuda-nvcc-tools       12.9.86  he02047a_1  26.2 MiB   conda  https://conda.anaconda.org/conda-forge/
cuda-nvdisasm         12.9.88  hbd13f7d_0  5.3 MiB    conda  https://conda.anaconda.org/conda-forge/
cuda-nvrtc            12.9.86  h5888daf_0  64.1 MiB   conda  https://conda.anaconda.org/conda-forge/
cuda-nvtx             12.9.79  h5888daf_0  28.6 KiB   conda  https://conda.anaconda.org/conda-forge/
cuda-nvvm-tools       12.9.86  he02047a_1  23.1 MiB   conda  https://conda.anaconda.org/conda-forge/
cuda-version          12.9     h4f385c5_3  21.1 KiB   conda  https://conda.anaconda.org/conda-forge/

BASH

nvidia-smi

OUTPUT

Mon Jun 16 00:07:33 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.14              Driver Version: 550.54.14      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 2080 Ti     On  |   00000000:B2:00.0 Off |                  N/A |
| 29%   26C    P8             23W /  250W |       3MiB /  11264MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

We can interactively run our code as well

BASH

tar -vxzf MNIST_data.tar.gz
time python ./src/torch_MNIST.py --data-dir ./data --epochs 2 --save-model

To return to the login node we just exit the interactive session

BASH

exit

Now to submit our job normally, we run the submit.sh script

BASH

bash submit.sh

OUTPUT

Submitting job(s).
1 job(s) submitted to cluster 2127879.

and its submission and state can be monitored with condor_q.

BASH

condor_q

OUTPUT



-- Schedd: ap2001.chtc.wisc.edu : <128.105.68.112:9618?... @ 06/15/25 19:16:17
OWNER     BATCH_NAME     SUBMITTED   DONE   RUN    IDLE  TOTAL JOB_IDS
mfeickert ID: 2127879   6/15 19:13      _      1      _      1 2127879.0

1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended

When the job finishes we see that HTCondor has returned to us the following files:

  • mnist_gpu_docker.log.txt: the HTCondor log file for the job
  • mnist_gpu_docker.out.txt: the stdout of all actions executed in the job
  • mnist_gpu_docker.err.txt: the stderr of all actions executed in the job
  • mnist_cnn.pt: the serialized trained PyTorch model

Key Points

  • You can use containerized Pixi environments with HTC systems to be able to run CUDA accelerated code that you defined.