This lesson is in the early stages of development (Alpha version)

Introduction to Geospatial Raster and Vector Data with Python: Setup

Set-up Instructions:

Prerequirisite software:

  1. Anaconda
  2. Python 3.x (<=3.10)
  3. Jupyter Lab

On Windows, this setup uses Anaconda prompt to install the prerequisites for the course. Experienced users may opt for other options such as Git Bash or Windows Subsystem for Linux

Installing Python Using Anaconda

Python is a popular language for scientific computing, and great for general-purpose programming as well. Installing all of its scientific packages individually can be a bit difficult, however, so we recommend the all-in-one installer Anaconda.

Regardless of how you choose to install it, please make sure you install Python version 3.x (e.g., 3.7 is fine). Also, please set up your python environment at least a day in advance of the workshop. If you encounter problems with the installation procedure, ask your workshop organizers via e-mail for assistance so you are ready to go as soon as the workshop begins.

Windows - Video tutorial

  1. Open https://www.anaconda.com/distribution/ with your web browser.

  2. Download the Python 3 installer for Windows.

  3. Double-click the executable and install Python 3 using the recommended settings. Make sure that Register Anaconda as my default Python 3.x option is checked - it should be in the latest version of Anaconda. We also recommend that you make sure “Add Anaconda to my PATH environment variable” is selected.

Mac OS X - Video tutorial

  1. Visit https://www.anaconda.com/distribution/ with your web browser.

  2. Download the Python 3 installer for OS X. These instructions assume that you use the graphical installer .pkg file.

  3. Follow the Python 3 installation instructions. Make sure that the install location is set to “Install only for me” so Anaconda will install its files locally, relative to your home directory. Installing the software for all users tends to create problems in the long run and should be avoided.

Linux

Note that the following installation steps require you to work from the shell. If you run into any difficulties, please request help before the workshop begins.

  1. Open https://www.anaconda.com/distribution/ with your web browser.

  2. Download the Python 3 installer for Linux.

  3. Install Python 3 using all of the defaults for installation.

    a. Open a terminal window.

    b. Navigate to the folder where you downloaded the installer

    c. Type

    $ bash Anaconda3-
    

    and press tab. The name of the file you just downloaded should appear.

    d. Press enter.

    e. Follow the text-only prompts. When the license agreement appears (a colon will be present at the bottom of the screen) press the space bar until you see the bottom of the text. Type yes and press enter to approve the license. Press enter again to approve the default location for the files. Type yes and press enter to prepend Anaconda to your PATH (this makes the Anaconda distribution your user’s default Python).

Setting up your Lesson Directory and Getting the Data

  1. Open the terminal/shell:
    • On Windows, open Anaconda prompt from the Start Menu.
    • On Mac OS or Linux, open the Terminal app.
  2. Change your working directory to your Desktop :

     cd ~/Desktop
    
  3. Create a new directory on your Desktop called geospatial-python and change into it:

     mkdir geospatial-python
     cd geospatial-python
    
  4. Create a subdirectory within geospatial-python called data and change into it:

     mkdir data
     cd data
    
  5. Download the data that will be used in this lesson. There are two ways you can do this:

    • Web browser: Click on the following three links to download the corresponding files, then move them into the data directory we created above:
    • Terminal: Running the following command will download three files (use the ls command to confirm):
       curl -L --progress-bar \
       --output brpgewaspercelen_definitief_2020_small.gpkg "https://figshare.com/ndownloader/files/37729413" \
       --output brogmwvolledigeset.zip "https://figshare.com/ndownloader/files/37729416" \
       --output status_vaarweg.zip "https://figshare.com/ndownloader/files/37729419"
      

      Do not unzip the files, since we will read from them directly.

  6. Change directories from data back into geospatial-python:

     cd ..
    

Setting up the workshop environment with conda

If Anaconda was properly installed, you should have access to the conda command in your terminal/anaconda prompt.

  1. Test that it works by running the conda command in the terminal. You should get an output that looks like this:

     $ conda
     usage: conda [-h] [-V] command ...
    
     conda is a tool for managing and deploying applications, environments and packages.
    
     Options:
    
     positional arguments:
       command
         clean        Remove unused packages and caches.
         compare      Compare packages between conda environments.
         config       Modify configuration values in .condarc. This is modeled
                     after the git config command. Writes to the user .condarc
                     file (/home/rave/.condarc) by default.
         create       Create a new conda environment from a list of specified
                     packages.
         help         Displays a list of available conda commands and their help
                     strings.
         info         Display information about current conda install.
         init         Initialize conda for shell interaction. [Experimental]
         install      Installs a list of packages into a specified conda
                     environment.
         list         List linked packages in a conda environment.
         package      Low-level conda package utility. (EXPERIMENTAL)
         remove       Remove a list of packages from a specified conda environment.
         uninstall    Alias for conda remove.
         run          Run an executable in a conda environment. [Experimental]
         search       Search for packages and display associated information. The
                     input is a MatchSpec, a query language for conda packages.
                     See examples below.
         update       Updates conda packages to the latest compatible version.
         upgrade      Alias for conda update.
    
     optional arguments:
       -h, --help     Show this help message and exit.
       -V, --version  Show the conda version number and exit.
    
     conda commands available from other packages:
       env
    
  2. Create the environment using the conda create command. It’s possible to paste the following code on the terminal/anaconda prompt:

     conda create -n geospatial -c conda-forge -y \
       python=3.10 jupyterlab numpy matplotlib \
       xarray rasterio geopandas rioxarray earthpy descartes xarray-spatial pystac-client python-graphviz
    
    

    Please note that this step may take several minutes to complete. If it takes more than a few minutes, see below for other options.

    In this command, the -n argument specifies the environment name, the -c argument specifies the Conda channel where the libraries are hosted, and the -y argument spares the need for confirmation. The following arguments are the names of the libraries we are going to use. As you can see, geospatial analysis requires many libraries! Luckily, package managers like conda facilitate the process of installing and managing them.

    If the above method does not work please try with one of the two methods provided below:

    Alternative Method 1: Faster Environment Install With One Extra Step

    If you see a spinning / for more than a few minutes, you may want to try the following to speed up the environment installation.

    1. Cancel the currently running conda create process with CTRL+C
    2. Run conda install -c conda-forge mamba
    3. Run the following command:
    mamba create -n geospatial -c conda-forge -y \
    python=3.10 jupyterlab numpy matplotlib \
    xarray rasterio geopandas rioxarray earthpy descartes xarray-spatial pystac-client python-graphviz
    

    Alternative Method 2: Using environment.yaml file

    If the above methods do not work, it’s also possible to create the environment from a file:

    1. Right-click and “Save Link As…” on this link: https://carpentries-incubator.github.io/geospatial-python/files/environment.yaml

    2. Name it environment.yaml and save it to your geospatial-python folder. The environment.yaml contains the names of Python libraries that are required to run the lesson:

    name: geospatial
    channels:
      - conda-forge
    dependencies:
    # Python
      - python=3.10
    # JupyterLab
      - jupyterlab
    # Python scientific libraries
      - numpy
      - matplotlib
      - xarray
    # Geospatial libraries
      - rasterio
      - geopandas
      - rioxarray
      - xarray-spatial
      - earthpy
      - descartes # necessary for geopandas plotting
      - pystac-client
      - python-graphviz
    
    1. In the terminal, navigate to the directory where you saved the environment.yaml file using the cd command.
    2. Run the following command to create the environment from the file:
    conda env create -f environment.yaml
    

    conda should begin to locate, download, and install the Python libraries listed in the environment.yaml file.

    When installation has finished you should see the following message in the terminal:

     # To activate this environment, use
     #    $ conda activate geospatial
     #
     # To deactivate an active environment, use
     #    $ conda deactivate
    

    IMPORTANT

    If your terminal responds to the above command with conda: command not found see the > «troubleshooting» section.

  3. Activate the geospatial virtual environment:

     conda activate geospatial
    

    If successful, the text (base) in your terminal prompt will now read (geospatial) indicating that you are now in the Anaconda virtual environment named geospatial. The command which python should confirm that we’re using the Python installation in the geospatial virtual environment. For example:

     % which python
     > /Users/your-username/anaconda3/envs/geospatial/bin/python
                                           ^^^^^^^^^^
    

    IMPORTANT

    If you close the terminal, you will need to reactivate this environment with conda activate geospatial to use the Python libraries required for the lesson and to start JupyterLab, which is also installed in the geospatial environment.

Starting JupyterLab

In order to follow the lessons on using Python (episode 5 and onward), you should launch JupyterLab after activating the geospatial conda environment in your working directory that contains the data you downloaded. See Starting JupyterLab for guidance or enter the code snippet below:

   jupyter lab

Once you have opened a new Jupyter Lab file, confirm that all modules have been installed correctly by importing a module:

  import rioxarray

If all of the steps above completed successfully you are ready to follow along with the lesson!

Troubleshooting conda: command not found

  1. First, find out where Anaconda is installed.

    The typical install location is in your $HOME directory (i.e., /Users/your-username/) so use ls ~ to check whether an anaconda3 directory is present in your home directory:

     % ls ~
     > Applications      Downloads       Pictures
       anaconda3         Library         Public
       Desktop           Movies
       Documents         Music
    

    If, like above, you see a directory called anaconda3 in the output we’re in good shape. If not, contact the instructor for help.

  2. Activate the conda command-line program by entering the following command:

     source ~/anaconda3/bin/activate
    

    If all goes well, nothing will print to the terminal and your prompt will now have (base) floating around somewhere on the left. This is an indication that you are in the base Anaconda environment.

    Continue from the beginning of step 3 to complete the creation of the geospatial virtual environment.