Summary and Setup

This workshop lesson is an introduction to making interactive data visualizations in Python. Learners will create a new environment using conda, wrangle data into the proper format using pandas library, create visualizations using the Plotly Python library, and display these visualizations and create widgets using Streamlit.

Preview the app

For a preview of what learners will be creating in this lesson (including the exercises), click the “Open in Streamlit” button below. Open in Streamlit The repository that contains this example Streamlit app can be found here.

{% comment %} This is a comment in Liquid {% endcomment %}

Prerequisites

  1. Learners should have completed the Plotting and Programming in Python workshop lesson, or have some experience with Python and the pandas library.
  2. Learners should have Anaconda installed on their machines, as specified in the setup for Plotting and Programming in Python.
  3. Learners should be comfortable with using the command line and with using git, either on the command line or through an application like GitHub Desktop.
  4. Learners should have a Jupyter Lab & Streamlit compatible web browser installed (Google Chrome, Firefox, or Safari).
  5. Learners should have downloaded the required files (data_viz_workshop.zip) as specified in the Setup
  6. Learners should have a GitHub account if they wish to deploy (share) their app.

Getting the Files


The dataset we will be using is taken from the gapminder dataset, just like the Plotting and Programming in Python workshop. We will also be using a special environment, which can be recreated on your machine using Anaconda.

To obtain the dataset and environment, download the file data_viz_workshop.zip. If given the option, choose to Save File in your Desktop folder. If you are not given the option to choose where to save the file, then move this zipped file to your Desktop. Finally, double click the zipped file to unzip it. You should now have a folder called data_viz_workshop. If you open this folder, you will see a file called environment.yml and a folder called Data.

Optional: Create the virtual environment


Creating the environment can be done as a part of setup if learners already have experience in working with virtual environments. This will save time during the workshop itself to focus on other activities.

If your instructor tells you to create the dataviz environment in advance, follow the directions in Episode 2, Create a new environment). Then, you can open Jupyter Lab in the project root directory (e.g. Desktop/data_viz_workshop)

Create a GitHub account (if you don’t already have one)


You can sign up for a GitHub account at github.com/signup

Make sure to choose a general purpose email that you are likely to still have access to in 5 years - that is, not an email tied to a specific workplace, university, or Internet Service Provider.

Make sure to also choose an appropriate username that you are comfortable putting on your resume or sharing with colleagues - some variation of your name is a good idea.

After you have a GitHub account, you should also download GitHub Desktop, so that you can clone, pull, and push without having to use the command line. You can download GitHub Desktop here.

Installing Python Using Anaconda


Python is a popular language for research computing, and great for general-purpose programming as well. Installing all of its research packages individually can be a bit difficult, so we recommend Anaconda, an all-in-one installer.

Regardless of how you choose to install it, please make sure you install Python version 3.x (e.g., 3.6 is fine).

We will teach Python using the Jupyter Notebook, a programming environment that runs in a web browser (Jupyter Notebook will be installed by Anaconda). For this to work you will need a reasonably up-to-date browser. The current versions of the Chrome, Safari and Firefox browsers are all supported (some older browsers, including Internet Explorer version 9 and below, are not).