Instructor Notes
Install the required workshop packages
Please use the instructions in the Setup document to perform installs. If you encounter setup issues, please file an issue with the tags ‘High-priority’.
Checking installations.
In the episodes/files/scripts/check_env.py
directory, you will find a script called check_env.py This checks the
functionality of the Anaconda install.
By default, Data Carpentry does not have people pull the whole
repository with all the scripts and addenda. Therefore, you, as the
instructor, get to decide how you’d like to provide this script to
learners, if at all. To use this, students can navigate into
_includes/scripts in the terminal, and execute the
following:
If learners receive an AssertionError, it will inform
you how to help them correct this installation. Otherwise, it will tell
you that the system is good to go and ready for Data Carpentry!
07-visualization-ggplot-python
iPython notebooks for plotting can be viewed in the
learners folder.
08-putting-it-all-together
Answers are embedded with challenges in this lesson, other than random distribtuion which is left to the learner to choose, and final plot, for which the learner should investigate the matplotlib gallery.
Scientists often operate on mathematical equations. Being able to use them in their graphics has a lot of added value Luckily, Matplotlib provides powerful tools for text control. One of them is the ability to use LaTeX mathematical notation, whenever text is used (you can learn more about LaTeX math notation here: https://en.wikibooks.org/wiki/LaTeX/Mathematics). To use mathematical notation, surround your text using the dollar sign (“$”).
LaTeX uses the backslash character (“\”) a lot. Since backslash has a special meaning in the Python strings, you should replace all the LaTeX-related backslashes with two backslashes.
PYTHON
plt.plot(t, t, 'r--', label='$y=x$')
plt.plot(t, t**2 , 'bs-', label='$y=x^2$')
plt.plot(t, (t - 5)**2 + 5 * t - 0.5, 'g^:', label='$y=(x - 5)^2 + 5 x - \\frac{1}{2}$') # note the double backslash
plt.legend(loc='upper left', shadow=True, fontsize='x-large')
# Note the double backslashes in the line below.
plt.xlabel('This is the x axis. It can also contain math such as $\\bar{x}=\\frac{\\sum_{i=1}^{n} {x}} {N}$')
plt.ylabel('This is the y axis')
plt.title('This is the figure title')
plt.show()
This page contains more information.
Data visualization with Pandas and Matplotlib
Instructor Note
Warning, the line of code
pd.read_csv("../data/raw/surveys_complete_77_89.csv") is
prone to raise relative path errors for learners, for two main
reasons:
They may have not set the directory structure as described in the previous episode: For example, if they used capital letters, added whitespaces, or just typed it differently. This would change their relative path to the file.
They didn’t create the notebook inside their
scriptsdirectory: JupyterLab seems to set the working directory to the place where the notebook was created. If a learner created the notebook at the root of their project folder then the relative path would change.
Additional to errors they may get, you would also need to spend some
time talking about relative paths and why we need to use
'../'. Prepare to spend 10 to 15 minutes successfully
reading the csv file with your group.
Exploring and understanding data
Choose how to teach this section
The section on generative AI is intended to be concise but Instructors may choose to devote more time to the topic in a workshop. Depending on your own level of experience and comfort with talking about and using these tools, you could choose to do any of the following:
- Explain how large language models work and are trained, and/or the difference between generative AI, other forms of AI that currently exist, and the limits of what LLMs can do (e.g., they can’t “reason”).
- Demonstrate how you recommend that learners use generative AI.
- Discuss the ethical concerns listed below, as well as others that you are aware of, to help learners make an informed choice about whether or not to use generative AI tools.
This is a fast-moving technology. If you are preparing to teach this section and you feel it has become outdated, please open an issue on the lesson repository to let the Maintainers know and/or a pull request to suggest updates and improvements.