Instructor Notes

Schedule


We will put information about the schedule here once we have it! We are aiming for a 1-day lesson. If we end up having way more content, we may develop a longer thing, but it would be good to have a 1-day version that can be taught in conjunction with the existing ecology curriculum. The goal is to have this be a supplemental lesson to the existing Data Carpentry ecology lesson, though it will also be able to stand alone. - This is also why we chose to use the {ratdat} dataset - The skills that learners get from the DC Ecology lesson are assumed as a baseline level of R experience in this lesson, though this lesson is also useful for more advanced R users.

Motivation and philosophy


This lesson was conceptualized by Kaija Gahm as part of a Teaching As Research project through CIRTL at UCLA, in Spring 2024.

Debugging code is a challenging task and not one that can be comprehensively covered in a single lesson. This lesson was designed with the understanding that learners will be better equipped to proceed with confidence in their coding if they have a toolbox of general skills that they can apply when they get stuck (or a general workflow to follow).

Traditional debugging lessons focused on computer science students often very technical debugging concepts that are more applicable to programmers, rather than code end users. We observed that most of our colleagues (graduate students in ecology and evolutionary biology at UCLA) have little computer science background. Their interest in learning to code (usually in R) is pragmatic, and they want to use R as a tool to solve their research problems, rather than as an end goal in and of itself.

Instead of trying to teach a lesson on debugging per se, we chose to focus narrowly on the practical skill of building a minimal reproducible example [because reasons–KG to expand here].

  • emotions are important–acknowledge the stress and anxiety that comes with getting stuck.
  • This has been done before but not in an interactive format.

What is a reprex and why is it useful?


Instructor Note

Loading the entire {tidyverse} here, rather than a few component packages, is an intentional over-complication so that we can teach learners to simplify their packages later. Learners should have {tidyverse} installed, as per the setup instructions.



Instructor Note

The following exercises are optional, but they can are useful for getting learners settled in.



Identify the problem and make a plan


Minimal reproducible codeMaking a reprex


Minimal reproducible data


Instructor Note

Development note: Previous episodes should have already introduced the concept of a minimal reproducible examples, why it is important, and talked about making the code minimal and reproducible. This data episode is part of being reproducible and should perhaps be followed-up with more details on reproducibility if not previously mentioned (e.g., reprex needs minimal code - check - dependencies, which include minimal data - check - and other basic information like your system and R version as well as contextual information on your data?).



Instructor Note

Idea for future edits: the above callout could be rephrased using Mickey and a narrative example (e.g., Mickey tries to just send their data to Remy but it doesn’t work out).



Instructor Note

Section 4.1, Exercise 1, and the extra practice exercise that follows are all intended to address the LO “Describe a minimal reproducible dataset.” The extra practice exercise better targets the learner’s ability to identify the appropriate reprex dataset.



Instructor Note

The lesson develoepers recommend you skip the exercise below to stay within the lesson’s time estimate. You can provide it as extra practice for quicker learners or during breaks.



Instructor Note

It would be nice if we could toggle the callout below or at least the table so that it doesn’t take so much room since it is not an essential part of the lesson.



Instructor Note

You can skip the challenge below if short on time. The exercise assesses the LO: “Create a dataset from scratch.” There will not be another exercise to practice this, but you will walk through creating the main reprex dataset together, so that may be enough. Do not spend too much time on this. Students can work in pairs, or you can work all together. Don’t wait for everyone to be done before going through solutions, maybe have them alert you when they finish each one, they can then keep going while they wait. You can skip giving the solution for C, expecting quicker learners to have tried it for practice while slower learners may not have gotten to it, that’s ok!



Instructor Note

Make sure participants understand the distinction we are trying to make and why it matters. It may not be straightforward. While it does not address a specific LO, it is relevant to being able to create a dataset for a given reprex.



Instructor Note

Give them some time to think it through, maybe in pairs or small groups, but focus on working through the answers together, even if not everyone has finished. Go one question at a time (time to answer the first -> walk through solution -> time to answer the second -> etc.).



Instructor Note

Begin creating the dataset with the information gained from each question, then you can put it all together.



Instructor Note

You can give them time to answer those questions on their own first, like in Exercise 5, or just go through them together.



Instructor Note

Note: if the new dataset had an odd number of rows it would spit out a warning giving you a hint as to what the source of the problem may be.



Instructor Note

I love that the filter {dplyr} documentation is thoroughly unhelpful on this



Asking your question