Data Carpentry is an open source project, and we welcome contributions of all kinds: new lessons, fixes to existing material, bug reports, and reviews of proposed changes are all welcome.
By contributing, you agree that we may redistribute your work under our license. In exchange, we will address your issues and/or assess your change proposal as promptly as we can, and help you become a member of our community. Everyone involved in Software Carpentry and Data Carpentry agrees to abide by our code of conduct.
If you have an idea about how to improve the lesson, you can submit it as an issue on GitHub. If you have multiple unrelated suggestions, it is best to open a separate issue for each of them. This makes it easier for the project maintainers to discuss and resolve them.
Submitting an issue can count as a contribution for your instructor training checkout. If your contribution is for instructor training, send an email with a link to the issue to checkout@carpentries.org. Please note that it is not necessary to point out in the issue’s title or text that it is a contribution for the instructor training checkout.
You can also suggest changes by modifying the lesson code directly and submitting your changes as a pull request.
Fork the alisonrclarke/R-archaeology-lesson
repository on GitHub. See the “Fork” button in the top-right corner of the screen on the GitHub website.
Clone that repository to your own machine. (It is also possible to make minor edits right on GitHub.) At your terminal:
Create a branch from main
for your changes. Give your branch a meaningful name, such as fix-typos-dplyr-lesson
or add-tutorial-on-visualization
. At your terminal:
Make your changes to the Rmd file. If you’d like to check the rendered version of your changes, you can do one of three things:
GNU Make
installed on your system, type make
at your shell terminal.rmarkdown::render_site("01-intro-to-r.Rmd")
in your R terminal (make sure your working directory is at the root of the lesson) to generate the corresponding html file.Commit the Rmd file you edited (git add file-you-changed.Rmd
, followed by git commit -m "fix typos in dplyr lesson"
), and push your changes to your repository on GitHub (git push origin fix-typos-dplyr-lesson
). If your change affects a lesson, please only commit and push the Rmd
files. The rendered versions will be generated by the lesson maintainers to avoid merge conflicts.
Send a pull request (PR) to the main
branch of the alisonrclarke/R-archaeology-lesson
repository for this lesson at https://github.com/alisonrclarke/R-archaeology-lesson
If you are new to Git or GitHub, software like GitHub Desktop can make this process easier for you.
If it is easier for you to send edits to us some other way, please mail us at checkout@carpentries.org. Given a choice between you creating content or wrestling with Git, we’d rather have you doing the former.
For the R material, lessons are written in RMarkdown (files ending in Rmd
). Filenames follow the pattern 00-before-we-start.Rmd
, 01-intro-to-r.Rmd
and so on. That is, we use two digits followed by a topic key to ensure files appear in the right order when listed.
A Makefile converts the Rmd files into HTML files that are processed by Jekyll (the tool GitHub uses to create websites) as explained in the README file.
To ensure a consistent formatting of the lessons, we recommend the following formatting guidelines for RMarkdown files:
function()
while variables are written as variable
, and package names as package
.## Use this format for headers
And not this format
-------------------
Most R code within .Rmd files is written inside of code chunks. Code chunks can have a name and a number of options, but neither is required. Options are added to a code chunk like this:
```{r, chunk_name, option1 = value, option2 = value, ...}
Throughout the lesson, we use different code chunk options, mostly to change when and how the code in the chunks is being executed. Below you will find a list of the most common options we use and information on how we use them. More information on RMarkdown code chunk options can be found here. When in doubt, consult the Rmd
files for examples.
answer = [FALSE | TRUE]
The answer
option is used in challenges to hide the content of the chunk so that the reader needs to interact with the website to reveal it. The default value is FALSE
.
echo = [FALSE | TRUE]
If echo = FALSE
, the code will be executed and its output will be visible on the lesson website (unless specified otherwise by the eval
, message
, or results
options), but the code itself will not be visible. This is useful when writing code for the code handout, because it allows to include redundant headings and comments that are not needed in the lesson itself, but help to structure and clarify the code handout. The default value is TRUE
.
eval = [FALSE | TRUE]
If eval = FALSE
the code in the chunk will not be executed by R
when the file is processed to create the lesson website. Accordingly, no output will be created. This is useful, for example, when seeing the result of the code is not required for the lesson, or when the code chunk contains code that installs or loads packages, downloads files, or opens the R
help window. The default value is TRUE
.
message = [FALSE | TRUE]
If FALSE
messages produced by the code will not be shown. THis is useful, for example when loading packages like tidyverse
that output when loaded. By using message = FALSE
, such output can be hidden. The default value is TRUE
.
purl = [FALSE | TRUE]
Code chunks that have the option purl = TRUE
will be included in the code handout (see below). The default value is FALSE
.
results = ['markup' | 'hide' | 'asis' | 'hold']
Determines if and how the text output of a code chunk is formatted. Useful values are markup
(to format text output using markup, usually formatting it as a code block), asis
(to write raw output directly into the document without any markup), and hide
(to hide the output, for example when loading data sets).
The code handout code-handout.R
contains code that can be distributed to learners. This is particularly useful for error prone code such as long URLs for downloading files. The code handout is created automatically from the lesson’s .Rmd
files by make_code_handout.R
, and we use the purl()
function from knitr
to
create the handout. Code that should be included in the code handout must be enclosed in an R
code chunk with the chunk option purl = TRUE
(see above). To make the handout more useful, consider including explanatory comments.
We don’t store data for lessons inside the lesson repositories. For completed lessons the data should be publicly available in a data repository appropriate to the data type. For lesson development the data may be provided in any way that is convenient including posting to a website, on figshare, a public Dropbox link, a GitHub gist, or even included in the pull request (PR). Once the PR is ready to merge the data should be placed in the official data repository and all links to the data updated.
Raw data go into data_raw/
. However, at this stage, this folder is created programmatically and only contain dataset downloaded directly from the Zenodo repository. In other words, it can be safely be deleted (e.g. using make clean-data
or make clean
.)
The data/
folder only contains data generated/exported by R code.
Images (e.g., screenshots) are stored in the img/
folder. Graphics generated by some R code also go into this folder and get the prefix R-archaeology-
. This latter case is handled automatically with some knitr options in the setup.R
file.
The site_libs
folder is generated by the rmarkdown package and holds the javascript, css, and fonts used by the website.
We aim to have our lessons be as self-contained as possible. Images and other external resources should be included in the repository whenever possible.
Page built on: 📆 2022-05-04 ‒ 🕢 16:01:47
Data Carpentry, 2014-2021.
Questions? Feedback?
Please file
an issue on GitHub.
On Twitter: @datacarpentry
If this lesson is useful to you, consider