This lesson is in the early stages of development (Alpha version)

Adding Code-Generated Plots and Figures

Overview

Teaching: 50 min
Exercises: 20 min
Questions
  • What is Knitr?

  • What are code chunks and how they are structured?

  • How can you run code from your rmd document?

  • What are global knitr options?

  • What are global chunk options?

Objectives
  • Understand the syntax of a code chunk.

  • Learn how to insert run-able blocks of code to integrate into your report

  • Learn how to source external scripts to run within an rmd document.

  • Learn about using global knitr options and global chunk options

Utilizing the Code Features of R Markdown

We’ve learned about the text-formatting options of R Markdown, now let’s dive into the code portion of R Markdown documents. R Markdown flips around the defaults of code and text in the documents. Instead of priortizing the code and making you comment out (#) text, they priortize text and force you to specially signal the code portions. How do you signal to R the difference between code and text when you’re not using code commments (#)? That’s where “Code Chunks” come into play (Yes that’s RStudio’s technical name for them). Instead of R Markdown’s rendering system processing the markdown styling into the final output, Code chunks are sent to a preceding stage of processing by Knitr, which “knits” the code output and text together. Secondly, rmarkdown processes the code output and displays it in the document format of our choice - i.e. Knitr runs the lines of code for a plot in a code chunk, joins it to the markdown text portions, and rmarkdown outputs that as an html document.

What is Knitr?

But what is Knitr? Knitr is the engine in RStudio which creates the “dynamic” part of R Markdown reports. It’s specifically a package that allows the integration of R code into the html, word, pdf, or LaTex document you have specified as your output for R Markdown. It utilizes Literate Programming to make research more reproducible. There are two main ways to process code with Knitr in R Markdown documents:

  1. Code Chunks
  2. Inline Code

First, we’re going to talk about code chunks more substantial portions of code into our narrative such as figures and plots. There are a plethora of options that become available to us when using code chunks so this tends to be the more complex part of R Markdown documents. Now, sometimes you just need to do a quick calculation - like a count of total observations in your data or the mean of one of your variables. In those cases, it may not be worth setting up a code chunk to calculate those values, so after code chunks we will see how to add inline code - which allows one to add a quick line of code or single function to be executing within the text portion of the document. But let’s start with code chunks.

Inserting Code Chunks

Code chunks are better when you need to do something more sophisticated with your code than inline code, such as building plots or tables. They also incorporate syntax which allows modifications to how that code is rendered and styled in your final output. We’ll learn more about that as we walk through the “anatomy” of a code chunk.

Start a new .Rmd File

First, though, let’s open a new .rmd document to get a look at how code chunks work before integrating them into our paper.

Again, open a new document by navigating to File > New File > R Markdown. Add the title Code-Chunk-Test.

Let’s first delete the generic text because we don’t need it at this point (all except the first code chunk that is - we’ll get back to that in a second).

Default RMD just setup chunk

Basic Anatomy of the Code Chunk

You can quickly insert chunks like these into your file with:

The most basic (and empty) code chunk looks like so:

blank Rmd code chunk

Other than our backticks ``` for code chunks that surround the code top and bottom, the only required piece is the specified language (r) placed between the curly brackets. This indicates that the language to read the code is R.

Let’s all start a new code chunk by typing our our starting backticks & r between curly brackets. (in your own workflow you may want to add the ending three backticks as well so you don’t forget after adding your code - it’s a common mistake):

Fun fact: Other Programming Languages

Although we will (mostly) be using R in this workshop, it’s possible to use other programming or markup languages. For example, we have seen that we can use LaTeX code for equations. You can also use python and a handful of other languages, so if R is not your preferred programming, but you like working in the RStudio environment, don’t despair! Other options for languages include: sql, julia, bash, and c, etc. It should be noted however, that some languages (like python) will require installing and loading additional packages.

Add a Code Chunk

Ok, let’s add some code! There are already some plots included in our code but as static images. This time, we are going to opt to add these plots as code chunks - which are also more reproducible and easier to update. This is because, as with our inline code, this assures that if there are any changes to the data, the plots update automatically. This also makes our life easier because when there’s a change we don’t have to re-generate plots, save them as images and then add them back in to our paper. This will potentially help prevent version errors as well! So we’re actually going to go ahead and add a few plots with code chunks.

Now, let’s open our 03_HR_analysis.R script in our code folder. We will insert the code of this script into our current working file. To do this, copy the code and paste it in-between the two lines with backticks and {r}.

heartrate code in chunk

Tip:

There’s actually a button you can use in the RStudio menu to generate the code chunks automatically. Automatic code chunk generation is available for several other languages as well. Also, you can use the keyboard shortcut ctrl+alt+i for Windows and command+option+i for Mac. auto create code chunk

Run the code in a code chunk

Now, to check to make sure our code renders, we could click the “knit” button as we have been doing to check on the output of our R Mardown file. However, with the code chunks we have options for running and debugging code that don’t require us to wait for the file to render.

1) Run from code chunk (green play button on the right top corner). This allows us to run one specific code chunk.

run from code chunk

2) Run menu - this gives more options for running code chunks including the current one, the next one, all chunks, etc.

run code menu

3) Keyboard shortcuts:

Task Windows & Linux macOS
Create a code chunk Ctrl + Alt + I Cmd + Option + I
Run all chunks above Ctrl+Alt+P Command+Option+P
Run current chunk Ctrl+Alt+C Command+Option+C
Run current chunk Ctrl+Shift+Enter Command+Shift+Enter
Run next chunk Ctrl+Alt+N Command+Option+N
Run all chunks Ctrl+Alt+R Command+Option+R
Go to next chunk/title Ctrl+PgDown Command+PgDown
Go to previous chunk/title Ctrl+PgUp Command+PgUp

Run your code with one of the given methods.

Did it work? Look under the code chunk. You should now see a plot preview displayed beneath the code chunk if all went well.

Code Chunk Plot Preview

Knitting with Code Chunks

We just saw how to run our code in our code chunks to see a preview of the code output that will render in our html document but to actually render it we need to use the Knit button. Using the knit button with code chunks is a two step process - first the code is run (all code chunks will run automatically). Second, (if there are no code errors) the document of choice will render for our whole R Markdown document.

Time to Knit!

Now, let’s knit the R Markdown file and see how our code output looks in the final html page.

code chunk with plot1 code

Wait… what’s all that output in our document? We don’t want that in our paper!

Heart rate code no options for code chunk

This happens because the output from running code (messages, results, warnings, etc.) get’s added to the R Markdown document instead of being printed to the console. Let’s see about adjusting the output to make it look better with code chunk rendering options.

Code Chunk Naming and Options

Name Your Code Chunk

Before we get to fixing how our code output looks, let’s pause a second and give our code chunk a name (also called a label). While not necessary for running your code, it is good practice is to give a name to each code chunk because it gives the chunk a unique identifier which allows for more advanced options (such as cross-referencing) to work with your rmd files later on:

{r chunk-name}

Some things to keep in mind

We’ll see in a bit where this code chunk label comes in handy. But, for now let’s go back and give our first code chunk a name:

{r fig3-heartrate}

Tip: Don’t use spaces, periods or underscores in code chunk labels

Try to avoid spaces, periods (.), and underscores (_) in chunk labels and paths. If you need separators, you are recommended to use hyphens (-) instead. For example, setup-options is a good label, whereas setup.options and chunk 1 are bad; fig.path = ‘figures/mcmc-‘ is a good path for figure output, and fig.path = ‘markov chain/monte carlo’ is bad. See more at: https://yihui.org/knitr/options/

Code Chunk Options

There are over 50 different code chunk options!!! Obviously we will not go over all of them, but they fall into several larger categories including: code evaluation, text output, code style, cache options, plot output and animation. We’ll talk about a few options for code evaluation, text output and plot output specifically.

Tip: Learn more about code chunk options

Find a complete list of code chunk options on Knitr developer, Yihui Xie’s, online guide to knitr. Or, you can find a brief list of all options on the R Markdown Reference guide on page 3 accesible through the RStudio Interface by navigating to the main menu bar Help > Cheat Sheets > R Markdown Reference Guide.

The chunk name is the only value other than r in the code chunk options that doesn’t require a tag (i.e. the “= VALUE” part of option = VALUE). So chunk options will always require a tag, and the syntax will be in the form:

{r chunk-label, option = VALUE}

the option always follows the code chunk label (don’t forget to add a , after the label either).

Common text output & code evaluation options:

Code evaluation option

include = (logical) whether to include the chunk output in the output document (defaults to TRUE).

Text output options

eval = (logical or numeric) TRUE/FALSE to evaluate (or not) or a numeric value like c(1,3) (only evaluate expressions 1 and 3). echo = (logical or numeric - following the same rules as above) whether to display source code or not.
results = (logical or character) text output of the code can be hidden (hide or FALSE), or delineated in a certain way (default ‘markup’).
warning = (logical) whether to display the warnings in the output (default TRUE). FALSE will output warnings to the console only.
message = (logical) whether or not to display messages that appear when running the code (default TRUE).

CHALLENGE 7.1 - Rendering Codes (optional)

How will some hypothetical code render given the following options? {r global-chunk-challenge, eval = TRUE, include = FALSE}

SOLUTION

The expressions in the code chunk will be evaluated, but the outputed figures/plots will not be included in the knit document.
When might you want to use this?
If you need to calculate some value or do something on your dataset for a further calucation or plot, but the output is not important to be included in your paper narrative.

CHALLENGE 7.2 - add options to your code

Add the following options to your code:
echo = FALSE, message = FALSE, warning = FALSE, results = FALSE

What will this do?

SOLUTION

solution to 7.2

These options mean the source code will not be printed in the knit html document, messages from the code will not be printed in the knit html document, and warnings will not be printed in the knit html document (but will still output to the console). Plots, figures or whatever is printed by the code WILL show up in the final html document.

Caption your code chunk output:

The options we just looked at focus on code evaluation and text output. However, we have another set of options that deal with how plot or figure outputs look at act. Many of the options start with fig. The one we will use today allows us to add a caption to our figure. Again, this is an optional feature, but if you need (or want) to add captions to your publication, it is straightforward to do in code chunks.

The caption information also resides between your brackets at the beginning of the chunk: {r}

the tag is fig.cap followed by a = and the captions within quotes "caption for figure".

CHALLENGE 7.3: Add a caption to Figure 3

Let’s add a caption to our heartrate figure. Add the caption:

“Fig 3: Mean heart rate of stress and control groups at baseline and during intervention.”

SOLUTION

so, you should end up with the following in your code chunk:

{r fig3-heartrate, echo = FALSE, message = FALSE, warning = FALSE, results = FALSE, 
fig.cap = "Fig 3: Mean heart rate of stress and control groups at baseline and during intervention."}

Set the option fig.cap to equal the text in double quotes.

More plot/figure options

Other options that change how a plot or figure appears often use the sytax fig.xxx similar to fig.cap Some other useful plot/figure code options include (From Yihui Xie’s page ):

  • fig.width, fig.height: (both are 7; numeric) Width and height of the plot (in inches), to be used in the graphics device.
  • out.width, out.height: (NULL; character) Width and height of the plot in the output document, which can be different with its physical fig.width and fig.height, i.e., plots can be scaled in the output document.
  • fig.align: (‘default’; character) Alignment of figures in the output document. Possible values are default, left, right, and center. The default is not to make any alignment adjustments.
  • fig.link: (NULL; character) A link to be added onto the figure.
  • fig.alt: (NULL; character) The alternative text to be used in the alt attribute of the tags of figures in HTML output. By default, the chunk option fig.cap will be used as the alternative text if provided.

Let’s knit one more time to see if our figure outputs how we’d like and has a caption.

Time to Knit!

Let’s try that again

Global Code Chunk Options:

Now we’ve learned how to create a code chunk and learned about options for adjusting how that code renders in our output document. However, let’s direct our attention back to the first code chunk in this document that I asked you not to delete.

The code looks like:

Code Chunk Option Setup

This is an option to globally set options for the entire R Markdown document.

With our first plot we set the four options that adjust how that one chunk renders. However, we may end up with quite a few code chunks in our paper. For example, if we have 10 code chunks in the final paper, can you imagine how much work it would be to add the options in manually each time? and if we need different options for different figures, it could be a lot of work to keep track of what options we’re using throughout the paper. We can automate setting options by adding this special code chunk at the beginning of the document. Then, each code chunk we add will refer to those “global” options when it runs.

To test this, let’s add the options we put in our code chunk (and make sure to delete them from the code chunk with the heart rate code)

In the () after the knitr::opts_chunk$set() add the options:

knitr::opts_chunk$set(echo = FALSE, message = FALSE, warning = FALSE, results = FALSE)

Tip: Overiding global options

What if you want most of your code chunks to render with the same options (i.e. echo = FALSE), but you just have one or two chunks that you want to tweak the options on (i.e. display code with echo = TRUE)? Good news! The global options can be overwritten on a case by case basis in each individual code chunk. Test this by adding echo = TRUE to your code chunk in your document and knitting. Did you override the global settings successfully?

Add global options to our paper

Now, let’s navigate back to our paper `` and add global options there.

To set global options that apply to every chunk in your file, we will call knitr::opts_chunk$set() in a new code chunk right after our yaml header (name the new code chunk setup.

Knitr will treat each option that we add to this call as default settings for all code chunks. However, we will need to set the options for this code chunk in the first place! so make sure to use include = FALSE as in the generic R Markdown document.

In the () after the knitr::opts_chunk$set() add the options:

knitr::opts_chunk$set(echo = FALSE, message = FALSE, warning = FALSE, results = FALSE)

Alright! That sets us up well for adding code chunks into our paper (which we will do next)

CHALLENGE 7.4 (optional) global & individual code chunk options

How would appear in our html document if we knit a code chunk with the following options?
{r challenge-5, warning = TRUE, echo = TRUE}

…considering the global chunk settings were as listed: knitr::opts_chunk$set(echo = FALSE, include = FALSE)

SOLUTION

In this case, the global settings are set so neither the code nor the output will display. However, the individual chunk reverses the echo setting so the code will display, and it also indicates that any warnings the code renders should output too. The outputs of the code would still not be displayed (include = FALSE) The hypothetical situation for this configuration may be for debugging while writing the rmd document.

Key Points

  • Knitr will render your code and R markdown-formatted text and output your document format of choice

  • Code chunks are runable piece of R code. Each time you knit the document, calculations and plots will be run and displayed

  • Options for code chunks can be set at the individual level or at the global level