Using knitr and R to make instructor/student handout versions

I teach some of my lab sections using R, and so I need to create lab handouts that include nicely formatted R commands and R output as an example for the students. These handouts will also include exercises where the students will be writing their own R code, or interpreting the results, or generating figures. For these exercises, it is useful to also have an instructor version of the handout so that I can recall what I was hoping to have the students do, and so that other instructors in the course have some clue as to what I might have been thinking when I wrote the exercises. Using the knitr package in R, I create these handouts as TeX documents that are turned into nicely formatted PDF files, and I use the flexibility of knitr to put both the code for the “student” and “instructor” versions of the handout in a single .Rnw file. By changing a single variable value in my .Rnw file, I can then have knitr and pdflatex produce either the student version of the handout, or the longer instructor version with all of the additional code and answers included in the PDF file. The output PDF file always has the same name as the .Rnw file, so I do have to be careful to rename the PDF so that I can tell whether it’s the student or instructor version. Please note that knitr is also used to produce Markdown and HTML output, but what I cover below is mostly specific to the TeX and PDF output.

Users of the knitr package will be familiar with the concept of “chunks”, sections in a .Rnw file that contain R code that knitr will interpret, run, and provide output for in the final output document (a PDF in my case). In between those chunks there will be TeX code, which produces the formatted paragraphs and equations that constitute much of the body of a lab handout. The chunks of R code can include arguments (“chunk options”) to show the R code (echo=), to actually run the R code (eval=), and to include the R output (include=). By using a combination of these chunk arguments, I can insert or hide sections of R code, or output, or figures, or even formatted TeX code that appears as normal text.

Here are several of the different situations where I might want to selectively run, not run, or hide a section of R code or TeX text output.

Case 1: Silent code

Code that you want to silently run, but neither the instructor nor student needs to see the code or results.

This case commonly includes things like setting knitr chunk options, defining variables that will be used in chunk options, or loading datasets that will be used later in the document. For these cases, the chunk option echo is set FALSE so that the R code isn’t shown in the handout, and the option include=FALSE means that no output from the R code is included in the handout. This code will always silently execute when the knitr document is processed into a pdf.

The usual knitr chunk options are an example:

<<setup, echo=FALSE, eval=TRUE, include = FALSE, cache = FALSE>>=
opts_chunk$set(fig.path='figs/', cache.path='cache/graphics-', 
		fig.align='center', fig.width = 5, fig.height = 5, = 'hold',
		cache = TRUE, par = TRUE)

In the code above I set some knitr options including a maximum of 60 character width for code printed out on the handout. The relative location of figures is declared (in a subfolder called ‘figs/’), and the default width and height of the figures is declared (the default figure type is a pdf).

The important bit that I make use of throughout the generation of my handouts is declaring two additional variables that I can use to switch the document output between the student and instructor versions. I declare two new variables in a code chunk:

<<echo=FALSE, eval=TRUE>>=
showInstruct = TRUE  # Set to TRUE to produce instructor output.  
# Set FALSE for student version

Case 2: Dummy code

In this case, I include a chunk of code that should be displayed for students (and instructors), but won’t necessarily produce viable output.

<<echo=TRUE, eval=FALSE>>=
# This would cause an error if actually executed
model = lm( your dummy ~ formula here, data = your data frame) 

I normally use the settings above if I want to show a malformed R command, or want to show a properly formed R command but not actually execute it or show the results. Because eval=FALSE, knitr will not run the R code in this chunk. If you purposely include malformed R code, you may see warnings when knitr attempts to process the .Rnw file and make a pdf, but the malformed code should still appear in the output document.

Case 3: “Normal” code

In this case, you want to display R commands, and show their output for students (and instructors). This might include plotting commands where you will also need to insert a figure into the document.

<<echo=TRUE, eval=TRUE>>=
# A real bit of R code, assuming data frame df has columns Y and X
model = lm(Y~X, data = df)

The above example would display both the R commands, and also the R output of the summary(model) function.

Case 4: Instructor-only evaluated code

For these chunks, you want to show the correct R code to execute (perhaps instead of the dummy code from Case 2), and show the associated R output, only when enableInstruct=TRUE.

<<echo=showInstruct, eval=showInstruct>>=
# A real bit of R code, assuming data frame df has columns Y and X
# This would only appear and run if showInstruct=TRUE
model = lm(Y~X, data = df)

The R code and results would not appear in the student version of the handout, and it’s worth noting that the underlying R code also wouldn’t be evaluated in the student version. This could cause problems if the R steps in this chunk are needed for later chunks that might actually appear in the student version. That situation gives rise to case 5.

Case 5: Always-evaluated instructor code

This is a case where you don’t want to reveal R code in the student version, but it is necessary to execute some R code at this juncture of the .Rnw file so that later R chunks in the student version work properly (so it must evaluate in both instructor and student versions).

<<echo=showInstruct, eval=TRUE, include=showInstruct>>=
# A real bit of R code, assuming data frame df has columns Y and X
model = lm(Y~X, data = df)

Notice above that echo=showInstruct, so this R code would only be printed in the instructor version of the handout, but eval=TRUE, so this code chunk will always evaluate in both the student and instructor versions. There is now the extra chunk option, include=showInstruct that will take care of any R output (or a figure) that might be produced by the R code. The include option tells knitr whether to show the R output or not, so in this case it will only show output when showInstruct is true. This extra option can be left out if you’re not running R code that produces output on the console or produces a figure.

Case 6: Instructor-only TeX code

In this case, you can have formatted text (not R code or R output, but just normal paragraph-style text like that you write outside your R chunks) that only shows up in the Instructor version of the handout, when showInstruct=TRUE.

This kind of chunk requires some specialized formatting, because the goal is to have knitr output TeX code that is secondarily read by your pdflatex interpreter to create nicely formatted text (or equations). You do not normally want to echo the raw R code from these chunks, you only want to have the TeX text inserted into the TeX document that knitr creates before generating a pdf. To accomplish this, I use three chunk options: echo=FALSE (since I never want to see the R code that produced this output), eval=showInstruct (since I only want the TeX text to appear in the document when I have set showInstruct to be true), and results='asis', which tells knitr to return the raw TeX results from the R code chunk, which in this case will be properly formatted TeX code that the pdflatex engine can operate on to produce nicely formatted text.

<<echo=FALSE,eval=showInstruct, results = 'asis'>>=
# Set the instructor copy watermark if enableInstruct flag is TRUE
\\textbf{Instructor answers:}
\\item The mean difference between groups was 15.1 cm.
\\item The standard deviation was 5.6 cm.
\\item We're going to need a bigger boat.

In the example chunk above, I call the R function cat(), and have it print out a text string enclosed by the double quotes. Inside those quotes, I write normal text and TeX commands, and this text string ends up becoming TeX code when knitr processes it. Notice that unlike normal TeX/LaTeX commands that have one backslash, the commands inside my cat() function are preceded with two backslashes. This is so that R does not try to interpret the single backslash as a special function (like the newline command \n). The double backslash escapes the TeX command so that it prints out of the chunk as properly formatted TeX code:

\textbf{Instructor answers:}
\item The mean difference between groups was 15.1 cm.
\item The standard deviation was 5.6 cm.
\item We're going to need a bigger boat.

The above is what will show up in the TeX document that knitr creates from the .Rnw file, and those lines will be interpreted by pdflatex as normal TeX commands. The output in the instructor’s pdf document would then look something like:

Instructor answers:

  • 1. The mean difference between groups was 15.1 cm.
  • 2. The standard deviation was 5.6 cm.
  • 3. We’re going to need a bigger boat.

  • Figures in the student and instructor version

    Including a figure in the student version (which will also appear in the instructor version) is as simple as using the Case 3 “Normal” chunk options. However, I usually include the extra argument'hide' to avoid inserting the image right where the R code is executed. The figure that is created and saved to disk is given the same name as the chunk name (in this case it is called cholBoxPlot):

    boxplot(CholChange~Exercise+Treatment, data = chol, las = 1, 
    		ylab = 'Cholesterol Change') 

    I usually follow that R chunk with some TeX code to insert the figure that was created, and was stored in my
    figs/ directory. The figure is centered on the page, scaled to fit the width of the page (or smaller), and a caption is added beneath it. The TeX code, which is written outside of a R chunk in the .Rnw file, looks like this:

    \caption{A boxplot of the various treatment combinations. (You may need to 
    stretch out the width of your plot to see all the x-axis labels)}

    Figures in the instructor version only

    If you want to only generate a figure and show it in the instructor version, you can use a combination of the Case 4 and Case 6 chunk options.

    First, I write a code chunk to generate the figure, but not insert it yet. The'hide' chunk option keeps the figure hidden (but it is still written to a file on disk).

    <<interplotFishInstruct,echo=showInstruct, eval=showInstruct,'hide'>>=
    # Instructor code - make an interaction plot for the fish data
    interaction.plot(x.factor = pc$care, trace.factor = pc$o2f, 
    		response = pc$Hatching, type = 'b', pch = c(19,21),
    		xlab = 'Care Treatment', ylab = 'Mean Hatching Success, %',
    		las = 1)

    Second, I use the Case 6 chunk options to generate TeX code that will only show up in the instructor version of the .tex and .pdf output files. Notice once again that all of the TeX commands are escaped with a double backslash so that knitr passes them through properly. They are all contained inside the cat(" ") R function that is spread across several lines. The knitr processor will just return the formatted TeX code.

    \\caption{Instructor output. Interaction plot.}

    That produces TeX code for inserting a figure from the figs/ directory. In the .tex file that knitr produces, you would find the normal-looking TeX code:

    \caption{Instructor output. Interaction plot.}

    and the pdf would have the figure and associated caption inserted.

    Cache issues

    One thing to watch out for is that knitr by default caches the results of all R chunks, in an attempt to speed the compiling of the TeX file output. After your first compile of the file, any existing R chunks will not be re-run unless you make a change inside the chunk, or delete the cache. This could cause problems if any of your R chunks act differently depending on whether showInstruct is true or false, but they don’t get changed when you switch from true to false. In these cases, you have two options. First is to delete the knitr cache and re-compile the .Rnw file, which will force all of the R chunks to be re-evaluated. The other option is to include the cache=FALSE option in any individual chunks that should always be re-evaluated. If you look back at the Case 1 example at the top of this page, you’ll see that the first chunk is set to cache=FALSE just in case.

    Example code and output

    To see an example .Rnw file with some of these techniques used, along with the output student and instructor PDF files, go to my GitHub repository linked here. In that directory you will find a .Rnw file that knitr uses to produce the .tex file, which is then fed to pdflatex to produce the final PDF file. I have saved two versions of the PDF file, Student and Instructor. They were both generated from the lab_handout_test.Rnw file by changing the value of showInstruct on line 63 of the file.