Purpose

Introduce and display examples of how to use R Markdown to bring together code and text in one document to share your work.

Global code and functions

# Load packages
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6      ✔ purrr   0.3.5 
## ✔ tibble  3.1.8      ✔ dplyr   1.0.10
## ✔ tidyr   1.2.1      ✔ stringr 1.4.1 
## ✔ readr   2.1.3      ✔ forcats 0.5.2 
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(knitr)
library(here)
## here() starts at C:/Repositories/05_EMRR_Org/RMarkdown-GitHub-tutorial2022

Import Data

Let’s import some example fish count data collected by YBFMP.

df_monthly <- read_csv(here("materials/Monthly_Catch_CPUE.csv"))
## Rows: 557 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (1): Gear
## dbl  (5): year, month, effort, mo.count, CPUE
## date (1): Date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

This data contains monthly fish counts and CPUE for three different gear types (FKTR, RSTR, BSEIN) from 1998 to 2020.


A word about working directories and .Rmd files:
The default working directory when an .Rmd file is knitted is the .Rmd document directory. This can get confusing when you are using an R project, in which the default working directory is the root directory of the R project. So it’s possible to have different working directories when you are running code from code blocks in an .Rmd file versus when you knit the .Rmd file. A useful R package to help with working directories and file paths is the here R package. You can see that we’re using it in the code chunk above to clearly define the file path for the fish data. The here::here() function points to the root directory of an R project regardless of what the current working directory is. For example:

# Current working directory
getwd()
## [1] "C:/Repositories/05_EMRR_Org/RMarkdown-GitHub-tutorial2022/materials"
# Working directory defined by here::here()
here()
## [1] "C:/Repositories/05_EMRR_Org/RMarkdown-GitHub-tutorial2022"

You can continue the file path within the here::here() function as a character string, which appends this string to the the root directory of the R project. For example, here is the file path to the fish data imported above:

# Define file path to fish data relative to the root directory using here::here()
here("materials/Monthly_Catch_CPUE.csv")
## [1] "C:/Repositories/05_EMRR_Org/RMarkdown-GitHub-tutorial2022/materials/Monthly_Catch_CPUE.csv"

Using Code Chunks

Let’s practice using code chunks to create and print a plot displaying monthly CPUE by gear type.

plt_cpue_month <- df_monthly %>% 
  ggplot(aes(x = Date, y = CPUE)) +
  geom_col() +
  facet_wrap(vars(Gear), ncol = 1, scales = "free_y") + 
  theme_bw()

plt_cpue_month

Code Chunk Options

Now that we’ve seen how to use code chunks in an .Rmd document, lets take a look at some commonly-used code chunk options that configure if and how the code chunk and its results are displayed.

message, warning

Notice when we imported the fish count data above, there is a message printed in the output. If we don’t want this message to display, we can set message = FALSE for the code chunk.

df_monthly <- read_csv(here("materials/Monthly_Catch_CPUE.csv"))

Now the message doesn’t display, but the code was still executed. Same applies to warnings.

eval

If you want the code to appear in the output file, but for the code to NOT run when the Rmd file is knitted, set eval = FALSE as a code chunk option. For example, you may want to show the code you used to summarize the count and CPUE data as annual totals in a new data frame df_annual and plot the CPUE results saving it as plt_cpue_yr, but not actually run the code when you knit the file.

# Summarize count and CPUE as annual totals
df_annual <- df_monthly %>% 
  group_by(year, Gear) %>% 
  summarize(across(c(mo.count, CPUE), sum), .groups = "drop") %>%
  rename(
    Year = year,
    Count = mo.count
  )

# Plot annual CPUE
plt_cpue_yr <- df_annual %>% 
  ggplot(aes(x = Year, y = CPUE)) +
  geom_col() +
  facet_wrap(vars(Gear), ncol = 1, scales = "free_y") + 
  theme_bw()

plt_cpue_yr

Now if you were to try to print df_annual and plt_cpue_yr, you’ll get an error since the code wasn’t executed when the document was knitted.

# Show glimpse of df_annual
glimpse(df_annual)
## Error in glimpse(df_annual): object 'df_annual' not found
# Print plot of annual CPUE
plt_cpue_yr
## Error in eval(expr, envir, enclos): object 'plt_cpue_yr' not found

include

If you want the code to be executed when the Rmd file is knitted, but for the code and results to NOT appear in the output file, set include = FALSE as a code chunk option. For example, let’s summarize and plot the CPUE data using the same code shown above, but this time we’ll use include = FALSE so the code and results won’t appear in the output document.

Even though the code and results aren’t displayed in the document, the df_annual data frame and plt_cpue_yr plot object are available to be used within the Rmd file.

# Show glimpse of df_annual
glimpse(df_annual)
## Rows: 67
## Columns: 4
## $ Year  <dbl> 1998, 1998, 1999, 1999, 2000, 2000, 2000, 2001, 2001, 2001, 2002…
## $ Gear  <chr> "BSEIN", "RSTR", "BSEIN", "FKTR", "BSEIN", "FKTR", "RSTR", "BSEI…
## $ Count <dbl> 982, 32724, 13071, 260, 8916, 1830, 16595, 6138, 3579, 19090, 53…
## $ CPUE  <dbl> 5.3224932, 187.6711599, 12.7715195, 0.3697516, 11.8979680, 3.299…
# Print plot of annual CPUE
plt_cpue_yr

echo

If you want the code to be executed and results to be displayed in the output file when the Rmd is knitted, but for the code to NOT appear in the output file, set echo = FALSE as a code chunk option. For example, maybe you want glimpse(df_annual) and the plot of annual CPUE to display, but not show the code used to execute this.

## Rows: 67
## Columns: 4
## $ Year  <dbl> 1998, 1998, 1999, 1999, 2000, 2000, 2000, 2001, 2001, 2001, 2002…
## $ Gear  <chr> "BSEIN", "RSTR", "BSEIN", "FKTR", "BSEIN", "FKTR", "RSTR", "BSEI…
## $ Count <dbl> 982, 32724, 13071, 260, 8916, 1830, 16595, 6138, 3579, 19090, 53…
## $ CPUE  <dbl> 5.3224932, 187.6711599, 12.7715195, 0.3697516, 11.8979680, 3.299…

Figure output options

The default figure dimensions in a rendered R Markdown document is 7 inches wide and 5 inches tall. Sometimes you’ll want to change these, which can be done using the fig.height and fig.width code chunk options. Let’s change the dimensions of the annual CPUE figure to 6 by 6.

plt_cpue_yr

Now, maybe we want the figure to be aligned on the right-side of the knitted document. In this case, we’ll use the fig.align code chunk option.

plt_cpue_yr

If we want to add a caption to our figure, we’ll use the fig.cap code chunk option.

plt_cpue_yr
Figure 1: This is our annual CPUE figure.

Figure 1: This is our annual CPUE figure.

Finally, if we want to add alt text to our figure, we’ll use the fig.alt option.

plt_cpue_yr

This figure shows the annual CPUE for each gear type used by the YBFMP from 1998-2020.

Tables

This is what it looks like when you print a data frame itself.

df_annual
## # A tibble: 67 × 4
##     Year Gear  Count    CPUE
##    <dbl> <chr> <dbl>   <dbl>
##  1  1998 BSEIN   982   5.32 
##  2  1998 RSTR  32724 188.   
##  3  1999 BSEIN 13071  12.8  
##  4  1999 FKTR    260   0.370
##  5  2000 BSEIN  8916  11.9  
##  6  2000 FKTR   1830   3.30 
##  7  2000 RSTR  16595  30.5  
##  8  2001 BSEIN  6138  24.7  
##  9  2001 FKTR   3579   4.98 
## 10  2001 RSTR  19090  28.1  
## # … with 57 more rows

knitr::kable() offers a simple format for displaying tables in a rendered R Markdown document. For example, here is the table of annual count and CPUE data printed as a kable.

kable(df_annual, caption = "Annual Counts and CPUE of YBFMP survey data")
Annual Counts and CPUE of YBFMP survey data
Year Gear Count CPUE
1998 BSEIN 982 5.3224932
1998 RSTR 32724 187.6711599
1999 BSEIN 13071 12.7715195
1999 FKTR 260 0.3697516
2000 BSEIN 8916 11.8979680
2000 FKTR 1830 3.2997186
2000 RSTR 16595 30.4987486
2001 BSEIN 6138 24.7352013
2001 FKTR 3579 4.9793030
2001 RSTR 19090 28.0687376
2002 BSEIN 5386 26.4395986
2002 FKTR 2223 3.2344627
2002 RSTR 42362 63.1653017
2003 BSEIN 455 4.6880205
2003 FKTR 3311 4.9792709
2003 RSTR 13383 20.3652883
2004 BSEIN 1261 6.7650673
2004 FKTR 4369 6.5649613
2004 RSTR 19179 28.6243865
2005 BSEIN 611 4.1804482
2005 FKTR 2029 3.0151784
2005 RSTR 12181 20.4830891
2006 BSEIN 4988 14.9033394
2006 FKTR 1374 3.0068944
2006 RSTR 27915 65.6020561
2007 BSEIN 6205 6.7106441
2007 FKTR 5174 7.6289034
2007 RSTR 34692 55.4470753
2008 BSEIN 13144 11.9853387
2008 FKTR 2788 4.4703717
2008 RSTR 12329 46.9071342
2009 BSEIN 8144 19.9055054
2009 FKTR 3754 6.9650768
2009 RSTR 24944 58.3251513
2010 BSEIN 8889 6.8781580
2010 FKTR 1990 6.4066673
2010 RSTR 17814 73.4114551
2011 BSEIN 30131 11.0662723
2011 FKTR 1726 5.8182464
2011 RSTR 17761 59.5595626
2012 BSEIN 38990 16.7569638
2012 FKTR 2438 7.2846733
2012 RSTR 24999 90.8836115
2013 BSEIN 19977 10.3254081
2013 FKTR 1394 4.2841587
2013 RSTR 21361 60.8719818
2014 BSEIN 13941 8.2754013
2014 FKTR 1756 7.2168245
2014 RSTR 25016 71.7300402
2015 BSEIN 11075 6.3747209
2015 FKTR 2079 6.0665883
2015 RSTR 12612 33.1259513
2016 BSEIN 19600 10.5877517
2016 FKTR 2996 11.5283032
2016 RSTR 18675 64.6274886
2017 BSEIN 26777 13.6441580
2017 FKTR 4436 18.6973393
2017 RSTR 10283 68.1595985
2018 BSEIN 11688 7.3778899
2018 FKTR 6776 20.9096576
2018 RSTR 3212 9.5898324
2019 BSEIN 18541 14.2690841
2019 FKTR 5284 17.1477588
2019 RSTR 33746 119.1001649
2020 BSEIN 932 0.5414329
2020 FKTR 1851 6.4073520
2020 RSTR 2188 7.3031628

This is pretty nice compared to just printing the data frame itself. However, the kable formatting can be improved. The knitr::kable() function allows for some minor formatting such as rounding digits and column alignment.

kable(
  df_annual, 
  caption = "Annual Counts and CPUE of YBFMP survey data",
  # Round CPUE column to 1 digit
  digits = 1,
  # Align all columns to the left
  align = "l"
)
Annual Counts and CPUE of YBFMP survey data
Year Gear Count CPUE
1998 BSEIN 982 5.3
1998 RSTR 32724 187.7
1999 BSEIN 13071 12.8
1999 FKTR 260 0.4
2000 BSEIN 8916 11.9
2000 FKTR 1830 3.3
2000 RSTR 16595 30.5
2001 BSEIN 6138 24.7
2001 FKTR 3579 5.0
2001 RSTR 19090 28.1
2002 BSEIN 5386 26.4
2002 FKTR 2223 3.2
2002 RSTR 42362 63.2
2003 BSEIN 455 4.7
2003 FKTR 3311 5.0
2003 RSTR 13383 20.4
2004 BSEIN 1261 6.8
2004 FKTR 4369 6.6
2004 RSTR 19179 28.6
2005 BSEIN 611 4.2
2005 FKTR 2029 3.0
2005 RSTR 12181 20.5
2006 BSEIN 4988 14.9
2006 FKTR 1374 3.0
2006 RSTR 27915 65.6
2007 BSEIN 6205 6.7
2007 FKTR 5174 7.6
2007 RSTR 34692 55.4
2008 BSEIN 13144 12.0
2008 FKTR 2788 4.5
2008 RSTR 12329 46.9
2009 BSEIN 8144 19.9
2009 FKTR 3754 7.0
2009 RSTR 24944 58.3
2010 BSEIN 8889 6.9
2010 FKTR 1990 6.4
2010 RSTR 17814 73.4
2011 BSEIN 30131 11.1
2011 FKTR 1726 5.8
2011 RSTR 17761 59.6
2012 BSEIN 38990 16.8
2012 FKTR 2438 7.3
2012 RSTR 24999 90.9
2013 BSEIN 19977 10.3
2013 FKTR 1394 4.3
2013 RSTR 21361 60.9
2014 BSEIN 13941 8.3
2014 FKTR 1756 7.2
2014 RSTR 25016 71.7
2015 BSEIN 11075 6.4
2015 FKTR 2079 6.1
2015 RSTR 12612 33.1
2016 BSEIN 19600 10.6
2016 FKTR 2996 11.5
2016 RSTR 18675 64.6
2017 BSEIN 26777 13.6
2017 FKTR 4436 18.7
2017 RSTR 10283 68.2
2018 BSEIN 11688 7.4
2018 FKTR 6776 20.9
2018 RSTR 3212 9.6
2019 BSEIN 18541 14.3
2019 FKTR 5284 17.1
2019 RSTR 33746 119.1
2020 BSEIN 932 0.5
2020 FKTR 1851 6.4
2020 RSTR 2188 7.3

The kableExtra R package allows for many more formatting options for kables. As of version 1.2+, the author of kableExtra recommends using the kbl() function to create the kable.

library(kableExtra)
## 
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
## 
##     group_rows
df_annual %>%
  kbl(
    caption = "Annual Counts and CPUE of YBFMP survey data",
    # Round CPUE column to 1 digit
    digits = 1,
    # Align all columns to the left
    align = "l"
  ) %>% 
  kable_styling(
    # appearance option
    bootstrap_options = c("striped", "hover"),
    # change width
    full_width = FALSE,
    # change position
    position = "left",
    # fix the header row at the top when scrolling, useful for longer tables
    fixed_thead = TRUE
  ) %>%
  # Format the header row (0) to be aligned in the center
  row_spec(0, align = "center") %>%
  # Make the Year column bold
  column_spec(1, bold = TRUE) %>%
  # add a scroll box for easier viewing
  scroll_box(width = "40%", height = "550px")
Annual Counts and CPUE of YBFMP survey data
Year Gear Count CPUE
1998 BSEIN 982 5.3
1998 RSTR 32724 187.7
1999 BSEIN 13071 12.8
1999 FKTR 260 0.4
2000 BSEIN 8916 11.9
2000 FKTR 1830 3.3
2000 RSTR 16595 30.5
2001 BSEIN 6138 24.7
2001 FKTR 3579 5.0
2001 RSTR 19090 28.1
2002 BSEIN 5386 26.4
2002 FKTR 2223 3.2
2002 RSTR 42362 63.2
2003 BSEIN 455 4.7
2003 FKTR 3311 5.0
2003 RSTR 13383 20.4
2004 BSEIN 1261 6.8
2004 FKTR 4369 6.6
2004 RSTR 19179 28.6
2005 BSEIN 611 4.2
2005 FKTR 2029 3.0
2005 RSTR 12181 20.5
2006 BSEIN 4988 14.9
2006 FKTR 1374 3.0
2006 RSTR 27915 65.6
2007 BSEIN 6205 6.7
2007 FKTR 5174 7.6
2007 RSTR 34692 55.4
2008 BSEIN 13144 12.0
2008 FKTR 2788 4.5
2008 RSTR 12329 46.9
2009 BSEIN 8144 19.9
2009 FKTR 3754 7.0
2009 RSTR 24944 58.3
2010 BSEIN 8889 6.9
2010 FKTR 1990 6.4
2010 RSTR 17814 73.4
2011 BSEIN 30131 11.1
2011 FKTR 1726 5.8
2011 RSTR 17761 59.6
2012 BSEIN 38990 16.8
2012 FKTR 2438 7.3
2012 RSTR 24999 90.9
2013 BSEIN 19977 10.3
2013 FKTR 1394 4.3
2013 RSTR 21361 60.9
2014 BSEIN 13941 8.3
2014 FKTR 1756 7.2
2014 RSTR 25016 71.7
2015 BSEIN 11075 6.4
2015 FKTR 2079 6.1
2015 RSTR 12612 33.1
2016 BSEIN 19600 10.6
2016 FKTR 2996 11.5
2016 RSTR 18675 64.6
2017 BSEIN 26777 13.6
2017 FKTR 4436 18.7
2017 RSTR 10283 68.2
2018 BSEIN 11688 7.4
2018 FKTR 6776 20.9
2018 RSTR 3212 9.6
2019 BSEIN 18541 14.3
2019 FKTR 5284 17.1
2019 RSTR 33746 119.1
2020 BSEIN 932 0.5
2020 FKTR 1851 6.4
2020 RSTR 2188 7.3


You can do so much more with the kableExtra R package. Check out the package documentation for more details and options.

Interactive Elements

There are a whole lot of R packages that allow for some really nice interactive widgets in R Markdown. We’ll cover three of these that I’ve used and found helpful.

datatable

You can convert a data frame to an interactive table using the DT::datatable() function.

library(DT)
datatable(df_annual)

There are also some options that you can set within the DT::datatable() function.

df_annual %>% 
  # Convert Year and Gear variables to factors to allow for select lists in the
    # datatable filters
  mutate(across(c(Year, Gear), factor)) %>% 
  datatable(
    # remove rownames
    rownames = FALSE,
    # add column filters
    filter = "top",
    options = list(autoWidth = TRUE)
  )

plotly

You can create interactive plots using the plotly R package. The plotly::ggplotly() function converts ggplot2 objects into interactive plots. I found that it works best with less complicated plots.

library(plotly)
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
ggplotly(plt_cpue_yr)

leaflet

You can make interactive maps using the leaflet R package. First we’ll download the fish monitoring stations for YBFMP from the EDI data package.

df_stations <- read_csv("https://portal.edirepository.org/nis/dataviewer?packageid=edi.233.3&entityid=89146f1382d7dfa3bbf3e4b1554eb5cc")
## Rows: 26 Columns: 9
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (7): StationCode, StationName, StationNumber, PeriodOfRecordFrom, Period...
## dbl (2): Latitude, Longitude
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Convert to sf object
library(sf)
## Linking to GEOS 3.9.1, GDAL 3.4.3, PROJ 7.2.1; sf_use_s2() is TRUE
sf_stations <- df_stations %>% st_as_sf(coords = c("Longitude", "Latitude"), crs = 4326)

Next, we’ll create an interactive map of the sampling locations.

library(leaflet)
sf_stations %>% 
  leaflet() %>% 
  addTiles() %>%
  addCircleMarkers(
    radius = 5,
    fillOpacity = 0.8,
    weight = 0.5,
    color = "black",
    opacity = 1
  )

Now, we’ll add labels for each station and color code the markers by the MethodCode (BSEIN, FKTR, RSTR).

# Define color palette for MethodCode
color_pal <- colorFactor(palette = "viridis", domain = sf_stations$MethodCode)

sf_stations %>% 
  leaflet() %>% 
  addTiles() %>%
  addCircleMarkers(
    radius = 5,
    fillColor = ~color_pal(MethodCode),
    fillOpacity = 0.8,
    weight = 0.5,
    color = "black",
    opacity = 1,
    label = paste0("Station Code: ", sf_stations$StationCode)
  ) %>% 
  addLegend(
    position = "topright",
    pal = color_pal,
    values = sf_stations$MethodCode,
    title = "Gear Type:"
  )

Adding Tabs

A feature that I’ve found to be useful is using tabs in a rendered R Markdown document. Here is how you would set up the document to display the two CPUE plots and the stations map in tabs for easier navigation. Apply the {.tabset} class attribute to the ## YBFMP Data Summary header below like this: ## YBFMP Data Summary {.tabset}. Then, each of the sub-headers of this header will appear as tabs instead of standalone sections. Use another header of the same level as ## YBFMP Data Summary (level 2) to stop using tab navigation.

YBFMP Data Summary

Monthly CPUE

Annual CPUE

Stations Map

Tab formatting

Add .tabset-fade and/or .tabset-pills to the {.tabset} class attribute to change the appearance and behavior of the tabs.

