Introduce and display examples of how to use R Markdown to bring together code and text in one document to share your work.
# Load packages
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.5
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.2.1 ✔ stringr 1.4.1
## ✔ readr 2.1.3 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(knitr)
library(here)
## here() starts at C:/Repositories/05_EMRR_Org/RMarkdown-GitHub-tutorial2022
Let’s import some example fish count data collected by YBFMP.
df_monthly <- read_csv(here("materials/Monthly_Catch_CPUE.csv"))
## Rows: 557 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Gear
## dbl (5): year, month, effort, mo.count, CPUE
## date (1): Date
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
This data contains monthly fish counts and CPUE for three different gear types (FKTR, RSTR, BSEIN) from 1998 to 2020.
A word about working directories and .Rmd
files:
The default working directory when an .Rmd file is knitted is the .Rmd
document directory. This can get confusing when you are using an R
project, in which the default working directory is the root directory of
the R project. So it’s possible to have different working directories
when you are running code from code blocks in an .Rmd file versus when
you knit the .Rmd file. A useful R package to help with working
directories and file paths is the here R package. You can
see that we’re using it in the code chunk above to clearly define the
file path for the fish data. The here::here()
function points to the root directory of an R project regardless of what
the current working directory is. For example:
# Current working directory
getwd()
## [1] "C:/Repositories/05_EMRR_Org/RMarkdown-GitHub-tutorial2022/materials"
# Working directory defined by here::here()
here()
## [1] "C:/Repositories/05_EMRR_Org/RMarkdown-GitHub-tutorial2022"
You can continue the file path within the here::here()
function as a character string, which appends this string to the the
root directory of the R project. For example, here is the file path to
the fish data imported above:
# Define file path to fish data relative to the root directory using here::here()
here("materials/Monthly_Catch_CPUE.csv")
## [1] "C:/Repositories/05_EMRR_Org/RMarkdown-GitHub-tutorial2022/materials/Monthly_Catch_CPUE.csv"
Let’s practice using code chunks to create and print a plot displaying monthly CPUE by gear type.
plt_cpue_month <- df_monthly %>%
ggplot(aes(x = Date, y = CPUE)) +
geom_col() +
facet_wrap(vars(Gear), ncol = 1, scales = "free_y") +
theme_bw()
plt_cpue_month
Now that we’ve seen how to use code chunks in an .Rmd document, lets take a look at some commonly-used code chunk options that configure if and how the code chunk and its results are displayed.
Notice when we imported the fish count data above, there is a message
printed in the output. If we don’t want this message to display, we can
set message = FALSE for the code chunk.
df_monthly <- read_csv(here("materials/Monthly_Catch_CPUE.csv"))
Now the message doesn’t display, but the code was still executed. Same applies to warnings.
If you want the code to appear in the output file, but for the
code to NOT run when the Rmd file is knitted, set
eval = FALSE as a code chunk option. For example, you may
want to show the code you used to summarize the count and CPUE data as
annual totals in a new data frame df_annual and plot the
CPUE results saving it as plt_cpue_yr, but not actually run
the code when you knit the file.
# Summarize count and CPUE as annual totals
df_annual <- df_monthly %>%
group_by(year, Gear) %>%
summarize(across(c(mo.count, CPUE), sum), .groups = "drop") %>%
rename(
Year = year,
Count = mo.count
)
# Plot annual CPUE
plt_cpue_yr <- df_annual %>%
ggplot(aes(x = Year, y = CPUE)) +
geom_col() +
facet_wrap(vars(Gear), ncol = 1, scales = "free_y") +
theme_bw()
plt_cpue_yr
Now if you were to try to print df_annual and
plt_cpue_yr, you’ll get an error since the code wasn’t
executed when the document was knitted.
# Show glimpse of df_annual
glimpse(df_annual)
## Error in glimpse(df_annual): object 'df_annual' not found
# Print plot of annual CPUE
plt_cpue_yr
## Error in eval(expr, envir, enclos): object 'plt_cpue_yr' not found
If you want the code to be executed when the Rmd file is knitted, but
for the code and results to NOT appear in the output
file, set include = FALSE as a code chunk option. For
example, let’s summarize and plot the CPUE data using the same code
shown above, but this time we’ll use include = FALSE so the
code and results won’t appear in the output document.
Even though the code and results aren’t displayed in the document,
the df_annual data frame and plt_cpue_yr plot
object are available to be used within the Rmd file.
# Show glimpse of df_annual
glimpse(df_annual)
## Rows: 67
## Columns: 4
## $ Year <dbl> 1998, 1998, 1999, 1999, 2000, 2000, 2000, 2001, 2001, 2001, 2002…
## $ Gear <chr> "BSEIN", "RSTR", "BSEIN", "FKTR", "BSEIN", "FKTR", "RSTR", "BSEI…
## $ Count <dbl> 982, 32724, 13071, 260, 8916, 1830, 16595, 6138, 3579, 19090, 53…
## $ CPUE <dbl> 5.3224932, 187.6711599, 12.7715195, 0.3697516, 11.8979680, 3.299…
# Print plot of annual CPUE
plt_cpue_yr
If you want the code to be executed and results to be displayed in
the output file when the Rmd is knitted, but for the code to NOT
appear in the output file, set echo = FALSE as a
code chunk option. For example, maybe you want
glimpse(df_annual) and the plot of annual CPUE to display,
but not show the code used to execute this.
## Rows: 67
## Columns: 4
## $ Year <dbl> 1998, 1998, 1999, 1999, 2000, 2000, 2000, 2001, 2001, 2001, 2002…
## $ Gear <chr> "BSEIN", "RSTR", "BSEIN", "FKTR", "BSEIN", "FKTR", "RSTR", "BSEI…
## $ Count <dbl> 982, 32724, 13071, 260, 8916, 1830, 16595, 6138, 3579, 19090, 53…
## $ CPUE <dbl> 5.3224932, 187.6711599, 12.7715195, 0.3697516, 11.8979680, 3.299…
The default figure dimensions in a rendered R Markdown document is 7
inches wide and 5 inches tall. Sometimes you’ll want to change these,
which can be done using the fig.height and
fig.width code chunk options. Let’s change the dimensions
of the annual CPUE figure to 6 by 6.
plt_cpue_yr
Now, maybe we want the figure to be aligned on the right-side of the
knitted document. In this case, we’ll use the fig.align
code chunk option.
plt_cpue_yr
If we want to add a caption to our figure, we’ll use the
fig.cap code chunk option.
plt_cpue_yr
Figure 1: This is our annual CPUE figure.
Finally, if we want to add alt text to our figure, we’ll use the
fig.alt option.
plt_cpue_yr
This is what it looks like when you print a data frame itself.
df_annual
## # A tibble: 67 × 4
## Year Gear Count CPUE
## <dbl> <chr> <dbl> <dbl>
## 1 1998 BSEIN 982 5.32
## 2 1998 RSTR 32724 188.
## 3 1999 BSEIN 13071 12.8
## 4 1999 FKTR 260 0.370
## 5 2000 BSEIN 8916 11.9
## 6 2000 FKTR 1830 3.30
## 7 2000 RSTR 16595 30.5
## 8 2001 BSEIN 6138 24.7
## 9 2001 FKTR 3579 4.98
## 10 2001 RSTR 19090 28.1
## # … with 57 more rows
knitr::kable() offers a simple format for displaying
tables in a rendered R Markdown document. For example, here is the table
of annual count and CPUE data printed as a kable.
kable(df_annual, caption = "Annual Counts and CPUE of YBFMP survey data")
| Year | Gear | Count | CPUE |
|---|---|---|---|
| 1998 | BSEIN | 982 | 5.3224932 |
| 1998 | RSTR | 32724 | 187.6711599 |
| 1999 | BSEIN | 13071 | 12.7715195 |
| 1999 | FKTR | 260 | 0.3697516 |
| 2000 | BSEIN | 8916 | 11.8979680 |
| 2000 | FKTR | 1830 | 3.2997186 |
| 2000 | RSTR | 16595 | 30.4987486 |
| 2001 | BSEIN | 6138 | 24.7352013 |
| 2001 | FKTR | 3579 | 4.9793030 |
| 2001 | RSTR | 19090 | 28.0687376 |
| 2002 | BSEIN | 5386 | 26.4395986 |
| 2002 | FKTR | 2223 | 3.2344627 |
| 2002 | RSTR | 42362 | 63.1653017 |
| 2003 | BSEIN | 455 | 4.6880205 |
| 2003 | FKTR | 3311 | 4.9792709 |
| 2003 | RSTR | 13383 | 20.3652883 |
| 2004 | BSEIN | 1261 | 6.7650673 |
| 2004 | FKTR | 4369 | 6.5649613 |
| 2004 | RSTR | 19179 | 28.6243865 |
| 2005 | BSEIN | 611 | 4.1804482 |
| 2005 | FKTR | 2029 | 3.0151784 |
| 2005 | RSTR | 12181 | 20.4830891 |
| 2006 | BSEIN | 4988 | 14.9033394 |
| 2006 | FKTR | 1374 | 3.0068944 |
| 2006 | RSTR | 27915 | 65.6020561 |
| 2007 | BSEIN | 6205 | 6.7106441 |
| 2007 | FKTR | 5174 | 7.6289034 |
| 2007 | RSTR | 34692 | 55.4470753 |
| 2008 | BSEIN | 13144 | 11.9853387 |
| 2008 | FKTR | 2788 | 4.4703717 |
| 2008 | RSTR | 12329 | 46.9071342 |
| 2009 | BSEIN | 8144 | 19.9055054 |
| 2009 | FKTR | 3754 | 6.9650768 |
| 2009 | RSTR | 24944 | 58.3251513 |
| 2010 | BSEIN | 8889 | 6.8781580 |
| 2010 | FKTR | 1990 | 6.4066673 |
| 2010 | RSTR | 17814 | 73.4114551 |
| 2011 | BSEIN | 30131 | 11.0662723 |
| 2011 | FKTR | 1726 | 5.8182464 |
| 2011 | RSTR | 17761 | 59.5595626 |
| 2012 | BSEIN | 38990 | 16.7569638 |
| 2012 | FKTR | 2438 | 7.2846733 |
| 2012 | RSTR | 24999 | 90.8836115 |
| 2013 | BSEIN | 19977 | 10.3254081 |
| 2013 | FKTR | 1394 | 4.2841587 |
| 2013 | RSTR | 21361 | 60.8719818 |
| 2014 | BSEIN | 13941 | 8.2754013 |
| 2014 | FKTR | 1756 | 7.2168245 |
| 2014 | RSTR | 25016 | 71.7300402 |
| 2015 | BSEIN | 11075 | 6.3747209 |
| 2015 | FKTR | 2079 | 6.0665883 |
| 2015 | RSTR | 12612 | 33.1259513 |
| 2016 | BSEIN | 19600 | 10.5877517 |
| 2016 | FKTR | 2996 | 11.5283032 |
| 2016 | RSTR | 18675 | 64.6274886 |
| 2017 | BSEIN | 26777 | 13.6441580 |
| 2017 | FKTR | 4436 | 18.6973393 |
| 2017 | RSTR | 10283 | 68.1595985 |
| 2018 | BSEIN | 11688 | 7.3778899 |
| 2018 | FKTR | 6776 | 20.9096576 |
| 2018 | RSTR | 3212 | 9.5898324 |
| 2019 | BSEIN | 18541 | 14.2690841 |
| 2019 | FKTR | 5284 | 17.1477588 |
| 2019 | RSTR | 33746 | 119.1001649 |
| 2020 | BSEIN | 932 | 0.5414329 |
| 2020 | FKTR | 1851 | 6.4073520 |
| 2020 | RSTR | 2188 | 7.3031628 |
This is pretty nice compared to just printing the data frame itself.
However, the kable formatting can be improved. The
knitr::kable() function allows for some minor formatting
such as rounding digits and column alignment.
kable(
df_annual,
caption = "Annual Counts and CPUE of YBFMP survey data",
# Round CPUE column to 1 digit
digits = 1,
# Align all columns to the left
align = "l"
)
| Year | Gear | Count | CPUE |
|---|---|---|---|
| 1998 | BSEIN | 982 | 5.3 |
| 1998 | RSTR | 32724 | 187.7 |
| 1999 | BSEIN | 13071 | 12.8 |
| 1999 | FKTR | 260 | 0.4 |
| 2000 | BSEIN | 8916 | 11.9 |
| 2000 | FKTR | 1830 | 3.3 |
| 2000 | RSTR | 16595 | 30.5 |
| 2001 | BSEIN | 6138 | 24.7 |
| 2001 | FKTR | 3579 | 5.0 |
| 2001 | RSTR | 19090 | 28.1 |
| 2002 | BSEIN | 5386 | 26.4 |
| 2002 | FKTR | 2223 | 3.2 |
| 2002 | RSTR | 42362 | 63.2 |
| 2003 | BSEIN | 455 | 4.7 |
| 2003 | FKTR | 3311 | 5.0 |
| 2003 | RSTR | 13383 | 20.4 |
| 2004 | BSEIN | 1261 | 6.8 |
| 2004 | FKTR | 4369 | 6.6 |
| 2004 | RSTR | 19179 | 28.6 |
| 2005 | BSEIN | 611 | 4.2 |
| 2005 | FKTR | 2029 | 3.0 |
| 2005 | RSTR | 12181 | 20.5 |
| 2006 | BSEIN | 4988 | 14.9 |
| 2006 | FKTR | 1374 | 3.0 |
| 2006 | RSTR | 27915 | 65.6 |
| 2007 | BSEIN | 6205 | 6.7 |
| 2007 | FKTR | 5174 | 7.6 |
| 2007 | RSTR | 34692 | 55.4 |
| 2008 | BSEIN | 13144 | 12.0 |
| 2008 | FKTR | 2788 | 4.5 |
| 2008 | RSTR | 12329 | 46.9 |
| 2009 | BSEIN | 8144 | 19.9 |
| 2009 | FKTR | 3754 | 7.0 |
| 2009 | RSTR | 24944 | 58.3 |
| 2010 | BSEIN | 8889 | 6.9 |
| 2010 | FKTR | 1990 | 6.4 |
| 2010 | RSTR | 17814 | 73.4 |
| 2011 | BSEIN | 30131 | 11.1 |
| 2011 | FKTR | 1726 | 5.8 |
| 2011 | RSTR | 17761 | 59.6 |
| 2012 | BSEIN | 38990 | 16.8 |
| 2012 | FKTR | 2438 | 7.3 |
| 2012 | RSTR | 24999 | 90.9 |
| 2013 | BSEIN | 19977 | 10.3 |
| 2013 | FKTR | 1394 | 4.3 |
| 2013 | RSTR | 21361 | 60.9 |
| 2014 | BSEIN | 13941 | 8.3 |
| 2014 | FKTR | 1756 | 7.2 |
| 2014 | RSTR | 25016 | 71.7 |
| 2015 | BSEIN | 11075 | 6.4 |
| 2015 | FKTR | 2079 | 6.1 |
| 2015 | RSTR | 12612 | 33.1 |
| 2016 | BSEIN | 19600 | 10.6 |
| 2016 | FKTR | 2996 | 11.5 |
| 2016 | RSTR | 18675 | 64.6 |
| 2017 | BSEIN | 26777 | 13.6 |
| 2017 | FKTR | 4436 | 18.7 |
| 2017 | RSTR | 10283 | 68.2 |
| 2018 | BSEIN | 11688 | 7.4 |
| 2018 | FKTR | 6776 | 20.9 |
| 2018 | RSTR | 3212 | 9.6 |
| 2019 | BSEIN | 18541 | 14.3 |
| 2019 | FKTR | 5284 | 17.1 |
| 2019 | RSTR | 33746 | 119.1 |
| 2020 | BSEIN | 932 | 0.5 |
| 2020 | FKTR | 1851 | 6.4 |
| 2020 | RSTR | 2188 | 7.3 |
The kableExtra R package allows for many more formatting
options for kables. As of version 1.2+, the author of
kableExtra recommends using the kbl() function
to create the kable.
library(kableExtra)
##
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
##
## group_rows
df_annual %>%
kbl(
caption = "Annual Counts and CPUE of YBFMP survey data",
# Round CPUE column to 1 digit
digits = 1,
# Align all columns to the left
align = "l"
) %>%
kable_styling(
# appearance option
bootstrap_options = c("striped", "hover"),
# change width
full_width = FALSE,
# change position
position = "left",
# fix the header row at the top when scrolling, useful for longer tables
fixed_thead = TRUE
) %>%
# Format the header row (0) to be aligned in the center
row_spec(0, align = "center") %>%
# Make the Year column bold
column_spec(1, bold = TRUE) %>%
# add a scroll box for easier viewing
scroll_box(width = "40%", height = "550px")
| Year | Gear | Count | CPUE |
|---|---|---|---|
| 1998 | BSEIN | 982 | 5.3 |
| 1998 | RSTR | 32724 | 187.7 |
| 1999 | BSEIN | 13071 | 12.8 |
| 1999 | FKTR | 260 | 0.4 |
| 2000 | BSEIN | 8916 | 11.9 |
| 2000 | FKTR | 1830 | 3.3 |
| 2000 | RSTR | 16595 | 30.5 |
| 2001 | BSEIN | 6138 | 24.7 |
| 2001 | FKTR | 3579 | 5.0 |
| 2001 | RSTR | 19090 | 28.1 |
| 2002 | BSEIN | 5386 | 26.4 |
| 2002 | FKTR | 2223 | 3.2 |
| 2002 | RSTR | 42362 | 63.2 |
| 2003 | BSEIN | 455 | 4.7 |
| 2003 | FKTR | 3311 | 5.0 |
| 2003 | RSTR | 13383 | 20.4 |
| 2004 | BSEIN | 1261 | 6.8 |
| 2004 | FKTR | 4369 | 6.6 |
| 2004 | RSTR | 19179 | 28.6 |
| 2005 | BSEIN | 611 | 4.2 |
| 2005 | FKTR | 2029 | 3.0 |
| 2005 | RSTR | 12181 | 20.5 |
| 2006 | BSEIN | 4988 | 14.9 |
| 2006 | FKTR | 1374 | 3.0 |
| 2006 | RSTR | 27915 | 65.6 |
| 2007 | BSEIN | 6205 | 6.7 |
| 2007 | FKTR | 5174 | 7.6 |
| 2007 | RSTR | 34692 | 55.4 |
| 2008 | BSEIN | 13144 | 12.0 |
| 2008 | FKTR | 2788 | 4.5 |
| 2008 | RSTR | 12329 | 46.9 |
| 2009 | BSEIN | 8144 | 19.9 |
| 2009 | FKTR | 3754 | 7.0 |
| 2009 | RSTR | 24944 | 58.3 |
| 2010 | BSEIN | 8889 | 6.9 |
| 2010 | FKTR | 1990 | 6.4 |
| 2010 | RSTR | 17814 | 73.4 |
| 2011 | BSEIN | 30131 | 11.1 |
| 2011 | FKTR | 1726 | 5.8 |
| 2011 | RSTR | 17761 | 59.6 |
| 2012 | BSEIN | 38990 | 16.8 |
| 2012 | FKTR | 2438 | 7.3 |
| 2012 | RSTR | 24999 | 90.9 |
| 2013 | BSEIN | 19977 | 10.3 |
| 2013 | FKTR | 1394 | 4.3 |
| 2013 | RSTR | 21361 | 60.9 |
| 2014 | BSEIN | 13941 | 8.3 |
| 2014 | FKTR | 1756 | 7.2 |
| 2014 | RSTR | 25016 | 71.7 |
| 2015 | BSEIN | 11075 | 6.4 |
| 2015 | FKTR | 2079 | 6.1 |
| 2015 | RSTR | 12612 | 33.1 |
| 2016 | BSEIN | 19600 | 10.6 |
| 2016 | FKTR | 2996 | 11.5 |
| 2016 | RSTR | 18675 | 64.6 |
| 2017 | BSEIN | 26777 | 13.6 |
| 2017 | FKTR | 4436 | 18.7 |
| 2017 | RSTR | 10283 | 68.2 |
| 2018 | BSEIN | 11688 | 7.4 |
| 2018 | FKTR | 6776 | 20.9 |
| 2018 | RSTR | 3212 | 9.6 |
| 2019 | BSEIN | 18541 | 14.3 |
| 2019 | FKTR | 5284 | 17.1 |
| 2019 | RSTR | 33746 | 119.1 |
| 2020 | BSEIN | 932 | 0.5 |
| 2020 | FKTR | 1851 | 6.4 |
| 2020 | RSTR | 2188 | 7.3 |
You can do so much more with the kableExtra R package.
Check out the package documentation for
more details and options.
There are a whole lot of R packages that allow for some really nice interactive widgets in R Markdown. We’ll cover three of these that I’ve used and found helpful.
You can convert a data frame to an interactive table using the
DT::datatable() function.
library(DT)
datatable(df_annual)
There are also some options that you can set within the
DT::datatable() function.
df_annual %>%
# Convert Year and Gear variables to factors to allow for select lists in the
# datatable filters
mutate(across(c(Year, Gear), factor)) %>%
datatable(
# remove rownames
rownames = FALSE,
# add column filters
filter = "top",
options = list(autoWidth = TRUE)
)
You can create interactive plots using the plotly R
package. The plotly::ggplotly() function converts
ggplot2 objects into interactive plots. I found that it
works best with less complicated plots.
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
ggplotly(plt_cpue_yr)
You can make interactive maps using the leaflet R
package. First we’ll download the fish monitoring stations for YBFMP
from the EDI
data package.
df_stations <- read_csv("https://portal.edirepository.org/nis/dataviewer?packageid=edi.233.3&entityid=89146f1382d7dfa3bbf3e4b1554eb5cc")
## Rows: 26 Columns: 9
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (7): StationCode, StationName, StationNumber, PeriodOfRecordFrom, Period...
## dbl (2): Latitude, Longitude
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Convert to sf object
library(sf)
## Linking to GEOS 3.9.1, GDAL 3.4.3, PROJ 7.2.1; sf_use_s2() is TRUE
sf_stations <- df_stations %>% st_as_sf(coords = c("Longitude", "Latitude"), crs = 4326)
Next, we’ll create an interactive map of the sampling locations.
library(leaflet)
sf_stations %>%
leaflet() %>%
addTiles() %>%
addCircleMarkers(
radius = 5,
fillOpacity = 0.8,
weight = 0.5,
color = "black",
opacity = 1
)
Now, we’ll add labels for each station and color code the markers by
the MethodCode (BSEIN, FKTR, RSTR).
# Define color palette for MethodCode
color_pal <- colorFactor(palette = "viridis", domain = sf_stations$MethodCode)
sf_stations %>%
leaflet() %>%
addTiles() %>%
addCircleMarkers(
radius = 5,
fillColor = ~color_pal(MethodCode),
fillOpacity = 0.8,
weight = 0.5,
color = "black",
opacity = 1,
label = paste0("Station Code: ", sf_stations$StationCode)
) %>%
addLegend(
position = "topright",
pal = color_pal,
values = sf_stations$MethodCode,
title = "Gear Type:"
)
A feature that I’ve found to be useful is using tabs in a rendered R
Markdown document. Here is how you would set up the document to display
the two CPUE plots and the stations map in tabs for easier navigation.
Apply the {.tabset} class attribute to the
## YBFMP Data Summary header below like this:
## YBFMP Data Summary {.tabset}. Then, each of the
sub-headers of this header will appear as tabs instead of standalone
sections. Use another header of the same level as
## YBFMP Data Summary (level 2) to stop using tab
navigation.
Add .tabset-fade and/or .tabset-pills to
the {.tabset} class attribute to change the appearance and
behavior of the tabs.