This recent manifesto from Nature Human Behavior describes the risks to reproducible science at every step of the process. I urge you to read it.
But today I want us to think more parochially about our own workflows. How can using R make our own data collection, cleaning, visualization, and analysis workflows more reproducible. Ask yourself this: Can you pick up where you left off on a project you were working on yesterday? Last week? Last month? Six months ago? Put it this way: If you were hit by a truck tomorrow, could your adviser and collaborators pick up where you left off?
Reproducible workflows are scripted. They minimize human contact with your data files. They are well-documented. And it turns out that workflows that are transparent to you and your colleagues are transparent to others. This makes them easy to share.
project_analysis.R
We’ve already shown you in this bootcamp how writing R scripts and functions can let you import, clean, munge, reorganize, plot, and analyze data. We’ve already seen how commenting code fragments makes it easier to read and understand. An extension to R called R Markdown lets us mix R code, analyses, text, tables, and other formatting to make all sorts of products. R Markdown files are just text files. But with this one text file, it’s easy to produce multiple output types: PDF or Word formatted documents; HTML for blogs, web sites, or even slide presentation.
# Import data
# Clean data
# Visualize data
# Analyze data
# Report findings
# Import data
my_data <- read.csv("path/2/data_file.csv")
# Clean data
my_data$gender <- tolower(my_data$gender) # make lower case
...
# Import data
source("R/Import_data.R") # source() runs scripts, loads functions
# Clean data
source("R/Clean_data.R")
# Visualize data
source("R/Visualize_data.R")
...
Just to show you how easy this is, let’s look at the R syntax James used yesterday. I’m going to show you how adding just a tiny bit of text to that file transforms it. Here is the original R script. Here is the transformed file with a .Rmd extension.
code
, bulleted or numbered lists, web links, etc.title: "R Notebook"
to something else, like title: "Rick's R Notebook"
Untitled
) with an .Rmd
extension.*.Rmd
code.*.nb.html
file in a browser.# Big idea
## Smaller idea in service of bigger
- Supporting point
- Another suppporting point
1. an enumerated **bold** point
1. an enumerated *italicized* point
- a [link](http://psu-psychology.github.io/r-bootcamp) to this bootcamp
- an image: ![rawr](https://www.insidehighered.com/sites/default/server_files/media/PennState2.PNG)
- an equation: $e = mc^2$
rmarkdown::render("talks/bootcamp-survey.Rmd")
rmarkdown::render('talks/bootcamp-survey.Rmd', output_format = "pdf_document")
rmarkdown::render('talks/bootcamp-survey.Rmd', output_format = "word_document")
rmarkdown::render('talks/bootcamp-survey.Rmd', output_format = "ioslides_presentation")
rmarkdown::render('talks/bootcamp-survey.Rmd', output_format = c("pdf_document", "word_document", "github_document", "ioslides_presentation")
papaja
Make_site.R
File/New Project.../
File/Open Project...