2020-03-31 09:42:04

Preliminaries

Check-in

Announcements

Todayā€™s topics

  • Managing resources from afar via APIs

Managing resources from afar via APIs

What are APIs

  • Application Program Interface (API)
  • Talk to computer services

Why use APIs

  • Reduce human manipulation of data files
  • Reduce errors
  • Reduce security/privacy breaches
  • Improve reproducibility

Some useful APIs

  • Box
  • Google Drive
  • OSF
  • Databrary

Box.com

Google Drive

Open Science Framework (OSF)

Databrary

What can you do with these APIs?

  • Move data to/from cloud storage
  • Leave data on cloud storage; clean, visualize locally
  • Produce reproducible workflows from the get-go
  • Reduce the likelihood that data can “leak”
  • Access & visualize data shared by others

Whatā€™s involved?

  • Downloading package
  • Configuring authentication
  • Testing the connection
  • Writing code to do what you want

Example 1: databraryapi

  • Summarize demographics of specific study

Tamis-LeMonda, C. (2013). Language, cognitive, and socio-emotional skills from 9 months until their transition to first grade in U.S. children from African-American, Dominican, Mexican, and Chinese backgrounds. Databrary. Retrieved March 29, 2020 from http://doi.org/10.17910/B7CC74.

  • Login
databraryapi::login_db()
## Please enter your Databrary user ID (email).
## Email:
## [1] FALSE

  • Download demographic data
demog <- databraryapi::download_session_csv(8)

str(demog)
## 'data.frame':    1351 obs. of  36 variables:
##  $ session_id           : int  190 191 192 193 194 195 196 197 198 199 ...
##  $ session_name         : logi  NA NA NA NA NA NA ...
##  $ session_date         : chr  "2006-XX-XX" "2006-XX-XX" "2006-XX-XX" "2006-XX-XX" ...
##  $ session_release      : chr  "EXCERPTS" "PRIVATE" "EXCERPTS" "PRIVATE" ...
##  $ participant.ID       : int  3814 3811 3808 3806 3804 3803 3801 3800 3797 3796 ...
##  $ participant.birthdate: logi  NA NA NA NA NA NA ...
##  $ participant.gender   : chr  "Male" "Female" "Female" "Male" ...
##  $ participant.race     : chr  "Black or African American" "Black or African American" "Black or African American" "Black or African American" ...
##  $ participant.ethnicity: chr  "Dominican" "Dominican" "Dominican" "Unknown or not reported" ...
##  $ participant.language : chr  "English" "English" "English" "English" ...
##  $ group.name           : chr  "14 month" "14 month" "14 month" "14 month" ...
##  $ task1.name           : chr  "Novel toy: string with beads" "Novel toy: string with beads" "Novel toy: string with beads" "Novel toy: string with beads" ...
##  $ task1.description    : chr  "Mother-child play with beads and string\nMother-child dyads played with beads and string." "Mother-child play with beads and string\nMother-child dyads played with beads and string." "Mother-child play with beads and string\nMother-child dyads played with beads and string." "Mother-child play with beads and string\nMother-child dyads played with beads and string." ...
##  $ task2.name           : chr  "Mother-child free play" "Mother-child free play" "Mother-child free play" "Mother-child free play" ...
##  $ task2.description    : chr  "Child alone play with standard set of toys\nMother-child dyads played with a set of toys for 8 minutes. Toys pr"| __truncated__ "Child alone play with standard set of toys\nMother-child dyads played with a set of toys for 8 minutes. Toys pr"| __truncated__ "Child alone play with standard set of toys\nMother-child dyads played with a set of toys for 8 minutes. Toys pr"| __truncated__ "Child alone play with standard set of toys\nMother-child dyads played with a set of toys for 8 minutes. Toys pr"| __truncated__ ...
##  $ task3.name           : chr  "Numeracy book" "Numeracy book" "Numeracy book" "Numeracy book" ...
##  $ task3.description    : chr  "Mother-child booksharing wordless number book\nMothers shared a wordless number book with their children." "Mother-child booksharing wordless number book\nMothers shared a wordless number book with their children." "Mother-child booksharing wordless number book\nMothers shared a wordless number book with their children." "Mother-child booksharing wordless number book\nMothers shared a wordless number book with their children." ...
##  $ task4.name           : chr  "Emotion book" "Emotion book" "Emotion book" "Emotion book" ...
##  $ task4.description    : chr  "Mother-child booksharing wordless book of baby faces expressing emotions\nMother-child dyads shared a wordless "| __truncated__ "Mother-child booksharing wordless book of baby faces expressing emotions\nMother-child dyads shared a wordless "| __truncated__ "Mother-child booksharing wordless book of baby faces expressing emotions\nMother-child dyads shared a wordless "| __truncated__ "Mother-child booksharing wordless book of baby faces expressing emotions\nMother-child dyads shared a wordless "| __truncated__ ...
##  $ task5.name           : chr  "Familiar toy" "Familiar toy" "Familiar toy" "Familiar toy" ...
##  $ task5.description    : chr  "Mother-child play interaction with child favorite toy\nMother-child dyads played with the child's favorite toy." "Mother-child play interaction with child favorite toy\nMother-child dyads played with the child's favorite toy." "Mother-child play interaction with child favorite toy\nMother-child dyads played with the child's favorite toy." "Mother-child play interaction with child favorite toy\nMother-child dyads played with the child's favorite toy." ...
##  $ task6.name           : chr  "" "" "" "" ...
##  $ task6.description    : chr  "" "" "" "" ...
##  $ task7.name           : chr  "" "" "" "" ...
##  $ task7.description    : chr  "" "" "" "" ...
##  $ task8.name           : chr  "" "" "" "" ...
##  $ task8.description    : chr  "" "" "" "" ...
##  $ task9.name           : chr  "" "" "" "" ...
##  $ task9.description    : chr  "" "" "" "" ...
##  $ task10.name          : chr  "" "" "" "" ...
##  $ task10.description   : chr  "" "" "" "" ...
##  $ task11.name          : chr  "" "" "" "" ...
##  $ task11.description   : chr  "" "" "" "" ...
##  $ context.setting      : chr  "Home" "Home" "Home" "Home" ...
##  $ context.state        : chr  "NY" "NY" "NY" "NY" ...
##  $ vol_id               : num  8 8 8 8 8 8 8 8 8 8 ...

  • Clean and visualize
sex_race <- demog %>%
  dplyr::select(., sex = participant.gender,
                race = participant.race)

xtabs(formula = ~ sex + race, sex_race)
##         race
## sex          Asian Black or African American Unknown or not reported White
##            7     0                         0                       0     0
##   Female   0   112                       341                       4   209
##   Male     0   111                       410                       2   155

Other working examples

Example 2: NY Times data on COVID-19

# Note the URL uses raw.githubusercontent.com
cv19 <- readr::read_csv("https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-states.csv")
## Parsed with column specification:
## cols(
##   date = col_date(format = ""),
##   state = col_character(),
##   fips = col_character(),
##   cases = col_double(),
##   deaths = col_double()
## )
str(cv19)
## tibble [1,499 Ɨ 5] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ date  : Date[1:1499], format: "2020-01-21" "2020-01-22" ...
##  $ state : chr [1:1499] "Washington" "Washington" "Washington" "Illinois" ...
##  $ fips  : chr [1:1499] "53" "53" "53" "17" ...
##  $ cases : num [1:1499] 1 1 1 1 1 1 1 1 1 2 ...
##  $ deaths: num [1:1499] 0 0 0 0 0 0 0 0 0 0 ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   date = col_date(format = ""),
##   ..   state = col_character(),
##   ..   fips = col_character(),
##   ..   cases = col_double(),
##   ..   deaths = col_double()
##   .. )

cv19 %>%
  dplyr::filter(., state %in% c("Pennsylvania", "New York", "New Jersey", "Maryland", "Ohio", "West Virginia", "Delaware")) %>%
  ggplot(.) +
  aes(date, cases, color = state) +
  geom_point() +
  geom_line()

Your turn

Connect to Qualtrics

Connect to Box

Connect to Google Drive

Next timeā€¦

  • Where to share
  • Your open science portfolio

Resources

Software

This talk was produced on 2020-03-31 in RStudio using R Markdown. The code and materials used to generate the slides may be found at https://github.com/psu-psychology/psy-525-reproducible-research-2020. Information about the R Session that produced the code is as follows:

## R version 3.6.2 (2019-12-12)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Mojave 10.14.6
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] forcats_0.5.0   stringr_1.4.0   dplyr_0.8.5     purrr_0.3.3    
## [5] readr_1.3.1     tidyr_1.0.2     tibble_3.0.0    ggplot2_3.3.0  
## [9] tidyverse_1.3.0
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_1.0.0   xfun_0.12          haven_2.2.0        lattice_0.20-40   
##  [5] colorspace_1.4-1   vctrs_0.2.4        generics_0.0.2     htmltools_0.4.0   
##  [9] yaml_2.2.1         rlang_0.4.5        pillar_1.4.3       withr_2.1.2       
## [13] glue_1.3.2         DBI_1.1.0          dbplyr_1.4.2       modelr_0.1.6      
## [17] readxl_1.3.1       lifecycle_0.2.0    munsell_0.5.0      gtable_0.3.0      
## [21] cellranger_1.1.0   rvest_0.3.5        evaluate_0.14      labeling_0.3      
## [25] knitr_1.28         curl_4.3           fansi_0.4.1        highr_0.8         
## [29] broom_0.5.5        Rcpp_1.0.4         backports_1.1.5    scales_1.1.0      
## [33] jsonlite_1.6.1     farver_2.0.3       databraryapi_0.1.9 fs_1.3.2          
## [37] hms_0.5.3          digest_0.6.25      stringi_1.4.6      keyring_1.1.0     
## [41] grid_3.6.2         cli_2.0.2          tools_3.6.2        magrittr_1.5      
## [45] crayon_1.3.4       pkgconfig_2.0.3    ellipsis_0.3.0     xml2_1.2.5        
## [49] reprex_0.3.0       lubridate_1.7.4    assertthat_0.2.1   rmarkdown_2.1     
## [53] httr_1.4.1         rstudioapi_0.11    R6_2.4.1           nlme_3.1-145      
## [57] compiler_3.6.2

References