Findings: Data viz outside psych sci

Published

January 22, 2025

Modified

January 22, 2025

About

This page extracts information about the data visualizations we explored in Exercise-02 from the shared Google Sheet.

Survey

Direct link: <https://docs.google.com/spreadsheets/d/1rLiLBRbDQfInauOUBPNwGYq-0VOjskGNY7Wq2QRTTbc/edit?gid=0#gid=0

Preparation

First, we load the external packages (groups of R commands) that we will be using.

Important

The code uses the quietly() function from the purrr package to suppress most of the feedback.

Code
library('ggplot2')
library('dplyr')

r_functions <- list.files(file.path(here::here(), "src", "R"), "\\.R$", full.names = TRUE)

purrr::map(r_functions, source) |>
  purrr::quietly()
function (...) 
capture_output(.f(...))
<bytecode: 0x12d0a4840>
<environment: 0x12d0a4568>

Gathering

Next, we download the data from the Google Sheet where it is collected. Dr. Gilmore has stored his Google account credentials in a special environment file that can be accessed by the R command Sys.getenv("GMAIL_SURVEY").

Tip

It’s vital to be very careful when creating and sharing code like this that involves sensitive information like login credentials.

Gilmore likes to put credentials in an .Renviron file that lives in his home directory. This is a recommended practice. On Mac OS and Linux, that’s ~/.Renviron. You can use the usethis::edit_r_profile() command at the R console (not the Terminal) to open your own .Renviron file. In Gilmore’s case, he has added the following line to that file:

GMAIL_SURVEY="<my-google-account>"

Here, he has substituted his Google account with credentials/access to the required files for <my-google-account>. Then, when the R code below calls Sys.getenv("GMAIL_SURVEY"), the value of those credentials is returned as a text string.

Make sure to close and save the .Renviron file and restart your R session before testing this yourself.

Code
if (!dir.exists('csv')) {
  message("Creating missing `csv/`.")
  dir.create("csv")
}

if (params$update_data) {
  options(gargle_oauth_email = Sys.getenv("GMAIL_SURVEY"))
  googledrive::drive_auth()

  googledrive::drive_download(
    "PSYCH-490.003-Spr-2025-Biz-Govt",
    path = file.path("csv", params$fn),
    type = "csv",
    overwrite = TRUE
  )
  message("Data updated.")
} else {
  message("Using stored data.")
}

The data file has been saved as a comma-separated value (CSV) format data file in a special directory called csv/.

Note

Because these data might contain sensitive or identifiable information, we only keep a local copy and do not share it publicly via GitHub. This is achieved by adding the name of the data directory to a special .gitignore file.

Cleaning

Next we load the saved data file, and then proceed to clean it.

Code
ex02 <-
  readr::read_csv(file.path("csv", params$fn), show_col_types = FALSE)

There are 7 responses.

These are the column/variable names.

Code
# Google Forms puts the full question in the top row of the data file.
# We use the names() function to extract and print the original questions.
ex02_qs <- names(ex02)
ex02_qs
[1] "identifier"    "source_type"   "url_to_src"    "url_to_figure"
[5] "why_selected"  "comment"      

For simplicity, we visualize below only those with non-empty URLs to the specific figure.

Summary data

Code
figs_w_urls <- ex02 |>
  filter(!is.na(url_to_figure))

There were n=5 unique respondents.

Of the 7 responses from these individuals or teams, n=2 had URLs we could link to directly.

Figures found

Code
these_figs <- ex02 |>
  filter(!is.na(url_to_figure))

res <- invisible(lapply(1:dim(these_figs)[1], return_img_chunk, df = these_figs))
cat(unlist(res), sep = "\n")

Figure 1

Source: https://wpdatatables.com/sports-data-visualization/#:~:text=What%20is%20Sports%20Data%20Visualization,drawn%20from%20sports%2Drelated%20data.

Analyst Source Type Why Selected
Kmm Report Basketball Fan
Comments
NA

Figure 2

Source:

Analyst Source Type Why Selected
ayc1 Poster Seen in person
Comments
Number of physicians by resident year in a hospital department