2022-04-05 08:12:47

Acknowledgements

  • Thank you to NICHD, NIMH, NIDA, NIH OD, NSF, the Alfred P. Sloan Foundation, the James S. McDonnell Foundation, the LEGO Foundation, and the John S. Templeton Foundation


Agenda

  • Tools for reproducible science
  • An open science future

Prelude

Questions to ponder

What proportion of findings in the published scientific literature (in the fields you care about) are actually true?

  • 100%
  • 90%
  • 70%
  • 50%
  • 30%

How do we define what “actually true” means?

Is there a reproducibility crisis in science?

  • Yes, a significant crisis
  • Yes, a slight crisis
  • No crisis
  • Don’t know

Have you failed to reproduce an analysis from your lab or someone else’s?

Does this surprise you? Why or why not?

What factors contribute to irreproducible research?

What is reproducibility, anyway?

Methods reproducibility

Results reproducibility

Inferential reproducibility

How is scientific research different from other (flawed) human endeavors?

Robert Merton

Wikipedia

Wikipedia

  • universalism: scientific validity is independent of sociopolitical status/personal attributes of its participants
  • communalism: common ownership of scientific goods (intellectual property)
  • disinterestedness: scientific institutions benefit a common scientific enterprise, not specific individuals
  • organized skepticism: claims should be exposed to critical scrutiny before being accepted

Are these norms at-risk?

…psychologists tend to treat other peoples’ theories like toothbrushes; no self-respecting individual wants to use anyone else’s.

(Mischel, 2011)

The toothbrush culture undermines the building of a genuinely cumulative science, encouraging more parallel play and solo game playing, rather than building on each other’s directly relevant best work.

(Mischel, 2011)

Do you agree or disagree with Mischel?

Discussion of Errington et al., 2021

We conducted the Reproducibility Project: Cancer Biology to investigate the replicability of preclinical research in cancer biology. The initial aim of the project was to repeat 193 experiments from 53 high-impact papers, using an approach in which the experimental protocols and plans for data analysis had to be peer reviewed and accepted for publication before experimental work could begin. However, the various barriers and challenges we encountered while designing and conducting the experiments meant that we were only able to repeat 50 experiments from 23 papers…

(Errington et al., 2021)

…First,…the data needed to compute effect sizes and conduct power analyses was publicly accessible for just 4 of 193 experiments. Moreover, despite contacting the authors of the original papers, we were unable to obtain these data for 68% of the experiments…

(Errington et al., 2021)

Second, none of the 193 experiments were described in sufficient detail in the original paper to enable us to design protocols to repeat the experiments, so we had to seek clarifications from the original authors. While authors were extremely or very helpful for 41% of experiments, they were minimally helpful for 9% of experiments, and not at all helpful (or did not respond to us) for 32% of experiments…

(Errington et al., 2021)

Third, once experimental work started, 67% of the peer-reviewed protocols required modifications to complete the research and just 41% of those modifications could be implemented.

(Errington et al., 2021)

…Cumulatively, these three factors limited the number of experiments that could be repeated. This experience draws attention to a basic and fundamental concern about replication – it is hard to assess whether reported findings are credible.

(Errington et al., 2021)

Is this satisfactory? Why or why not?

Discussion of Munafò et al., 2017

How do these issues affect your research?

Do the solutions seem reasonable and appropriate? Are you convinced?

Do we have the power we need?

Assuming a realistic range of prior probabilities for null hypotheses, false report probability is likely to exceed 50% for the whole literature.

(Szucs & Ioannidis, 2017)

Silberzahn et al., 2018

Twenty-nine teams involving 61 analysts used the same data set to address the same research question: whether soccer referees are more likely to give red cards to dark-skin-toned players than to light-skin-toned players.

(Silberzahn et al., 2018)

How much did results vary between different teams using the same data to test the same hypothesis?

What were the consequences of this variability in analytic approaches?

Did the analysts’ beliefs regarding the hypothesis change over time?

Here, we have demonstrated that as a result of researchers’ choices and assumptions during analysis, variation in estimated effect sizes can emerge even when analyses use the same data.

(Silberzahn et al., 2018)

These findings suggest that significant variation in the results of analyses of complex data may be difficult to avoid, even by experts with honest intentions.

(Silberzahn et al., 2018)

Transparency in data, methods, and process gives the rest of the community opportunity to see the decisions, question them, offer alternatives, and test these alternatives in further research.

(Silberzahn et al., 2018)

How could a ‘many analysts’ approach be helpful/harmful?

Practical Solutions (Gilmore et al., 2018)

Some recommendations

  • Build upon existing (secondary) data and collect (much) larger samples
  • Share protocols and complete methods (consider video)
  • Share data (FAIRly)
  • Script analyses and share analysis code
  • Preregister data analyses
  • Use version control

What to share

How to share

  • With ethics board/IRB approval
  • With participant permission

Where to share data?

  • Lab website vs.
  • Supplemental information with journal article vs.
  • Data repository

When to share

  • Paper goes out for review or is published
    • Some journals in some fields require post-acceptance “verification”
  • Grant ends
  • Never

How do these suggestions impact your research?

Tools for reproducible science

What is version control and why use it?

  • thesis_new.docx
  • thesis_new.new.docx
  • thesis_new.new.final.docx

vs.

  • thesis_2019-01-15v01.docx
  • thesis_2019-01-15v02.docx
  • thesis_2019-01-16v01.docx

Version control systems

  • Used in large-scale software engineering
  • svn, bitbucket, git
  • GitHub

How I use GitHub

  • Every project gets a repository
  • Work locally, commit (save & increment version), push to GitHub
  • Talks, classes, software, analyses, web sites

FAIR data principles

  • Data in interoperable formats (.txt or .csv)
  • Scripted, automated = minimize human-dependent steps.
  • Well-documented
  • Kind to your future (forgetful) self
  • Transparent to me & colleagues == transparent to others

Scripted analyses

# Import/gather data

# Clean data

# Visualize data

# Analyze data

# Report findings

# Import data
my_data <- read.csv("path/2/data_file.csv")

# Clean data
my_data$gender <- tolower(my_data$gender) # make lower case
...

# Import data
source("R/Import_data.R") # source() runs scripts, loads functions

# Clean data
source("R/Clean_data.R")

# Visualize data
source("R/Visualize_data.R")
...

But my SPSS syntax file already does this

  • Great! How are you sharing these files?
  • (And how much would SPSS cost you if you had to buy it yourself?)

But I prefer {Python, Julia, Ruby, MATLAB, …}

Reproducible research with R Markdown

  • Add-on package to R, developed by the RStudio team
  • Combine text, code, images, video, equations into one document
  • Render into PDF, MS Word, HTML (web page or site, slides, a blog, or even a book)

x <- rnorm(n = 100, mean = 0, sd = 1)
hist(x)

The mean is -0.0480519, the range is [-2.8958845, 1.9151122].

Ways to use R Markdown

Ways to use R Markdown

An alternative to R Markdown

Registered reports and pre-registration

Why preregister?

  • Nosek: “Don’t fool yourself” (“…and you are the easiest to fool” – R. Feynmann)
  • Separate confirmatory from exploratory analyses
  • Confirmatory (hypothesis-driven): p-hacking matters
  • Exploratory: p-values hard(er) to interpret

How/where

Skeptics and converts

To investigate whether,in psychology, preregistration lives up to that potential, we focused on all article spublished in Psychological Science with a preregistered badge between February 2015 and November 2017, and assessed the adherence to their corresponding preregistration plans. We observed deviations from the plan in all studies, and, more importantly, in all but one study, at least one of these deviations was not fully disclosed.

Claesen et al., 2019

Large-scale replication studies

Many Labs

Reproducibility Project: Psychology (RPP)

…The mean effect size (r) of the replication effects…was half the magnitude of the mean effect size of the original effects…

(Collaboration, 2015)

…39% of effects were subjectively rated to have replicated the original result…

(Collaboration, 2015)

If it’s too good to be true, it probably isn’t

An open science future…

The advancement of detailed and diverse knowledge about the development of the world’s children is essential for improving the health and well-being of humanity…

(SRCD, 2019)

We regard scientific integrity, transparency, and openness as essential for the conduct of research and its application to practice and policy…

(SRCD, 2019)

…the principles of human subject research require an analysis of both risks and benefits…such an analysis suggests that researchers may have a positive duty to share data in order to maximize the contribution that individual participants have made.

(Brakewood & Poldrack, 2013)

Resources

This talk was produced on 2022-04-05 in RStudio using R Markdown and the ioslides framework. The code and materials used to generate the slides may be found at https://psu-psychology.github.io/psy-543-clinical-research-methods-2022/. Information about the R Session that produced the code is as follows:

## R version 4.1.2 (2021-11-01)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Monterey 12.3
## 
## Matrix products: default
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] digest_0.6.29   mime_0.12       R6_2.5.1        jsonlite_1.7.2 
##  [5] magrittr_2.0.1  evaluate_0.14   highr_0.9       rlang_0.4.12   
##  [9] stringi_1.7.6   jquerylib_0.1.4 bslib_0.3.1     rmarkdown_2.11 
## [13] tools_4.1.2     stringr_1.4.0   xfun_0.29       yaml_2.2.1     
## [17] fastmap_1.1.0   compiler_4.1.2  htmltools_0.5.2 knitr_1.37     
## [21] sass_0.4.0

References

Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature News, 533(7604), 452. https://doi.org/10.1038/533452a

Botvinik-Nezer, R., Holzmeister, F., Camerer, C. F., Dreber, A., Huber, J., Johannesson, M., … Schonberg, T. (2020). Variability in the analysis of a single neuroimaging dataset by many teams. Nature, 582(7810), 84–88. https://doi.org/10.1038/s41586-020-2314-9

Brakewood, B., & Poldrack, R. A. (2013). The ethics of secondary data analysis: Considering the application of belmont principles to the sharing of neuroimaging data. NeuroImage, 82, 671–676. https://doi.org/10.1016/j.neuroimage.2013.02.040

Camerer, C. F., Dreber, A., Holzmeister, F., Ho, T.-H., Huber, J., Johannesson, M., … Wu, H. (2018). Evaluating the replicability of social science experiments in nature and science between 2010 and 2015. Nature Human Behaviour, 1. https://doi.org/10.1038/s41562-018-0399-z

Collaboration, O. S. (2015). Estimating the reproducibility of psychological. Science, 349(6251), aac4716. https://doi.org/10.1126/science.aac4716

Errington, T. M., Denis, A., Perfito, N., Iorns, E., & Nosek, B. A. (2021). Challenges for assessing replicability in preclinical cancer biology. eLife, 10, e67995. https://doi.org/10.7554/eLife.67995

Gilmore, R. O., Kennedy, J. L., & Adolph, K. E. (2018). Practical solutions for sharing data and materials from psychological research. Advances in Methods and Practices in Psychological Science, 1(1), 121–130. https://doi.org/10.1177/2515245917746500

Goodman, S. N., Fanelli, D., & Ioannidis, J. P. A. (2016). What does research reproducibility mean? Science Translational Medicine, 8(341), 341ps12–341ps12. https://doi.org/10.1126/scitranslmed.aaf5027

Marek, S., Tervo-Clemmens, B., Calabro, F. J., Montez, D. F., Kay, B. P., Hatoum, A. S., … Dosenbach, N. U. F. (2022). Reproducible brain-wide association studies require thousands of individuals. Nature, 603(7902), 654–660. https://doi.org/10.1038/s41586-022-04492-9

Merton, R. K. (1979). The sociology of science: Theoretical and empirical investigations. (N. W. Storer, Ed.). University of Chicago Press. Retrieved from https://www.amazon.com/Sociology-Science-Theoretical-Empirical-Investigations/dp/0226520927

Mischel, W. (2011). Becoming a cumulative science. APS Observer, 22(1). Retrieved from https://www.psychologicalscience.org/observer/becoming-a-cumulative-science

Munafò, M. R., Nosek, B. A., Bishop, D. V. M., Button, K. S., Chambers, C. D., Sert, N. P. du, … Ioannidis, J. P. A. (2017). A manifesto for reproducible science. Nature Human Behaviour, 1, 0021. https://doi.org/10.1038/s41562-016-0021

NYU Health Sciences Library. (2013, November). Data sharing and management snafu in 3 short acts (higher quality). Youtube. Retrieved from https://www.youtube.com/watch?v=66oNv_DJuPc

Sah, R., Rodriguez-Morales, A. J., Jha, R., Chu, D. K. W., Gu, H., Peiris, M., … Poon, L. L. M. (2020). Complete genome sequence of a 2019 novel coronavirus ( SARS-CoV-2) strain isolated in nepal. Microbiology Resource Announcements, 9(11). https://doi.org/10.1128/MRA.00169-20

Silberzahn, R., Uhlmann, E. L., Martin, D. P., Anselmi, P., Aust, F., Awtrey, E., … Nosek, B. A. (2018). Many analysts, one data set: Making transparent how variations in analytic choices affect results. Advances in Methods and Practices in Psychological Science, 1(3), 337–356. https://doi.org/10.1177/2515245917747646

SRCD. (2019). Policy on scientific integrity, transparency, and openness | society for research in child development SRCD. https://www.srcd.org/policy-scientific-integrity-transparency-and-openness. Retrieved from https://www.srcd.org/policy-scientific-integrity-transparency-and-openness

Szucs, D., & Ioannidis, J. P. A. (2017). Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature. PLoS Biology, 15(3), e2000797. https://doi.org/10.1371/journal.pbio.2000797

Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J. J., Appleton, G., Axton, M., Baak, A., … Mons, B. (2016). The FAIR guiding principles for scientific data management and stewardship. Scientific Data, 3, 160018. https://doi.org/10.1038/sdata.2016.18