Prevalence of QRPs

2023-10-09 Mon

Rick Gilmore




Prevalence of QRPs

John, L. K., Loewenstein, G. & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524–532.

Cases of clear scientific misconduct have received significant media attention recently, but less flagrantly questionable research practices may be more prevalent and, ultimately, more damaging to the academic enterprise. Using an anonymous elicitation format supplemented by incentives for honest reporting, we surveyed over 2,000 psychologists about their involvement in questionable research practices.

(John et al., 2012)

The impact of truth-telling incentives on self-admissions of questionable research practices was positive, and this impact was greater for practices that respondents judged to be less defensible. Combining three different estimation methods, we found that the percentage of respondents who have engaged in questionable practices was surprisingly high. This finding suggests that some questionable practices may constitute the prevailing research norm.

(John et al., 2012)

Accessibility/openness notes

  • Paper not openly accessible.
  • Paper was accessible via authenticated access to PSU library.

Table 1 from (John et al., 2012)

The two versions of the survey differed in the incentives they offered to respondents. In the Bayesian-truth-serum (BTS) condition, a scoring algorithm developed by one of the authors (Prelec, 2004) was used to provide incentives for truth telling. This algorithm uses respondents’ answers about their own…

…behavior and their estimates of the sample distribution of answers as inputs in a truth-rewarding scoring formula. Because the survey was anonymous, compensation could not be directly linked to individual scores.

Instead, respondents were told that we would make a donation to a charity of their choice, selected from five options, and that the size of this donation would depend on the truthfulness of their responses, as determined by the BTS scoring system. By inducing a (correct) belief that dishonesty would reduce donations, we hoped to amplify the moral stakes riding on each answer (for details on the donations, see Supplementary Results in the Supplemental Material).

Respondents were not given the details of the scoring system but were told that it was based on an algorithm published in Science and were given a link to the article. There was no deception: Respondents’ BTS scores determined our contributions to the five charities.

Respondents in the control condition were simply told that a charitable donation would be made on behalf of each respondent. (For details on the effect of the size of the incentive on response rates, see Participation Incentive Survey in the Supplemental Material.)

(John et al., 2012)

Figure 1 from John et al. (2012)

Figure 2 from John et al. (2012)


What do you think about the “truth serum” manipulation?

Do the data persuade you that it made respondents more honest?

Reproducibility notes for (John et al., 2012)

Supplemental Material Additional supporting information may be found at

(John et al., 2012)

knitr::include_url("", height = 600)
  • I went to the journal page and searched for the article title.
  • Since the article is behind a paywall, I wasn’t able to access the supplemental materials that way.
  • After authenticating to the PSU library, I was able to find a PDF of the supplementary material. It and the original paper are on Canvas.
  • I was unable to find the raw data, but I found the questions on p. 5 of the supplementary material.

Supplementary material for (John et al., 2012)

More evidence about prevalance of QRPs

Survey of \(n=3,247\) NIH-funded scientists

Table 1 from Martinson, Anderson, & Vries (2005)

Meta-analysis (Xie, Wang, & Kong, 2021)

Self-reported prevalence; Figure 2 from Xie et al. (2021)

Observed prevalence; Figure 3 from Xie et al. (2021)

Evaluating p-hacking

Our results

  • Who got a “significant” result?
  • How many different analyses did you try?
  • Who changed their analysis after finding a significant result?
  • Did anyone try another analysis–after you got a significant result–and keep the non-significant result?
  • Visualizations of the data
  • Do we have good evidence about which party harms or helps the economy?
  • Why or why not?
  • What is p-hacking?
  • Did you p-hack? Did another classmate? Why do you think so?

Daniel Patrick Moynihan from Wikipedia

You are entitled to your own opinion, but you are not entitled to your own facts.

Next time

File drawer effect



Franco, A., Malhotra, N., & Simonovits, G. (2014). Social science. Publication bias in the social sciences: Unlocking the file drawer. Science, 345(6203), 1502–1505.
John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524–532.
Martinson, B. C., Anderson, M. S., & Vries, R. de. (2005). Scientists behaving badly. Nature, 435(7043), 737–738.
Rosenthal, R. (1979). The file drawer problem and tolerance for null results. Psychological Bulletin, 86(3), 638–641.
Xie, Y., Wang, K., & Kong, Y. (2021). Prevalence of research misconduct and questionable research practices: A systematic review and Meta-Analysis. Science and Engineering Ethics, 27(4), 41.