2024-10-14 Mon

Work session on p-hacking

Rick Gilmore

Overview

On this day

  • Columbus Day or Indigenous People’s Day or Dia de la Raza or…

Announcements

Last time…

Questionable research practices

Defensible choice or questionable practice?

  • In a paper, failing to report all of a study’s dependent measures
  • Deciding whether to collect more data after looking to see whether the results were significant
  • In a paper, failing to report all of a study’s conditions
  • Stopping collecting data earlier than planned because one found the result that one had been looking for
  • In a paper, “rounding off” a p value (e.g., reporting that a p value of .054 is less than .05)
  • In a paper, selectively reporting studies that “worked”
  • Deciding whether to exclude data after looking at the impact of doing so on the results
  • In a paper, reporting an unexpected finding as having been predicted from the start
  • In a paper, claiming that results are unaffected by demographic variables (e.g., gender) when one is actually unsure (or knows that they do)
  • Falsifying data

More on Simmons, Nelson, & Simonsohn (2011)

  • Researcher choices can inflate false positive rates
  • Collecting more data after analyzing a small initial sample inflates the false positive rate
  • Just because a statistical test met the criterion threshold (\(\alpha\)) with a small sample doesn’t mean it will do so with larger samples

Recommendations

Table 2 from Simmons et al. (2011)

Table 2 from Simmons et al. (2011)

Today

Work Session: P-hacking and Final Project Proposals

Set-up for group discussion…

  • Enter (anonymized) data into a spreadsheet

https://docs.google.com/spreadsheets/d/1NXcBrI_bMP_wFi1BurCS5WGppr9HWBiF5ulh7ch61MU/edit?usp=sharing

  • Choose a student_id (integer) for yourself (not your PSU ID or phone number)

Evaluating p-hacking

Our results

  • Who got a “significant” result?
  • How many different analyses did you try?
  • Who changed their analysis after finding a significant result?
  • Did anyone try another analysis–after you got a significant result–and keep the non-significant result?
  • Visualizations of the data
  • Do we have good evidence about which party harms or helps the economy?
  • Why or why not?
  • What is p-hacking?
  • Did you p-hack? Did another classmate? Why do you think so?

Daniel Patrick Moynihan from Wikipedia

You are entitled to your own opinion, but you are not entitled to your own facts.

Next time

Prevalence of QRPs

  • Discuss
    • John, Loewenstein, & Prelec (2012)

Resources

References

John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524–532. https://doi.org/10.1177/0956797611430953
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 1359–1366. https://doi.org/10.1177/0956797611417632