A p-hacking guide
- Manousos A. Klados
- 1 day ago
- 2 min read
Imagine you’re a scientist passionately working on a groundbreaking experiment. You’ve spent months collecting data, but the initial analysis shows no significant results. Frustration kicks in — after all, significant findings often mean recognition, publication, or even funding. What if you could slightly tweak the analysis, exclude a problematic data point, or keep testing until something “interesting” emerges? That’s exactly what p-hacking entails.
What Exactly Is P-Hacking?
P-hacking, often called “data dredging” or “selective reporting,” refers to manipulating statistical analyses or selectively reporting results until a statistically significant outcome (usually indicated by a p-value below the conventional threshold of 0.05) emerges. While sometimes unintentional, p-hacking seriously undermines scientific integrity, casting doubt on research reliability.
Common Methods of P-Hacking
Let’s explore some common ways researchers might unintentionally (or intentionally) engage in p-hacking:
Fishing Expeditions:
Imagine testing numerous hypotheses and cherry-picking only those results that appear significant. It’s akin to throwing darts until you finally hit the target, then presenting only that final success.
Flexible Data Collection:
Continuously peeking at the data and halting data collection the moment the results appear favorable. It’s like ending a game only once you’re in the lead.
Conveniently Ignoring Data:
Arbitrarily excluding outliers or problematic data points simply because they shift the results towards significance.
Switching Analytical Methods:
Trying various statistical tests or models until stumbling upon one that provides a significant result, despite initial hypotheses.
Selective Reporting:
Publishing only the experiments or outcomes that “worked,” leaving out unsuccessful or null results from the narrative.
How Can We Spot P-Hacking?
Detecting p-hacking isn’t always straightforward, but there are telltale signs:
Suspiciously Clustered P-Values:
An unusual abundance of p-values just below the significant threshold (like 0.049) suggests manipulation.

Comparing to Original Plans:
Pre-registration protocols that outline study methods beforehand can expose deviations and selective reporting.
Inconsistent Method Descriptions:
Vague or unclear methodology sections may signal that researchers selectively reported or adjusted methods post-hoc.
P-Curve Analysis:
A statistical method used to evaluate the distribution of p-values across multiple studies, highlighting potential manipulation when there is clustering around the significance boundary.
Protecting Against P-Hacking
To maintain scientific integrity, researchers can adopt several practices:
Pre-Registration: Clearly stating hypotheses, methods, and intended analyses beforehand prevents subsequent manipulation.
Transparent Reporting: Publishing all tested hypotheses and findings — including null or negative results — encourages openness and accountability.
Replication Studies: Encouraging independent research groups to replicate results ensures that findings hold under scrutiny.
By recognizing, detecting, and actively preventing p-hacking, the scientific community safeguards trust, credibility, and ultimately, the advancement of genuine knowledge.
Enjoyed this post?
Subscribe to my mailing list to receive all new posts in your email.
Comments