Open Access, Open Data – Closed Understanding? Why Open Science Alone Can’t Fix the Reproducibility Crisis
When science becomes untrustworthy, the consequences ripple far beyond academia. The ongoing reproducibility crisis—also called the replication crisis—reveals deep flaws in how scientific knowledge is produced and published. Despite the rise of Open Science, with initiatives like data sharing and reproducibility projects, critical problems such as p-hacking, publication bias, and weak statistical practices continue to persist.
Leading voices like John Ioannidis have long warned that a large proportion of published research may be unreliable. It is becoming increasingly clear that Open Science, while valuable for transparency, cannot on its own solve the structural and epistemological weaknesses embedded in modern scientific publishing.
The Replication Crisis: More Than Just P-Hacking
The replication crisis is no longer just a whispered concern in academic hallways — it has become a recognized challenge across psychology, medicine, economics, and beyond. Initial shocks came from large-scale replication projects, such as the Open Science Collaboration (2015), which found that only 36% of psychological studies could be reliably replicated. In biomedicine, Ioannidis' now-classic paper (2005) provocatively titled "Why Most Published Research Findings Are False" laid bare how flawed statistical reasoning and methodological shortcuts produce misleading results.
Much of the public discourse has focused on p-hacking — the practice of manipulating data or analyses until statistically significant results emerge (e.g., through selective reporting or testing multiple hypotheses without correction). While p-hacking certainly exemplifies the problem, it is not the crisis itself — it is a symptom. Others include:
- Low statistical power: Studies are often underpowered due to small sample sizes, inflating the risk of false positives.
- Publication bias: Journals favor novel, positive findings, incentivizing researchers to chase significance.
- Lack of transparency: Without open data and code, replication is difficult and errors go undetected.
- Questionable research practices: From HARKing (hypothesizing after results are known) to selective exclusion of data, the list is long.
These are not isolated lapses in judgment. They emerge systematically from a research culture oriented more toward publishability than robustness.
In response to these systemic issues, the scientific community has turned to transparency-focused reforms, most prominently Open Science. The idea is intuitive: if researchers share their data, preregister hypotheses, and adopt open peer review, errors and questionable practices should be easier to detect and correct.
Open Science promises not only to make research more transparent but also to realign incentives, encouraging rigor over novelty. Yet the effectiveness of these reforms depends crucially on widespread adoption and proper implementation—conditions that are still far from guaranteed.
Reform Efforts: The Rise of Open Science
The Open Science movement — encompassing pre-registration, registered reports, open data, and open peer review — has aimed to restore credibility and replicability. Institutions have begun incentivizing transparency, and some journals now require or strongly encourage pre-registration (Nosek et al., 2018).
Statistical education has also seen reform. Researchers are now urged to report effect sizes and confidence intervals, not just p-values. Bayesian methods are increasingly suggested as alternatives to frequentist tools.
These are valuable interventions. They make cheating harder and good science easier. But even these systemic reforms may not address deeper epistemological issues.
Reforms and Their Limits: Why Open Science (Still) Isn’t Enough
But before we consider the full potential of these practices, it is important to recognize that adoption is far from universal. Now, let’s imagine a scenario in which Open Science practices were fully applied—would all reproducibility issues disappear?
Intrinsic limitations of empirical research—such as context dependence, variability in populations, and the inherent complexity of social phenomena—mean that two identical studies can yield different outcomes.
Moreover, the clarity and validity of theoretical constructs themselves can constrain how informative replication attempts are. In other words, transparency and preregistration can significantly reduce bias, but they cannot eliminate the inherent uncertainty and variability that characterize empirical investigation.
Open Science is not a failure. On the contrary, preregistration, open data, open peer review and transparent analyses have the potential to dramatically increase the credibility of research. Yet two major hurdles prevent these reforms from truly overcoming the replication crisis.
1. The Implementation Gap.
Open Science practices are time-consuming, bring little immediate career payoff and clash with existing incentive structures. Journals still reward spectacular “positive” findings; funding schemes still prize rapid publications.
Many researchers – especially at early career stages – shy away from the extra workload and risk of preregistration and data sharing. Added to this are legal and technical barriers such as data protection, anonymization, or lack of platform know-how. The result: Open Science is used sporadically, but not at scale.
2. The Residual Problem Even at 100% Adoption.
Even if tomorrow every researcher preregistered hypotheses and shared data openly, not all reproducibility problems would disappear. Three issues would remain:
Theoretical vagueness. Transparent methods are of little use if the constructs under study are themselves vague or ambiguous.
Measurement and context effects. Results depend on instruments, populations and timing. The same well-designed study may yield different outcomes in different contexts.
Importantly, the limitations here are not failures of the Open Science tools themselves, but of incomplete adoption and the inherent boundaries of empirical methods.
Randomness and complexity. Especially in the social and life sciences, variability is unavoidable. Even correctly designed studies will not always produce identical results.
These two layers – lack of adoption and inherent limits – show that Open Science is a necessary but not sufficient condition for robust science.
Transparency reduces bias but does not eliminate fuzzy theories, faulty applied methods or context-bound differences. The replication crisis is therefore not just a technical problem but also a structural and conceptual one. To minimize bias, we've collected how bias arise and how to avoid them.
Ensuring Research Quality – Guidance, Guidelines, and Consequences
By systematically identifying where bias can creep in and implementing strategies to minimize it, research becomes not only more reproducible but also more trustworthy. However, both conscious and unintentional mistakes can introduce bias and reduce reproducibility. Below, we outline where research can go wrong and practical ways to mitigate these issues.
Where Bias Can Arise
Even well-intentioned studies can suffer from bias at multiple stages:
Consequences of poor research quality: Distorted or non-replicable results, loss of trust in science, flawed decisions in practice and policy, and reputational damage to researchers and institutions.
Research design: Poorly phrased or leading questions, unclear operationalization of variables, or excessive simplification of complex phenomena can skew findings. Sample selection also matters: lack of representativeness, inadequate randomization, or neglecting specific populations (e.g., patients with mental health conditions) can bias results. Ethical and transparent reporting is essential: participants must be informed, and all methods and hypotheses documented.
Execution: Inconsistent procedures, insufficient training of researchers, or failure to accommodate participant needs (motivation, stress, attention) can all introduce variability. Social desirability and other response biases must be monitored.
Data analysis: Flexibility in statistical approaches, selective reporting, and improper handling of non-significant results can distort conclusions. Overly complex instruments may overwhelm participants, while too few measures can miss important dimensions.
Peer review and scientific community: Lack of critical, independent review and homogeneous perspectives may reinforce errors and groupthink.
How Bias Can Be Minimized
Ensuring research quality requires deliberate actions at each stage:
Design phase: Use validated, clearly operationalized measures. Ensure representative, randomized samples, and include relevant subgroups. Document all procedures, hypotheses, and methods transparently.
Execution phase: Standardize procedures to reduce variability, provide researcher training on potential biases, and account for participant-specific needs. Monitor and mitigate social desirability and other response distortions.
Analysis phase: Pre-specify analyses, ideally via preregistration. Avoid p-hacking and selective reporting, and publish non-significant results. Balance detail and practicality in measurement instruments.
Community and peer review: Engage diverse, independent reviewers and be open to alternative methods or theories. Encourage a culture that values methodological rigor over flashy results.
By systematically addressing these areas, researchers can substantially reduce bias, improve reproducibility, and strengthen trust in science.
Objectivity – An Approach, Not a Destination
Science aspires to objectivity – but what does that actually mean?
Imagine that every person wears a pair of glasses with a unique color filter. Everyone sees the same world, but in different hues. One person claims to see “the truth.” Yet for the others, that is just another color. Even if the first person lends their glasses to someone else, the second now sees that version – but the original diversity remains.
This is how empirical research works, too: we can approach objectivity, we can make our filters visible, we can improve our methods. But we can never see entirely without filters.
Perhaps the future of science lies not only in technical fixes, but also in openly acknowledging these limits – and thinking about whether we need other ways of knowing as well.
This text does not end with a ready-made answer. It ends with a question:
“How can we – as researchers and as a society – find ways to create more trustworthy knowledge together, despite the glasses we each wear?”
References
Begley, C. G., & Ellis, L. M. (2012). Drug development: Raise standards for preclinical cancer research. Nature, 483(7391), 531–533.
Button, K. S., et al. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14(5), 365–376.
Eronen, M.I., & Bringmann, L.F. (2021). The theory crisis in psychology: How to move forward. Perspectives on Psychological Science, 16(4), 779–788. https://doi.org/10.1177/1745691620970586
Flake, J.K., & Fried, E.I. (2020). Measurement schmeasurement: Questionable measurement practices and how to avoid them. Advances in Methods and Practices in Psychological Science, 3(4), 456–465. https://doi.org/10.1177/2515245920952393
Franco, A., Malhotra, N., & Simonovits, G. (2014). Publication bias in the social sciences: Unlocking the file drawer. Science, 345(6203), 1502–1505.
Ioannidis, J.P.A. (2005). Why most published research findings are false. PLoS Medicine, 2(8), e124. https://doi.org/10.1371/journal.pmed.0020124
Nosek, B.A., et al. (2018). The preregistration revolution. PNAS, 115(11), 2600–2606. https://doi.org/10.1073/pnas.1708274114
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716. https://doi.org/10.1126/science.aac4716
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology. Psychological Science, 22(11), 1359–1366.
Yarkoni, T. (2020). The generalizability crisis. Behavioral and Brain Sciences, 44, e1. https://doi.org/10.1017/S0140525X20001685
Authored by Rebekka Brandt
