The Dead Salmon in the MRI: What brain imaging errors reveal about our growing trust in machines, algorithms, and data-driven science

At first glance, the story sounds like a bad joke. In fact, it is one of the most instructive episodes in modern science. A dead Atlantic salmon is placed inside a functional magnetic resonance imaging scanner. The fish is shown photographs of humans expressing different emotions. The analysis reports statistically significant brain activity. The salmon, apparently, “responds.”

Of course, the fish does not respond. It is dead. And that is precisely the point.

The so-called Dead Salmon Study was not conducted to mock neuroscience or to provoke for provocation’s sake. Its purpose was to expose a fundamental methodological problem: even highly advanced, well-established, and widely trusted measurement technologies can produce systematically false results—without fraud, manipulation, or malicious intent. The errors arise purely from the way data are generated, processed, and interpreted.

This insight extends far beyond fMRI. It strikes at the heart of a deeply ingrained assumption in modern research: that machine-generated measurements are inherently objective and reliable. The study forces an uncomfortable question. If even fMRI—a heavyweight technology used in medicine and research for decades—can generate false positives, how much more vulnerable must newer, younger research technologies be?

MRI as the Gold Standard of Objectivity

Functional MRI is widely regarded as a paradigmatic example of objective measurement in medicine and neuroscience. It is expensive, technologically complex, grounded in sophisticated physics, and firmly embedded in clinical and research practice. Presenting fMRI data carries authority. Machines, we assume, see what humans cannot.

This combination of technical complexity and visual output generates trust. Color-coded brain images convey clarity: something is happening here; the brain is responding; activity has been detected. Rarely acknowledged is the long chain of statistical assumptions, filters, thresholds, and models that stand between raw signals and the final activation map.

In everyday scientific discourse, fMRI is often treated not as an interpretive instrument but as a window into truth. That assumption is precisely what makes it dangerous.

What fMRI Actually Measures—and What It Does Not

A central misconception is that fMRI measures “brain activity.” In reality, it measures changes in blood oxygenation—an indirect proxy for neural processes. Even at this basic level, interpretation begins. Blood flow is equated with neural activity, temporally delayed, spatially smoothed, and statistically modeled.

Picture by Dirk Adams on Unsplash

These processing steps are not optional. Without them, the data would be unusable noise. But each step introduces opportunities for systematic error. Particularly problematic is the multiple-comparisons problem. Tens of thousands of voxels are tested simultaneously. Without proper correction, statistically significant results are almost guaranteed to appear somewhere—even when no real signal exists.

This is exactly what the dead salmon demonstrated. Using standard analysis pipelines, apparently meaningful activation patterns emerged in an organism incapable of neural activity. Not because anyone cheated, but because the statistics were misapplied.

Not an Outlier: Systemic Errors in MRI Research

The dead salmon is not an amusing anomaly; it is a symbol. Subsequent research revealed that widely used fMRI software packages relied for years on flawed assumptions about cluster-based statistics. Meta-analyses later suggested that a substantial proportion of published fMRI studies may suffer from inflated false-positive rates.

Crucially, these studies were not retracted due to misconduct. They reflected the best methodological understanding available at the time. Only retrospectively did it become clear that entire research domains were built on unstable foundations.

This distinction matters. The problem lies not in deception, but in the interaction between technology, statistics, and interpretation. fMRI produces signals, not truths. Meaning emerges through interpretation—and so do errors.

Technological Maturity Does Not Guarantee Reliability

This is the core issue. fMRI is not a new or experimental gadget. It is regulated, standardized, and deeply embedded in medical infrastructure. Yet it still took decades for fundamental methodological weaknesses to be widely recognized.

That should be alarming. It challenges a comforting narrative of technological progress: that older technologies are inherently more reliable. The history of fMRI suggests the opposite. Maturity can conceal assumptions rather than eliminate errors. Problems persist not because they are absent, but because they become invisible.

If this applies to a technology as established as fMRI, the implications for newer tools are obvious.

Emerging Research Technologies: Bold Claims, Unknown Errors

Eye-tracking glasses, wearable sensors, biosignal devices, AI-driven classification systems—all promise objective insight into attention, emotion, cognition, or behavior. They measure gaze direction, skin conductance, heart rate variability, or statistical patterns in massive datasets. Their outputs appear precise, quantified, and machine-neutral.

But as with fMRI, these systems do not measure the phenomena themselves. They measure proxies. Gaze is not attention. Skin conductance is not emotion. A model’s output is not understanding.

What makes these technologies particularly risky is their youth. Their error profiles are poorly mapped. There are few long-term meta-analyses, limited replication, and weak standardization of statistical corrections. Yet their outputs are increasingly used to justify conclusions and decisions—in research, industry, and policy.

The difference from fMRI is not conceptual, but temporal. With fMRI, we now know where it fails. With many modern tools, we simply do not know yet.

Artificial Intelligence as an Error Multiplier

Artificial intelligence amplifies these dynamics. AI systems excel at detecting patterns—including spurious ones. Overfitting, bias, and illusory correlations are not fringe issues; they are structural properties of data-driven models.

The key problem is not that AI makes mistakes, but that it lends speed and authority to them. Statistical artifacts that once required painstaking analysis now appear as clean model outputs. The machine delivers a result—so it must be correct. This is the same epistemic trap that surrounded fMRI for decades.

If decades were required to uncover fundamental problems in fMRI, it is naive to assume AI systems will self-correct more rapidly. It is far more likely that their errors will only become visible after they are deeply embedded in decision-making processes.

Why We Trust Machines More Than Humans

At the heart of this issue lies an epistemological mistake. Machines are seen as objective; humans as biased. In reality, machines faithfully execute our assumptions, simplifications, and blind spots.

Technology does not replace theory. It does not replace understanding. It does not replace critique. It does not eliminate human error—it scales it.

The dead salmon is therefore not a joke, but a warning. Measurement without reflection is meaningless. Trust in technology without methodological skepticism is dangerous.

Conclusion: Skepticism Is Not Technophobia

The lesson here is not to reject fMRI, AI, or modern sensing technologies. They are powerful tools. But tools remain tools. They generate signals, not truths.

If even a highly regulated, decades-old technology like fMRI can produce systematic false positives, caution is not optional—it is mandatory. Especially for newer technologies whose errors we have not yet learned to detect.

The dead salmon does not think. But it taught us something essential: not every colorful visualization is knowledge, and not every machine knows what it is measuring.

Key References

Bennett, C. M., Baird, A. A., Miller, M. B., & Wolford, G. L. (2009). Neural correlates of interspecies perspective taking in the post-mortem Atlantic salmon: An argument for proper multiple comparisons correction. Journal of Serendipitous and Unexpected Results, 1(1), 1–5. https://doi.org/10.1525/jser.2009.1.1.1

Button, K. S., Ioannidis, J. P., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S., & Munafo, M. R. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14(5), 365–376. https://doi.org/10.1038/nrn3475

Eklund, A., Nichols, T. E., & Knutsson, H. (2016). Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates. Proceedings of the National Academy of Sciences, 113(28), 7900–7905. https://doi.org/10.1073/pnas.1602413113

Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2(8), e124. https://doi.org/10.1371/journal.pmed.0020124

Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 1359–1366. https://doi.org/10.1177/0956797611417632

Wired Magazine. (2009, September 29). Dead salmon fMRI study highlights false positives in brain research. https://www.wired.com/2009/09/fmrisalmon/

Inspired by Kanwar Singh

Authored by Rebekka Brandt

The Research Review

Search This Blog