The Teen View is back! Read our latest here:
By: Oviya Gowder
Experimental design and execution play a crucial role in research. When researchers plan an experiment to test a hypothesis, they must fulfill three design requirements—control, randomization, and replication. Replication, in the context of research, is the ability to reproduce the original findings using a different sample. This essential third pillar helps bolster the confidence society has in published results. It indicates to scientists which specific area of their field to continue pursuing, preventing wasted time and resources spent on faulty conclusions.
In 2005, a paper called “Why Most Published Research Findings Are False” incited discussion of the actual reliability of published experiments. In 2010, another paper was published claiming to have discovered that people possess precognition, or the ability to predict future events. However, it was later found that unorthodox experimental and statistical methods were used to manipulate the data in a way that positively supported the hypothesis. This launched what is now called the replication crisis, a decade-long problem consisting of scientists discovering they could not produce similar results previously published by the original researcher. This pattern appeared most prominently in the field of psychology. For instance, an attempt in 2015 to reproduce 100 psychology studies was able to replicate only 39 of them. Another 2018 effort could replicate 14 of the 28 studies it examined. With these astounding numbers, it is clear that many published papers consisted of critical methodology flaws and statistical maneuvers to make their data appear more attractive to the science community. However, in the past few years, the storm around the crisis has seemingly died down as researchers find that more papers are becoming replicable.
There are many reasons why published findings during the replication crisis were doomed to fail from the start. In a particular 2008 psychology experiment, Lawrence Williams and John Bargh found that people who were physically experiencing warmth tended to act more optimistically towards others. For example, they found that holding a hot cup of coffee led participants to judge someone else’s personality more positively. In 2018, another team of researchers attempted to replicate this methodology and failed to reproduce the original findings. Williams and Bargh’s experiment is the epitome of one cause of irreplicability: a small sample size. In the original experiment, only 41 subjects were used. A small sample size makes it hard to introduce variability in the experiment, leading to significant results that happen by chance, just as they did in this case. In the latter experiment, a comparably astounding 128 people were used — but no significant results were produced.
This situation gives insight into another reason for irreplicability. Researchers and journals can fall into the trap of publication bias, or the belief that intriguing results are better received by the general public than data that shows no significance. Williams and Bargh’s experimental results met the requirements to be called “significant,” but narrowly so. Despite this, their theory behind testing this particular “warmth” phenomenon was interesting and had potential, possibly contributing to their audience accepting the results. Although the experiment’s marginal significance level raised some skeptical eyebrows, it was still widely received by the community because of the notable, unique discovery. This reveals how, in order to capture the interest of the public, papers with seemingly unique results were continuously prioritized over others.
The replication crisis was a troubling time for the scientific community, but in light of recent years, the pillar of replicability has begun to strengthen again. For one, journals have started to remove questionable papers from publication. Ivan Oransky, a journalist, claims, “Journals now retract about 1,500 articles annually — a nearly 40-fold increase over 2000, and a dramatic change even if you account for the roughly doubling or tripling of papers published per year.” Scientific journals have made it their duty to help suppress the crisis by eliminating papers based on the merits of their methodology. Not basing selection criteria on the final results makes researchers more comfortable with reporting data that shows no statistical significance. Indeed, null results are just as important as significant ones. Andrew Gelman, a Columbia University researcher, states, “This crisis has influenced my own research practices, and I assume it’s influenced many others as well.” The replication crisis has helped researchers become more aware of their methodology and statistical techniques. This acknowledgment of the gravity of the crisis has improved the quality of research currently being done. Now, according to Nature Human Behavior, “The results show that the original findings could be replicated 86% of the time — significantly better than the 50% success rate reported by some systematic replication efforts.”
As seen by the gradual increase in replication rates, the scientific community has clearly improved since the replication crisis. All in all, the scientific community is one of society’s bedrocks. Without it, we wouldn’t be able to move forward or uncover the intricate details of the world we live in. Because research is the everlasting propeller for advancement, it is imperative that future generations strive to sustain its reliability. The crisis has left a few crevices within the growing wall of science, but these cracks shouldn’t go unnoticed. The scientific community must continue to acknowledge and fix them, ensuring they will only remain a memory of our growth.
This article was edited by Grace Hur.