It Must Be True, There Was a Study!

You take the your first morning sip of coffee and point your browser at your favorite news site. You’re immediately drawn to two interesting stories. A study by medical researchers has found that taking a particular nutritional supplement can increase your life span. A second article boldly proclaims that people with a particular political ideology have a higher IQ. Both findings, if true, would have significant implications. Should you believe the studies? Both were published in prestigious, peer-reviewed journals. Both had researchers from well regarded universities. Heck, in the supplement study, the authors were from Harvard! You conclude that the research is accurate. You make a note to order the supplement and plan to confront that jerk in your office from the “dumb” side of the political spectrum.

Many of the posts in this blog are based on a single concept: Professionals would be more effective by applying a rational and scientific approach to the practice of their disciplines. In numerous posts, I’ve shown how scientific research has demonstrated that people have certain unconscious biases, tendencies and behaviors. I often point to the research as a demonstration of truth, validating my points. While science is a fabulous tool for establishing knowledge, peer-reviewed studies, even in prestigious journals, are not without potentially serious flaws. Understanding the limitations of peer-reviewed studies can provide a healthy sense of skepticism, allowing for a more balanced application of scientific findings.

Before reflecting on the limitations of studies or experiments, let’s take a quick look at the evolvement of scientific research. Throughout most of recorded history, the acquisition of knowledge about the natural world was driven by superstition, folklore and mysticism. There was little concern for causal relationships or appreciation for the importance of underlying physical laws. Ritual dances or prayer could end droughts. Ground animal parts could enhance fertility. Blood letting could cure numerous diseases.

It took thousands of years of civilization for more scientific thinking to become commonplace. While numerous great thinkers contributed to this process, I’d like to highlight the work of two notable men. First, Francis Bacon, a Renaissance-era British philosopher, is generally credited with the establishment of the scientific method. Bacon recognized that people had inherent tendencies that led to inaccurate acquisition of knowledge. He described four “Idols of the Mind” which blocked the path of sound scientific reasoning. These “Idols” are similar to the cognitive biases described by modern scientists. Bacon’s remedy was to create a methodology in which theories are tested through the collection of evidence and the objective analysis of experimentation. To this day, his idea remains a cornerstone to the scientific way of thinking.

A second influential contributor to the methods of scientific research was James Lind. Lind was an 18th century Scottish physician. At that time, a major health issue was the mortality rate of sailors during long journeys. One particular issue was the high rate of death from the disease scurvy. Lind’s grand idea was to break a bunch of sailors into groups, giving each set different experimental supplements. In the course of the experiment, only the group that received oranges and lemons remained free of scurvy. Although, unknown at the time, scurvy was a result of a deficiency of Vitamin C. Lind’s experiment was a precursor to the idea of randomized control groups, a staple of modern day clinical trials.

Today the ideas of Bacon and Lind, along with the contributions of numerous others, are incorporated into a rigorous process which starts with a hypothesis and ends with a publication of findings in a peer reviewed journal. A small percentage of these findings make their way into the public consciousness by making headlines in mainstream news sources. Along the way, a sophisticated set of techniques are used to ensure that the findings are accurate. These techniques include randomized control groups, double blindness and sophisticated statistical analysis. A peer review process is in place to limit publication to the most valid and profound studies. The modern scientific research process is a huge step forward from the superstitious and pseudo-scientific practices of the past. Unfortunately though, it has led to a cultural mindset where published studies are taken as gospel. There is reason, however, to critically question many findings.

To fully appreciate the limitations of scientific studies, one needs to start with an understanding of the research publication process. Academic journals have a tendency to favor the publishing of novel and profound research. They are more inclined to publish a study showing a life extending diet regimen than a study invalidating a previously interesting finding. The academics performing the research are under enormous pressure to achieve publication. This pressure is driven in part by the tenure process, which demands that an academic show a significant amount of published research. This strongly motivates academics to deliver “interesting” results as part of their research.

Let’s look at a few of the ways in which these pressures result in false, inaccurate or unsound findings. While rare, fraud can be a factor in academic research. That was the situation involving a prominent psychology researcher, Diederik Stapel. Stapel, falsified findings in several dozen published papers, many accepted in noted journals and reported through media outlets. One recent study found that white people were more likely to discriminate against black people if they were in a messy rather than organized environment. Many of his studies had provocative findings, the type that both research journals and media outlets prefer.

Another way in which research can be tainted is through conflicts of interest. Although not well known by the general public, there is a large body of medical research that is conducted in conjunction with pharmaceutical firms. In a practice known as ghostwriting, the pharmaceutical companies hire firms that produce articles that are submitted to the academic journals. The papers are frequently review articles which summarize existing research and offer an opinion about the utilization of a drug to treat a particular condition. The ghost writing firms utilize prominent medical researchers from top name institutions. The findings are then associated with these prestigious individuals and institutions, creating an air of credibility and independence. Finally, the research is read by doctors and patients, who use it to make decisions about prescribing or taking the drug.

It is this very practice of ghostwriting that appeared to play a part in a disastrous turn of events for Wyeth Labs. Wyeth had promoted hormone replacement therapy for menopausal women as a way to protect against aging skin, heart disease and dementia. They sponsored numerous ghostwritten articles that appeared in a variety of prestigious medical journals. The articles emphasized the positive values of hormone replacement therapy while downplaying the risks. The house of cards collapsed when a large federal study showed an increased risk of breast cancer, heart disease and stroke. Pfizer, who acquired Wyeth, ultimately agreed to settle a lawsuit filed by women who had taken the hormone replacement therapy and developed breast cancer.

Even when fraud and conflict of interest are not a factor, many well-intentioned researchers can consciously or subconsciously produce inaccurate results. One simple way that this occurs is by limiting or extending the amount of data collected in order to achieve a profound result. In a survey of over 2,000 research psychologists, more than 50% admitted that they had not finalized their sample size until they determined the significance of initial testing. This common practice allows researchers to continue performing iterations of an experiment until a desired result is achieved.

An appropriate countermeasure for the sample size issue is for researchers to predetermine their sample sizes. This reduces the researcher’s bias from the experiment. It doesn’t, however, eliminate another pernicious problem affecting research accuracy. In a recent review of 513 papers in prominent Neuroscience journals, found a pervasive issue with statistical methods. Specifically, they found that researchers improperly used statistical techniques to determine if they had achieved a significant finding. Researcher found the error in nearly half the studies where the errror was possible (further explanation here). Other studies have shown simple mathematical errors in published research.

Another issue that can limit the validity of scientific studies is a phenomenon known as the file-drawer effect. Many studies never make it into academic journals because they are rejected by the publisher or because the researcher chooses not to submit the findings. Typically this occurs because either the publisher or researcher determined that the results were not novel or interesting. Let’s consider this issue through a hypothetical research topic; the effectiveness of a particular diet for weight loss. The two studies that demonstrate the effectiveness are likely to be published due to their interesting results. The 5 studies that show limited effectiveness of the diet might be rejected for publication or simply not submitted. Taken together, the 7 studies would lead one in the direction of rejecting the value of the diet. However, the 2 published studies, and subsequent news stories, will lead the public to believe that the findings are scientifically strong.

So, where does this leave us? As rational professionals, we want to leverage scientific findings to increase our personal knowledge and improve our decision making skills. We want to avoid returning to our primitive instincts, resorting to superstition or pseudoscience. But the findings highlighted in this blog post leave us wondering; what can I take as truth? There is no easy answer to this question. The current scientific research process is imperfect and will likely remain so, even with additional process improvements.

There are, however, a number of ways that one can evaluate research to determine its “strength”. One simple way to look at research is through its novelty. Those studies presenting ground breaking findings (exactly the studies that grab the headlines) should be viewed most skeptically. First, truly novel results are rare. Second, as a new finding, it typically means it reflects limited research. The most solid findings are those that have been repeatedly replicated. They have been demonstrated through numerous studies, by different researchers, across time. Additionally, the strongest findings have the least amount of counter findings. Meta-studies that analyze the results of previous studies are considered a more reliable measure of accurate findings. Additionally, the strongest findings have the least amount of counter findings.

Those studies that are in “softer sciences” such as psychology, economics and politics should be viewed more skeptically. Despite decades of research, these areas are notoriously complex and have few consistent laws. In physics, one can conduct a straightforward experiment to see how fast an object falls. Determining the impact of government spending on the economy or the relationship between political ideology and intelligence is a far more nebulous problem.

Consider whether the research appears to be agenda driven. What is the research showing? Is it advocating the use of a particular drug or product? Is it making a point that would support a particular political issue? While it is not always possible to identify, are there potential conflicts of interest in the authorship or sponsorship of the research?

Finally, be careful when reading accounts of research in mainstream media outlets. Many articles are an attempt to “grab eyeballs”, getting viewers to click on a provocative sounding headline. Frequently, upon a detailed reading of the article (or the underlying research) it becomes clear that the headline was a gross overstatement of profoundness. Anything unequivocally proclaiming a shocking new finding should be taken with the proverbial grain of salt.

This entry was posted in Probability/Statistics, Psychology. Bookmark the permalink.