Yesterday’s ‘troubles with fMRI’ article has caused lots of debate so I thought I’d post the original answers given to me by neuroimagers Russ Poldrack and Tal Yarkoni from which I quoted.
Poldrack and Yarkoni have been at the forefront of finding, fixing and fine-tuning fMRI and its difficulties. I asked them about current challenges but could only include small quotes in The Observer article. Their full answers, included below with their permission, are important and revealing, so well worth checking out.
First, however, a quick note about the reactions the piece has received from the neuroimaging community. They tend to be split into “well said” and “why are you saying fMRI is flawed?”
Because of this, it’s worth saying that I don’t think fMRI or other imaging methods are flawed in themselves. However, it is true that we have discovered that a significant proportion of the past research has been based on potentially misleading methods.
Although it is true that these methods have largely been abandoned there still remain some important and ongoing uncertainties around how we should interpret neuroimaging data.
As a result of these issues, and genuinely due to the fact that brain scans are often enchantingly beautiful, I think neuroimaging results are currently given too much weight as we are trying to understand the brain but that we shouldn’t undervalue neuroimaging as a science.
Despite having our confidence shaken in past studies, neuroimaging will clearly come out better and stronger as a result of current debates about problems with analysis and interpretation.
At the moment, the science is at a fascinating point of transition, so it’s a great time to be interested in cognitive neuroscience and I think this is made crystal clear from Russ and Tal’s answers below.
Russ Poldrack from the University of Texas Austin
What’s the most pressing problem fMRI research needs to address at the moment?
I think that biggest fundamental problem is the great flexibility of analytic methods that one can bring to bear on any particular dataset; the ironic thing is that this is also one of fMRI’s greatest strengths, i.e., that it allows us to ask so many different questions in many different ways. The problem comes about when researchers search across many different analysis approaches for a result, without the realization that this induces an increase in the ultimate likelihood of finding a false positive. I think that another problem that interacts with this is the prevalence of relatively underpowered studies, which are often analyzed using methods that are not stringent enough to control the level of false positives. The flexibility that I mentioned above also includes methods that are known by experts to be invalid, but unfortunately these still get into top journals, which only helps perpetuate them further.
Someone online asked the question “How Much of the Neuroimaging Literature Should We Discard?” How do you think should we consider past fMRI studies that used problematic methodology?
I think that replication is the ultimate answer. For example, the methods that we used in our 1999 Neuroimage paper that examined semantic versus phonological processing seem pretty abominable by today’s standards, but the general finding of that paper has been replicated many times since then. There are many other findings from the early days that have stood the test of time, while others have failed to replicate. So I would say that if a published study used problematic methods, then one really wants to see some kind of replication before buying the result.
Tal Yarkoni from the University of Colorado at Boulder
What’s the most pressing problem fMRI research needs to address at the moment?
My own feeling (which I’m sure many people would disagree with) is that the biggest problem isn’t methodological laxness so much as skewed incentives. As in most areas of science, researchers have a big incentive to come up with exciting new findings that make a splash. What’s particularly problematic about fMRI research–as opposed to, say, cognitive psychology–is the amount of flexibility researchers have when performing their analyses. There simply isn’t any single standard way of analyzing fMRI data (and it’s not clear there should there be); as a result, it’s virtually impossible to assess the plausibility of many if not most fMRI findings simply because you have no idea how many things the researchers tried before they got something to work.
The other very serious and closely related problem is what I’ve talked about in my critique of Friston’s paper [on methods in fMRI analysis] as well as other papers (e.g., I wrote a commentary on the Vul et al “voodoo correlations” paper to the same effect): in the real world, most effects are weak and diffuse. In other words, we expect complicated psychological states or processes–e.g., decoding speech, experiencing love, or maintaining multiple pieces of information in mind–to depend on neural circuitry widely distributed throughout the brain, most of which are probably going to play a relatively minor role. The problem is that when we conduct fMRI studies with small samples at very stringent statistical thresholds, we’re strongly biased to detect only a small fraction of the ‘true’ effects, and because of the bias, the effects we do detect will seem much stronger than they actually are in the real world. The result is that fMRI studies will paradoxically tend to produce *less* interesting results as the sample size gets bigger. Which means your odds of getting a paper into a journal like Science or Nature are, in many cases, much higher if you only collect data from 20 subjects than if you collect data from 200.
The net result is that we have hundreds of very small studies in the literature that report very exciting results but are unlikely to ever be directly replicated, because researchers don’t have much of an incentive to collect the large samples needed to get a really good picture of what’s going on.
Someone online asked the question “How Much of the Neuroimaging Literature Should We Discard?” How do you think should we consider past fMRI studies that used problematic methodology?
This is a very difficult question to answer in a paragraph or two. I guess my most general feeling is that our default attitude to any new and interesting fMRI finding should be skepticism–instead of accepting findings at face value until we discover a good reason to discount them, we should incline toward disbelief until a finding has been replicated and extended. Personally I’d say I don’t really believe about 95% of what gets published. That’s not to say I think 95% of the literature is flat-out wrong; I think there’s probably a kernel of truth to most findings that get published. But the real problem in my view is a disconnect between what we should really conclude from any given finding and what researchers take license to say in their papers. To take just one example, I think claims of “selective” activation are almost without exception completely baseless (because very few studies really have the statistical power to confidently claim that absence of evidence is evidence of absence).
For example, suppose someone publishes a paper reporting that romantic love selectively activates region X, and that activation in that region explains a very large proportion of the variance in some behavior (this kind of thing happens all the time). My view is that the appropriate response is to say, “well, look, there probably is a real effect in region X, but if you had had a much larger sample, you would realize that the effect in region X is much smaller than you think it is, and moreover, there are literally dozens of other regions that show similarly-sized effects.” The argument is basically that much of the novelty of fMRI findings stems directly from the fact that most studies are grossly underpowered. So really I think the root problem is not that researchers aren’t careful to guard against methodological problems X, Y, and Z when doing their analyses; it’s that our mental model of what most fMRI studies can tell us is fundamentally wrong in most cases. A statistical map of brain activity is *not* in any sense an accurate window into how the brain supports cognition; it’s more like a funhouse mirror that heavily distorts the true image, and to understand the underlying reality, you also have to take into account the distortion introduced by the measurement. The latter part is where I think we have a systemic problem in fMRI research.
No comments:
Post a Comment