The Positive Result Bias

This is a pretty well-known non-secret among about anyone who does academic research, but Arnold Kling provides some confirmation that there seems to be a tremendous bias towards positive results.  In short, most of these can't be replicated.

A former researcher at Amgen Inc has found that many basic studies on cancer -- a high proportion of them from university labs -- are unreliable, with grim consequences for producing new medicines in the future.

During a decade as head of global cancer research at Amgen, C. Glenn Begley identified 53 "landmark" publications -- papers in top journals, from reputable labs -- for his team to reproduce. Begley sought to double-check the findings before trying to build on them for drug development.

Result: 47 of the 53 could not be replicated. He described his findings in a commentary piece published on Wednesday in the journal Nature.

"It was shocking," said Begley, now senior vice president of privately held biotechnology company TetraLogic, which develops cancer drugs. "These are the studies the pharmaceutical industry relies on to identify new targets for drug development. But if you're going to place a $1 million or $2 million or $5 million bet on an observation, you need to be sure it's true. As we tried to reproduce these papers we became convinced you can't take anything at face value."...

Part way through his project to reproduce promising studies, Begley met for breakfast at a cancer conference with the lead scientist of one of the problematic studies.

"We went through the paper line by line, figure by figure," said Begley. "I explained that we re-did their experiment 50 times and never got their result. He said they'd done it six times and got this result once, but put it in the paper because it made the best story. It's very disillusioning."

This is not really wildly surprising.    Consider 20 causal relationships that don't exist.  Now consider 20 experiments to test for this relationship.  Likely 1 in 20 will show a false positive at the 95% certainty level -- that's what 95% certainty means.  All those 1 in 20 false positives get published, and the other studies get forgotten.

To some extent, this should be fixable now that we are not tied to page-limited journals.  Simply require as a grant condition that all findings be published online, positive or negative, would be a good start.



  1. W. C. Taqiyya:

    So, when 'they' say butter, coffee, bacon, milk and umpteen other things are either 'bad' or 'good' for us, we should just drink more red wine instead? Good, is what I do anyway.

  2. aretae:

    Well said.

    The Bayesian line would be even better here as an example.

    If there is a Real 1 in 500 chance of finding a result...then in 1000 will get 52 positives.
    Roughly 2 will find real results, and roughly 50 will find fake results.
    On retesting the 52 on average , 2 will find replicated true results, 2.5 will find replicated false results, and 47.5 will find unreplicated false results.

    On review of these arbitrary appears that the real chance of finding a useful "landmark" cancer treatment is about 1 in 500...and that therefore the p-value necessary to publish a study should be p = 0.001, not p=0.05

  3. Andrew_M_Garland:

    "He said they’d done it six times and got this result once, but put it in the paper because it made the best story.”

    Did the lead scientist publish the 5 failures? Did he attempt to reproduce his own result? If not, he should be charged with scientific fraud, prosecuted, and fired. He reported an error that he could not replicate.

    Was he running a gossip column or a scientific experiment, with serious government and private funding.

    Keep this in mind when government appointees claim that the government is central to advancing our society through scientific research. It was a private sector company which discovered the truth.

  4. sean2829:

    Remember when the journal of irreproducible results was a tongue in cheek look at science? I guess now its a mainstream publication....just the name been changed.

  5. a leap at the wheel:
    :( It is not just the reporters.

  6. ElamBend:

    See there you go not believing in science! again.

  7. Chris:

    It would be impractical to "force" scientists and organizations to publish the negative results. Instead, there should be a simple process in which scientists must register their proposed studies prior to conducting them, and only studies that are pre-registered in this way are allowed to be published. Then, when positive results are published, there will be a record of all the similar studies that were not published.

    It is really a scientific necessity that this kind of process be put in place - preferably as a self-regulating effort by the scientific disciplines, but if not, then by law.

  8. Iain Clarke:

    My wife is a clinical psychologist, and also a researcher. Not just a pretty face by any means.

    I showed the article to her, and said "ouch, that's pretty bad" at the one in six quote above. She then told me about the Cochrane Library ( )

    To be considered a serious researcher, you should register your studies Aim before starting, so it's harder to time travel and conveniently find what you were looking for... with hindsight. And so on.

    So, what you're suggesting in your last paragraph has already started.


  9. EscapedWestOfTheBigMuddy:

    Particle physics had a bad case of this in the 1970s.

    So many bins on so many plots from so many experiments: literally scores of "particles" were "discovered" (at 2--3 sigma significance) that later turn out to be just noise that the community adopted a convention of requiring five sigma to use the word "discovery".

    Four sigma is usually considered enough to say "evidence for" and three sigma sufficient for "indications of".

  10. tomw:

    So, who is going to point the finger at the "Second Hand Smoke" studies that have 1/10the coefficient necessary to be considered 'science'? A causal relationship was not found, but the studies are still cited...
    No, I don't sell or use tobacco {any more..}, and do not recommend it to anyone, but I DO believe in the right of innkeepers and restaurateurs to make rational choices in operating THEIR businesses.
    Big Brother and Sister Smoke Nazi, go home.


  11. IGotBupkis, Legally Defined Cyberbully in 57 States:

    Another key point, Warren, is that the testing needs to be double-blind. This is tricky and anything but cheap, but it's the only way to avoid the various biases when studying soft sciences.

    In this general subject arena, I'd call attention to a much deeper treatment of the wider concept by your "canine blogging compatriot":

    Wolf Howling: The Scientific Method & Its Limits - The Decline Effect

  12. Gaunilo:

    Anyone who has done real research by the scientific method knows what a royal pain in the backside it is to follow. The only reason that we would put up with such a limitation is that it is the only way we have devised yet to prevent us from bullshitting ourselves blind.

    I think the tendency to find relationships that don't exist is an artifact of our background as hunters. Hunting (in the real sense, not sitting in a blind over a feeder) requires looking for nebulous patterns, most of which don't mean anything. But if only a small percentage pay off, it makes a real difference in the probability of having something to eat tonight. So we have to build systems to protect ourselves from neural defects in our brain.

  13. MJ:

    I think this is called "confirmation bias".