The Credibility Crisis in Science

Posted on by

Albert Einstein was chosen by Time magazine as the Person of the Twentieth Century.  It was a good choice (and now is a good time to read Einstein more than seventy years after his death in April 1955, here and here).  As noted by Thomas Plümper and Eric Neumayer in their newly published book, The Credibility Crisis in Science: Tweakers, Fraudsters, and the Manipulation of Empirical Results, Einstein finished ahead of the runners-up Franklin Delano Roosevelt, who saved capitalism from itself during the Great Depression, and Mohandas Gandhi, who showed the world that, despite the iron will of Winston Churchill, the sun was setting rapidly on the British Empire.  Plümper and Neumayer write “Those days are over (for science). And they may never come back.”  This is the theme of their book, and they might be right.

We have discussed the problems of modern science here often, going back to my first contribution here. The goal of The Credibility Crisis in Science is to show how science has gone wrong.  It is difficult to disagree with Plümper and Neumayer:

The credibility of science is not, as such, about scientific ideas, theories, or models that turn out to be false.  Rather, the crisis has been caused by scientists who deliberately publish overconfident, misleading, and often simply false empirical results based on research designs or model specifications they have intentionally specific to give the desired results.  We call this practice “tweaking.”  In extreme cases, published results rely on manipulated or outright fabricated data.  Whether tweaked, manipulated, or fabricated, the results often cannot be replicated – not even of replication analysts use identical research designs.

Tweaking is potentially more damaging to science in the long run than data manipulation and fabrication.  Any particular tweaked empirical result is likely to have a smaller effect on the fabric of science than cases of data fabrication and manipulation, but the cumulative effect can still be can still be larger than the cumulative effect of data fabrication and manipulation because these strategies are rare, while tweaking is common.

While they discuss data fabrication and manipulation, which are subsets of the same thing, their view is that tweaking is the larger threat to the integrity of science.  Once again they are correct, but tweaking is “useful” in scientific questions of the social sciences, e.g., psychology, sociology, economics, whose experimental approaches are qualitatively and quantitatively different from the natural sciences.  My view is that fabrication, manipulation, and tweaking do equal damage to science, while affecting distinct sciences differently.  In the basic and clinical natural sciences, fabrication/manipulation will be found out, eventually, and the consequences for the perpetrators will be severe, even if this takes too long most of the time.

Plümper and Neumayer cover various explanations of the behavior of dishonest scientists.  In some cases the scientist has the need to be “first” in the pathological winner-take-all environment of “publish and still perish anyway.”  In other cases the scientist is (probably) simply inattentive to the actions of other laboratory or team members, although nothing is simple about such lassitude, as the case of the former president of Stanford, Marc Tessier-Lavigne shows.  Can such behavior be predicted?  Plümper and Neumayer spend a lot of time on this.  The answer is, “usually not.”  But in the case of Sylvain Lesné (here and here), an early mentor is reported to have dismissed him because of perceived dishonesty in his laboratory in France.  This did not stop the young scientist from advancing in his career, however.

Just “making stuff up” as Sylvain Lesné did by manipulating images in his work on the amyloid hypothesis of Alzheimer’s disease (AD) probably did keep AD scientists on the wrong path for nearly twenty years.  The opportunity costs remain unknown.  Francesca Gino’s apparently fraudulent research (she disputes the findings of her former employer, Harvard University) did less damage but was more audacious, and embarrassing for her credulous collaborators.  Diederick Stapel was truly sui generis in his production of fictitious research.  Plus, he freely admits to his dishonesty while using several excuses to justify it.  Each of these two cases is covered at some length in The Credibility Crisis in Science.  It is difficult to treat them as much more than examples of academics gone around the bend, however. [1]

Tweaking is more difficult to detect.  One need only consider modern eugenics (not discussed in The Credibility Crisis in Science), which is essentially the same as that of the nineteenth century, except for its supposed sophisticated grounding in genetics.  The Genetic Lottery by Kathryn Paige Harden of the University of Texas has been discussed previously here in Prolegomena to an Understanding of the Replication Crisis in Science and later in Hayek’s Bastards and the Rise of Neoliberalism.  It is not difficult to argue for the lesser intelligence of “the Other” when tweaking the questions and choosing the proper data used to answer them.  Detection of these tweaks requires a strong background in genetics along with the knowledge that general intelligence (g) is not a unitary explanation of human intelligence, especially across different groups that differ primarily in skin pigmentation and/or socioeconomic status. [2]

The so-called “replication crisis of science” is covered well by Plümper and Neumayer, during which they consider the work of John Ioannidis, who originated this trope with a paper in PLOS Medicine in 2005 entitled Why most published research findings are false:

In a widely read essay…the enfant terrible of methodology in medicine…argued that “most published research findings are false,” giving the result away in the title of his essay.  He offers no empirical evidence for this claim, which is entirely based on logical (and statistical) reasoning, and his calculations hold only under certain restrictive and unrealistic assumptions.  This is fortunate for science because Ioannidis’s argument does not at all rely on data fraud in any way shape or form.  He claims that the majority of research findings are false even if they are based on honest research. [3]  If his assertion is correct, and if many researchers have indeed additionally fabricated or manipulated data or tweaked their results, then the proportion of false published research is significantly higher than what Ioannidis claims.

…Apparently, Ioannidis has learned one or two lessons from YouTube clickbait titles: Make very bold, headline-grabbing claims, whether they stand up to scrutiny or not.  Ioannidis’s logic goes like this: Many empirical studies are under-powered, their sample size is too small, or the effect, if it exists at all, is too small to recover “true positives” defined as effects that truly exist in reality.  Combined with testing hypotheses that are rather unlikely to be true, this can easily lead the majority of published articles with statistically significant findings to misleadingly suggest that an effect exists where there is none.

We find Ioannidis’s core assumption unrealistic.  Although research practices vary from field to field and from area to area, researchers generally tend to test hypotheses that have a fairly high probability of actually being true.  Researchers do not, as Ioannidis implicitly assumes, test random hypotheses that have a uniform distribution of actually being true.  This is simply a misrepresentation of how scientists work and how they select their projects.  Based on well supported theories and existing evidence, they often test hypotheses that have a high probability of being true.  As a consequence, the probability of small p-values (and greater statistical significance) is larger, arguably much larger, than Ioannidis assumes.

Ioannidis is called an “enfant terrible” by Plümper and Neumayer.  This fits, and he is definitely a gadfly who from 2005 through 2025 averaged 56 publications a year, or about one a week.  This is remarkable, and frankly unbelievable if each author on a scientific paper is responsible for the entire content of that paper (that this requirement is often unmet explains much scientific misconduct).  His most famous paper from 2005 was also corrected in 2022, which leads one to believe that few readers got past its very useful clickbait title.

Still, a common recommendation for increasing the credibility of science is that published research must be replicated before it is accepted as true or, more correctly, useful.  However, without repeating myself, too much, the goal of science is not to produce truth.  Truth is for theologians and philosophers (and our modern politicians).  The goal of science is to produce useful, factual information that allows us to understand the natural world better. [4]  Very few scientific experiments that are modeled on complex systems are precisely replicable. [5]  But results that do not allow two questions to grow where there was only one before (Thorstein Veblen) are not useful, whether they are correct or not.  Sometimes it is the mistakes in theory or interpretation that lead to deeper understanding of a scientific question.

So, what are the conclusions reached in The Credibility Crisis in Science?  For those who feel compelled to cheat for whatever reason, deterrence is unlikely to work.  But when a scientist is caught fabricating results, a career will end.  Prevention is not as difficult as thought by Plümper and Neumayer, especially when the dishonesty is not perpetrated at the top.  A good mentor and good scientist checks and understands every piece of primary data that eventually produces a result that is published.  The rule with my students “once is an anecdote, twice is data, and three times is a result.  And then we do it all over again from a slightly different perspective, rather than tweaking our conditions to get our hoped-for answer.  When the principal investigator is the perpetrator, things get dicey because whistleblowers are not a beloved species. [6]  Detection strategies will require access to all primary data by editors and reviewers.  Anything less facilitates dishonesty.  Plümper and Neumayer conclude that old-fashioned peer review is virtually the only strategy that will work in the long term.  They are correct.  But for this to happen, the nature of peer review and current business of scientific publication must change.

The Credibility Crisis in Science is well worth the read, but it fails in its goal to properly diagnose the deeper problems of the “scientific enterprise.”  As Plümper and Neumayer properly note:

Science has lost some (one might say much) of its standing with the public.  While skepticism about scientific findings can be healthy (and it is essential) and is an inherent part of the scientific process, a general disbelief and distrust of scientific findings pose significant challenges.  Scientists have a vested interest in regaining some (most) of that lost trust…But much would be gained if scientists were honest about the uncertainties associated with scientific results – honest with other scientists in scientific publications and honest in public statements.  Scientists must learn to distinguish between scientific results and their private opinions, and they should promote brutal transparence in scientific research, not hide potential conflicts of interest, and find ways to improve communications between themselves and the public in order to rebuild trust.

Where to begin? The first place would be to define “science” (i.e., scientific research) as the disinterested search for new knowledge about the natural world, from the social psychology of political and religious belief to the structure and function of individual cells in the organism. [7]. While this may be understood to be the case by the authors, they do not take note that it is the  scientistic Merchants of Doubt who first sowed distrust in science because it interfered with their interests.  They are still with us and they are seldom called out as scientists on a specific mission to “prove something.”

No disinterested scientist, which is the only kind of scientist, objects to skepticism about scientific results until their utility has been demonstrated because they form the foundation for further advances.  And no disinterested scientist disputes his or her vested interest in regaining trust that was lost because of the improper use of scientistic gestures on the part of Merchants of Doubt.  A disinterested scientist is honest with himself or herself first, last, and always, and not one during my long career has conflated personal opinions with scientific results.  Scientists who are not willing and able to share their data with other scientists are not scientists.  Several professors of my early acquaintance who were in the Monsanto orbit stopped being disinterested scientists when they began to accept industrial support for their research and parroted the Monsanto line that gave us Roundup Ready commodity crops and herbicide-resistant weeds.  This is recognized by Plümper and Neumayer, who nevertheless seem to need to be reminded at times that research and scientific research are not the same thing:

While it is often impossible to demonstrate that single studies suffer from vested interests, considerable evidence exists for bias at the aggregate level.  For example, research financed by corporate sponsors is many times more likely to find supportive evidence than research not sponsored by corporations.  This holds even when other “sponsorships” are present (Fabbri et al. 2018)

Yes, it does.  And this is why the reader should always read the acknowledgments of a scientific paper first.  Who pays can influence what is published and some of this research goes by the name of Evidence-Based Medicine.  As Matthew G. Saroff commented nearly four years ago:

Much of the skepticism about science is not because people think themselves smarter than the scientist, though some do, but because people think the scientists are corrupt.

In my long experience, “corrupt” is seldom the exact description, but since the Bayh-Dole Act of 1980 the undercurrents of American biomedical science have militated strongly against “disinterested” as the ideal, default description of the typical scientist.  One can reasonably say that scientists have lost the plot.  Over the past five years the behavior of many scientists during the pandemic did not meet reasonable expectations, usually because the protagonists on multiple sides of the argument about how to respond to COVID-19 were not disinterested in proper, if provisional, path to take in a very difficult situation.  Many of them arrived at the argument fully formed, like Athena.

But this is in no way limited to science.  Our politics and politicians failed, too.  And they have been failing since the Neoliberal Dispensation turned our world into a winner-take-all society.  This allows, or impels, both wings of the Uniparty spend their time dialing for dollars and kowtowing to their masters on K Street instead of tending to the business of the republic.  As for business and industry, in the 1950s the CEO of General Motors certainly had no love for Walter and Victor Reuther of the United Auto Workers, but he was proud to lead a company that employed several hundred thousand men (and a few women) at more than a living wage.  The same is undoubtedly true of the CEO of General Electric.  There can be no doubt that both considered themselves rich.  Now, our Tech Bros look forward to abolishing those few such jobs that remain and taking it all.  It is passing strange that so many of our compatriots are fine with this, which will not end well for them or anyone else.

Thomas Plümper and Eric Neumayer have provided in The Credibility Crisis in Science a useful, if somewhat didactic, overview of what is wrong with American science in the twenty-first century.  However, they never really defined what science is, and they mostly left out the social and cultural context that has damaged scientists and science, which are the same that have damaged society as a whole.  There are no easy solutions to this problem.  Maybe there are no solutions.  But as members of a culture and society gone bad, scientists as a group are, in the end, no different from any other group.  Until we scientists in the aggregate realize this, nothing can change.

Notes

[1] The case of Gino is covered here and in the linked articles in the piece, while that of Stapel is covered in an extensive Wikipedia entry that summarizes his remarkable case very well.  Lesné is no longer a professor at the University of Minnesota.

[2] The Genetic Lottery was reviewed here and the inevitable follow-up discussion can be found here.  The popular and improper misuse of science to support modern eugenics is the stock in trade of Charles Murray in The Bell Curve and other works.

[3] This seems to be a distant and unconvincing echo of Against Method by Paul Feyerabend, who was correct that while there is no one scientific method, there is a scientific method.

[4] From Prolegomena to an Understanding of the Replication Crisis in Science: Nancy Cartwright has the much better view, one that is more congenial to the practicing scientist who is paying attention.  In her view, “theory and experiment do not a science make.”  Yes, science can and has produced remarkable outputs that can be very reliable (the goal of science), “not primarily by ingenious experiments and brilliant theory…(but)…rather by learning, painstakingly on each occasion how to discover or create and then deploy…different kinds of highly specific scientific products to get the job done.  Every product of science – whether a piece of technology, a theory in physics, a model of the economy, or a method for field research – depends on huge networks of other products to make sense of it and support it.  Each takes imagination, finesse and attention to detail, and each must be done with care, to the very highest scientific standards…because so much else in science depends on it.  There is no hierarchy of significance here.  All of these matter; each labour is indeed worthy of its hire…Contrary to the conceit of too many scientists, the goal of science is not to produce truth.  The goal of science is to produce reliable products that can used to interpret the natural world and react to it as needed, for example, during a worldwide pandemic.  This can be done only by appreciating the granularity of the natural world.”

[5] A simple example from my research.  My first exercise as a postdoc was to replicate an experiment on the pH-dependence of the binding of my favorite protein to a binding partner in a multicomponent complex.  I could not get the experiment to work, so we pivoted with little angst to something else that turned out to be much more useful.  As it happened, I had purified my protein from smooth muscle.  The previous paper had used the protein purified from human platelets.  I later discovered the proteins were not the same.  Although they were the same size and behaved similarly as far as we could tell, they were only 72% identical at the amino acid level because they were the products of a gene duplication in our vertebrate ancestors going back 440 million years to fish.  What was thought to be a simple system was not.  Most biological systems are similarly complex.

[6] For example, in this case, with which I became familiar after the fact, the Principal Investigator (PI) leading his research group manipulated images and lied about materials required to support several grant applications to NIH.  He was not found out until be made the mistake of hiring a Research Associate with a PhD.  His previous lab members were technicians, never graduate students or research fellows, who did the experiments after which the PI manipulated their data without their knowledge or consent.  It is unlikely they had anything to do with the publications or grant applications.  It is also absurd that NIH did not seek recompense (~$7M) for the wasted grant money, but that is another issue altogether.

[7] I leave out the physical sciences here because, in general, the absolutism of the atom does not grant much leeway.  While chemists can be just as duplicitous as any other human being, their results are too close to well-understood theory to stray far from fact.  Drs. Pons and Fleishmann and the University of Utah found this out rather quickly regarding cold fusion.  A similar example from biology is the published, and finally retracted, paper by Felisa Wolfe-Simon and others on bacteria that (do not and cannot for chemical reasons) substitute arsenic for phosphorous in the structure of DNA [to go deeper into the weeds, a sugar-arsenate backbone would not be stable in water as is the sugar-phosphate backbone of nucleic acids (jpg)].  How this paper passed peer review remains a mystery to all who have considered the question.

Print Friendly, PDF & Email

22 comments

  1. tyaresun

    My econometrics text books in the eighties had a couple of paragraphs on “data mining”. Reviewers of academic papers would accuse the authors of “data mining”. “data mining” was a pejorative in the eighties. From there, in the late nineties we came to a point where universities were teaching courses on “data mining” and academics were coming up with more and more powerful algorithms for “data mining” using more and more powerful compute and memory resources. Gone were any attempts for causal models or explanations for the results, everything was “pattern recognition”. LLMs are the culmination of that trend.

    When it is impossible to publish papers with negative results, “scientists” whose rewards depend upon number of publications, are bound to “produce” positive results. These incentives completely negate the principles of science.

  2. Robert Hahl

    Overheard at a hiring committee meeting at Mass General in 1984: “He publishes one paper a week. You are impressed, I am appalled.” I think the problem became acute when the ratio of a candidate’s publications to age became the deciding factor. It is just like making GDP the measure of all things economic.

  3. Carolinian

    Some of us would question whether social “science” is even a thing. Did Freud have any scientific basis for all those theories of his? And yet the 20th century was obsessed with him. Clearly science, the brand, has come to have as much importance in our society as science the real thing. Attaching the “s” word conveys a halo of credibility to ideas being promoted for commercial reasons or career promotion reasons. And it’s all well and good to talk about idealism but in medicine in particular some very big bucks are at stake. For awhile my mother had a clerical job in a local college chemistry dept and would ask all the premeds why they wanted to be doctors. She said they invariably answered “to become rich.”

    Capitalism corrupts, absolute capitalism corrupts absolutely? It has to be faced.

  4. DJG, Reality Czar

    Many thanks, KLG. As ever, your essay is enlightening.

    A few points. One is linguistic. Here in Italy, scienza still means a body of knowledge or expertise. I was recently at a progam, a dramatic monologue about a remarkable woman, Ada Gobetti, highly influential here in Torino during the twentieth century. The author / actress thanked her “consultente scientifica,” who was not a scientist in the U.S. sense, but instead, a woman who is a researcher in history with expertise on local history. So when I read the word science in your essay, I tend to think of what might be called the scientific endeavor, rather than the structure of research and how it is funded – the other science.

    You diagnose the problem in these sentences: “Scientists who are not willing and able to share their data with other scientists are not scientists. Several professors of my early acquaintance who were in the Monsanto orbit stopped being disinterested scientists when they began to accept industrial support for their research and parroted the Monsanto line that gave us Roundup Ready commodity crops and herbicide-resistant weeds.”

    We live in a time when many people and institutions have lost moral authority. It isn’t just Hillary Clinton who has lost whatever scraps of moral authority she may once have had!

    As a writer and editor, I am seeing the same decline into whimpering, in-fighting, and insignificance in the literary world. You mention Einstein. One wonders who in U.S. writing these days has the stature of Walt Whitman, Edith Wharton, Marianne Moore, John Dos Passos, Arthur Miller, or James Baldwin.

    Here in Torino, I have a pile of paper copies of Harper’s Magazine. The issue with this essay, “Speaking Reassurance to Power,” by Pankaj Mishra, is marked:

    https://harpers.org/archive/2025/08/speaking-reassurance-to-power-pankaj-mishra-easy-chair/

    Mishra makes this observation, which is the companion piece to your sentences above: ‘For all its claims to superior virtue, the liberal American intelligentsia manifests very little of the courage and dignity it has expected from artists and thinkers in less fortunate societies, as hooded and masked officials disappear students for the crime of writing school-newspaper op-eds and liking social-media posts. Dissenters from far-right orthodoxies in the United States did not face such a concerted onslaught even in the early Fifties, when, threatened by the House Un-American Activities Committee, pursued by the FBI, and canceled by the Library of Congress, Thomas Mann departed the arsenal of democracy for Switzerland. Today, the “disgusting exhibition,” as Mann saw the witch hunts of McCarthyism, “of primitive Puritanism, hatred, fear, corruption and self-righteousness” is much more extensive.’

    Yes, it is corruption.

  5. John Merryman

    Wouldn’t it be nice to be able to just write in a figure and call it Dark Money, when the account comes up seriously short?
    Our current expanding universe cosmology is not something that can stand serious scrutiny, but too many reputations and the culture is wrapped up in it.
    In the most basic terms, if intergalactic space were expanding, the speed of the light crossing it should increase proportionally, in order to remain Constant!
    It fails its own premises.
    One way light does redshift over distance is as multi spectrum packets, as the higher frequencies dissipate faster. Yet that would mean we are sampling a wave front and so the quantification of light is an artifact of its detection and measurement. A loading or threshold theory of quantization.
    Which opens another large can of worms.

    1. hereweare

      “In the most basic terms, if intergalactic space were expanding, the speed of the light crossing it should increase proportionally, in order to remain Constant!”
      What???
      Why not light takes longer to get from A to B than it did, because the distance has increased but the speed of light is constant?

    2. Samuel Conner

      An ant crawling along the surface of an inflating balloon at a fixed speed relative to the local surface of the balloon it is in contact with does not experience a change in its speed, regardless of how rapidly or slowly the balloon inflates. This is how to think about light propagation in an expanding cosmos.

      There are problems in current consensus cosmology (the Hubble Tension, the “sigma-8” tension, the identity of the particle(s) or field(s) responsible for the effects of what is called “dark energy”, the nature of “dark matter”, etc.), but light propagation in an expanding cosmos is not one of them.

  6. The Rev Kev

    I wonder about the long term effects of tweaking. So you may have a paper that has tweaked it’s results. But then subsequent papers go off in the direction of those tweaked results and maybe do some tweaking of their own. The net result is that scientific research may be then diverted to follow this particular path when better more promising paths have been now bypassed.

    1. hereweare

      “It is difficult to disagree with Plümper and Neumayer:

      Tweaking is potentially more damaging to science in the long run than data manipulation and fabrication.”

  7. Patrick Donnelly

    What stops planets falling into the Sun over the last 4.5 billion years?

    Science has no sensible answer! Satellites are kept aloft by rockets which require fuel. They all fail eventually

    The ask why gravity only accelerates as the square of the velocity and not the cube.

    Galileo at Pisa proved weight has nothing to do with gravity.

    1. Mel

      Douglas Adams explained it in one of the later volumes of The Hitchhiker’s Guide… as the secret of flight.

      You throw yourself at the ground and miss.,

    2. hereweare

      Science has no sensible answer? Atmospheric drag makes our artificial earth satellites fall.

    3. BiilS

      What stops planets falling into the Sun over the last 4.5 billion years?

      Isaac Newton figured that out 350 years ago. Where have you been?

    4. Samuel Conner

      I assume that this is intended in jest and, as such, it is nicely done.

      Regarding your allusion to the fact that the centripetal acceleration of a particle in circular orbit is v^2/r, the answer to your in-jest question can be deduced in a number of ways. A simple one is to note that the units of acceleration are “length per time-squared”. Third power of velocity screws that up.

  8. Bun

    “The Credibility Crisis in Medical Science”.
    Fixed it for them.

    Where there are vast sums to be made, people will lie, cheat, and/or steal to get a piece of the action.

    Whocoodanode?

    Sincerely,
    Annoyed physicist

  9. CanCyn

    I had a high school science teacher who told us not to ‘cook’ our results when writing our lab reports. She meant ‘tweak’ and she was very clear that experimenting and truthfully reporting our results were far more important than proving or disproving our hypotheses. Said teacher must be spinning in her grave these days.
    I firmly believe that the commercial sponsorship of science and the for profit publishing world really are responsible for the current state of affairs. As with politics, taking the money out of it would go a long way to righting the ship

  10. Darthbobber

    Where it was a rarity before the mid-80s, it’s now more likely than not that a leading light in a cutting edge medical or biological field comes with a corporation that proposes to monetize their work.

Comments are closed.