By Alexander Krauss, Post-Doctoral Research Fellow, London School of Economics. Originally published at the Institute for New Economic Thinking website
Randomised controlled trials (RCTs) are generally viewed as the foundational experimental method of the social and medical sciences. Economists depend on them, for certain questions, as their most valued method. Yet RCTs are not flawless. In my study, Why all randomised controlled trials produce biased results,* I argue that RCTs are not able to establish precise causal effects of an intervention.
Many of us however likely have used some medication, own some technology or support some public policy tested in a trial. To be able to assess how effective they may be prior to supporting them—either as patients or consumers or voters—RCTs are often conducted by splitting up a sample of people into a treatment group and a control group. Contrary to the common belief, I argue in my study that some degree of bias inevitably arises in any trial. This is because some share of recruited people refuse to participate in any trial (which leads to sample bias), some degree of partial blinding or unblinding of the various trial persons generally arises in any trial (which leads to selection bias), and participants generally take treatment for different lengths of time and different dosages in any trial (which leads to measurement bias), among other issues.
The ten most-cited RCTs worldwide, which I assess in the study, suffer from such general issues. But they also suffer from other methodological issues that affect their estimated results as well: participants’ background characteristics (like age, health status, level of need for the treatment etc.) are often poorly allocated across trial groups, participants at times switch between trial groups, and trials often neglect alternative factors contributing to their main reported outcome, among others. Some of these issues cannot be avoided in trials—but they affect the robustness and validity of their results and conclusions.
This is important as the level of validity of a trial’s causal claims is at times a life-or-death matter—for example in public health. The study itself is about the RCT method and not any individual RCTs, and the insights outlined in this study are useful and important for researchers using RCTs in economics, psychology, agriculture and the like (though the ten most-cited RCTs worldwide that are assessed happen to be medical trials).
Assumptions and Biases Generally Increase at Each Step When Carrying Out Trials
That is, from how we create our variables, select our initial sample and randomise participants into trial groups, to how we analyse the data for participants with different lengths of time and amounts of treatment and how we try and ensure everyone involved is fully blinded before the trial begins and throughout its entire implementation—among many other steps before, in between and after these.
I thus argue that the reproducibility crisis is, to a large extent, the result of the scientific process always being a complex human process that involves many actors (study designers, all participants, data collectors, implementing practitioners, study statisticians etc.) who must make many unique decisions at many different steps over time when designing, implementing and analysing any given study—and some degree of bias unavoidably arises during this process. Variation between study outcomes is thus the norm, and one-to-one replication is not possible.
Researchers should thus not assume that the RCT method inevitably produces valid causal results—in fact, that all trials face some degree of bias is simply the trade-off for studies to actually be conducted in the real world. A number of things inevitably do not go as planned or designed given the multiple complex processes over time involved in carrying out trials. Once a study is conducted and completed some biases will have arisen and nothing can be done about a number of them. The study, at the same time, aims to improve how RCTs are carried out by outlining how researchers can reduce some of the biases.
Are Biased Results in Trials Still Good Enough To Inform Decisions in Public Health and Social Policy?
In many cases they are. But that judgement generally depends on how useful the results are in practice and their level of robustness relative to other studies that use the same method or at times other methods. Yet no single study should be the sole and authoritative source used to inform policy and our decisions.
Some may respond, “are RCTs not still more credible than other methods even if they may have biases?” For most questions we are interested in, RCTs cannot be more credible because they cannot be applied—e.g. for most complex phenomena we study such as effective government institutions, long life expectancy, democracy, inequality, education systems, psychological states etc. Other methods (such as observational studies) are needed for many questions generally not amendable to randomisation but also at times to help design trials, interpret and validate their results, provide further insight on the broader conditions under which treatments may work, among other reasons discussed in the study. Different methods are thus complements (not rivals) in improving understanding.
Taken together, researchers, practitioners and policymakers need to become better aware of the broader range of biases facing trials. Journals need to begin, as I illustrate in the study,* requiring researchers to outline in detail the assumptions, biases and limitations in their studies. If researchers do not report this crucial information in their studies, practitioners and citizens will have to just rely on information and warning labels provided by policymakers, biopharmaceutical companies and the like implementing the tested policies and selling the tested treatments.
* Krauss, Alexander. Why all randomised controlled trials produce biased results. Annals of Medicine, 50:4, 312-322 (2018).
This is undergraduate level stuff, although it probably comes as news to some economists.
It’s also quite dangerous to generalise from some cases of RCTs having problems (often well known, described and charactised, if the author read the literature), to all of them being invalid.
If more economists relied on RCTs, or any empirical methodology for that matter, they would have more credibility, if less utility to their owners.
I would hate to see them discouraged from empirical research.
Interesting. Too much empirical research in economics is part of the problem. In my reading of mainstream accademia economic research fits into two categories:
-Econometrics, creating statistical relationships with no thought to actual complex causalities
-Heavy mathematical models attempting to model simple things in a complex way that has so many assumptions and unknowns that it is useless in the real world
If science ran like this we could throw away 500 years of technology….
I am most troubled by the use of relatively simple mathematical models to describe complex phenomena, then use those models for policy prescriptions. They are oversimplified to the point of being little more than ideological props, but are given outsized importance because they produce policy that benefits their sponsors.
Exhibit A: “The Laffer Curve”
A study I read estimated the peak of the curve (where reducing tax rates increases revenue) at around 68% (while also casting shade on the whole concept).
But if you believe Republicans there is literally no tax rate too low where cutting taxes further would not unleash a flood of benefits (for them).
Maybe the article is, insofar as the idea of the observer bringing bias to the observed is not new. I didn’t read the study, and depending on the examples and analysis it could be quite interesting.
Part of the point is that “empirical” may not mean what you want it to.
FWIW: The author of the following piece argues that it is a general problem that could be solved by jettisoning current statistics and embracing Bayesian analysis:
still aren’t RCTs are as good as it gets, at least if you assume a lack of widespread outright fraud. From what I’ve read things like epidemiology with statistical adjustment etc. seem to be even more questionable as a methodology than RCTs.
It’s undergraduate level stuff in some disciplines, like statistics. In others it’s never acknowledged at all.
When teaching research methods, we always tell students that, even when testing a hypothesis with the gold standard test of causality (the controlled experiment with random assignment to the two groups), there are nevertheless various significant threats to internal validity. Even when the researcher carefully attempts to guard against these, no study is infallible.
The controlled experiment (with random assignment) is the best tool out there.
“Are Biased Results in Trials Still Good Enough To Inform Decisions in Public Health and Social Policy?
In many cases they are. But that judgement generally depends on how useful the results are in practice and their level of robustness relative to other studies that use the same method or at times other methods. Yet no single study should be the sole and authoritative source used to inform policy and our decisions”
yes, common sense
Well said. A straw man study, debunking the idea that RCTs are perfect, when I’m reality everyone knows perfectly well what the problems are, not least those in the field.
Some of the lower people on the lower end of the totem pole have been lodging OFFICIAL COMPLAINTS and unofficial complaints for years. Respected rmen and women in their fields have doctored results to fit their hypotheses in front of close eyes to mine. Maybe not 100% fraud, but any amount makes the entire enterprise moot. Medical research needs to be nationalized or changed, I don’t know , but it is in large part broken.
The theoretical conditions that have to be fulfilled for regression analysis and econometrics to really work are nowhere even closely met in reality. Making outlandish statistical assumptions do not provide a solid ground for doing relevant social science and economics. Although regression analysis and econometrics have become the most used quantitative methods in social sciences and economics today, it’s still a fact that the inferences made from them are — strictly seen — invalid. – LARS P. SYLL
Given the greater money angle, things are brought to market prematurely and with less testing …
Given the Martha Stewart angle, things are brought to market prematurely and with less testing …
Theranos is a good recent example. Even if you take management as sincere, they are sincere about making oodles of money, and they are fearful of being late to market, being late to patent, having a dud on their hands.
I’ve addressed this issue in my own work on “race differences.” It’s hard to get across that any and every randomly constituted groups of people will differ in an infinity of ways. As such, any post-treatment difference detected between the groups, logically, may be due the any or all of the infinity of differences between the groups at the outset of the study. If the study is repeated, these pre-treatment differences will theoretically “wash out” at infinity. If the post-treatment difference is still detected, then the logic is that the result must be attributable to the treatment.
The problem is, first, nobody does the same study an infinite number of times (nobody likes to be second in the race for a result), and second, journals don’t publish pure replications. A third problem, identified by David Bakan 50+ years ago, is that if 20 independent labs do “the same” study, one of them will detect a “significant” difference and publish. The other labs will assume they did something wrong, and a chance result will persist in the literature indefinitely.
Nevertheless, indefinite replication is the only sensible ground for arguing that a particular treatment “causes” a particular result. The pre-treatment differences can be minimized with very large samples, but they can never be eliminated. “Matching,” another strategy that’s proposed, logically doesn’t work either, because (1) you’ve violated randomization by selecting for characteristics, (2) you can’t match on ever potential confound (because there’s an infinity of them), and (3) you can’t model how matched, pre-treatment differences may be differentially related to other important factors. For example, in racism research, if you match a group of Native Peoples with a group of non-Natives with respect to income, you will have, either or both, an unusually well-off group of Native Peoples and an unusually poor group of non-Natives.
David A. Freedman’s books and articles are the clearest discussions of these difficulties, and articles like the one under discussion seem necessary to inject a more cautious attitude about randomized experiments.
there is a very large element of scientism behind the enthusiasm for RCTs: the notion that the scientist is “objective” because he puts on a white lab coat during the workday and sometimes talks in an impenetrable and stilted jargon.
research design involving controlled trials in a laboratory and interpreted in light of a mathematical model of analysis of variance are what science is sometimes supposed to aspire to and RCTs gain credibility by promising to be an analogue to those kinds of procedures, but it is a false promise.
the reported research reinforces this scientism to some extent by pushing the idea that a major problem with RCTs can be traced to the introduction of “bias”. the implication is that the premise behind RCTs — a certain view of the role of probability and statistics in representing social or medical life — are epistemologically valid.
the focus on sources of bias will be familiar to those involved in doing RCTs and enthusiastic about the methods and may aid in getting a hearing for this methodological research. a frontal assault on the philosophy of science behind the use of probability concepts in RCTs may be fully justified, but simply not likely to succeed in getting a hearing, for reasons of the sociology of science in the fields, like economics and medical research, where RCTs are favored.
Krauss is tactful. He gently suggests that not all sources of bias are subject to remedy. And, he allows that RCTs are not the be-all-and-end-all standard method for research. I am doubtful that full implications of that message will be gratefully received and absorbed any time soon in either medical science or economics.
the weakness of theory in the disciplines of economics and some precincts of clinical medical science plays, i suspect, a large part in the enthusiasm fo RCTs. Statistical technique will allow us to penetrate the statistical fog and find the shadows of unseen structure in correlation — if you are a really good story-teller, you might be able to spin something, but it seems to me to be designed for an astronomy of constellations, or not even that: naming patterns that are ephemeral, transitory or nearly meaningless.
weakness in our common intuitive understanding of statistics and probability also plays a large part. even knowing that the reality of the economy or the biological processes of a living being are “complex”, are we at all justified in trying to circumvent that complexity by treating it as if it is a dust cloud merely obscuring a simple correlation or three?
i will not speak to medical science, but mainstream economics uses statistical methodology generally to avoid empiricism. RCTs are just one manifestation of that pathology. having to impose causative transitivity or linearity to fit data to Procrustes’ bed comes very naturally to a profession dedicated to keeping an analytic theory isolated from contact with factual reality. A statistical “empiricism” in economics is really no empiricism at all — it becomes another way of continuing to do theoretical analysis in isolation from reality.
Indeed! Follow Dr. Prasad, MD on Twitter to see what happens when he makes singularly plain statements to the same topic. He’s viciously attacked. The medical, as in many professional endeavors, is a field built on the efforts of many strong individuals whose egos are very fragile and whose positions (salaries gratuities). are dependent on the receipt of funding from interested parties whose positions are likewise very brittle.
Bruce, there are too many basic truths in your comments to second them all. I fully share the perspective you elaborate here, in all particulars.
The theoretical impoverishment of much the the sciences and the entirety of the social sciences has been striking to me over thirty years of close engagement with them. In part, to my observation the large majority of human beings are simply bad at doing theory, and so compensate for their weaknesses. Statistical probability is a handy crutch, and one not without worth. You have no way to be sure that you are completely right but are far less likely to be completely wrong. The entire philosophy os science in the modern era derives from this logical formula, as you allude. Modern science includes a very large majority of theory-weak individuals, so even beyond inherent methodological conservatism there is a pronounced bias in these fields to keep the crutch at all costs, even rejecting ‘physical therapy for the theoretical tissue’ if necessary. Bias of this kind isn’t just an issue in economics, where it is corrosive of all endeavor, yes, or in medical science, but it permeates the hard sciences too. Quantum theory: it’s all probability, nobody has a clue how it works—or even if it works. There have been no real theoretical advances in physics in some eighty years, just fiddling with probability configurations at the margins with ever more empirically strained, stained, and counterfactual hypotheses in consequence. The statistical coal gas of quantum theory has asphyxiated theory, which happens only by accident and as a last resort.
All of these concerns with statistical inference confound one even before addressing the potential for complex interactions to escape common statistical methods altogether. A re-thinking of the epistemology of causation in light of what is even so far understood regarding complexity and self-organization is desperately needed, but undertaken by few and isolated individuals when pursued at all. Not least because complex interactions threaten to decimate the usefulness of statistical inference as now employed.
Theoretically, reliance on statistical probability by itself alone is worse than useless, exactly because it is substituted for empirical engagement with the subject matter. I quite agree with this crucial point you raise. Which empirical engagement is often never undertaken at all. Modern research is largely a stupendous temple mound of piled up useful truthiness with few empirical girders, and fewer and poorly placed logical inferences as the rivets therein. No one on the top of the mound, or climbing the steps thence, has any interest in digging deeper. Myself, I tend to equate the value in an observation with the willingness of the observer to get their hands dirty digging in ‘the facts,’ or even with the spiny question of what constitutes a ‘fact’ and its context. Time and money don’t wait for that effort, generally. Hence current the prevalence of ‘fact poor’ or even ‘fact free’ studies too prevalent in the research disciplines of our day.
Rather than a RCT with an effect group compared against a null with covariates equal between the groups, why not a factorial analysis by statistical DOE?
However, I’ve seen experimenters collapse matrices non-orthoganally to create significance out of noise.
That would be very interesting to see in practice, especially utilizing both methods and comparing. I don’t think the Pharma would be on board though. Possibly scared of the results??? The RCT draws simple conclusions they are prepared to deal with. And I believe now they are required to report all studies (even failed ones). Reporting findings would require them to be responsible for more information (additional warnings and contraindications).
Indeed. That’s why the real gold standard in research is not so much formal rigor, as reproducibility across studies.
There is a push for studies to have larger number of subjects: good to a certain extent, but one mega study is a lot riskier than a bunch of smaller studies, because if the single large study had a major methodological flaw, more subjects would not help…
I do point out that, in the long run, no formal procedure can replace the role of honor in science. If scientists are under such pressure to produce ‘results,’ if getting funded gets harder and harder, if the alternative to getting funded is to be fired and stay poor for the rest of your life, sooner or later the system will be corrupted.
We say that competition is good, and indeed it is. But when competition reaches such frenzied levels that it is virtually impossible for even the best scientists to succeed without cheating, and when alternative career paths are also increasingly bad, the pressure to fudge results – or more likely, the pressure to not question your results when they confirm what you want – will become inescapable.
There is a reason that overpopulated places like India and Bangladesh contribute so little to science. Much of that of course is the lack of free resources. But also there is the level of competition that makes cronyism and corruption so endemic. And as the United States increasingly moves in that direction, sooner or later it will happen here.
From my experience in academic research it is not just the financial pressure to get results, but rather the perverse incentives to not pursue experiments that could end in negative results for their pet speciality. If you spent decades building and owning a narrative that technology X might be the perfect solution for problem Y in the future you are better off doing endless fiddling at the edge of the problem, rather than doing any experiments that might clearly show that it will never solve problem Y. Academic researchers are rewarded for peddling hope, and punished for showing when something will not work.
Interesting, thank you for this. Another piece of the puzzle.
This critique is about the hidden gaps between theoretical and actual properties of sampling, selection, and treatment, which I agree should be borne in mind. There is a deeper problem, however, that concerns the underlying theoretical model that informs most existing causal inference work.
Causal functions are viewed as separable: a “treatment” can be linked causally to an effect. This is sometimes the case, but rarely in the social sciences and is even questionable in many health contexts. Rather, causal factors are nonseparable; a treatment is mediated by a set of coincident characteristics or events. Thus there is an immense tradeoff between experimental control (isolating/identifying the treatment) and external validity.
An example is education. As someone who has worked in education-related fields (and been an educator), I have read endless studies where education (years of schooling, particular classroom-level treatments, etc.) is examined as a causal factor against various outcome metrics. But the causal significance of education is not direct; it is mediated by family or some other living environment, economic circumstances, cultural norms and practices and the like, which differ across individuals, groups and places. Trying to control for these things misses the point — they interact with education “treatments” to produce the results we see out in the world.
RCTs draw their validity from a conception of causation that doesn’t apply to most of the problems social scientists deal with.
Yes. The old slogan, “correlation is not causation”, doesn’t cover half of it. Fishing for patterns in an n-dimensional cloud bank often practically requires that the researcher rather uncritically presume that candidate causative factors are additive and transitive and linear in their relation to outcomes: statistical methods need strong causal models and if the researchers cum fisher folk do not have a strong theory, the methods impose one, willy nilly.
If all the researchers are doing is fishing for clues, to be investigated more deeply and fully, then it seems to me the activity may be worthwhile. But, the idea that RCT’s provide “proof” on a level with rigorously controlled experimental methods and research design is just delusional.
I think Krauss is being naive.
RCTs are used to get FDA to let Big Pharma sell a drug. Yes, they are NOT ecologically valid. More people might be helped or hindered by a drug than results indicate from samples and tight controls. That is in part why there are Stage IV RCTs – post release studies of a drug once everyone is exposed. There are researchers who specialize in debunking bad medicine. We also have meta-analysis to increase ecological validity.
So where is the rub? In the rest of science we require multiple conceptual replications to catch the ecological factors that initial studies missed. But Pharma paid Congress to lower the bar to a single stage III RCT with a large N arguing that people’s lives are at stake (which sometimes they are, but not most of the time). It is hard to see what whining about RCTs is likely to achieve as most folks have no clue what you are talking about.
PS: The Association for Psychological Science has made the topic of research design and analysis a cornerstone issue and publishes monthly on the topic.
Is that true? I don’t know – genuine question. I would like some clarification/expansion as to what “Stage IV RCTs” and “single stage III RCT” mean in the cycle of regulatory approval and review. You seem to be saying something interesting and important about how RCTs are being used to assess “ecological validity”, but I am surely in the group that has “no clue what you are talking about”. More detail, please.
For periodic updates as well as discussion on the fundamental issues pertaining to RCT’s follow Vinay Prasad MD on Twitter
The full paper is available free at the link. I urge people to get it and at least skim thru it. While this article does not give examples of the issues at stake, the paper details:
1. initial sample selection bias – “[No data on selection] means that we do not have details about the representativeness of the data used for these RCTs. Moreover, the trial on cholesterol by Shepherd et al.  was for example conducted in one district in the UK and the trial on insulin therapy by Van Den Berghe et al.  in one intensive care unit in Belgium
2. randomization of allocation of sample to treatment/control groups – “those receiving the main treatment (compared to those with the placebo) were 3% less likely to have had congestive heart failure, 8% less likely to have been smoking before the stroke, 14% more likely to have taken aspirin therapy, 3% more likely to be of white ethnicity relative to black, and 3% more likely to have had and survived a previous stroke”
This was news to me especially about clinical studies that were cited 10,000 times.
Personally I always read the papers when my doc switches my medicine. For example, when she switched me to Brilinta from Plavix I was really lucky and found a head to head trial and free paper https://www.nejm.org/doi/full/10.1056/NEJMoa0904327
Sadly, I didn’t think I was as ill as the patients in the clinical trial, so I told the doc that the trial sample wasn’t representative of me but she was still insistent that brilinta was superior for me.
That’s the other angle of drug delivery to the patient. Are the docs well-versed in data science/statistical inference ?
In the last analysis it STILL depends on the patient – but the educated ones should put in the effort and get acquainted with the data and then exercise a choice.
But what about those who are not well-versed in science /stats/ STEM in general ? Maybe, not-for-profit, independent, STEM-educated volunteer organizations are a path ?
Bruce, Peter, Richard, TG, Roland and others- Phenomenally thoughtful, analytical, logical comments!!!
For the wonkish:
See Deaton and Ziliak.
The underlying issue is gross misunderstanding of how statistical analysis should be done, and the situations and conditions under which results of such analysis can be deemed useful or valid.
Ziliak’s Evergreen Haiku:
What are you trying to say
Nice to see my favorite kind of article here, thanks.
Must say, though, these criticisms have been leveled at my science of choice, research psychology, for damn near 40 years now.
All those people, all these years, going through all those motions, publishing all those papers, without a scintilla of science in the vast majority, just gettin paid. I’d wear my grad-school reject status as a badge of honor, but now really, where would a Zen poet pin such a frivolous thing?
Meanwhile, field experiments with embodiment continue apace.
This is just the conclusion. It’s available for purchase or rental here.
Kinget, G. W. (1979). Objective psychology: a case of epistemological sleight-of-hand. Journal of Phenomenological Psychology, 11, 83-96.
A CASE OF EPISTEMOLOGICAL SLEIGHT-OF-HAND
G. M. KINGET
The first, most deeply hidden and most fatal of objective
psychology’s epistemological feats of legerdemain here reported
lies in introducing an ideology under the cover of a methodology.
A second consists in homologizing events phenomenally as dif-
ferent as consciousness and physiology. A third manages to ignore
the absolutely peculiar, oft-mentioned, seldom heeded fact of
the identical nature of subject and experimenter in human
research (Riegel, 1975), a peculiarity which clamors for an
epistemological substructure sui generis. A fourth equates the
unique case of human experience with the single case of physics,
G. M. Kinget
Objective Psychology 93
while a fifth shifts the meaning of the term “fact” from physical
phenomena to conscious phenomena–oblivious of the fact that
in both cases, true replication, precondition of factual status, is
impossible where specifically human matters are concerned.
Finally, under the guise of dealing with humans qua organisms
pure and simple, objective psychology deals with irretrievably en-
cultured systems which tolerate no separation of culture and
biology. This fabric of operations both bold and subtle, which
reduces the irreducible and homologizes the inherently disparate,
is further shot through by a form of sleight-of-hand so solidly
established that it can rightly be termed an institution:
substituting quantity for specificity. Operations such as these
result in a mere shift of psychology’s earlier dependency upon
philosophy to a new dependency, this time on the physical
sciences, a shift which robs psychology of its very chance to ac-
quire both identity and autonomy. For psychology to have a
chance to achieve genuine autonomy it must, in Merleau-Ponty’s
The latter words are apt to evoke the specter of a “subjective
science” in the thinking of the phenomenologically
uninitiated–a notion which must, of course, be rejected as self-
contradictory. Granting the validity of Hebb’s dictum according to
which there is no such thing as a subjective science (1974, p. 73),
it remains nevertheless that “objective” psychology was never ob-
jective in the sense of natural and undistorted, but featured a
merely physicalistic objectivity. So also a psychology of lived ex-
perience is not necessarily “subjective” in the sense of biased, “in-
effable,” and unreliable. Like any scientific endeavor, such a
psychology would aim for rigor. Rigor, not necessarily in terms of
mathematical precision but in the sense of adherence to and
respect for the specificity of the phenomena as they present
As for the “how to” of this new approach, while it is clearly
94 Journal of Phenomenological Psychology 11(1)
outside the compass of this paper, it must nevertheless be stated
that it is only in the beginning phases of being worked out
(Giorgi, 1970; Romanyshyn, 1978). But this does not detract from
the validity of the criticisms here made. Abandonment of an ap-
proach is not dictated by the availability of a fully completed
alternative but by the gravity of its deficiencies. As for the alter-
native approaches, for which Leona Tyler made so insistent a
plea in her 1973 presidential address, they should be given not only
“the chance to fail gracefully rather than always being dismissed
as merely propaedeutic” (Giorgi, Note 2). They should be en-
couraged–more precisely: supported.
In conclusion, it is admittedly easy to gain insight from hind-
sight, and to speak with the support of a rising concert of voices.
The point is, however that there is a time to look back and take
stock, and that the present is inescapably such a time. The fur-
ther point is that while mistakes, even serious ones, do not detract
from the merit of pioneers, this same indulgence cannot be cannot be ex-
tended to generations of followers who cannot claim the in-
nocence of the neophyte. What began a hundred years ago as a
most original–if epistemologically mistaken–pursuit has
become fossilized into a routinized enterprise of “empty research”
which blocks the emergence of fresh attempts to correct what is,
conceptually, so undeniable a failure.
Sampling bias has been especially pronounced in psychology. Let’s just say that college students, rats, and mice are particularly well understood. I say this as having participated, from both sides, when I was in college.
So insofar as college students differ from the general population (say, age, or level of education, or…), the studies don’t apply beyond the study group. Prisoners and mental patients are other well-studied groups. Granted, this is so obvious that there must be attempts to correct for it; I’ve no idea how successful those might be.
None of us here expect economics to actually be science, but medical studies could be very important.