Some Pointers on How to Catch the Dubious Use of Statistics

Posted on June 24, 2015 by Yves Smith

A long standing pet peeve is how the use of figures has been fetishized in political discourse and in our society generally, to the point where many people too easily swayed by argument that invoke data (I discussed this phenomenon at length in the business context in a 2006 article for the Conference Board Review, Managementt’s Great Addiction). And now that what used to be called statistical analysis has now been given mystical powers by calling it “Big Data,” the layperson is going to be subjected to even more torturing of information by people who have incentives to find stories in it, whether those stories are valid or not.

One helpful notion to keep in the back of your mind when looking at studies are basic ideas like: was the sample size large enough? Is is representative? And one always needs to remember that correlation is not causation.

This short video (hat tip Lars Syll) provides some additional insight into the sound use of statistics, which should help in thinking critically about research findings. This video is a couple from 2013 and has lively examples.

If you are interested in more on this topic, I strongly urge you to read the classic and widely-cited 2005, Why Most Published Research Findings Are False. Note that this paper’s warnings apply most strongly to research where the investigator creates his own data set for study, such as medical research. Here is the abstract:

There is increasing concern that most current published research findings are false. The probability that a research claim is true may depend on study power and bias, the number of other studies on the same question, and, importantly, the ratio of true to no relationships among the relationships probed in each scientific field. In this framework, a research finding is less likely to be true when the studies conducted in a field are smaller; when effect sizes are smaller; when there is a greater number and lesser preselection of tested relationships; where there is greater flexibility in designs, definitions, outcomes, and analytical modes; when there is greater financial and other interest and prejudice; and when more teams are involved in a scientific field in chase of statistical significance. Simulations show that for most study designs and settings, it is more likely for a research claim to be false than true. Moreover, for many current scientific fields, claimed research findings may often be simply accurate measures of the prevailing bias. In this essay, I discuss the implications of these problems for the conduct and interpretation of research.

As we know, economics has as different approach in that the discipline relies on mathed-up statements of finding which serves to exclude non-economists from being taken seriously as critics of their work (and as skeptics like Deidre McClosky have pointed out, the parts written up in mathematical form are often the trivial parts of the argument). And they generally prefer abstract models with Panglossian assumptions embedded in them (that economies have a propensity towards equilibrium state at full employment) over empirical work.

Print Friendly, PDF & Email

Subscribe to Post Comments
12 comments

vlade June 24, 2015 at 2:55 am

To the links above, I’d add Cathy O’Neil’s On being a data sceptic http://www.oreilly.com/data/free/files/being-a-data-skeptic.pdf

This topic would deserve a permanent section I’d say..
1. Min June 24, 2015 at 10:53 am
  
  Let me second the motion to read O’Neil about Big Data. And other things as well. :) See her blog at http://mathbabe.org/
Clive June 24, 2015 at 3:51 am

Another red flag is where, perhaps for perfectly laudable reasons (then again, perhaps not…) a complex dataset is rendered graphically to help “understandability”. Either be accident or by intentional sleight of hand, errors can often be introduced in an attempt to simplify the underlying data.

Take this example, (the coloured map of the world showing murder rates by country and expanding or shrinking the country’s landmass to show where it is disproportionately higher or lower than the average and is the last picture in the article) which is an attempt to convey visually what is quite a lot of data. When I saw it, I knew there wasn’t something quite right about it because I knew that Japan had an infinitesimally low murder rate but was shown on the “map” as being at best average and at worse seemingly above average. Yet Australia which has a pretty average (maybe slightly below average) murder rate is rendered almost invisible on the map implying a significantly lower murder rate. Something was wrong and it took me a while to figure it out.

What the map didn’t do was account for population density. Australia has an exceptionally low population density considering its huge landmass whereas Japan has an exceptionally high population density. Whoever generated this map didn’t take that into account.

Now, presenting data in visually isn’t a bad thing in and of itself. Just so long as it is done correctly. But this isn’t the only example I’ve come across where visual presentations of data end up being completely wrong. Whenever I see a graph and don’t get the raw data, I start to get wary especially if the graphical representation is being used by the writer to hard-sell a particular point.
IsabelPS June 24, 2015 at 4:22 am

I remember a small book on aspirin, the wonder drug I was quite surprised to read that, for its latest, widespread use for coronary protection, they had to make 10 studies before they finally could prove what, theoretically, made sense but reality refused to confirm…
And, still on medical studies, there is a fascinating book about the way Linus Pauling never managed to have his theory on high-doses of vitamin C to treat cancer properly tested, in spite of 2 studies conducted, I think, by the Mayo clinic but never exactly as he recommended. Unfortunately, I have lost that book (that is more on how science works than on the case proper, which, BTW, is very well documented because there are 10 years of letters exchanged between Pauling and a Scottish surgeon). In my 50s of search, right now, I can’t find the title, which is strange, but I will try more seriously.
Steve H. June 24, 2015 at 7:19 am

Corrupt Techniques in Evidence Presentations: Tufte

//www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0002bA
Mary Wehrheim June 24, 2015 at 8:10 am

Another problem is the actual suppression of the gathering of data; for example. NRA minded lobbyists “encouraging” congressman to defund compilation of data by the CDC on gun deaths, or the paucity of information out there on the details regarding police shootings etc. Then there is that whole cottage industry out there of conservative think tanks. If the study is in any way linked to the Heritage Foundation or other tank whose web site has “freedom” “liberty” or “market solutions” anywhere in its mission statement….here there be dragons.
1. Ivy June 24, 2015 at 10:34 am
  
  The defunding issue is a national disgrace. I support gun ownership, but not fact obfuscation. More disinfecting sunlight is needed.
Min June 24, 2015 at 11:06 am

A good talk, but let me reiterate the point I made a Syll’s site. Blitzstein shows how “regression towards the mean” is necessary for stability, but does not show that it does not mean that the population as a whole approaches the mean, which is what the term suggests. That was Galton’s original interpretation. He thought that he had discovered something about evolution. But while tall fathers tend to have sons who are tall, but not as tall, so tall sons tend to have fathers who are tall, but not as tall. Similarly, a student whose second test score is good is likely to have a first test score that is not as good.
1. MaroonBulldog June 26, 2015 at 1:02 am
  
  “”regression toward the mean” is necessary for stability.’= loose concept of what “necessary” means. Stability is an inference we draw from something we observe. We need “regression toward the mean” in order to construct an explanation of why we observe the class of phenomena we classify as stability. We need it because it is the best, simplest mathematical idea we have come across the helps us understand why we observe what we observe.
  
  “regression to the mean” is not a cause; it is not a fact; it is an inference from an observation. It provides us a theory of how the world works, but it does not tell us why the world works as we observe it. Since “everything is conditional”–I love Bayes’s theorem–so is the necessity of “regression to the mean” conditional–on the necessity of stability: It is necessary “for stability,” but why is stability itself necessary? Because we are in the habit of observing the world as stable, as Hume seems to have thought? –I love epistemology as much as I love Bayes’s theorem. In fact, of all the things that I don’t understand, I love these two the best.
LAS June 24, 2015 at 2:41 pm

Perhaps most important are recognizing the difference between experimental methods and observational studies. Was there a well-run random trial? Was there any randomization at all? Most economic data is not randomized and that’s the source of its limitations right there.

The strength of a paper is in the methods used, quality of the argumentation and sources. Data modeling should be done for minor adjustment of findings. If you think data modeling is the heart and soul of your thesis, you’re probably way removed from real science.
downunderer June 24, 2015 at 9:40 pm

What I recall most vividly from my years of summarizing research reports for publication in a periodical of a handmaiden of Big Pharma is the dichotomous thinking enforced by regulation, which required that the difference between a test group and controls be “significant”. Meaning that the odds of the difference arising by chance were smaller than 1/20, or 19:1 against. Meaning also that something like several percent of approved drugs are just lucky placebos that never get well tested again.

This led to many a report of differences that only “approached significance” even after all the invisible data tweaking was done.

But the worst feature was that nobody I recall ever calculated the significance of the significance, or the significance of the adverse effects.

So many a drug has been lauded and launched because it demonstrably helped, say, 3 or 4 patients out of a hundred who received it, and nobody in the whole process looked too hard at how much harm (or at least avoidable discomfort and expense) was suffered by those not helped. The profit for the company comes from advertising-related sales, much more than from results.
alex morfesis June 24, 2015 at 11:17 pm

fantabulous piece

Comments are closed.