Yves here. It’s frustrating to see how often medical experts have to debunk fact-free assertions that Covid-19 is no big deal and health officials are overreacting. The more accurate view is that there are many people who have not seen much in the way of hospitalizations and deaths in their personal networks. That in turn may be due to luck, to people they know having reclusive tendencies, or employers that are letting them work at home, and/or taking good precautions when they do go out and about. As PlutoniumKun and others have pointed out, another factor is the combination of more testing (leading to more infections being identified) and serious hospitalized cases being kept alive 1-2 weeks longer also means the delay between case count increases and death increases has risen.
This Indiana data is sobering. It isn’t just the confirmation that Covid-19 is plenty lethal. It’s also that very few residents have been infected. I’m opposed to the “herd immunity” thesis, since there is yet no evidence supporting the notion that Covid-19 is different than all other major coronaviruses and confers lasting immunity. But even if it did, we are a very very long way from getting there.
By Nir Menachemi, Professor of Health Policy and Management, IUPUI. Originally published at The Conversation
Since day one of the coronavirus pandemic, the U.S. has not had enough tests. Faced with this shortage, medical professionals used what tests they had on people with the worst symptoms or whose occupations put them at high risk for infection. People who were less sick or asymptomatic did not get tested. Because of this, many infected people in the U.S. have not been tested, and much of the information public health officials have about the spread and deadliness of the virus does not provide a complete picture.
Short of testing every person in the U.S., the best way to get accurate data on who and how many people have been infected with the coronavirus is to test randomly.
I am a professor of health policy and management at Indiana University, and random testing is exactly what we did in my state. From April 25 to May 1, our team randomly selected and tested thousands of Indiana residents, no matter if they’d been sick or not. From this testing we were able to get some of the first truly representative data on coronavirus infection rates at a state level.
We found that 2.8% of the state’s population had been infected with SARS–CoV–2. We also found that minority communities – especially Hispanic communities – have been hit much harder by the virus. With this representative data, we were also able to calculate out just how deadly the virus really is.
The Process of Random Testing
The goal of our study was to learn how many Indiana residents, in total, were currently or had been previously been infected by the coronavirus. To do this, the people our team tested needed to be an accurate representation of Indiana’s population as a whole and we needed to use two tests on every person.
With the help of the Indiana State Department of Health, numerous state agencies and community leaders, we set up 70 testing stations in cities and towns across Indiana. We then randomly selected people from a list created using state tax records and invited them to get tested, free of charge. Some groups showed up more readily than others and we adjusted the numbers to represent the demographics of the state accordingly.
Once a person showed up to our mobile testing sites, they were given both a PCR swab test that looks for current infections and an antibody blood test that looks for evidence of past infection.
By testing randomly and looking for both current and past infections, we could extrapolate our results to the entire state of Indiana and get information about real infection rates of this virus.
The research team also worked with civic leaders from vulnerable communities to conduct open, nonrandom testing as well to see how the results of these two testing approaches would differ.
How Widespread and How Deadly
We tested more than 4,600 Indiana residents as part of the first wave of testing in the study. This included more than 3,600 randomly selected people and more than 900 volunteers who participated in open testing.
During the last week in April, we estimate that 1.7% of the population had active viral infections. An additional 1.1% had antibodies, showing evidence of previous infection. In total, we estimate that 2.8% of the population currently were or had previously been infected with the coronavirus with 95% confidence that the actual infection rate is between 2% and 3.7%.
Because our random sample was designed to be representative of the population of the state, we can assume with almost certainty that the entire state numbers are the same. That would mean that approximately 188,000 Indiana residents had been infected by late April. At that point, the official confirmed cases – not including deaths – were about 17,000.
Focusing the tests on severe or high-risk people underestimated the true infection rate by a factor of 11.
Having a reliable estimate of the true number of people who have been infected also allowed us to calculate the infection fatality rate – the percentage of people infected with SARS-CoV-2 who die. In Indiana, we calculated the rate is 0.58%. For this calculation, we divided the number of COVID-19 deaths in Indiana – 1,099 at the time – into the total number of people that were determined to have been cumulatively infected at 2.8% of the population – 188,000.
Early estimates suggested that 5% to 6% of cases in the U.S. were fatal, which is similar to the 6.3% that you would get by dividing confirmed cases in Indiana – 17,000 – by the deaths – 1,099. The infection–fatality rate of 0.58% is thankfully far lower, but is nearly six times higher than the seasonal flu which has a death rate of 0.1%.
This random testing also allowed us to make accurate estimates about what percent of infected people are asymptomatic. In our study, about 44% of those who tested positive for active viral infection reported no symptoms. While this was already suspected by experts, our estimate is likely the most accurate to date.
Race, Job and Living Situation Matter
The general trends and information about the virus are incredibly important, but just as important are the ways in which human actions influenced what people are most affected.
We asked every person we tested about their race, ethnicity and whether they lived with someone who was previously diagnosed with COVID-19.
Our analysis of the random sample suggests that COVID-19 rates are much higher in minority communities, especially in Hispanic communities, where approximately 8% were currently or previously infected. While we do not definitively know why, it is possible that members of the Hispanic community in Indiana are more likely to be essential workers, live in extended family structures that include relatives beyond the nuclear family or both.
We further found that people who lived with a person who was COVID-19 positive were approximately 12 times more likely to have the virus themselves than people living in a home with no infections. Living with extended family and being more exposed due to one’s job may make it easier for the virus to spread within some communities.
These findings, along with the relatively low 2.8% prevalence, suggest that social distancing slowed the spread of the virus in the larger population. However, the hardest-hit communities were those who, on average, are not able to practice social distancing as consistently as others.
Now that we have this information and have established a baseline, we will continue periodically testing a random sample of people in the state. Doing so will tell us how far the virus has infiltrated our population so that policy decisions can be tailored to the situation.
This is the first statewide random sample study in the U.S. and the numbers offer both points of hope and concern.
The good news is that social distancing worked. Efforts to slow the virus contained it to only 2.8% of the population and by slowing the spread of the virus in the community, Indiana bought some time to determine the best way forward. This provides more time for researchers to both determine the degree to which infection results in immunity and to accelerate the development of a vaccine.
But there is bad news as well. If only 2.8% of the population have been infected with SARS-CoV-2, 97.2% of the population have not been infected and could still get the virus. The risk for a large outbreak that could dwarf the initial wave is still very real.
The demographic distribution of infections, while disturbing, offers important information that can help public health officials direct testing, education and contact tracing resources that are language and culturally sensitive. The research team and the state health department are working with leaders from these communities to figure out how to best contain the spread of the virus in the areas most affected.
As businesses slowly reopen, we need to be vigilant with any and all safety precautions so that we do not lose the ground we gained by hunkering down. Hopefully numbers will go down, but regardless of what happens in the future, we now better know the foe we fight.
Since it usually takes several weeks to die from covid, doesn’t it seem that they have underestimated the infection fatality rate by using the state death count from the week of the testing?
Correct, and good catch, since Covid-19 cases have been rising, but not as dramatically as in the South.
Given another month of data, We can still put an upper bound on the CFR by using the current death count of 2846. That gives a CFR of 1.5%, still far lower than the 5% initially thought.
Clearly, in different places one is going to get different rates for everything. In terms of people of dying the year over year increase of excessive deaths must be looked at and examined as to what these deaths are about as well.
Good point. We will have to wait a year or two to truly measure it.
I am not sure anyone believed the 5%+ from the official case count numbers. I heard a lot of 1-2% guesses from knowledgeable people. It is a less bad thing that the Indiana study suggests the lower end of that range or <1% is more likely. (It is still a lot of death and not a good thing)
China had a CFR of 3.4% on >150,000 cases. But it defined a case as having symptoms + a positive test. I think Italy also had a high CFR due in part to having a aged population + the hospitals being so overloaded that a lot of people didn’t get adequate treatment.
I have read that a lot of Chinese men are heavy smokers. Could many of China’s COVID casualties made themselves more death-liable by pre-injuring their lungs from smoking?
Also, has anyone studied to see if the parts of China with the highest death rates among people infected with New Corona are the parts with highest levels of persistent air pollution?
The mortality rate is not constant. Depends on several factors and will not be the same in summer and in winter, it will not be the same if masks are widely used and if hospital protocols improve. It won’t be the same if the distribution in age cohorts changes.
And very likely it will not be the same if the number of severe cases exceed medical capacities like ICU beds, ventilators, drugs, and specialized staff.
I’m sure you already know this, Ignacio, but some of the people here may not…
fwd https://covid19-projections.com/us-in Interesting that Dr Menachemi May1 Total Infected based on antibody testing of 2.7%, is similar to Mr Gu’s Estimated Total Infected confidence interval to 3.7% for Indiana (confidence interval 2.2%-5.3%). Gu’s current Jul22 Estimated Total Infected is now at 7.4%.
If I recall correctly from a Dr Chris Martenson (PhD Pathologist) weekly COVID (Peak Prosperity on youtube) podcaster, had reported a research paper (he reads/reviews research papers for part of his podcast & explains them for a layman non-biomedical pro audience) that tested some confirmed prior tested COVID patients for antibodies 3 months later, & a minority had lost antibodies, at least at a strength level that could be registered by the antibody test. If that is correct, it implies the actual Total Infected May1 IN number could be higher than Menachemi’s 2.7%.
I am a layman & just brainstorming, but I’d like to see Menachemi revisit the subset of his patient group who were amongst the 2.7% who had antibodies on May1, every 6 months for the next 3 years.
1 Readminister the antibody test, & tabulate the amount who still has antibodies
2 Test (perhaps including lung MRI scan) or at least verbally interview patient for long-term health conditions
3 Report descriptive statistics by gender/decade-age cohort (e.g. 40s as of May2020 female, etc), or gender/decade-age/race cohort the probability of long-term issues. There are anecdotes about people without preexisting conditions getting the 32-year old female avid runner who “recovered” but now is incredibly tired after standing taking a shower for 3 minutes, or the 26-year old male who cannot smell & thus finds it hard to eat. It would be helpful to see the probability by age cohort of these impacts, & follow up every 6 months to see if they persist as lifelong impacts or are temporary/reversible.
btw, to those in the NC community that ARE biomedical pros like Ignacio, do you recommend any COVID experts’ podcast/editorials/etc? Do you have any take on Pathologist Dr Chris Martenson from Peak Prosperity, epidemiologst Dr Michael Osterholm from U of Minnesota CIDRAP podcast, data scientist Youyang Gu from covid19-projections, & biologist mikethemadbiologist editorial/blogger.
I perceive these 4 to be earnest skilled experts & peruse their podcast/site content occasionally – especially Gu’s projections, but that is my mere guesstimation.
The best reporting is here @NC.
I like Dr Seheult at Medcram. Explains the latest papers, and presents hypothesis for why the virus behaves as it does. A lot of it is at the Dr technical level but he is very good at explanations for the layman.
His latest one explains why not very sensitive but quick and inexpensive tests might be a really good thing.
“Because our random sample was designed to be representative of the population of the state, we can assume with almost certainty that the entire state numbers are the same.”
I don’t think we can assume anything. The onus is on them to show that their sample derived from tax returns is representative. I downloaded the pdf from the CDC webpage on the study and it only said “among 15495 randomly selected persons, 3658 participated (23.6%), 3629 (99.2%) of whom had at least one test result.” How were the participants of this random pool selected?
You select people randomly, typically using a computer algorithm with a random number generator. You select the participants so that the age, sex, etc. composition of the group reflects the census data. So for example computer algorithm will be tasked with selecting X number of people from each of a dozen demographic categories. Then you call them up and ask them to participate.
If the group that agrees to participate is imbalanced, you do statistical weighting to correct, for example, if there are too many old people, you weight their infection rates less than those of young people. 3600 people is a large enough group to get representative samples from different demographic categories. When the full paper gets published they will include breakdown of different demographic categories and statistical weights assigned to each.
More importantly, they would have benefited greatly from a larger sample size.
In random sampling, a sample of 3658 is large enough. If the sample was indeed taken randomly, and if it is over about 1,000 subjects, the sample is representative. This has been shown in statistics repeatedly over time–like the last 100 years.
One of the best things about the article by Prof. Menachemi is the exacting explanation of the method.
Unless you are a statistician, you’re quibbling.
Excellent points. Would like to point out some issues with the analysis though.
It is important to note that this analysis is only representative of the sample that was collected and not necessarily of the population in Indiana. (As already noted). Why is the survey flawed?
1. The survey uses a random sample.
2. The researches are stating the sample is representative of the population.
Both assumptions are incorrect if you look into the statistical literature.
It is well know in statistics that random sampling is biased for population sized estimates (see the literature on capture recapture or multiple systems estimation). See Chapman 1951 for how to correct early work of Petersen, 1896 and Lincoln, 1930. See Bird and King 2018 for a review of multiple systems estimation.
In order to refine this, one would need to do adaptive sampling. In addition, one would need to correct for anyone that was not captured in the sample that would be in the population. This is very difficult to do in practice as this requires a bias correction regarding the observed data.
Finally, we should not be quibbling over the sample size but rather on the issues with the study itself as the conclusions are flawed (at least from someone that is a statistician).
The Lake County Council learned Tuesday two of the department’s five nurses recently resigned, and two more are off work, leaving just a single nurse on-duty to field more than 200 calls and emails a day relating to COVID-19.
Lake County (pop. 485,000) includes the cities of Hammond and Gary, Indiana.
(Sorry about the all-bold/italics, I screwed up the formatting)
I did a seven-state driving tour in June, and the suburbs of Indianapolis were my first stop. The difference in activity between there and Michigan was frightening. Of the seven states I visited, Indiana had the fewest people in masks, the most that wanted to shake my hand, etc.
Northern Indiana has been flooded by out-of-staters looking to take a break from covid restrictions, to the dismay of many locals, especially along the beaches of Lake Michigan and surrounding areas. Lake County has finally imposed mask requirements, while in neighboring Porter County masking is “advised” even though several (higher-end) restaurants have recently closed for “deep cleaning” following exposures.
Like elsewhere, most of the politicians in Indiana are too busy making the buck and/or passing the buck to risk taking an actual stand to protect and promote public health, the health of their constituents.
The political misleadership in this county is now responsible for 50-times the US deaths via Covid-19 vs. the 9/11 terror attacks on US soil.
Perhaps it’s another example of “Osama’s revenge”. US preoccupation with and vast “investment” in defeating “global terrorism” may have helped to hollow out domestic infrastructure and capabilities of all kinds.
feels closer to a fifty state version of the 9-11 response.
after 9-11 the strong federal response(including rhetoric) was relatively easy to rally against. all eyes were on lil george and dick and don.
this, the feds are absent/incompetent/chasing butterflies.
and the chaos…especially the information chaos…is more extreme.
makes it even harder than back then to differentiate LIHOP from studpidity/smoking their own stash.
I also must admit(to myself) to being more cynical now than i ever have been…and that’s really saying something. This could bias my assessment.
i’m about burned out on bad news,lol.
Thanks for this. I’d heard through local news that IU was doing this, but I’d yet to see any follow up on it.
Perhaps it’s just coincidence, but the “proxy mortality rate” died/(died + recovered) for “China ex Hubei province” converged to a number close to that reported here, about 0.8%, using the Johns Hopkins CSSE data. If these numbers are not cooked (and many think they are cooked), it suggests that China did a better job of detecting low symptom cases than US has done. Or perhaps they are missing the low symptom cases but have a lower mortality rate than US among the symptomatic cases. Either way, not a great look for US.
For comparison, the province of British Columbia is the best-case scenario in Canada (if not North America) and is a large province with major urban areas, and they published some statistics today:
” The latest modelling data also shed new light on the devastating impact COVID-19 has had in the province’s care homes. Outbreaks at those facilities resulted in a case fatality rate of roughly 20 per cent, meaning one in every five people infected succumbed to the virus.
“This is one of the reasons we have put so much time and energy trying to protect our long-term care homes and our elders and seniors in these situations,” Henry said.
The fatality rate was even higher – at 22.4 per cent – in infections associated with hospital outbreaks in the province.
By comparison, the province’s overall case fatality rate was 6.1 per cent, which is actually slightly lower than the national average of about eight per cent.
The 1,028 cases associated with outbreaks, including outbreaks in the community that spread through social events and workplaces, had an overall fatality rate of 12.9 per cent.
Officials said cases that weren’t associated with an outbreak were significantly less lethal, accounting for 50 deaths out of 1,950 infections.
“So the case fatality rate is about 2.6 per cent – still much higher than we see with influenza, for example, but much lower than certain scenarios,” Henry said.”
Yves Smith: Yes, thanks for this article. The good professor takes pains to describe the sampling techniques and how the researchers drew their conclusions. Just the explanation of how one does a statistical example, let alone the insight into Covid infection.
As the writer Alessandro Carrera recently noted, all science is wrong. He says that to his students to shock them. The study is a snapshot, as Ignacio and others note above. The infection rate and the numbers will change. Yet the professor and the other researchers show that Covid is dangerous and insidious–we all suspected that, but the study indicates that greater care with public-health measures is required. To understand what Carrera means–this study will soon be superseded by others and will form the basis of others, even as the data go obsolete.
In spite of the disturbing findings, it is good to have this information. I also agree with Yves Smith’s assessment that “herd immunity” is a fantasy–that I am seeing among liberals and among the Trumpish among us. It is the revenge of anti-vaxxers and others who think that public health is something for other people to attend to.
I have to apologize if I missed this in other comment threads (didn’t see any discussion on the 7/18 daily links that included an FT article on the subject), but I would like to hear what Ignacio and others think about the role T-cells might play in immunity. Folkhälsomyndigheten has updated their guidance to say that if you had a positive PCR test in the past 6 months and recovered, “the risk of reinfection is low” even if you don’t have measurable antibodies.
Anecdotally, a friend of mine in her mid 30s was sick with a lot of the symptoms a few months ago, including a total loss of taste and smell, but never received a PCR test while sick and an antibody test came back negative. Recent randomized antibody testing in Stockholm had something like a 15% positive rate but I get the impression that the public health authority is confident that a higher percentage have been sick than that.
For the benefit of us bog peasants here – that antibody blood test that they did on all those people. It has been established now that a lot of people quickly lose the antibodies that they had to fight off their bout with Coronavirus. Would that test still pick up on infected people if those antibodies were already gone? If not, then that would skew the numbers and it would be a larger percentage of people that had been infected.
Antibodies disappear from the blood in about 6 months after exposure. But T and B cells retain the immune response and generate antibodies upon re-exposure.
The oft repeated 0.1% CFR for flu is not based on observation, but modeled, and hugely overstated. The actual death toll from influenza is much lower .
There has been past criticism that CDC flu death estimates were inflated (see for example April SciAmer article by Jeremy Faust). I’m not qualified to judge this and don’t want to spread bad info, but am wondering if the 0.1% flu death comparator has been adjusted downward since February, when the comparison to flu gave me and I think many people a false sense of complacency.
A small point and then a bigger one.
Small: The writeup doesn’t discuss false negatives and positives of the two tests — estimates of these rates and their translation into confidence intervals. I assume that’s in the full paper.
Big: Thus far discussion and analysis of outcomes have included only these categories: infected or not, asymptomatic or not, hospitalized or not, treated in ICU or not, and fatal or not. There is one more outcome that needs to be monitored, however, the extent to which the infected person exhibits potentially permanent tissue damage. This will be a big concern going forward, since it appears to be widespread and affects future health risks. Of course, it is also the most difficult to gather data on because it requires a clinical assessment, and damage to lungs, kidneys etc. can emerge and/or dissipate over time. Still, not addressing this dimension, even if to explain why it isn’t being included in a study, leaves out a significant aspect of this pandemic.
probably also want to consider, how many times do those tested show negative once, but later show as positive, and of course vice versa? and how many are positive end up in ICU, then recover, but later are positive, and either ends up back in the ICU, or only have the ‘ mild ‘ case (its not fatal….but they have one or more permanent conditions…lung,heart, and other damage). course some of these maybe impossible in many cases, because of how health is handled in a state, or county). and it gets worse.