By Cathy O’Neil, a data scientist who lives in New York City and writes at mathbabe.org
Yesterday I caught a lecture at Columbia given by statistics professor David Madigan, who explained to us the story of Vioxx and Merck. It’s fascinating and I was lucky to get permission to retell it here.
Madigan has been a paid consultant to work on litigation against Merck. He doesn’t consider Merck to be an evil company by any means, and says it does lots of good by producing medicines for people. According to him, the following Vioxx story is “a line of work where they went astray”.
Yet Madigan’s own data strongly suggests that Merck was well aware of the fatalities resulting from Vioxx, a blockbuster drug that earned them $2.4b in 2003, the year before it “voluntarily” pulled it from the market in September 2004. What you will read below shows that the company set up standard data protection and analysis plans which they later either revoked or didn’t follow through with, they gave the FDA misleading statistics to trick them into thinking the drug was safe, and set up a biased filter on an Alzheimer’s patient study to make the results look better. They hoodwinked the FDA and the New England Journal of Medicine and took advantage of the public trust which ultimately caused the deaths of thousands of people.
The data for this talk came from published papers, internal Merck documents that he saw through the litigation process, FDA documents, and SAS files with primary data coming from Merck’s clinical trials. So not all of the numbers I will state below can be corroborated, unfortunately, due to the fact that this data is not all publicly available. This is particularly outrageous considering the repercussions that this data represents to the public.
The process for getting a drug approved is lengthy, requires three phases of clinical trials before getting FDA approval, and often takes well over a decade. Before the FDA approved Vioxx, less than 20,000 people tried the drug, versus 20,000,000 people after it was approved. Therefore it’s natural that rare side effects are harder to see beforehand. Also, it should be kept in mind that for the sake of clinical trials, they choose only people who are healthy outside of the one disease which is under treatment by the drug, and moreover they only take that one drug, in carefully monitored doses. Compare this to after the drug is on the market, where people could be unhealthy in various ways and could be taking other drugs or too much of this drug.
Vioxx was supposed to be a new “NSAID” drug without the bad side effects. NSAID drugs are pain killers like Aleve and ibuprofen and aspirin, but those had the unfortunate side effects of gastro-intestinal problems (but those are only among a subset of long term users, such as people who take painkillers daily to treat chronic pain, such as people with advanced arthritis). The goal was to find a pain-killer without the GI side effects. The underlying scientific goal was to find a COX-2 inhibitor without the COX-1 inhibition, since scientists had realized in 1991 that COX-2 suppression corresponded to pain relief whereas COX-1 suppression corresponded to GI problems.
Vioxx Introduced and Withdrawn From the Market
The timeline for Vioxx’s introduction to the market was accelerated: they started work in 1991 and got approval in 1999. They pulled Vioxx from the market in 2004 in the “best interest of the patient”. It turned out that it caused heart attacks and strokes. The stock price of Merck plummeted and $30 billion of its market cap was lost. There was also an avalanche of lawsuits, one of the largest resulting in a $5 billion settlement which was essentially a victory for Merck, considering they made a profit of $10 billion on the drug while it was being sold.
The story Merck will tell you is that they “voluntarily withdrew” the drug on September 30, 2004. In a placebo-controlled study of colon polyps in 2004, it was revealed that over a time period of 1200 days, 4% of the Vioxx users suffered a “cardiac, vascular, or thoracic event” (CVT event), which basically means something like a heart attack or stroke, whereas only 2% of the placebo group suffered such an event. In a group of about 2400 people, this was statistically significant, and Merck had no choice but to pull their drug from the market.
It should be noted that, on the one hand Merck should be applauded for checking for CVT events on a colon polyps study, but on the other hand that in 1997, at the International Consensus Meeting on COX-2 Inhibition, a group of leading scientists issued a warning in their Executive Summary that it was “… important to monitor cardiac side effects with selective COX-2 inhibitors”. Moreover, in an internal Merck email as early as 1996, it was stated there was a “… substantial chance that CVT will be observed.” In other words, Merck knew to look out for such things. Importantly, however, there was no subsequent insert in the medicine’s packaging that warned of possible CVT side-effects.
What the CEO of Merck Said
What did Merck say to the world at that point in 2004? You can look for yourself at the four and half hour Congressional hearing (seen on C-SPAN) which took place on November 18, 2004. Starting at 3:27:10, the then-CEO of Merck, Raymond Gilmartin, testifies that Merck “puts patients first” and “acted quickly” when there was reason to believe that Vioxx was causing CVT events. Gilmartin also went on the Charlie Rose show and repeated these claims, even go so far as stating that the 2004 study was the first time they had a study which showed evidence of such side effects.
How quickly did they really act though? Were there warning signs before September 30, 2004?
Let’s go back to the time in 1999 when Vioxx was FDA approved. In spite of the fact that it was approved for a rather narrow use, mainly for arthritis sufferers who needed chronic pain management and were having GI problems on other meds (keeping in mind that Vioxx was way more expensive than ibuprofen or aspirin, so why would you use it unless you needed to), Merck nevertheless launched an ad campaign with Dorothy Hamill and spent $160m (compare that with Budweiser which spent $146m or Pepsi which spent $125m in the same time period).
As I mentioned, Vioxx was approved faster than usual. At the time of its approval, the completed clinical studies had only been 6- or 12-week studies; no longer term studies had been completed. However, there was one underway at the time of approval, namely a study which compared Aleve with Vioxx for people suffering from osteoarthritis and rheumatoid arthritis.
What did the arthritis studies show? These results, which were available in late 2003, showed that the CVT events were more than twice as likely with Vioxx as with Aleve (CVT event rates of 32/1304 = 0.0245 with Vioxx, 6/692 = 0.0086 with Aleve, with a p-value of 0.01). As we see this is a direct refutation of the fact that CEO Gilmartin stated that they didn’t have evidence until 2004 and acted quickly when they did.
In fact they had evidence even before this, if they bothered to put it together (in fact they stated a plan to do such statistical analyses but it’s not clear if they did them- or in any case there’s so far no evidence that they actually did these promised analyses).
In a previous study (“Table 13″), available in February of 2002, the could have seen that, comparing Vioxx to placebo, we saw a CVT event rate of 27/1087 = 0.0248 with Vioxx versus 5/633 = 0.0079 with placebo, with a p-value of 0.01. So, three times as likely.
In fact, there was an even earlier study (“1999 plan”), results of which were available in July of 2000, where the Vioxx CVT event rate was 10/427 = 0.0234 versus a placebo event rate of 1/252 = 0.0040, with a p-value of 0.05 (so more than 5 times as likely). This p-value can be taken to be the definition of statistically significant. So actually they knew to be very worried as early as 2000, but maybe they… forgot to do the analysis?
The FDA and Pooled Data
Where was the FDA in all of this?
They showed the FDA some of these numbers. But they did something really tricky. Namely, they kept the “osteoarthritis study” results separate from the “rheumatoid arthritis study” results. Each alone were not quite statistically significant, but together were amply statistically significant. Moreover, they introduced a third category of study, namely the “Alzheimer’s study” results, which looked pretty insignificant (more on that below though). When you pooled all three of these study types together, the overall significance was just barely not there.
It should be mentioned that there was no apparent reason to separate the different arthritic studies, and there is evidence that they did pool such study data in other places as a standard method. That they didn’t pool those studies for the sake of their FDA report is incredibly suspicious. That the FDA didn’t pick up on this is probably due to the fact that they are overworked lawyers, and too trusting on top of that. That’s unfortunately not the only mistake the FDA made (more below).
So the Alzheimer’s study kind of “saved the day” here. But let’s look into this more. First, note that the average age of the 3,000 patients in the Alzheimer’s study was 75, it was a 48-month study, and that the total number of deaths for those on Vioxx was 41 versus 24 on placebo. So actually on the face of it it sounds pretty bad for Vioxx.
There were a few contributing reasons why the numbers got so mild by the time the study’s result was pooled with the two arthritis studies. First, when really old people die, there isn’t always an autopsy. Second, although there was supposed to be a DSMB as part of the study, and one was part of the original proposal submitted to the FDA, this was dropped surreptitiously in a later FDA update. This meant there was no third party keeping an eye on the data, which is not standard operating procedure for a massive drug study and was a major mistake, possibly the biggest one, by the FDA.
Third, and perhaps most importantly, Merck researchers created an added “filter” to the reported CVT events, which meant they needed the doctors who reported the CVT event to send their info to the Merck-paid people (“investigators”), who looked over the documents to decide whether it was a bonafide CVT event or not. The default was to assume it wasn’t, even though standard operating procedure would have the default assuming that there was such an event. In all, this filter removed about half the initially reported CVT events, and about twice as often the Vioxx patients had their CVT event status revoked as for the placebo patients. Note that the “investigator” in charge of checking the documents from the reporting doctors is paid $10,000 per patient. So presumably they wanted to continue to work for Merck in the future.
The effect of this “filter” was that, instead of it seeming 1.5 times as likely to have a CVT event if you were taking Voixx, it seemed like it was only 1.03 as likely, with a high p-score.
If you remove the ridiculous filter from the Alzheimer’s study, then you see that as of November 2000 there was statistically significant evidence that Vioxx caused CVT events in Alzheimer patients.
By the way, one extra note. Many of the 41 deaths in the Vioxx group were dismissed as “bizarre” and therefore unrelated to Vioxx. Namely, car accidents, falling of ladders, accidentally eating bromide pills. But at this point there’s evidence that Vioxx actually accelerates Alzheimer’s disease itself, which could explain those so-called bizarre deaths. This is not to say that Merck knew that, but rather that one should not immediately dismiss the concept of statistically significant just because it doesn’t make intuitive sense.
VIGOR and the New England Journal of Medicine
One last chapter in this sad story. There was a large-scale study, called the VIGOR study, with 8,000 patients. It was published in the New England Journal of Medicine on November 23, 2000. See also this NPR timeline for details. They didn’t show the graphs which would have emphasized this point, but they admitted, in a deceptively round-about way, that Vioxx has 4 times the number of CVT events than Aleve. They hinted that this is either because Aleve is protective against CVT events or that Vioxx is bad for it, but left it open.
But Bayer, which owns Aleve, issued a press release saying something like, “if Aleve is protective for CVT events then it’s news to us.” Bayer, it should be noted, has every reason to want people to think that Aleve is protective against CVT events. This problem, and the dubious reasoning explaining it away, was completely missed by the peer review system; if it had been spotted, Vioxx would have been forced off the market then and there. Instead, Merck purchased 900,000 preprints of this article from the NE Journal of Medicine, which is more than the number of practicing doctors in the U.S.. In other words, the Journal was used as a PR vehicle for Merck.
The paper emphasized that Aleve has twice the rate of ulcers and bleeding, at 4%, whereas Vioxx had a rate of only 2% among chronic users. When you compare that to the elevated rate of heart attack and death (0.4% to 1.2%) of Vioxx over Aleve, though, the reduced ulcer rate doesn’t seem all that impressive.
A bit more color on this paper. It was written internally by Merck, after which non-Merck authors were found. One of them is Loren Laine. Loren helped Merck develop a sound-byte interview which was 30 seconds long and was sent to the news media and run like a press interview, even though it actually happened in Merck’s New Jersey office (with a backdrop to look like a library) with a Merck employee posing as a neutral interviewer. Some smart lawyer got the outtakes of this video made available as part of the litigation against Merck. Check out this youtube video, where Laine and the fake interviewer scheme about spin and Laine admits they were being “cagey” about the renal failure issues that were poorly addressed in the article.
The Damage Done
Also on the Congress testimony I mentioned above is Dr. David Graham, who speaks passionately from minute 41:11 to minute 53:37 about Vioxx and how it is a symptom of a broken regulatory system. Please take 10 minutes to listen if you can.
He claims a conservative estimate is that 100,000 people have had heart attacks as a result of using Vioxx, leading to between 30,000 and 40,000 deaths (again conservatively estimated). He points out that this 100,000 is 5% of Iowa, and in terms people may understand better, this is like 4 aircraft falling out of the sky every week for 5 years.
According to this blog, the noticeable downwards blip in overall death count nationwide in 2004 is probably due to the fact that Vioxx was taken off the market that year.
Let’s face it, nobody comes out looking good in this story. The peer review system failed, the FDA failed, Merck scientists failed, and the CEO of Merck misled Congress and the people who had lost their husbands and wives to this damaging drug. The truth is, we’ve come to expect this kind of behavior from traders and bankers, but here we’re talking about issues of death and quality of life on a massive scale, and we have people playing games with statistics, with academic journals, and with the regulators.
Just as the financial system has to be changed to serve the needs of the people before the needs of the bankers, the drug trial system has to be changed to lower the incentives for cheating (and massive death tolls) just for a quick buck. As I mentioned before, it’s still not clear that they would have made less money, even including the penalties, if they had come clean in 2000. They made a bet that the fines they’d need to eventually pay would be smaller than the profits they’d make in the meantime. That sounds familiar to anyone who has been following the fallout from the credit crisis.
One thing that should be changed immediately: the clinical trials for drugs should not be run or reported on by the drug companies themselves. There has to be a third party which is in charge of testing the drugs and has the power to take the drugs off the market immediately if adverse effects (like CVT events) are found. Hopefully they will be given more power than risk firms are currently given in finance (which is none)- in other words, it needs to be more than reporting, it needs to be an active regulatory power, with smart people who understand statistics and do their own state-of-the-art analyses – although as we’ve seen above even just Stats 101 would sometimes do the trick.