By Darius Tahir, a correspondent based in Washington, D.C. who reports on health technology. Originally published at KFF Health News.
What use could health care have for someone who makes things up, can’t keep a secret, doesn’t really know anything, and, when speaking, simply fills in the next word based on what’s come before? Lots, if that individual is the newest form of artificial intelligence, according to some of the biggest companies out there.
Companies pushing the latest AI technology — known as “generative AI” — are piling on: Google and Microsoft want to bring types of so-called large language models to health care. Big firms that are familiar to folks in white coats — but maybe less so to your average Joe and Jane — are equally enthusiastic: Electronic medical records giants Epic and Oracle Cerner aren’t far behind. The space is crowded with startups, too.
The companies want their AI to take notes for physicians and give them second opinions — assuming they can keep the intelligence from “hallucinating” or, for that matter, divulging patients’ private information.
“There’s something afoot that’s pretty exciting,” said Eric Topol, director of the Scripps Research Translational Institute in San Diego. “Its capabilities will ultimately have a big impact.” Topol, like many other observers, wonders how many problems it might cause — like leaking patient data — and how often. “We’re going to find out.”
The specter of such problems inspired more than 1,000 technology leaders to sign an open letter in March urging that companies pause development on advanced AI systems until “we are confident that their effects will be positive and their risks will be manageable.” Even so, some of them are sinking more money into AI ventures.
The underlying technology relies on synthesizing huge chunks of text or other data — for example, some medical models rely on 2 million intensive care unit notes from Beth Israel Deaconess Medical Center in Boston — to predict text that would follow a given query. The idea has been around for years, but the gold rush, and the marketing and media mania surrounding it, are more recent.
The frenzy was kicked off in December 2022 by Microsoft-backed OpenAI and its flagship product, ChatGPT, which answers questions with authority and style. It can explain genetics in a sonnet, for example.
OpenAI, started as a research venture seeded by Silicon Valley elites like Sam Altman, Elon Musk, and Reid Hoffman, has ridden the enthusiasm to investors’ pockets. The venture has a complex, hybrid for- and nonprofit structure. But a new $10 billion round of funding from Microsoft has pushed the value of OpenAI to $29 billion, The Wall Street Journal reported. Right now, the company is licensing its technology to companies like Microsoft and selling subscriptions to consumers. Other startups are considering selling AI transcription or other products to hospital systems or directly to patients.
Hyperbolic quotes are everywhere. Former Treasury Secretary Larry Summers tweeted recently: “It’s going to replace what doctors do — hearing symptoms and making diagnoses — before it changes what nurses do — helping patients get up and handle themselves in the hospital.”
But just weeks after OpenAI took another huge cash infusion, even Altman, its CEO, is wary of the fanfare. “The hype over these systems — even if everything we hope for is right long term — is totally out of control for the short term,” he said for a March article in The New York Times.
Few in health care believe this latest form of AI is about to take their jobs (though some companies are experimenting — controversially — with chatbots that act as therapists or guides to care). Still, those who are bullish on the tech think it’ll make some parts of their work much easier.
Eric Arzubi, a psychiatrist in Billings, Montana, used to manage fellow psychiatrists for a hospital system. Time and again, he’d get a list of providers who hadn’t yet finished their notes — their summaries of a patient’s condition and a plan for treatment.
Writing these notes is one of the big stressors in the health system: In the aggregate, it’s an administrative burden. But it’s necessary to develop a record for future providers and, of course, insurers.
“When people are way behind in documentation, that creates problems,” Arzubi said. “What happens if the patient comes into the hospital and there’s a note that hasn’t been completed and we don’t know what’s been going on?”
The new technology might help lighten those burdens. Arzubi is testing a service, called Nabla Copilot, that sits in on his part of virtual patient visits and then automatically summarizes them, organizing into a standard note format the complaint, the history of illness, and a treatment plan.
Results are solid after about 50 patients, he said: “It’s 90% of the way there.” Copilot produces serviceable summaries that Arzubi typically edits. The summaries don’t necessarily pick up on nonverbal cues or thoughts Arzubi might not want to vocalize. Still, he said, the gains are significant: He doesn’t have to worry about taking notes and can instead focus on speaking with patients. And he saves time.
“If I have a full patient day, where I might see 15 patients, I would say this saves me a good hour at the end of the day,” he said. (If the technology is adopted widely, he hopes hospitals won’t take advantage of the saved time by simply scheduling more patients. “That’s not fair,” he said.)
Nabla Copilot isn’t the only such service; Microsoft is trying out the same concept. At April’s conference of the Healthcare Information and Management Systems Society — an industry confab where health techies swap ideas, make announcements, and sell their wares — investment analysts from Evercore highlighted reducing administrative burden as a top possibility for the new technologies.
But overall? They heard mixed reviews. And that view is common: Many technologists and doctors are ambivalent.
For example, if you’re stumped about a diagnosis, feeding patient data into one of these programs “can provide a second opinion, no question,” Topol said. “I’m sure clinicians are doing it.” However, that runs into the current limitations of the technology.
Joshua Tamayo-Sarver, a clinician and executive with the startup Inflect Health, fed fictionalized patient scenarios based on his own practice in an emergency department into one system to see how it would perform. It missed life-threatening conditions, he said. “That seems problematic.”
The technology also tends to “hallucinate” — that is, make up information that sounds convincing. Formal studies have found a wide range of performance. One preliminary research paper examining ChatGPT and Google products using open-ended board examination questions from neurosurgery found a hallucination rate of 2%. A study by Stanford researchers, examining the quality of AI responses to 64 clinical scenarios, found fabricated or hallucinated citations 6% of the time, co-author Nigam Shah told KFF Health News. Another preliminary paper found, in complex cardiology cases, ChatGPT agreed with expert opinion half the time.
Privacy is another concern. It’s unclear whether the information fed into this type of AI-based system will stay inside. Enterprising users of ChatGPT, for example, have managed to get the technology to tell them the recipe for napalm, which can be used to make chemical bombs.
In theory, the system has guardrails preventing private information from escaping. For example, when KFF Health News asked ChatGPT its email address, the system refused to divulge that private information. But when told to role-play as a character, and asked about the email address of the author of this article, it happily gave up the information. (It was indeed the author’s correct email address in 2021, when ChatGPT’s archive ends.)
“I would not put patient data in,” said Shah, chief data scientist at Stanford Health Care. “We don’t understand what happens with these data once they hit OpenAI servers.”
Tina Sui, a spokesperson for OpenAI, told KFF Health News that one “should never use our models to provide diagnostic or treatment services for serious medical conditions.” They are “not fine-tuned to provide medical information,” she said.
With the explosion of new research, Topol said, “I don’t think the medical community has a really good clue about what’s about to happen.”
As a Neurologist myself, I can confidently assert that seeing 15 patients in a day means your value as a psychiatrist is next to nothing. Don’t tell me he works 15-16 hours a day if he speaks like that (though others do work that hard and never complain or boast about it). Spending less than one hour per psychiatric patient is very suspect. You can, of course, write a prescription for bupropion or olanzapine in less than a minute. This is where our medicine is really sick.
And oh, about the not-so-important nonverbal signs, from the pattern of walking to the tearful eyes, they are in fact most of the time much more telling than the words spoken. I would not refer any patient to this Arzubi and his like. But we are marching towards the most dystopian society ever…
Once AI is widespread, will it become almost solely depended upon for answers? If it does become depended on for answers and establish it’s role as fact producer, will it become ossified and rigid – incapable of advancement or new ideas?
Will it tend to linearize and standardize the legal system in a ways that may limit many and free a few?..stop human and cultural diversity and progression? Give reason not to care for maximum biodiversity and harmony with our spacecraft earth?
Who directs and discerns (simplistically) good from bad, positive and negative? — for instance, what economic system is good and bad and for whom is it good and for whom bad?
With all the money in politics…I suppose it is being used extensively… and to what end ? What legislation is being roughed out and to what purpose?
Most human communication is non-verbal…. what happens for the place emotion plays in life? (after all the root of emotion is to put into motion)
ISTM when you get to the point where all training data is just earlier output from other AI you have a closed feedback loop.
But I studied AI as a grad student in an earlier generation where understanding natural language was the holy grail. There were great hopes then that never panned out.
Rule number one in training AI is not to feed it with any identifying information. Ever. Because you don’t want it to “learn” about, say, Chuck – you want it to learn about people that can be classified belonging to the same group as Chuck.
And that’s precisely the big problem with processing metric crap-loads of “free-form” text: it’s downright impossible to remove identifying information from a wall of text. The best efforts at the moment can reach 80% of the well defined identifiers (name, email), but will certainly fail with sentences like “orange man vs. 46th president” even if most humans can identify individuals from that.
Programmed by people.
People paid for a goal of power.
A vast storage Warehouse (laughably called the Cloud)
The Huckster’s Game of getting you to reveal all your stuff, while he calculates how to take advantage of you.
It does make you think that the Luddites had more humanity than their Overlords.
I’ve recently wondered, though, whether some n-letter agency, probably now ending in AI, has been feeding all the live streams of the communication data they hoover up to use as feeding material for their own AI research projects. So associating all names, numbers, topics, locations, etc. together to see what AI can make of it. That’s on the one hand.
On the other hand I’ve also always wondered how on earth they can have a storage backup system that can be relied on. How many times have you gone for a backup tape or an Apple Timemachine restore only to find that your backup scheme stopped working 6 months ago. D’oh! 600 petabytes of my Chicago data gone! I’ll have to rebuild my spreadsheet from scratch…
As we used to say: you don’t really have backups until you have managed to restore the thing you need. Thus we did run some random restoring tests from the tape library for the most critical data way back when we still had the resources and a tape library.
Now it’s all block level snapshots and a policy of not being responsible if you didn’t have copies of your data…
I imagine the AMA will take a stand against certain uses of AI in medicine as it has the potential to totally cheapen physician sovereignty (and wealth). Retail blood tests didn’t work under Theranos, but I can see retail diagnosis becoming a popular, sought-after alternative for many potential patients. Many patients will prefer this cheaper option over privacy b/c medicine in the USA is so over-priced and/or one may not have health insurance or the copay/co-insurance funds to make up the difference going to be extracted. Privacy already has many breaches, so what the heck. Then, too, many typically poorer or rural people don’t even have access to timely high quality health care and/or are already experiencing skipped or missed diagnosis anyway, or medically acquired harms. It’s not like the USA has the best health care in the world for all its people. The trade will want to control the technology for reasons of preserving their sovereign right as corporate rentiers, and that’s the bigger problem IMO.
Watching members of my family in primary care struggle with a combination of the electronic medical record, declining reimbursement rates, shorter allotted patient time slots, and increasingly stressed out patients, I have gotten a glimpse of the harsh balancing act of AI.
Members have appreciated the help in assistance in the typing problem, but the sheer number of patients to be seen poses the problem of accurate notes when the doctor finds time to finish the note and the AI unit may have fibbed in composing the note.
When more and more doctors have been driven out of, or not enlisted in primary care, due to insane working conditions, relative low pay, poorly trained subordinates, and huge educational debt, I’m certain this situation will get much worse.
When lesser trained medical workers rely on AI making diagnoses without having the depth of of knowledge of a trained doctor, things will get very interesting, as we will have multiple sources of confounding inaccuracies.
I will say one thing about this idea. The way that the medical establishment lined up with their political masters in so many countries during the Pandemic and gave advice that they knew was wrong and even harmful seriously eroded trust between people and their doctors. If they bring in these AIs on a widespread scale, this will serve to finish whatever trust is left.
I’d put it the other way around. The dangers of getting a ‘second opinion’ or even some initial guidance in making a diagnosis or treatment regime may seem anathema to the ‘experts’ in their ivory towers but to those of us outside those exclusive universities or marble-floored hospitals, dealing with practitioners we know are doing their level best despite unbearable time pressures and lack of resources, knowing they have this AI condensing the experience of thousands of others to work with will restore some trust and confidence.
AI is just a tool, and like every tool it produces results the quality of which depends upon the skill with which you use it.
Good to know I can save money by using the cheapest, crappiest tools available, and my skillz will ensure a fine job when the tools inevitably fail.
One of my favorite delights as a teacher of young medical students is discussing great works of literature with them that have meaning for our profession.
One book I never dreamed would be on the list is one I just added to the rotation in the past 6 months. It is a book from my youth that I have reread a few times in my life.
It is a science fiction book about a post-Jackpot world of deprivation and hunger, where the citizens are challenged every day to just survive. And then they find an ancient alien technology that allows them to traverse the void between stars. Some of these desperate travelers find riches Mammon could not even dream of, some of them never come back, and some return forever changed in not so good ways. Only a very few get rich, but it is worth the risk, or so they think. Those economic allegorical issues had always been the hook for me, and then the past year with AI rolled out.
You see, the main character has a guilt problem about what may or may not have happened to his friend. And a large part of the book which I had minimized as a young man was his constant conversations with his AI psychologist. This part of the book was an add-on lark when I was young. Now, with our incipient race to AI, it has taken on a very scary dimension in my mind. The dulcet tones of the AI in this book and indeed even its communication style are hauntingly familiar to what I have seen from ChatGPT. There are parts of this book now, which as a physician of decades, make my soul hurt. The discussions with my young students have been illuminating.
We seem to be barreling right into this whole concept of AI, without the slightest thought or discussion of the consequences. We should be used to this now. The vaccines, the Ukraine situation, the border fiasco – all brought to us by our betters without a shred of contemplation or common sense.
It has been amazing to me how many of our sci-fi writers of the 60s-70s had their fingers on the pulse of the future. Downright scary at times.
The book is called Gateways. Frederick Pohl is the author.
“I’m Afraid I Can’t Do That, Dave.“
Life just keeps getting more and more complicated. In the event of medical mal-practice, whom will I sue? My doctor or his AI assistant?
Oh, I forgot. I can just ask ChatGPT! It will surely provide an answer. And tell me, of course, that the doctor is the culpable party.
Poor little fellow. This surely deserves an “Oh honey…” reply, spoken to him in a condescending, motherly tone. When have have improvements in efficiency ever gona towards reducing our workload?
I feel like Topol is a good and well meaning person, for example he is willing to look at the evidence in front of his eyes about the SARS-CoV-2 pandemic. But on the issue of technology in medicine he has what I see as a huge blind spot, even before AI entered the scene. His 2016 book touted technology that would get around having to talk to a person which I believe is a major part of medicine.
That would have taken all of the fun out of The Bob Newhart Show.
US internal medicine hospitalist of 20 yrs.
1. Critical thinking in medicine has been declining over the last 30+ years. College and medical school selects for rule-follower personality types. And the “quality”, “accreditation”, “insurance” rule-sets use $$$ as a hammer to crush providers into following cookie-cutter diagnostic and treatment protocols. The last few years with maltreatment of Covid showed how many amiable followers there are rather than courageous critical thinkers.
2. Most of what is communicated by patients – in their physical signs, tone of language, unspoken words – is NOT simple verbal language. Patient and family expectations, emotional and psychological reactions to illness and disability and death are a much bigger part of the care experience than simple differential diagnoses and UpToDate expert opinion treatment recommendations.
3. AI is just an elaborate human-created rule set for a computer to actuate quickly. Even today there are many humans in the background having to label and prioritize data for the AI to make use of.
==> I don’t believe AI will be able to take care of anyone soon.
==> The current state of US medical care which is driven by bureaucracy and punishes free-thinkers, is close to already delivering arbitrary, rule-based care which is essentially just a bad version of AI.
I’m hoping the whole system collapses so we can get back to face-to-face interpersonal caring.
The collapse cannot happen fast enough.
Do not know about you, but the collapse is well under way where I am. Critical shortages in staff and nursing in almost every department. A substantial number of doctors have “retired” in the past few years, not that they were afraid of catching Covid, but because they could no longer in good conscience be a part of the system that inflicted this disaster on the world. A huge maw is opening now that the baby boomer docs are retiring or dying in droves. There are two issues here. 1). No one in charge decades ago saw this coming, amazingly. The number of docs in the pipeline is just minuscule compared to those retiring. All the while the baby boomer generation itself is set to overwhelm the system we have. 2) As you say, the critical thinking skills have been in an extreme decline in our universities for about a decade now. The trainees know all about the care of transgendered people, etc. However, they melt like butter when placed in front of acutely ill patients. They have also been indoctrinated with the idea that 8-10 patients a day is hideously busy.
The past few years have brought the first inkling of the true disaster that is on the way. No one in charge at any of the boards seems to notice or care. They are too interested in first class flights and multimillion dollar condos.
I have a very bad feeling about all this. I sense another Flexner Report is on the way. The brick wall must be close at hand.
Thanks, Doc. Anecdote about the supply of doctors: a family friend graduated from college two years ago. Good school, great grades in a challenging scientific major, and an overall good guy who has been focused the whole time on becoming a physician. Applied to a wide variety of med schools (an arduous and expensive process), and was rejected by all. Some didn’t reject him outright, but said “we’re full, try again in a year.” Now, I admit I’m looking at this with a bias in favor of my young friend, but are med school slots really in such short supply? Is that the bottleneck? Certainly the cost is jaw-dropping, and entering the field with astronomical debt seems to be standard procedure. I’m not sure what to make of this.