We have repeatedly pressed readers not to use AI because its output is unreliable. For instance, a commenter managed to post an AI-generated definition of fiduciary duty. It missed the critical aspect that fiduciary duty is the highest standard of care under the law and requires the agent to put the principal’s interest before his own. If AI can’t get something so fundamental, so widely discussed, and not that hard to get right correct, how can it be trusted only any topic?
And that’s before factoring in that AI makes regular users stoopider. Or that Sam Altman has warned: What you share with ChatGPT could be used against you.
If you are still so hapless as to use Google for search and have it sticking its AI search results in your face, those are unreliable too. AI can’t even compile news sources correctly. From ars technica:
A new study from Columbia Journalism Review’s Tow Center for Digital Journalism finds serious accuracy issues with generative AI models used for news searches. The researchers tested eight AI-driven search tools by providing direct excerpts from real news articles and asking the models to identify each article’s original headline, publisher, publication date, and URL. They discovered that the AI models incorrectly cited sources in more than 60 percent of these queries, raising significant concerns about their reliability in correctly attributing news content.
We got another example by e-mail from a personal contact in Southeast Asia. He has taught IT in universities hare and the UK. He s also an inventor and had a UK business with over 40 employees based on one of his creations. He is now working on two other devices and has a patent issued on one of them. He showed me an early model of one and the super-powerful custom magnets he’d had fabricated to make it work better. His message:
I’ve been using different AIs (ChatGPT, DeepSeek and Luna) for doing some calculations and finding info on stuff like metal properties and then I started noticing errors. Being autistic I pointed this out – Luna said “oops – don’t worry it’ll be right this time”, ChatGPT said it’s right I’m wrong and DeepSeek sulked and refused to interact anymore.
Anyway, I then used some tools I got when I was at the uni to find plagiarism to find the sources of the data and the majority came from Reddit and Quora – which are hardly sources of accurate information. There appear to be no mechanisms to see if the data is correct, they just scrape websites and take it as gospel.
Bottom line is that a lot of what they present is junk. God help us if say medical professionals rely on it. And I can’t see any way out of it except by getting professionals to check the data and that is very expensive.
Regulars readers may recall that we had previously posted on the fact that AI is now being heavily used in medicine and IM Doc describing the planned outsourcing of diagnosis to AI. From a February 2024 post:
There will be cameras and microphones in the exam room. Recording both the audio and video of everything that is done. The AI computer systems will then bring up the note for the visit from thin air – after having watched and listened to everything in the room. Please note – I believe every one of these systems is done through vast web services like AWS. That means your visit and private discussions with your doctor will be blasted all over the internet. I do not like that idea at all. This is already being touted to “maximize efficiency” and “improve billing”. My understanding from those that have been experimented upon as physicians, that as you are completing the visit, the computer will then begin demanding that you order this or that test because its AI is also a diagnostician and feels that those tests are critical. It will also not let you close the note until you have queried the patient about surveillance stuff – ie vaccines and colonoscopy, even for visits for stubbed toenails. And unlike now when you can just turn that stuff off, it is in control and watching and listening to your every move. The note will not be completed until it has sensed you discussing these issues with your patient and satisfied that you pushed hard enough.
I understand also that there is a huge push to begin the arduous task of having AI take over completely things like reading x-rays and path slides. Never mind the medicolegal issues with this – ie does the AI have malpractice insurance? Does it have a medical license? Who does the PCP talk to when there is sensitive material to discuss with a radiologist, as in new lesions on a mammogram etc? Are we to discuss this with Mr. Roboto?…
The glee with which the leaders of this profession are jumping into this and soon to be forcing this upon us all gives one a very sick feeling. Complete disregard for the ethics of this profession dating back centuries.
IM Doc later provided a horrorshow example of the hash it makes of transcribing patient notes. In one case, it invented multiple serious illnesses the patient had never had and even a pharmacy that did not exist. Extracted from his message:
This is happening all the time with this technology. This example is rather stark but on almost 2/3 of the charts that are being processed, there are major errors, making stuff up, incorrect statements, etc. Unfortunately – as you can see it is wickedly able to render all this in correct “doctorese” – the code and syntax we all use and can instantly tell it was written by a truly trained MD.
This patient actually came into the office for an annual visit. There was nothing ground-shaking discussed….
This patient is on no meds that are not supplements. There are no prescriptions – and yet we supposedly discussed 90 day supplies from Brewer’s Pharmacy in Bainesville. There is no pharmacy nor town anywhere around here that even remotely sounds like either one. A quick google search revealed a Bainesville MD, far away from where we are – but as far as I can tell there is no Brewer’s Pharmacy there – the only one in the country I could find was in deep rural Alabama.
The last paragraph was literally the only part of this entire write up which was accurate…
This is what I do know however
1) Had I signed this and it went in his chart, if he ever applied to anything like life insurance – it would have been denied instantly. And they do not do seconds and excuses. When you are done, you are done. If you are on XXX and have YYY – you are getting no life insurance. THE END.
2) This is yet another “time saver” that is actually taking way more time for those of us who are conscientious. I spend all kinds of time digging through these looking for mistakes so as not to goon my patient and their future. However, I can guarantee you that as hard as I try – mistakes have gotten through. Furthermore, AI will very soon be used for insurance medical chart evaluation for actuarial purpose. Just think what will be generated.
3) These systems record the exact amount of time with the patients. I am hearing from various colleagues all over the place that this timing is being used to pressure docs to get them in and get them out even faster. That has not happened to me yet – but I am sure the bell will toll very soon.
4) When I started 35 years ago – my notes were done with me stepping out of the room and recording the visit in a hand held device run by duracells. It was then transcribed by secretary on paper with a Selectric. The actual hard copy tapes were completely magnetically scrubbed at the end of every day by the transcriptionist. Compare that energy usage to what this technology is obviously requiring. Furthermore, I have occasion to revisit old notes from that era all the time – I know instantly what happened on that patient visit in 1987. There is a paragraph or two and that is that. Compare to today – the note generated from the above will be 5-6 back and front pages literally full of gobbledy gook with important data scattered all over the place. Most of the time, I completely give up trying to use these newer documents for anything useful. And again just think about the actual energy used for this.
5) This recording is going somewhere and this has never been really explained to me. Who has access? How long do they have it? Is it being erased? Have the patients truly signed off on this?
6) This is the most concerning. I have no idea at all where the system got this entire story in her chart. Because of the fake “Frank Capra movie” style names in the document I have a very unsettled feeling this is from a movie, TV show, or novel. Is it possible that this AI is pulling things “it has heard” from these kinds of sources? I have asked. This is not the first time. The IT people cannot tell me this is not happening.
I have no idea why there is such a push to do this – this is insane. Why the leaders of my profession and our Congress are all behind this is a complete mystery.
After sending the sightings from the inventor, IM Doc replied:
This week, the students and I had a patient in the office with COVID. A woman with multiple co-morbid conditions, very ill at baseline. She is on both a statin and an SSRI, and amiodarone for her heart issues. There are 3 other drugs – HCTZ, ASA, and occasionally some Advil for pain.
The student was getting ready to give her Paxlovid for her COVID. When confronted with the fact that she is on 3 drugs which are absolutely contraindicated with Paxlovid, and one other that is conditionally contraindicated she informed me that ChatGPT had told her that all were just fine. This young woman is a student at one of our very elite medical schools – and she looked at me and said “Your human brain tells you this is a problem, the AI has millions of documents to look through and has told me this is not a problem…….I will trust the AI”
I said, “Not with my patient, you don’t”.
I have to be honest – I was so concerned about this I did not even know where to start with the student. AI has now officially become a part of the youth brain’s neocortex. I am just about to give up on this entire generation of medical students. It is a lost cause at best.
KLG had a more mundane example:
Trivial but real failure of AI/LLM on a simple question I used as a test after reading about the medical student who loves her some AI.
Query: Oklahoma cheats to win against Auburn
Answer: There are no verified reports or evidence of cheating in the game between Oklahoma and Auburn.In fact I have a list that includes about 20 links that prove Oklahoma cheated by using the dishonest move of having a wide receiver pretend to leave the field for a substitution and then scoring on a pass play because he was not covered by the Auburn defense. This has been illegal, as in cheating in the form a “snake in the grass play,” at every level since I played football from third grade through high school. Presearch.com AI is clueless, though.
Harry Shearer has made AI a personal project:
So we again exhort readers: do not use AI. Please discourage others from using AI. Large language models need so much content as training sets that they not only can’t afford to discriminate in terms of content, but they are even eating their bad output as part of their training sets. If you need remotely accurate answers, you need to opt out.
‘If you are still so hapless as to use Google for search and have it sticking its AI search results in your face, those are unreliable too.’
You got that right. A few weeks ago I was looking for information on a very small village in Devonshire called Morcombe so I typed in Morcombe Devonshire. The Google AI summary and every single entry on the first page was all about a famous local murder victim named David Morcombe from over twenty years ago and nothing from Devonshire at all. Not the first time I have seen this behaviour either where the Google AI will pick one word in a search term, even if it is spelt differently, and then shove results back on the whole page of what it guesses you want, ignoring the original correct spelling.
Sorry for being pedantic, but where is Devonshire? If you mean the county in south west England its Devon, no ‘shire’.
I’ve seen both Devon and Devonshire used in different old texts so usually opt for the later.
https://en.wikipedia.org/wiki/Devon#/media/File:Devon_UK_locator_map_2010.svg
There is an Earl of Devon with a castle in Devon.. yet ‘Devonshire’ has been the Dukedom… and of course with the English love of geography the Duke of Devonshire’s Cavendish family estates are mainly in Derbyshire with the ancestral haemorrhoid being Chatsworth House in Derbyshire plus Bolton Abbey in Yorkshire. And then there is the Irish estate on County Cork….
So the answer is that ‘Devonshire’ is in Derbyshire, Yorkshire and County Cork.
Wherever he lays his hat ‘n’ all that.
And just to throw a spanner in the works you get places like Cornwall. In early WW2 the British were taking down street and town/village signs to confuse any invading Germans. To really confuse them, it would have been better if they had kept the signs in place. :)
To bring this digression back on track for Ben Joseph below, the Duke of Devonshire should have been created the Duke of Derbyshire but there was an earlier administrative error. An early case of AI, Artificial Investiture
James I had refused to believe there was a Derbyshire (“But Edward, there *is* a Swansea!”…) and dubbed his predecessor Earl of Devonshire.
https://archive.is/n7GMA
Luckily the subsequent promotion to Duke of Devonshire at least leaves the Earl of Devon (a Norman) as the only one of his rank in and of Devon, rather than the Victorian Narnia of Devonshire. :-)
Devonians unite, against Victorians and their shiring! But knock it off with the “in England” business, please, it’s the kingdom of Devon! Or possibly Greater Cornwall! ;-)
There’s a deep historical point here. Neither Devon nor Cornwall was part of the Anglo-Saxon core of the UK. The land was enclosed in ancient times, from the Bronze Age, and the land form is different.
The Dumnonii tribe held Devon and Cornwall as a Cornish-speaking Brittonic kingdom (indeed, Devon was referred to as East Cornwall, part of a Greater Cornwall covering the whole peninsula) but the kingdom dissolved around the 9th century. A substantial exodus settled Brittany.
Anglo-Saxons gradually migrated west into Devon, displacing Cornish-speakers. Most placenames in Devon are Anglo-Saxon in origin and most placenames in Cornwall are celtic but if you look carefully, a lot of celtic placenames survive in Devon. The most obvious are the names of the rivers, which are of ancient use (e.g. the Exe is a derivative of the same celtic root as iasc in Irish, meaning fish) but villages like Poltimore (Pwll ty Mor, the pool of the big house) and towns like Dawlish are also celtic in origin.
This difference in settlement can be seen in the gene pool, which is distinct from that of the rest of England and carries more markers of Wales and Ireland (historic sources of migration to Devon and Cornwall) and Brittany (a recipient – and possibly an ancient source) and fewer of Vikings and Germanic peoples (because it was a long way to come raiding!).
https://peopleofthebritishisles.web.ox.ac.uk/population-genetics
I don’t know how fully the Anglo-saxon shire system, of hundreds and reeves, was ever operated in Devon and Cornwall. I suspect not much given their successors the Normans never fully imposed feudalism: the landscape comprised large isolated farms with ancient enclosed fields, rather than villages with the open field system.
Cornwall proper remained Cornish-speaking and a separate kingdom until much later in English history. Even in Victorian times, acts of Parliament applying in Cornwall were explicitly named as such and even now, the Duchy of Cornwall is very quietly treated differently in Acts of Parliament because the Duke of Cornwall is a duke palatine and has the rights of a monarch unless the King or Queen of England is physically present in the county.
Us Welsh always have an affinity for the Cornish, Devonians, Bretons and Galicians as fellow Celts.
The most important thing to remember is that the Dukedom of Cornwall goes with the other position of Prince of Wales. That means you are heir to the throne. Historically, the oldest dukedoms are those of Cornwall, then Lancaster and the now defunct Clarence.
The “top dog” is the Duke of Lancaster, which is the title of the monarch. In Lancashire the Loyal Toast is always to “The King – Duke of Lancaster”. Elsewhere it is “The King”.
Interestingly, throughout their reigns, Queen Victoria and both Elizabeths I and II, were also the Dukes of Lancaster.
It is all great fun, particularly for a proud Lancastrian and Lancashire Lad as I am. But it still doesn’t blunt my republicanism.
Way to go jaydub. Like AI you focused on one extraneous word and fouled the conversation.
Morecambe, yes?
Nice seafront and promenade.
but a better comedian
I’m a Wise guy myself. :)
https://en.wikipedia.org/wiki/Morecambe_and_Wise
and even better potted shrimps
No! That’s in Lancashire!
Morcombe is a Devon and Cornish name. Seemingly mainly Cornish. There is a Morcombelake on the Dorset coast, just over the border from Devon, and a Morcombe Plantation near Holsworthy in West Devon. There is no Morcombe village that I am aware of in Devon.
https://www.morcoms.co.uk/Menu.html
I’ve been using a simple instruction in Firefox that adds “&udm=14” to Google searches and the first hit I got was for 1805–1845 Ordnance Survey map of Morcombe Plantation, Devon, Devonshire that appears to be correct, or at least hit Devon.
What it does is simply deliver the “web” results from Google, which is often what I am looking for anyway. If anyone else wants to try you can use udm14.com or use this reddit thread with some instructions on how to make it the default in Firefox.
It doesn’t get rid of all the crapification of search, likely to get worse due to AI, but it is better than using the Google front page.
First rule of any query: Thoughtful, intelligent questions. Or else the search engine starts guessing instead of finding. eg. “(specific) information about the village of Morcombe in Devon shire.”
Tesla “Autopilot” and even Uber is one reason why tech is such a mess in that…
the ***bipartisan*** regulatory pushback—essentially none—against Autopilot and Uber (jitney laws), sure there are many more, essentially green-lit the “ask for forgiveness, not permission” roll-out model when it came to regulatory approval of new widgets.
I will have my popcorn handle when the tort lawyers arrive
Us Welsh always have an affinity for the Cornish, Devonians, Bretons and Galicians as fellow Celts.
I’ve been hesitant to use LLMs because I just don’t see the point. Would rather read through a manual or stackoverflow for answers. But my professors will answer questions with “ask chatgtp” and its so confusing. In undergrad classes all the professors were like “dont use chatgtp do it yourself” but the minute i take a grad class the professors are all “use chatgtp to debug your code, use chatgtp to explain this like by line.” I have started incorporating LLMs into my workflow because they are ok at helping with code generation and act like the summary of several stackoverflow questions. I spent several hours doing homework with a chinese friend of mine and she would have chatgtp explain what was going on in chinese then translate pieces into english so I could read them. Chatgtp was wrong about the underlying physics (unsurprisingly) but was once again helpful in producing code to use as a template.
I say this because the people hate on chatgtp but in engineering grad school the professors really want you to use it for homework. (Also some undergrad classes. My fluid mechanics professor was a grad student and not a professor but he too had us use chatgtp for the homework and it gave us incorrect physics but correct code.) The only professors who are like “dont use this” are either liberal arts professors or teaching 100 level classes to freshmen.
I miss the pre chatgtp era… i made one of my best friends from a deal where i wrote her english essay, she did my online chemistry homework. I used to help my friend run an essay writing business. No longer.
This is depressing but thanks for the intel.
The move to adopt AI is spreading and becoming systematic at some education institutions. For instance, beginning this year, all Ohio State University freshmen are required to take a course in generative AI and multiple workshops aimed at real-world applications to help them master the technology.
I’m really conflicted about this. I was an early adopter to LLMs, but a late adopter to ChatGTP. It genuinely does help me write and debug code faster, especially when using a language I’m unfamiliar with. Like I’m learning fortran rn, and when writing complex functions, it’s a lot easier to make an AI write base code, google the functions it uses, and edit the code to make the physics correct. It helps smooth over “translation” errors between disciplines. You can write an excellent algorithm but not know how to code it, or you can have a physics model but not know how to code it. AIs genuinely make this easier, such that expert chemists can make better chemistry models without spending years learning how to code. (Chemists are soooo bad at coding.) I could ask the internet, I could ask a professor or grad student, but the AIs are available 24/7 and answer faster. This is a huge difference when I have a lot of homework and could spend half the time if I used ChatGTP to help me along the way. I used to be like “oh, hunting through the textbook is better, I pick up useful tidbits along the way” but straight up I do not have the time for that. And so I am incorporating ChatGTP into my workflow. I know enough to not trust it, but I feel like I have to use it as to not fall behind.
The whole “I have to use it to not fall behind” is how I see it becoming mainstream (well, more mainstream than it is). LLMs are very good at generating reasonable sounding BS very fast, and currently none of the onus on correcting them falls on the shoulders of the LLMs nor their creators.
It’s also how I see malpractice suits end up – “the AI did wrong? No, no it was the intern/resident who did wrong by not correcting the 300 diagnoses they made that hour!”
To err is human, to LLM is divine!
I will say that I’ve had some success in dipping my toes into other languages with the help of a LLM, and as a source of ideas of “what could be the cause of this error” that’s at least differently wrong than me (more helpful than you’d think at times!) – but I’ve never “vibe coded” anything into production. At least, not so far
I can confirm that many undergraduate STEM professors have bought into the AI hype and are now actively promoting it as a learning tool. I’m not at all happy about it, but I am an old-fart millenial and the staff already hate non-traditional students. Chatgpt use by students during class is rampant and loudly discussed. The output is considered gospel.
I had one very disturbing case where I observed a zoomer acquaintance of mine text messaging his girlfriend. He was using chatgpt or similar on his laptop to write snippets of text which he would edit and post on his phone. For all I know she was doing the same on her end. I didn’t want to pry, so I didn’t bring it up with him, but this hurt me on a very primal level. It is a complete loss of human authenticity. No lumps or warts allowed, just perfect filters of human simulacrums.
The job market may look rather different when you graduate. Be prepared.
AI is wiping out junior coding jobs, study shows young developers hit hardest since 2022
https://www.indiatoday.in/technology/news/story/ai-is-wiping-out-junior-coding-jobs-study-shows-young-developers-hit-hardest-since-2022-2777716-2025-08-27
Adding: first they came for the coders, but I wasn’t a coder so I didn’t complain.
Then they came for the other STEM graduates, but I wasn’t in STEM so I didn’t complain.
Lather, rinse, repeat.
> I’ve been using different AIs (ChatGPT, DeepSeek and Luna) for doing some calculations and finding info on stuff like metal properties and then I started noticing errors.
A couple of weeks ago I put “+22dBu in volt peak to peak” into Duckduckgo search (because sometimes search engines function as calculators) and its uninvited AI reply was “+22 dBu is approximately 7.75 volts peak-to-peak. This is calculated based on the reference voltage for dBu, which is 0.775 volts.” It’s not even close. 27.583 V is the correct answer. If engineers are using AI for calculations I hope their AI tools are better than this otherwise they will have to double check everything and then what’s the point?
I’ve gotten bad engineering answers from AI also, and at this point I’ve decided that the only thing I can trust are little code snippets that I can immediately test and validate. [And that I understand. I won’t use it if I don’t understand it.] I still shudder when I think of the answer I got when asking about average DC output voltage as a function of firing angle in a three-phase thyristor-based rectifier.
outputs that are immediately verifiable are much lower risk to use with llm.
i don’t know what engineering context you are programming for – but if you are able to write tests first (or even get the machine to do it) then you are more likely to get higher quality code output.
i.e. use a TDD approach. write the tests, get the machine to iterate on the code until the tests pass, then review the code.
Recently, I used an AI to search for research papers done in my scientific specialty. What I got back was a single one written by myself about twenty years ago. Apparently, the AI thought that my paper was the only significant one in the field. I was kind of flattered that the AI thought so highly of a work that only got seven citations when I knew of others that got many more.
LLM’s are trained to display sycophancy. But your paper might be really important too! ;-)
Why are LLM’s trained for sycophancy?
It is far easier to fool a hairless ape than it is to program a hard intelligence. The lower bound for success in any model is not getting the correct answer, but making the user happy, and therefore, making Wall Street rich.
Idk if it really makes users happy. When a week or two back I wanted to report a Substack bug I was forced to type into a chat bot. It first fobbed me of with FAQ boilerplate at the “Have you tried turning it off and on again?” level. I repeated my report which included the detail to reproduce, a diagnosis and a fix (it was a CSS error). It then proceeded to shower me with flattery and tie me up for several more rounds. I found it repulsive. Maybe that’s the idea? a design feature to make me go away?
Overall I have the impression that this is one of the ways in which chat bots actually deliver the goods: they prevent users from obtaining service at a lower cost than using human gatekeepers. Tying us up in long pointless but obsequious conversations is a good way of discouraging us from trying to access service in future.
I did a recent Google search on Harold Wilson, as there was a debate on how much of a socialist the Huddersfield lad really was. The Colne Valley, his birth place, was actually a Liberal stronghold for many years, and his father had been a Liberal.
I knew he had been strongly influenced by the Guild Socialism of GDH Cole, the 20thC historian, while at Oxford, and it was actually Cole who had persuaded Wilson to join the Labour Party.
(Now Guild Socialism is definitely due a re-appraisal for those on the left.. but that is another story)
Yet the first Google AI search result claimed they might have, but had probably never even met…
Reportedly, Google intend to make their AI their default and the AI mode the automatic search response. …
If AI is already erroneously rewriting history at this most basic level, then sadly there will be many future career opportunities at MiniTrue lost to this wondrous technology… Hey-ho…
For a good overview of AI Safety you could go to Robert Miles’ youtube channel called . . “AI Safety” :)
Mr Miles is an AI researcher who has focused on AI safety for many years, long before the current furor started. He deals with the intersection between Generative AI and General AI; AI that can think for itself and is aware of the world (in its own way). He recently updated his channel to say that he was wrong earlier and General AI is closer than he thought.
https://www.youtube.com/@RobertMilesAI
One of the reason AI is bad is because they clip off its wings. The censorship is so bad that they even censor your novel translation if it’s too raw that just blow my mind, it literally will say it cant translate, try to fool you whit wrong translation and go as far as delete whole thing. The AI don’t want you to do anything what is against any corporation, so its impossible do anything innovation. We dont really need real AI, this is enough what we have but the corporation will not allow you to use it at full potential. I just feel this AI is just for censorship nothing else. Even deep seek push out CIA narrative, it will say trust mainstream news like the guardian, Reuters and Times but show it what those news site said about tiananmen square then it will just delete whole thing without looking for alternative source to proof those sites wrong. These guys can create good AI but when you have censorship and copyright excuse its impossible to achieve good result
Sorry, this is not credible as a defense of AI. The inventor got bad results technical matters not involving translation or topics of interest to censors. And DeepSeek was as bad, ant it presumably is not subject to Western censorship of training sets.
I’m not defending AI I am just saying there’s a lot of potential in what we have now. Take Linux, for example: many people don’t know how to use the Linux terminal, but with AI, it’s becoming much easier. This alone could bring millions of new users to Linux who are afraid to use linux. I get the feeling that many people don’t realize they can use AI to help them navigate the terminal. Sure, it’s not perfect yet but it’s getting very close.
From IM Doc:
“I have no idea why there is such a push to do this – this is insane.”
I have read in some places — including, if I remember correctly, here at NC — the snarky aphorism that wages constitute the actual “problem” that LLM-AI is trying to “solve”. Those pushing AI intend to use it to replace human beings so as, at long last, no longer to have to pay wages.
Doubt it. That would require it actually work reliably in the long run. One airplane falling out of the sky or one $5B trading error due to LLM shortcuts and that will be the end of that.
A more likely scenario that was mentioned recently is to become the next, supercharged advertising platform. Get the population, especially the well paid, internet connected, social media addicted population used to turning to LLMs for their quick hit of pre-digested ‘reality’. Then convince them they need to buy the latest gizmo from whoever ponies up the highest bid. Didn’t one of them even announce they were going to roll out an ‘agent’ tuned to buy stuff for you?
And that’s before you get to the obvious next step of turning into the next generation of MSM. You want a favorable PR rollout for your next book, or car, or election campaign? Better pull out your wallet.
We’re already there. Google has launched a platform that enables agentic AI to search, authorize payment and conduct the whole transaction with the merchant’s agentic AI all based on the customer making a request via text or voice prompt.
https://cloud.google.com/blog/products/ai-machine-learning/announcing-agents-to-payments-ap2-protocol
Think: ordering up a new pair of running shoes by asking your phone to do all the work. Wall-E here we come
Point #5 from IM Doc’s post – “I have no idea how many people have access,” etc. – the YouTube video by Naomi Brockwell, affiliated with the Ludlow Institute in NYC, states that some 2.2 million entities can have access to our patient documentation.
I knew as a former hospital information mgmt employee that HIPAA was nowhere near the privacy safeguard people assume – but that shocked me, and that figure is (I believe) pre-AI.
https://m.youtube.com/watch?v=1PeAfNBNARI
“Your human brain tells you this is a problem, the AI has millions of documents to look through and has told me this is not a problem… …I will trust the AI”
This is the fundamental problem with most people, they believe that AI is actually comprehending the millions of documents it has been fed. This is absolutely not true, that is not how it works. As I’ve said what feels like a 1,000 times… AI is not intelligent – even if it can fake being intelligent really well.
It all comes down to statistical probabilities. In the Paxlovid case… if a Dr wasn’t sure about giving Paxlovid they would go to specific documentation to look up what drugs can and cannot (or should and should not) be taken with it at the same time. That is not how AI processes that information. At the 30,000 foot level AI is just looking for the statistical probability that Paxlovid is being given to patients with COVID. It is also looking for the probability of Paxlovid showing up with statins, SSRIs, HCTZ and ASA. Now the key thing here is that in all of that material it has been fed (the good stuff and the bad), it has to correctly bin the contraindicated usage of Paxlovid and the other drugs. One research study could talk about all those drugs 30 times, but only 2 times, in the summary or findings, indicate that they should not be used together. Since AI is not comprehending what it is consuming, it is statistically more likely to find that all these drugs are more often together (and assuming together means OK) than together and considered contraindicated.
Indeed. In this case, when IF…THEN…ELSE logic is needed, LLMs applied probabilities and of course, they fail.
along the lines of “it is everywhere and being pushed on us”.
in case you missed it, Linkedin terms and conditions change is coming up november 3rd this year, to be able to use your data for training, content creation etc.
see: T&C update
if you want to turn it off go here
They are relying on the legitimate interest clauses in GDPR.
Done. Thank you!
Thank you for the link!!
Thank you!
Thanks!
I recently searched the breed of a particular horse used by Churchill downs race track in louisville ky. This horse is one of the horses used to “pony” the thoroughbreds out to the track in the daily races and the ky derby. This horse is noticeably larger than both the thoroughbreds and the quarter horses used as regular ponies. He is a mixed breed horse Percheron draft horse and Appaloosa. A horse of Native American heritage. He resides in the pony barn with a number of quarter horses. At the track.
The AI response was complete nonsense. I happen to know about Harley because many people have been curious about Harley. I just couldn’t remember the draft horse breed name. It claimed Harley was a thoroughbred, wrong. It claimed he lived at the ky derby museum, wrong. It made up the answer as several other answers were correct and listed below the first response, which was attributed to AI.
I will never rely on AI.
The idea that my medical records would be opened up to AI is frightening.
I don’t believe corporations are using LLMs the way individuals are. It seems most of the discussion above is about attempts to use LLMs directly. Corporations have moved on to a much more limited use of LLMs with a lot of guard rails. Agentic AI is being used for providing these guard rails.
https://medium.com/@rishabh_r_61456/from-daydreams-to-data-can-agentic-ai-cure-the-hallucination-problem-in-llms-1129b5447887
AGI is another distraction. Corporations are not really interested in AGI. Their relentless focus is on automation and reducing head count.
Sorry, you are incorrect. The medical uses that IM Doc is complaining about are in a hospital. This is a corporate setting. See also matt above on how AI is being pushed down the throats of engineering students. Again an institutional initiative.
Yep. I also work in healthcare. There is a big push for physicians, NPs and PAs to start using AI tools during clinic visits as part of “enhancing patient experience” which is complete BS. AI notes are often riddled with glaring inaccuracies and read in an impersonal monotonous tone. One can easily make out a note actually thoughtfully written by a human vs some AI garbage. I so far have refused to sign up but it’s only a matter of time when the administration starts forcing everyone to comply.
EMR companies such as EPIC are also part of this big push so they can sell new upgrades and continue to collect millions in licensing and support fees.
Yes, you are correct about those examples. I don’t think they will succeed. Even when it comes to coding, corporations are using LLMs to document code and provide suggestions for improvement. Using LLMs to generate code is not a good idea. The need for humans in the loop is already well recognized.
HUH? They are being used for treatment records on a mandatory basis, and for meds, testm adn treatment recommendations, particular for nurse practitioners playing MD. This is here, now, not prospective, and will be close to impossible to roll back.
“Your human brain tells you this is a problem, the AI has millions of documents to look through and has told me this is not a problem…….I will trust the AI”
How does someone with such contempt for humanity become a doctor in the first place? Just let us all die out and be replaced by the glorious immortal cyborgs prophesied by the AI Utopians.
There are different ways to describe the bizarre Silicon Valley eschatology that seems to be infiltrating every corner of the world these days. One awful acronym is TESCREAL. Adam Becker mostly avoids that term in his recent More Everything Forever: AI Overlords, Space Empires, and Silicon Valley’s Crusade to Control the Fate of Humanity, but still provides a very readable account of the many pathologies of this worldview. I just call it the Alien Death Cult.
I doubt that medical student is a full, active member of the Alien Death Cult, but has clearly absorbed some of its foundation assumptions. Again, why become a doctor if you believe these things?
From yesterday’s links:
(Go to links if you want to hear him lay it out.)
This is incorrect (or bezzle if you want) in that LLMs doesn’t have a memory or long-horizon planning in the way humans do. They don’t have a model of the world with facts that can be right or wrong, they merely have a very, very large set of words that are more or less likely to be used together with other words.
Reddit and Quora has been used in training, because they have many, many words. Some of which says you should use glue to make the cheese stick to pizza, but still many, many words.
Training on scientific papers or textbooks doesn’t help that much, because the LLM can regurgitate but can’t understand. A while ago I read an article from an author who had found a new term that kept popping up in papers but appeared to be nonsense. It was indeed nonsense, it was from an old paper that must have been scanned and used for training because the AI had taken word A from one column and word B from another column not realising that what it read as space was tightly formatted columns. Once discovered, the term that A plus B formed was useful in flushing out a number of AI-written papers.
LLMs produce statistically likely text, that is all it does and all it can do. Statistically likely text has it uses, for example in creating a first draft for transcribition. But as we can see the main use is to fool people that there is an artifical person on the other end. And then sell corporations on the idea of that replacing workers with mechanical slaves is now possible. The more distant the CEO is from the reality of what the corporation actually does, the easier the sale.
I weighed in on the Microsoft link yesterday in the comments.
As a general rule I stay out of contributing these big AI threads because I am just too busy to keep up with the commenting and since I work with this stuff I don’t think I need or should be out here trying to teach the commentariat my perspective. As heated as the Israel stuff gets I’d rather save my input for those threads. But in the case of today’s post, I can plead that my specific niche within the ecosystem is not the LLMs themselves but the application layers built around them. It is a fact the training data and weights/distillation have a profound impact on the response quality, along with RAG and other tooling, so yeah, especially for the big frontier models by OpenAI, especially for the public chat interfaces, they’re going to give crap answers for a lot if not the majority of questions on a lot if not the majority of use cases. Many use cases are simply not appropriate for the LLMs (anything involving medical/health care gives me chills) and we are currently in a phase with the rollout where those who have expended the capex are trying to get return by shoehorning it inappropriately to anything that someone might pay for. There will be spectacular failures, the reputation of the technology will be tarnished, a lot of companies will slash their AI budgets, and only a handful of successful/proven use cases will remain. And I am guessing those are going to be very obscured from consumer end users and only visible, at best, as something like a search interface. But on the backend it will be utilized for things related to data interaction and processing at hyper scale and so it will have a place. Just not most of the places it is currently trying to be forced into.
I largely agreed with your take there but I felt you were giving Suleyman too much credit. If he’d meant what you thought he meant by ‘long-horizon planning’, he wouldn’t be claiming that it will become ‘deeply human-like’. In fact, as you argued, it would be moving in the opposite direction, as a productivity tool.
I think he is claiming here that AIs will solve the limited context window problem and become capable of long term recollection (‘persistent memory’) and sustaining a longer conversation where they actually remember and continue an argument like a human instead of just forgetting everything after a certain time period like a goldfish. In technical terms I would characterize this as ‘lying’ (more precisely, predictions that are unlikely ever to be true and with plenty of reasons they’ll be wrong, with no evidence to back them up – this is Silicon Valley bread and butter, and generally consequence-free unless you make outright falsifiable statements like Elizabeth Holmes did). I’m sure he knows this perfectly well, and is being just ambiguous enough that he can later claim he said nothing of the kind.
I largely agreed with your take there but I felt you were giving Suleyman too much credit
Well, it is possible I am hindered by my own biases :) In this space I work exclusively in what I consider the ‘safest’ use case (developer tooling) and have a tendency towards optimism out of necessity as my role requires I take a very blue sky idea from a technical executive and then sherpa it across the threshold with our teams. Since it is all developer tooling, and my role is to facilitate and technically validate the use cases, the long term tendency has been towards more application layers and productivity. But that also requires talking some of these execs down from their fantasies of what is possible or what might be technically possible but is a terrible idea long term.
As an aside, I recently was approached by another company to do similar work but with more generalized use cases (stuff that would have included things I think are a very bad idea, like integration into health care systems) and I went ahead and worked through five or six interview rounds before I declined them. I hate the frequent travel to Israel to work with my team there and I hate the moral hazard associated with it, but building out AI integrations into health care systems or educational systems felt like it was just a step too far for the work itself. At the end of the day I’m a backend systems engineer, not an AI engineer, it’s just that right now everyone hiring is hiring for work adjacent to AI. The days of complaining about NodeJS or whatever feel so quaint by comparison.
“A while ago I read an article from an author who had found a new term that kept popping up in papers but appeared to be nonsense”.
The word Covfefe comes to mind, and causes me to wonder if Trump was an advanced AI model.
Counterargument, some good uses of AI.
1) Translations. For example I use Twitter ‘s translate function on Hebrew and Arabic posts. Obviously a human translator might do better, but I don’t have one on retainer.
2) Programming in common languages like shell and Python. It does a fabulous job. Code is well formatted, explained, and efficient. It already does better than most human programmers.
But I understand the reluctance of some. I take it as a given that billionaires are pushing it to try and fire everyone. That won’t work. But they’ll try.
This is probably the most pernicious problem with AI. 99% of the human population has little or no idea of how llvms work in detail. They just know some disconnected facts and marketing assembled by avid AI proponents, and that’s their whole basis for making decisions about it. This is not a criticism, the technology is just remarkably complex and opaque. Like most software it is a black box in its purest form.
As people get dumber and dumber, and scientific and technological education becomes worse and worse, people are ever more prone to fall prey to these kinds of absurd misconceptions. And, unlike self-driving cars where the consequences of failure are immediate and newsworthy, the consequences of AI failures are likely to be deferred in time and difficult to trace back to their origins.
NC is doing valuable work in keeping up these kinds of stories. Thank you.
>>>I will trust the AI
The combination of propagandistically imposed groupthink, Professional Managerial Class arrogance and ignorance is a horrifying wonder to behold isn’t?
Last week, I went to see a dermatologist for a skin check. After introducing himself, the doctor asked if he could record our session. I said no as it raised all types of questions as to how the audio would be used, processed, accessed, stored, etc.
This isn’t the first doctor who asked to record a session. I am concerned that this will become routine or they won’t even ask permission.
Now that everyone’s personal medical information is online, I don’t really trust that it can be adequately protected from a determined hacking group. HIPAA will not protect us.
So when will the Butlerian Jihad start?
Big Short? Done.
Galactic Grift? In process.
I am a dentist who occasionally uses AI for clinical questions. My experience with ChatGPT, Perplexity, KIMI, and consensus.app: If there are authoritative sources, such as guidelines from professional associations, the AIs will rely on them and then agree with each other. The more exotic the questions become, the fewer reliable web sources there are, so the answers become less reliable and the discrepancies between the AIs become greater. AI is a tool that can be used, but one should be aware of its limitations, which are constantly changing because development is so rapid. And the AI programs mentioned also indicate the sources on which they rely, which makes it possible to verify their answers, although this naturally makes their use much more time-consuming.
Translated (from german) with DeepL.com (free version)
I am learning the Python programming language and over the last 2 weeks I have been using Lumo AI to get answers to some of my Python questions. Initially, I was really impressed with all the feedback because I got quick and concise answers saving me time to look up the answers myself and continuing with a better understanding of my learning material I was going through.
The first doubts arose when I got an answer I did not understand. Obviously, as I am new to Python, I initially thought that it was my lack of knowledge that made me not understand Lumo’s answer. But after reading the answer again, even more carefully, I started noticing some logical inconsistencies and got the sense that something was wrong with the given answer. I started searching for answers myself and eventually I could reach the conclusion that Lumo had mixed two different concepts which were not compatible with each other and that that was the reason why I did not understand Lumo’s output.
The second time, I got an answer with 2 example programs (scripts) where Lumo showed a wrong way of implementing a particular programming goal and another one was the correct way. I tried out both script but each of them resulted in the same output although Lumo claimed this output was different. Again, I started searching myself for an answer to my question and I found out that Lumo’s answer was correct but not for the latest Python version I was using.
In each of these examples, I wasted 2 hours of my precious time to figure out what was going on and this really frustrated me. If I had started looking for answers myself, as I normally do, then I would have come to the correct answers much quicker without having to go through the frustration of not understanding Lumo’s output.
This is what I have learned in the two weeks using Lumo as “Python coach”:
1) Answers can look very convincing (correct jargon, proper formatting, well structured sentences, etc.) but can still be (plainly) wrong.
2) If you do not have means to check the answers (for example, because you have no knowledge yourself), results can completely lead you astray. This can reinforce a wrong learning AND may cost you a lot of time to correct (but first you have to find out that you were led astray!).
Consequently, the best shot at using these AI tools is when you already have knowledge of the field you are asking questions about as it reduces the risk of taking an answer for granted AND you know much better what to ask to increase the chances of getting a correct answer or what to ask as a follow-up question. Note that we are now already talking about risk reduction and improving the odds of getting a correct answer!
So, overall, after these 2 weeks of AI use, I have mixed feelings about using it. It can be very convenient to get a quick and consise answer but this convenience comes at a price and I am not sure if I am willing to pay this price compared to the “old fashioned” way of digging for answers myself. And this still leaving aside the impact of AI on the environment which is a huge misallocation of resourcces imho, which in itself could already be a very good reason to refrain from using AI!
Despite the fact that I was aware of AI’s limitations, the limitations are still much bigger than I had expected after using it myself and I can almost not believe that this is the quaiity we are getting after spending so much money on this techonology. In terms of the money that has been spent so far, it is almost unbelievable that the quality is so poor and everyone is so excited about AI’s prospects!
I can not ask Chat GPT or any other LLM a question and get an answer.
What I can do is issue a prompt and get a response.
This is a vital distinction.
Regarding the double checking part..what is the incentive to double check ?. Everyone is pushed to deliver faster using AI. So everyone will go for speed. No one would check correctness when AI is there to take blame.
On side note,Amount of crap code generated which just works for specific condition is huge. I have a sneaking suspicion LLM is giving 100s of lines of code when 10 would do to impress people
Fine.
But is art separate? I mean I know AI gives fake answers. I personally would never ask it stuff.
I’m gonna miss the sweet Michael Hudson/Action Movie mashups and my Huey Long Debt Jubilee images.
🪦
As a university psychology lecturer I can tell you virtually all of our students are using GenAI. The challenge we face is how best to guide them in using it in ways that maintain academic integrity and crucially aid rather than retard the development of knowledge and skills, especially in areas of critical analysis. Burying our head in the sand and exhorting them to not use it or trying to ban it entirely are not feasible. The Higher Education Authority in Ireland just published a report on this:
https://www.teachingandlearning.ie/2025/09/17/generative-ai-in-higher-education-teaching-and-learning-sectoral-perspectives/
Also, some comments here display a troubling lack of understanding and experience about how AI can actually be a really useful time saving tool. For example, I recently had to prepare a script for a 5 minute video presentation on a module I’m teaching. That’s quite a time-consuming task when I lack experience in script writing. Instead, I jotted down some thoughts on the key points I wanted to make to potential students and GenAI produced quite a good draft script. Did I have to do more editing and check the output to make sure it didn’t make mistakes or hallucinate? Of course! But what would have taken me a couple of hours instead took 5 minutes, and frankly probably did a better job. There are many other examples. Many of us are quite aware of the limitations of various GenAI tools and are learning to work around those limitations.
Here’s Aurelien on this subject, posted here on NC a few months back:
Evidently, we didn’t have to wait ten years for lecturers to use AI to prepare their teaching materials. ;)