As long as enough don't, we're golden. The glue on pizza and taste-testing mushrooms are the exploding pressure-cookers of decade pasts, a risk that existed, but is no longer assessed again, in the future. They are all corrected and yet will be talked for years. Right now, LLMs are lacking training on enough legal datathat they can't provide specialized legal advice (only broad general description), and that's where they struggle -- for now.
While those particular instances may have been addressed, they’re just the tip of the iceberg. There’s certainly less egregious examples out there, unreported, as well as future similarly dangerous incidents to come.
And more data isn’t necessarily the solution. There’s such a mass of legal cases and complexity in law that many cases never see publication, or even the inside of a courtroom.
I mentioned the ticking time bomb of decades of as-yet unlitigated clauses in oil & gas cases. There’s nothing to train an AI on because there’s ZERO case law- just the unpublished opinions of O&G teachers & analysts. There’s similar clauses lurking in other industries as well.
Despite being presumptively unconstitutional, several states nonetheless have laws preventing atheists from holding public office. But those laws are so far untested. What would an AI tell an atheist considering running for mayor in such a state?
Hell- the last case I had in probate court involve a situation so rare that the judge had never seen it…but his clerk had. Not in a published case, in her 30+ years of employment. There was no published case law.
She had to tell
him how a previous judge had handled the case.
This is quite easy to test, given that there is ample computing power available to get the error rate if needed by submitting them synthetic questions. Not that anyone would be interested into running a trial test, unfortunately. There are correcting measures: simple ask an LLM to analyze and check what the first LLM outputs. It will catch most hallucinations (possibly providing other to be corrected by the first).
I will note that
past AIs have had difficulty detecting and reporting they had made errors, like miscounted the number of “r”s in “strawberries”. While corrected, there’s a certain level of insanity in trusting technology known to be error prone or hallucinate to run diagnostics to detect errors or hallucinations.
With regard to non-specialized AI dispensing medical advice, do have a rate of accidents linked to improperly understanding medical advice on Google? Having accidents happening as a result of a misunderstanding of what an LLM dispenses is bad, but if it is identical or less than the alternative (people googling for health advice from a random board, which apparently two third of Internet users do), then maybe it's a public health improvement over the current situation.
I don’t know that there’s been systematic research on the harm that “Doctor Google” does, just anecdotes.
But one anecdote I know of from discussing CME with my father was that many doctors blamed part of the overprescribing of antibiotics on patients demanding them based on “their research” and threatening to walk out if they didn’t get them. So (some) doctors would prescribe a short course of antibiotics along with whatever their affliction ACTUALLY demanded.
(Overprescribing antibiotics reduces the effective product life of that antibiotic in particular as resistance increases, as well as contributing to the rise of other antibiotic resistant bacteria over time. The more we use them, the faster we lose them.)
What is true of an LLM of 2023 isn't true about an LLM of end 2024 and what was true in late 2024 might not be true mid-2025. Forbidding LLMs to talk about a topic will disincentivize improvement in this field (a lawmaker will be loath to take the political risk to allow something that is previously disallowed...) so there will be less research to improve the products. Ackowledging that they are imperfect chatters and nothing more at the point until they can be proofed for more serious use is certainly the best way to go.
Not “
talk about”, “
advise”. Completely different standards.