Judge decides case based on AI-hallucinated case law

I guess we’re at the stage where we know that our various forms of media (books, news, TV etc) may not be always telling us the truth, but we don’t know if they’re actively deceiving us, not simply reflecting biases or getting facts wrong but deliberately lying to us for a specific purpose or agenda (some media assuredly are, of course).

We’re now also at the stage where we know genAI is not always telling us the truth, but now we also don’t know if they’re actively deceiving us. Is that about right?

In neither case do we seem to have the tools to make the outlet accountable or correct its inaccuracy or actual mendacity, it seems.
I think the risk with LLMs is that they actively reinforce what the user brings to them. If the user wants to have their own views and biases flattered (we all do) and the LLM can give them factual information which does so, it gives people a greater ability than before to lock themselves into a self-perpetuating loop.
 

log in or register to remove this ad

The reason why we detected it is because we saw the change in real time in plain view.

If Grok had started spewing hate speech, it would have been seen immediately as well.
In contrast, we’ve been assuming that a legal or medical AI’s biases would be baked in before release. And since a good portion of the fact checking could be based on the info the AI was trained on (supplied by interested parties), only those with specialized knowledge would be able to determine that bias might exist.

Additionally, it’s not far fetched that an organization with deep pockets could bribe the right people and make the audits unreliable and/or suppress them. You don’t have to bribe the majority of auditors if the bureaucrat or politician in charge is in your pocket courtesy of an offshore bank account in a tax haven.

Honestly, with this level of resources, it would be easier to just bribe the right politician to have the law changed to achieve goal X than to change a legal database hoping that over time, every law professional becomes misguided over what the law said (including the people penalized by the law, who would tend to be interested in checking the legal argument upon which they lost a trial).
 

I think the risk with LLMs is that they actively reinforce what the user brings to them. If the user wants to have their own views and biases flattered (we all do) and the LLM can give them factual information which does so, it gives people a greater ability than before to lock themselves into a self-perpetuating loop.

I once asked Chat-GPT for proofs that the Earth is flat, and he refused to tell me what I wanted. Maybe I wasn't subtle enough in my request and something milder might have worked...
 

So, that's exacty the issue I was trying to raise. When an AI calls itself "MechaH****r", we can easily see and not take is seriously. When they go off the deep end, they are less dangerous.

When they are biased, but don't go off the deep end, is when their bias can influence you most. When it says something problematic, but doesn't sound crazy, is the dangerous moment.

Also, "off the deep end" is a relative measure - it depends on the Overton Window for the community of users in question, just like "news" organizations.
Doing a little study of the wording used in various "news" will tend to prove that. Simple changes in phrasing make a big difference in how facts hit. For example how often do you read a story about "motorcycle hits car", rather than "car hits motorcycle"? Which way does one imply blame over the other? Statistical fault is roughly 50/50, over the last 30 years, but I have almost never seen the latter used over the former.

(Sorry, that's one I particularly looked into, because of my interests. Best one I had handy ;) )
 

With the LLMs statistical analysis should be easier because the data will be cleaner (easier to read via computer). So if you ask "is the LLM overprescribing drug X" or "is the LLM recommend fewer painkillers to population X" you can get an answer. You don't need access to the training data. You could test this with synthetic (generated) patient files.

And that's why a civil service AI would be better, from my point of view. If, say, the government wants to reduce the prescription of painkillers, it can, without AI and with increasing effectiveness:
  1. include in medecine studies a course on the many evils of prescribing painkillers,
  2. mandate doctors to limit prescription of painkillers,
  3. remove some painkillers from the list of allowed drugs in primary care, allowing only generic pills,
  4. remove painkillers from the list of treatments paid for by the national health system.

It's easy (depending on your legal framework) and certainly more effective to reach this goal than skewing a medical AI offered to doctors (though probably less subtle), so the authorities would have little reason to brew a complicated plans when they can take a more direct course of action. On the other hand, if the official policy is to reduce the consumption of painkillers, and an AI is secretly skewed by a third-party (big pharma?) to nudge doctors into overprescribing painkillers, it will either be detected during trials and, if the trials are suppressed/modified by a deep pocket group that bribe the testers, and the trials made by interest groups like NGO who would test the AI with a list of arbitrary medical files are also suppressed somehow, it will be seen next year in the balance sheet of the national health service and someone will inquire on why suddenly an expense that was supposed to decline is increasing.

Sure, you'd need to be sure the government has the citizen's best interest at heart -- so for example it doesn't introduces bias against painkillers when painkillers would be necessary -- but that would be political to go in that direction.
 
Last edited:

I guess we’re at the stage where we know that our various forms of media (books, news, TV etc) may not be always telling us the truth, but we don’t know if they’re actively deceiving us, not simply reflecting biases or getting facts wrong but deliberately lying to us for a specific purpose or agenda (some media assuredly are, of course).

We’re now also at the stage where we know genAI is not always telling us the truth, but now we also don’t know if they’re actively deceiving us. Is that about right?

In neither case do we seem to have the tools to make the outlet accountable or correct its inaccuracy or actual mendacity, it seems.
Decades ago, I was asked to read an essay by a computer scientist I believe who argued THEN that we as a society had lost the ability to truly know and understand the world around us.

We were already then, out of touch.

Now? Impossible.
 

I disagree. At least in medicine, bias can be studied using only output. E.g., do patients with characteristic X have outcome Y with more or less frequency than expected.
This is absolutely true…IF the people or organizations evaluating the study for bias are unbiased themselves. Remember, in some countries, the health agencies have been muzzled and hobbled by political forces (for a variety of reasons). And even outside institutions might have their results discredited.🤷🏾‍♂️
 

This is absolutely true…IF the people or organizations evaluating the study for bias are unbiased themselves. Remember, in some countries, the health agencies have been muzzled and hobbled by political forces (for a variety of reasons). And even outside institutions might have their results discredited.🤷🏾‍♂️
Well, yeah. This is exactly what I had in mind when I warned about giving government the power to decide what LLMs can comment on.
 

If Grok had started spewing hate speech, it would have been seen immediately as well.
True.

But the fact that it started off without a radical bias and even contradicted the posts of its owner served to highlight the change in ways that the same words wouldn’t have otherwise.

It’s one thing to see a company releasing MechaHitler LLM. It’s another thing to see an irked company owner demand changing a functional LLM into MechaHitler LLM.
Honestly, with this level of resources, it would be easier to just bribe the right politician to have the law changed to achieve goal X than to change a legal database hoping that over time, every law professional becomes misguided over what the law said (including the people penalized by the law, who would tend to be interested in checking the legal argument upon which they lost a trial).
Wellll…maybe.

Political pathways to hard code biases require processes that are at least partially public. That’s a big ask, and big asks come with big price tags. And sometimes, they backfire.

Rigging the system by corrupting programmers, auditors & the like is lower profile and less visible. My gut feeling is that it’s also cheaper to bribe several key people who are otherwise anonymous as opposed to someone who is a public figure, and whose work is a matter of public record.
 


Pets & Sidekicks

Remove ads

Top