AI/LLMs Judge decides case based on AI-hallucinated case law

The Firebird · Jul 10, 2025

BookTenTiger said:
I'm confused. Isn't this thread about a judge making a legal decision based on the work of a lawyer who used LLMs? So not everyone is pointing and laughing.

recent case refers to the Grok meltdown, not the legal case.

Ruin Explorer · Jul 10, 2025

The Firebird said:
And this is too cute. It relies on the assumption that people just kind of blindly follow what the LLMs tell them. But the recent case shows people don't. They just point and laugh, the same dynamic that has occurred with print and news media for years.

I don't think you're being very realistic, and your own comment re: "news and print media" should point that out to you.

Print and news media have absolutely lied to people's faces and presented insane warpings of the facts, but the reality is, in any given country, a huge number of people believe the insane warpings of the facts.

Sure, a lot of intelligent people do point and laugh but they're clearly a minority of the electorate.

Some smug people like us on the internet "detecting bias" isn't good enough when the vast majority of people using these things are pretty uncritically seeing them as "one source of the truth", and consulting them on stuff they really shouldn't. Countless people absolutely blindly follow what LLMs tell them - just not snarky people on BlueSky/Twitter.

You can say "Well it's no different to biased news!", but yeah, it is. Because the difference is that biased news is one-way vector - it's not very responsive. It often fails because it's not flexing to what people actually want to know. Whereas if you have a system for answering people's questions, and just answer with lying or biased information, that's insanely more powerful for manipulating people. And it would be much less of a problem if there were say, six major LLMs with different background and ownership giving completely different answers but... that's not the case. All the major Western LLMs are based in one part of one country and all owned by people who have a spectrum of beliefs that are technically varied but still pretty narrow (and at odds with society in general).

With Grok it was a first and very clumsy attempt. The black box nature of LLMs does cut both ways a bit. But this was a big upgrade in intentional bias from the previous simple prompt-injection attempts at biasing them. It just went too far in the Hitlerian direction (very literally) so became obvious. That's because Elon Musk is an absolute dim bulb when it comes to code who demands things be done immediately, rather than waiting until the tech is ready. I don't expect intentional biasing of other LLMs to remotely as clumsy, and we know from oblique comments that it's being worked on.

Now, I will say, this isn't some "ALL IS LOST - FLEE TO THE FORESTS" deal. LLMs may well lose popularity. Competing ones may emerge. The whole technology may get superseded by better approaches to AI. The bubble may well burst (indeed things not looking great for OpenAI), because the whole thing is so expensive. Climate change may just make them unworkable, as things get worse and governments cut them off from the power grid. And the whole attempts at biasing them may never work properly because of the way LLMs work.

But the reality remains, no matter how "cute" you think it is, a major reason LLMs are being pushed so hard is that they represent a way to control information being fed to people. And people who do rely on them (a percentage of the population which is only going to increase given how many young people do) are going to be extremely easy to control, because they tend not to seek out their own information, or even news or the like, just wanting the LLM to explain stuff to them. Even if you only get to control what, say, 10-15% of people think, that's a crazy amount of people. And again, this isn't theory/conspiracy theory - not only has it already been done, but a number of wealthy techbros have expressed public approval of the idea of controlling what LLMs think.

BookTenTiger said:
I'm confused. Isn't this thread about a judge making a legal decision based on the work of a lawyer who used LLMs? So not everyone is pointing and laughing.

Exactly. It's very easy to say "Elon Musk said he'd improved Grok and then Grok said Hitler was cool and also it was MechaHitler, look, he biased it!", but when we're seeing actual judges taken in by entirely fictional arrays of cases (honestly the judge in question should be disbarred or at least er... disjudged?), we clearly have a problem beyond flippant "Omg its ez 2 tell" responses.

BookTenTiger · Jul 10, 2025

The Firebird said:
recent case refers to the Grok meltdown, not the legal case.

The legal case is also a recent case, and is literally a case.

It's silly to point to two recent situations, one in which a judge was tricked and one in which a famously petty man messed with his own LLM, and then say "nobody takes LLMs seriously!"

The Firebird · Jul 10, 2025

BookTenTiger said:
The legal case is also a recent case, and is literally a case.

It's silly to point to two recent situations, one in which a judge was tricked and one in which a famously petty man messed with his own LLM, and then say "nobody takes LLMs seriously!"

That was not what I said.

Ruin Explorer · Jul 10, 2025

The Firebird said:
That was not what I said.

It kind of is though.

The Firebird said:
It relies on the assumption that people just kind of blindly follow what the LLMs tell them. But the recent experience with Grok shows people don't.

People clearly do blindly believe information that has come from LLMs. Directly or indirectly. Maybe it's not a huge percentage of people yet, maybe it never will be more than low double-digits. Probably techbros who think they can mind-control everyone are idiots. But that is part of the reason LLMs are being pushed, and this case with the judge shows that it can cause real-world problems, and that people knowingly use it for wrongdoing (don't believe for a second the lawyer didn't know these cases were made up - even as a legal researcher I got to the point where I knew certain cases would come up with certain legal topics).

The Firebird · Jul 10, 2025

I don't think I have much to add that hasn't been stated already. Just in brief:

Ruin Explorer said:
I don't think you're being very realistic, and your own comment re: "news and print media" should point that out to you.

Agree w/rt your comments on news media. I do not believe LLMs are categorically different.

Ruin Explorer said:
Countless people absolutely blindly follow what LLMs tell them - just not snarky people on BlueSky/Twitter.

Disagree. There were some good posts previously about adoption/skepticism to new technology. I'll bet MechaHitler helps in the sense that it makes accuracy concerns more salient.

Ruin Explorer said:
It kind of is though.

The point was "if LLMs go off the deep end w/rt bias, no one will take them seriously". Prestigious news site A can be 1) taken seriously 2) factually accurate 3) biased in what it chooses to report on and 4) required to maintain (2) to get (1). LLMs are confronted with the same, but the challenge is greater because their scope is broader. Anti-semitic rants or not, if you can't recommend a good dishwasher people will stop using your product.

Jfdlsjfd · Jul 10, 2025

I am pretty sure the judge wasn't "tricked", as in "he checked and still thought the bogus cases were real". I won't comment on his job since maybe it's OK to roll a dice and determine who wins in his juridiction, but barring that, if it was to actually decide on the case based on what the parties provides, he obviously didn't do his job. He wasn't tricked by AI. He neglected to do the job of weighing the arguments at the trial. The defendant lawyer wasn't tricked at all: he saw the bogus case and appealed. The appeal judge wasn't tricked either.

I have trouble thinking anyone was tricked. The initial lawyer filing was probably aware of it (but admittedly, maybe he was tricked, but that would be moronic to quote cases you don't even know the content and have never heard of), but neither the judge nor the defendant lawyer were. And the US judicial system worked: the problem was detected in the working of the judge, the bad laywer was fined. It's a political decision on whether a fine is sufficent or harsher measures are necessary (disbarment, dismemberment...) to prevent lawyers from inventing cases, whether by using AI or using their own imagination, but yes, the end result was that the bad lawyer is pointed at and laughed at (and 5,000 USD poorer).

Some have said that maybe some case fly under the radar and are decided without anyone noticing anything wrong (which would mean that none of the involved professionals bothered to check, but OK, let's assume that the system itself is broken), it's quite easy to verify: take a statistically significant number of cases in a juridiction and check all precedents quoted for existence against a reputable database. And then see if increased safety measures should be implemented (or don't have judge rely on precedent for ruling, but that's a larger-scale solution to implement).

And if you're in the unfortunate situation where both the other party's lawyer, the judge and your own lawyers conspire against your interests, you wouldn't have gotten justice anyway.

Ruin Explorer · Jul 10, 2025

The Firebird said:
The point was "if LLMs go off the deep end w/rt bias, no one will take them seriously". Prestigious news site A can be 1) taken seriously 2) factually accurate 3) biased in what it chooses to report on and 4) required to maintain (2) to get (1). LLMs are confronted with the same. Anti-semitic rants or not, if you can't recommend a good dishwasher people will stop using your product.

That's exactly the issue.

With enough practice and effort, they believe (BELIEVE, note, it's not yet proven, but it is proven that they are trying) that they can bias LLMs to specific viewpoints without impairing their ability to recommend a dishwasher. Already LLMs are significantly biased just based on the sources they've been fed, but it's sufficiently similar to media bias that we don't talk about it much (and also, because they just took everything that wasn't nailed down, it's broad-based). Also, we've seen that, for a lot of media sources, you can achieve and (for now) maintain 3 without 2. The NYT has proven this repeatedly. The last three-to-five years have seen the NYT involved in wildly inaccurate and obviously biased reporting countless times (far more than say, the entire thirty years before that), including repeating outright lies and fictions as if they were researched facts. Maybe people will stop believing them eventually but it doesn't seem to have happened. BBC News has been even worse (and bad for nearly twenty years now), but continues to be treated as "trustworthy" more because it's sort of meme than a fact.

And a distrust of media, which is, I admit, setting in (and has been for a couple of decades) just leaves a vacuum - a vacuum that podcasters, TikTokers, and LLMs are absolutely filling - and audiences are showing they aren't particularly more interested in truth than nonsense (Joe Rogan would not be the most popular podcaster in the US if people didn't love absolute and total nonsense - he's the direct podcast equivalent of your friend-of-a-friend's pothead older brother who said told you Mayans invented cellphones).

jian · Jul 10, 2025

The whole distrust of media thing is important for this conversation because we’re becoming more aware how biased most news media are. They may be biases we can live with - the BBC is pro-UK-establishment, Al-Jazeera is pro-Qatari-establishment - but they’re there and in some cases there’s more bias than news. So there’s no real objective trusted source for truth or what happened, and we should be even more aware from genAI which draws from those sources can be even more warped and biased depending on the designer’s intent.

It also recently made me think about a recent minor scandal in autobiographies - it turns out The Salt Path, a series of life-affirming autobiographies about a couple walking about England because they don’t have any money, is written by a pair of fraudsters - and the response from publishing that they basically don’t fact check books and don’t consider that to be their job, caveat emptor. Or about the estimate that about 50% of artworks for sale on the international market are fake (as in, not by the creator they purport to be by) and the common practice of museums and galleries of unofficially displaying copies of major artworks to preserve the originals. What can we trust to be real? What sort of authenticity are we paying for?

Paul Farquhar · Jul 10, 2025

The Firebird said:
This doesn't follow. Grok being a black box did not stop us from detecting its bias

That’s because it was given its orders by someone with the subtlety of a ten megaton nuke. But the next version will be better.

Not that there won’t be plenty of people who consider mechahitler a true prophet.