And this is too cute. It relies on the assumption that people just kind of blindly follow what the LLMs tell them. But the recent case shows people don't. They just point and laugh, the same dynamic that has occurred with print and news media for years.
I don't think you're being very realistic, and your own comment re: "news and print media" should point that out to you.
Print and news media have absolutely lied to people's faces and presented insane warpings of the facts, but the reality is, in any given country, a huge number of people believe the insane warpings of the facts.
Sure, a lot of intelligent people do point and laugh but they're clearly a minority of the electorate.
Some smug people like us on the internet "detecting bias" isn't good enough when the vast majority of people using these things are pretty uncritically seeing them as "one source of the truth", and consulting them on stuff they really shouldn't. Countless people absolutely blindly follow what LLMs tell them - just not snarky people on BlueSky/Twitter.
You can say "Well it's no different to biased news!", but yeah, it is. Because the difference is that biased news is one-way vector - it's not very responsive. It often fails because it's not flexing to what people actually want to know. Whereas if you have a system for answering people's questions, and just answer with lying or biased information, that's insanely more powerful for manipulating people. And it would be much less of a problem if there were say, six major LLMs with different background and ownership giving completely different answers but... that's not the case. All the major Western LLMs are based in one part of one country and all owned by people who have a spectrum of beliefs that are technically varied but still pretty narrow (and at odds with society in general).
With Grok it was a first and very clumsy attempt. The black box nature of LLMs does cut both ways a bit. But this was a big upgrade in intentional bias from the previous simple prompt-injection attempts at biasing them. It just went too far in the Hitlerian direction (very literally) so became obvious. That's because Elon Musk is an absolute dim bulb when it comes to code who demands things be done immediately, rather than waiting until the tech is ready. I don't expect intentional biasing of other LLMs to remotely as clumsy, and we know from oblique comments that it's being worked on.
Now, I will say, this isn't some "ALL IS LOST - FLEE TO THE FORESTS" deal. LLMs may well lose popularity. Competing ones may emerge. The whole technology may get superseded by better approaches to AI. The bubble may well burst (indeed things not looking great for OpenAI), because the whole thing is so expensive. Climate change may just make them unworkable, as things get worse and governments cut them off from the power grid. And the whole attempts at biasing them may never work properly because of the way LLMs work.
But the reality remains, no matter how "cute" you think it is, a major reason LLMs are being pushed so hard is that they represent a way to control information being fed to people. And people who do rely on them (a percentage of the population which is only going to increase given how many young people do) are going to be extremely easy to control, because they tend not to seek out their own information, or even news or the like, just wanting the LLM to explain stuff to them. Even if you only get to control what, say, 10-15% of people think, that's a crazy amount of people. And again, this isn't theory/conspiracy theory - not only has it already been done, but a number of wealthy techbros have expressed public approval of the idea of controlling what LLMs think.
I'm confused. Isn't this thread about a judge making a legal decision based on the work of a lawyer who used LLMs? So not everyone is pointing and laughing.
Exactly. It's very easy to say "Elon Musk said he'd improved Grok and then Grok said Hitler was cool and also it was MechaHitler, look, he biased it!", but when we're seeing actual judges taken in by entirely fictional arrays of cases (honestly the judge in question should be disbarred or at least er... disjudged?), we clearly have a problem beyond flippant "Omg its ez 2 tell" responses.