ChatGPT lies then gaslights reporter with fake transcript


log in or register to remove this ad

If the video is to be trusted, he will continue to use it. The ending pitch is "AI will change our lives and workplace, but we've got to be careful". So, if it is taken at face value, it will change our lives so it must be widely be used in the future, and we just need to be careful about its use. I could have said the same thing about cars in 1925.
Yeah, the reporter doesn't seem to be saying "AI is bad, don't use it!" so much as "AI has problems, be careful when using it", which... yeah, I'm not sure why anybody would have an issue with that statement, and some people really do need the reminder.
 

But, your positive experiences do not clarify that the operation is actually worth the economic, social, and ecological costs of the data centers, and loss of value of materials that should have been covered by copyright, required to support the technology.

Nor does it address whether the relationship between the large corporations and the general public is... healthy, or manipulative and abusive.
This is all obviously true, the costs and social benefits are distinct from whether I find it useful.

In the end, genAI looks suspiciously like large companies externalizing large amounts of cost, while reaping large amounts of revenue, for questionable overall results on the large scale.
But, you go on to finish this post with a large leap that has little supporting evidence. I get you don't see much benefit to it. But the transformer architecture has led to real, measurable impacts on tasks that are easy to measure--protein folding, translation, or mathematics--for example. So the idea that its application to chatbots in particular yield little value seems questionable to me, especially in light of my (subjective) good experience.

I don't think there is the kind of gold standard study you'd need to prove it with data either way. And once you get into a social analysis then values play a large role.
 

But do we in general keep bringing up the fact that the human artist drew terrible hands?

Depends, do we expect the human artist to also provide material advice regarding finance at a global level, medical recommendations, mental health council, or do we just shrug it off because its 'just art'.
 

Sure, and an LLM would certainly fail at some point, but packaged AI solutions are promising to do so. I'd expect them not to rely only on LLM training, especially when working on professional, potentially sensitive, data. Or even, public, readily available data. If you check a legal reference with ChatGPT, he might give you the correct answer and state an invalid reference, then when asked for backup, perform the search and come up with "sorry, this isn't article 2248-30 it is article 1749-51 of this other code." Or "I couldn't find a reference for the statement I made." Which is usually enough to identify an hallucination.
Ime prompting with some variant of "are you sure about that?" identifies a significant number of hallucinations.
 

Yeah, the reporter doesn't seem to be saying "AI is bad, don't use it!" so much as "AI has problems, be careful when using it", which... yeah, I'm not sure why anybody would have an issue with that statement, and some people really do need the reminder.

I am pretty sure there is no problem with the message. Most criticism were aimed at the way it is presented:
  • A theatrical presentation like the reporter had discovered something -- he isn't, he's reporting on a well known mechanism.
  • The use of terms that made somehow AI sentient -- we're not there yet, by a wide margin.
  • The lack of data on how often such problem arose and information or advice on how to react when having this problem.

It's like a report on slothy intern to warn about coworker not being 100% truthful. There is little problem with the message, but since the message isn't new, it's the execution that will get evaluated.
 

But, you go on to finish this post with a large leap that has little supporting evidence. I get you don't see much benefit to it. But the transformer architecture has led to real, measurable impacts on tasks that are easy to measure--protein folding, translation, or mathematics--for example. So the idea that its application to chatbots in particular yield little value seems questionable to me, especially in light of my (subjective) good experience.
These practical uses don't require the massive money burning that more "commercial" models do.
 

Depends, do we expect the human artist to also provide material advice regarding finance at a global level, medical recommendations, mental health council, or do we just shrug it off because its 'just art'.
Any human who screws up will get it brought up again and again just like we are doing with AI, but I'm talking in the limited case of a human artist vs generating images.
 

Ime prompting with some variant of "are you sure about that?" identifies a significant number of hallucinations.
I've heard this before and it always strikes me as odd. I'm genuinely curious: if asking, "Are you sure?" identifies most hallucinations, why isn't "Are you sure?" functionality being used under the hood to proactively stop hallucinations from happening?
 

The PMI (Project Management Institute, the most accepted authority on project management techniques) notes that about 80% of enterprise genAI projects fail*. The two basic reasons for failure are 1) Does not deliver the expected value and 2) in effect, the customer was sold a solution looking for a problem, rather than staring with a real problem that the customer knew needed a solution.

Several studies, in both the prose/technical writing and code writing domains, which looked beyond focused task completion, that found including genAI reduced overall productivity when genAI was included as a major tool. In essence, any improvements seen in completing one task is overwhelmed by the effort needed to correct the errors genAI introduced downstream from that task completion.
I am reminded of Goodhart's law - "When a measure becomes a target, it ceases to be a good measure" - and with AI, I think it now applies to measures like the classic Turing Test. LLMs give me the impression of technology that was designed specifically to do things like pass the Turing Test, write book reports, solve single-task coding puzzles, and accomplish other "AI" things that can be persuasive on a surface level.

But anytime I hear from a professional software engineer, the most that they get from AI is that it has replaced StackOverflow and Google for the purpose of looking up obscure libraries or writing boilerplate code blocks. Which would be the output of that focused single-task completion. Beyond that, its usefulness falls off pretty quickly.

What worries me the most is the amount of faith AI companies and AI boosters want people to have in things like vibe coding, because AI in the hands of a good software engineer can be a time-saving force multiplier, but AI in the hands of a mediocre software engineer can be a force-multiplier for mediocre code and bad software engineering decisions. And in 2025, bad software engineering decisions are highly likely to become bad software security decisions, which will make a bunch of software less safe, which is a scary thought, and also means extra headaches for me as an infosec professional. Most of us in the IT security field are already fighting more fires than we have the time, energy, or workforce to handle.
 

Remove ads

Top