I've heard this before and it always strikes me as odd. I'm genuinely curious: if asking, "Are you sure?" identifies most hallucinations, why isn't "Are you sure?" functionality being used under the hood to proactively stop hallucinations from happening?
Maybe that's just the difference between commercial AI application sold 300 €/user/month and the 20 $ solution for the general public

Also, it might be that answering the second question consumes as much resources as answering the first, behind the hood. That would push the entry price to a level many casual users wouldn't be ready to just try. I feel there are more people that can spend 20 USD a month on that work 95 times out of 100 for their use case, fails 4 times and fumbles critically 1-in-100, rather than pay 100 USD a month for a tool that would work 99 times, fail critically 0.2 times and fail regularly 0.8 times.
Honestly, while it can work, it's not totally enough: I have seen ChatGPT doubling down on a wrong answer about the ir'Wynarn family line in the Eberron setting. There was no official answer, fan sites gives conflicting information, and he hallucinated and kept doing it until asked, "OK, show your sources."
Last edited: