I think his error was to use a general LLM (trained on a large variety of sources to be able to sound correct statistically on most topics) rather than a dedicated model for law. Diagnosing AI are working better than a AI-less human, as evidenced earlier, but that doesn't mean that you can chat with chatgpt about how you feel and get a diagnose.
Also (and this is a big one) medical assistance AIs are generally not just LLMs trained on medical texts.
"A.I." is a broad category. Generative AI is only corner of the overall space.