Judge decides case based on AI-hallucinated case law

My trust in AI- such as it is- erodes further:



stargate-sg1.gif

You love to see it.

I had this conversation with a coworker who has not yet realized that its all a scam. Will it be there one day? Maybe. Is there there today?

No.
 

log in or register to remove this ad






Purely anecdotal, so not provided as evidence of anything, but without prompting of any sort I was prescribed Penicillin for Chicken Pox.

Chicken Pox.
Basic medical note - it’s reasonably common for chickenpox spots to become secondarily infected with bacteria, and it’s uncommon (but deadly serious) to get chickenpox-associated pneumonia which would benefit from being treated with antibiotics (certainly in the absence of decent antivirals in the 1970s and before, say) and so under certain circumstances, even now, it’s reasonable to give antibiotics for chickenpox. It all depends on the clinical presentation.

And this is a basic point about the use of machine learning, especially LLMs, in diagnosis of illnesses. Your best tool as a doctor is pattern recognition intuition based on experience of verbal and nonverbal cues and information. A system that just manipulates words has absolutely no idea what a patient or a condition is and will make errors and misjudgements that are unpredictable because by definition it doesn’t have all the information and it never can. At best, it can support experienced clinicians to make judgments. So far it’s worse than useless for patients.
 
Last edited:

Sure, but I don't know exactly how telehealth system work, but if they are drop in replacement for general practitioner appointment, the latter just send you to a laboratory for most exams (like blood and urine samples), gets the result and then discuss it back to you. It is possible that some doctors (or some medical systems) have the GP doing that, but this hasn't be my experience. The only test the GP has ever made listening with a stethoscope and measuring blood pressure. Still, requiring a lab appointment defeats the point of telehealth, but if it can treat the like 90% of "little illness" like a cold or something, it might become helpful.
Again, from my experience - telehealth means a lot of things, but it usually means synchronous remote communication (a phone call or video call) but it can also mean asynchronous (email, basically).

A phone or video consult - and I’ve done thousands of these in the last 5 years - is helpful for limited tasks and triage for a face to face appointment but is absolutely no replacement for a F2F. They happen because they’re more convenient for the patient and sometimes for the clinician. If all the patient needs is a sick note or medication review, great. If they need even anything as simple as “do I have a chest infection” then they need examination and thus a F2F.

In the UK, they’ve mostly been deployed en masse to deal with the massive rise in patient demand for appointments after Covid (patients now ask for consultations about 10/year rather than more like 5/year before Covid or right now in Canada) and I personally think they’ve been deployed in desperation, haphazardly to meet the demand. I think we still don’t really know how to use them most effectively or appropriately.
 

More on the IMO Gold Medal. The two models that achieved gold aren't released to the public and many details are not known. However, the authors link to a really interesting article about a third AI demonstrating gold medal performance, which used Gemini 2.5 and a clever prompting and verification strategy.

I thought that was interesting for this audience, because it shows how different these prompts are compared to how many people are using LLMs. Their initial prompt is below (apoologies the format has some issues due to my copying from the pdf. See page 5).

### Core Instructions ###
  • ** Rigor is Paramount :** Your primary goal is to produce a complete and rigorously justified solution . Every step in your solution must be logically sound and clearly explained . A correct final answer derived from flawed or incomplete reasoning is considered a failure .
  • ** Honesty About Completeness :** If you cannot find a complete solution , you must ** not ** guess or create a
solution that appears correct but contains hidden flaws or justification gaps . Instead , you should present only
significant partial results that you can rigorously prove . A partial result is considered significant if it represents
a substantial advancement toward a full solution . Examples include :
* Proving a key lemma .
* Fully resolving one or more cases within a logically
sound case - based proof .
* Establishing a critical property of the mathematical
objects in the problem .
* For an optimization problem , proving an upper or lower bound without proving that this bound is achievable .
* ** Use TeX for All Mathematics :** All mathematical variables , expressions , and relations must be enclosed in
TeX delimiters ( e . g . , ‘ Let $n$ be an integer . ‘) .
### Output Format ###
Your response MUST be structured into the following sections, in this exact order .
*1. Summary *
Provide a concise overview of your findings . This section must contain two parts :
  • ** a . Verdict :** State clearly whether you have found a complete solution or a partial solution .
  • ** For a complete solution :** State the final answer , e . g . , " I have successfully solved the problem . The final answer is ..."
  • ** For a partial solution :** State the main rigorous conclusion ( s ) you were able to prove , e . g . , " I have not
found a complete solution , but I have rigorously proven that ..."
* ** b . Method Sketch :** Present a high - level , conceptual outline of your solution . This sketch should allow an
expert to understand the logical flow of your argument without reading the full detail . It should include :
* A narrative of your overall strategy .
* The full and precise mathematical statements of any
key lemmas or major intermediate results .
* If applicable , describe any key constructions or case
splits that form the backbone of your argument .
*2. Detailed Solution *
Present the full , step - by - step mathematical proof . Each step must be logically justified and clearly explained . The level of detail should be sufficient for an expert to verify the correctness of your reasoning without needing to fill in any gaps . This section must contain ONLY the complete , rigorous proof , free of any internal commentary , alternative approaches , or failed attempts .
### Self - Correction Instruction ###
Before finalizing your output , carefully review your " Method Sketch " and " Detailed Solution " to ensure they are clean ,
rigorous , and strictly adhere to all instructions provided above . Verify that every statement contributes directly to the final , coherent mathematical argument
 
Last edited:

More on the IMO Gold Medal. The two models that achieved gold aren't released to the public and many details are not known. However, the authors link to a really interesting article about a third AI demonstrating gold medal performance, which used Gemini 2.5 and a clever prompting and verification strategy.

I thought that was interesting for this audience, because it shows how different these prompts are compared to how many people are using LLMs. Their initial prompt is below (apoologies the format has some issues due to my copying from the pdf. See page 5).
Interesting as well as mildly nostalgic for my late teens when I got through to FIST 1 (Final International Selection Test, the third round of the British Maths Olympiad that feeds into the IMO) when I was 16. My older brother got through to FIST 2 and someone from our school (in the year between my brother and me) was an IMO gold medal winner in his year.

That said, AFAIK IMO questions are a mixture of maths and pattern recognition and so exactly the kind of thing I'd expect machine learning to be good at, so this is hardly earth-shattering. I'm surprised they hadn't managed this earlier.
 

Pets & Sidekicks

Remove ads

Top