That's a great encapsulation. An additionally there's an expectation that computers act the way we've always expected them to; so they don't 'get the math wrong', but with LLMs that expectation should really be more aligned with how we might think of a person, they in fact do mis-remember, mis-understand, and get the math wrong.
I've definitely found that I shouldn't necessarily count on the LLM, even provided eg. the PDF, to get the rules right, and I try to either check myself, or ask the kinds of questions that might surface "oh, wait a minute..." from it.
When you're using the ChatGPT (or Claude or Gemini) app, you're bound to its behavior, but if you build your own you have a lot more avenues for helping it with specific prompting, tools and context sources (although fair warning, it gets complicated fast

).
I find that at the moment it's still best as a player, because it more naturally lets you interject in the same way you might with players who aren't fully up on the rules, or who are a little 'forgetful'
I actually find one of the biggest problems I had to work around was sort of 'unlicensed co-creation' where the AI doesn't quite understand the distinction between the player/gm roles and ends up eagerly stepping over that boundary. Reprimanding helps for a bit, but context length kills it over time. The best fix I've found is clear system prompting and model choice in particular.
That aligns with something I’ve seen too—especially the part about not expecting the LLM to "just know" the rules, even if you provide the PDF or source material. It’s a common assumption: that having access to the rules means it can retrieve and apply them like a structured system. But that’s not really how it works.
Even with the full rules in context, the model still has to read, interpret, and synthesize meaning from the material every time it responds. It’s not pulling from a stable rule engine—it’s rebuilding its understanding on the fly, each time, through probabilistic reasoning. That introduces variability.
I tend to think of it like this: memory is the pot that cooks the soup. You can keep adding ingredients (text, documents, clarifications), but at some point, the earlier flavors fade. You don’t get more coherence just by adding more content. You need to control the temperature, the order, the timing—which is why I focus more on building stable context windows and scaffolds than just loading up resources.
That said, I’ve also found it much easier to
go with the flow than to force strict expectations. If you focus on the experience rather than the rules, LLMs have room to surprise you—especially when you let them lean into inference and improvisation. Once the boundaries are clear, the freedom inside them can produce some unexpectedly good moments.
For example, I asked the model to run
Keep on the Borderlands several times to observe how it would approach the same prompt with different variations. It’s a classic, widely discussed module, and the model already had a strong sense of the tone, structure, and general content—just from exposure to the vast amount of material written about it online.
Even without the actual module in context, it was able to generate believable versions of the adventure. The details varied, but the atmosphere, encounter structure, and thematic beats remained consistent enough to feel intentional. It wasn’t exact—but it
felt right.
That’s where expectations can get misaligned. We assume that if the LLM has the PDF or module file, it should know exact details—what’s in room 13, how much copper is hidden, the NPC names, etc. But that’s not how LLMs work. They don’t retrieve text from a file like a database. They read, interpret, synthesize, and generate a response based on patterns—not memory.
So while giving it a file might help guide its inference, what you’re getting is still a reconstruction, not a citation. That’s why I’ve found it more effective to focus on the
experience rather than expecting perfect recall. If you let the model lean into what it does well—tone, structure, improvisation—it can often surprise you in a good way.