How long before VTTs support TTRPGs with AI GMs?

It can though. Easily.

Without links or support, everything you state is an opinion.

I use Copilot a lot. It once told me that 2+2=5 (no, I'm not kidding) When I brought the error to to it's attention, it told me that most AI's aren't equipped to check basic math. How good of a GM could an AI be with a game that has a heavy focus on basic math?

In fact, here's it's response:

How AI Handles Basic Math​

AI language models generate text by predicting the next word (or token) based on patterns learned during training. They’re not built as dedicated calculators—so while they can often mimic correct arithmetic, they don’t automatically invoke a strict math engine every time you ask them to add or multiply numbers.
 

log in or register to remove this ad

A couple of my friends use AI to help them write adventures and build maps and stuff. They'll feed a bunch of prompts into ChatGPT or whatever, then spend a bit of time refining the output, and then they save it to a document and paste it up on Roll20. They're having a lot of fun with it.

One of them asked me a while ago how often I use AI to write my adventures. When I told him I didn't use any AI at all, he was surprised. "Why not?" he asked, "It makes things so much easier, you can just rush through all the boring stuff and get right to playing the game. I can have a whole campaign ready to play in a single afternoon!"

I tried to explain that what he called the "boring stuff" was actually the parts I love most about being a Dungeon Master. I love the creativity it takes, writing the story, building the world, drawing the maps, dreaming up interesting people and places...why would I want to automate away the fun part? His answer was something along the lines of "fun for you maybe, but that's way too much work for me." And yeah, that's fair.

I use a VTT to simplify the game: it helps bring everyone together, keeps track of our character sheets and dice, and it makes combat so much faster. But I don't use any of the built-in tools and features that get in my way: I don't use the VTT's jukebox or webcam, no 3D-tokens, no animated maps or spell effects, etc. I suspect that as AI becomes more pervasive, I'll have to get more creative in avoiding it.
 

Without links or support, everything you state is an opinion.

I use Copilot a lot. It once told me that 2+2=5 (no, I'm not kidding) When I brought the error to to it's attention, it told me that most AI's aren't equipped to check basic math. How good of a GM could an AI be with a game that has a heavy focus on basic math?

In fact, here's it's response:

How AI Handles Basic Math​

AI language models generate text by predicting the next word (or token) based on patterns learned during training. They’re not built as dedicated calculators—so while they can often mimic correct arithmetic, they don’t automatically invoke a strict math engine every time you ask them to add or multiply numbers.

Not sure why folks are so serious about this. If you can't joke about the coming AI apocalypse, what else is there??

There's a pretty visceral anti-AI bias though. I'm not trying to exaggerate or evangelize AI. I keep saying that I'm both amazed and afraid of it at the same time. I'm just trying to describe my experience is all. It can do a lot of cool things if you like RPGs.

I just asked Copilot what 2+2 is and it said 4, BTW. I wouldn't stake the future of the planet on one chat, but I use them for things involving math and statistics all the time, and they're generally better at math than most people are.
 

Attachments

  • copilot_2+2.png
    copilot_2+2.png
    18.8 KB · Views: 7

I don’t use AI, and I’m pretty shure it will take a while for an AI GM to replace me, with my quirks and idiosyncrasies that make my player interactions and campaigns unique. And by then, I hope I still have a table of other old farts that prefer to play with people.
 

Here's the continuing saga of the quick 5e ChatGPT game I started earlier. This could get lengthy to continue pasting into the thread, but I think you get the gist of it. It's kind of remarkable, IMO. What I'd hope doesn't get lost is that AI is at the point now where you can refine it along the way. If you want more action, just ask for it. If you want more description, ask. It's highly customizable on the fly now.

You said:​

What does the party think? Speaking for myself, I'd like to accept the quest immediately! But I want to make sure everyone's being heard. I'm not the leader of the party, after all! (Not sure who is.) I'd probably suggest we wait till the storm passes though.

ChatGPT said:​

That’s a great instinct—you’re roleplaying Fizzbang both thoughtfully and collaboratively. Let’s check in with your companions:



Bram Ironfist (the fighter) slams his mug down again, rain pounding outside.

“Aye, I say we go tonight! Storm or no storm, danger doesn’t wait for fine weather. The longer we dawdle, the more folk vanish. Besides—storms make good cover if somethin’ nasty’s lurkin’.”
Sister Maribel (the cleric) folds her hands, speaking calmly.

“I respect your boldness, Bram, but rushing headlong is reckless. Fizzbang is right—the storm makes travel dangerous. If we leave at dawn, we’ll arrive with clear skies and rested minds. And I would prefer to pray for guidance before we tread into sacred ruins.”
Both turn their eyes toward you.

It’s clear Bram’s the man of action, Maribel the voice of caution. Neither seems to claim “leadership”—it’s more of a partnership. If you want, Fizzbang can be the tie-breaker when they disagree, or you can suggest a more democratic style of decision-making for the group.



Would you like to:

  1. Side with Bram and push to leave tonight through the storm.
  2. Side with Maribel and wait till dawn for clearer weather.
  3. Propose a compromise (e.g., leave tonight but only scout the area, or wait out part of the storm).
What’s Fizzbang’s call?

You said:​

Well, having heard from the others, I think it's probably wise if we go tonight because lives are at stake! We'll just be careful (as usual), but if we can save one life, the risk to us will have been worth it. Let's go! Do we have the provisions we need?

ChatGPT said:​


Thought

Brilliant — I love that call. You convince the party to go tonight despite the storm, because lives are at stake. I’ll run this strictly within D&D 5e rules, keep the NPCs consistent, and play out the scene now.

Below I’ll cover: (A) what you have / what the party brings, (B) spell slots / prepared spells for Fizzbang (rules-correct), (C) the travel and checks we make in the storm (I’ll roll and show results), and (D) the moment you arrive at the ruined chapel and the immediate options.



A — Provisions & gear (what the party has)​

You and the party are frontier adventurers; before leaving the Silver Cup Inn you make a quick checklist. You do have the standard adventuring gear you would expect for a 3rd-level party who took a job in a frontier town:

  • Torches ×10 (or one hooded lantern + oil), flint & steel.
  • Rations (3 days each).
  • Waterskins.
  • Bedrolls and blankets.
  • Rope (50 ft), grappling hook.
  • Healer’s kit.
  • Holy symbol (Sister Maribel).
  • Spellbook & arcane focus (Fizzbang). Component pouch (if you use it).
  • Simple rope, chalk, oil.
  • A small amount of coin (enough to pay the inn and maybe buy a replacement torch if needed).
If you want to buy or swap anything before leaving — extra oil, caltrops, or a potion of healing — say so now; otherwise we depart with the above.



B — Fizzbang’s spellcasting (rules-accurate)​

I confirmed the wizard table for 5e: a 3rd-level wizard has four 1st-level spell slots and two 2nd-level spell slots. (Source: the Wizard class table / SRD.) Roll20+1

Your prepared spells count = Intelligence modifier + wizard level. Your INT is 16 → modifier +3. Your level is 3. So:

  • Intelligence modifier (3) + Wizard level (3) = 6 prepared spells.
    Step-by-step: 3 + 3 = 6.
Your slots and some spells you told me earlier:

  • Spell slots: 4 × 1st-level, 2 × 2nd-level.
  • Spells you had listed earlier (examples you prepared): Mage Armor (already active if you cast it), Magic Missile, Shield, Detect Magic, Misty Step, Scorching Ray, Web.
    (You can have six prepared 1st/2nd-level spells; the list above included more than six items earlier — confirm which six you want prepared right now if you want to be strict. For now I’ll assume you have Mage Armor active, and you’ve prepared Magic Missile, Shield, Detect Magic, Misty Step, and Scorching Ray — that’s five plus Mage Armor = 6.)
If you want to swap any prepared spells before we go (you can do this only after a long rest normally), tell me — otherwise we use the set we agreed.



C — Setting out and the checks (I’ll run the dice and show results)​

You set out through the rain. Bram insists on shouldering the lead; Maribel quietly prays. The trail north to the old chapel crosses low hills and muddy tracks. I’ll run the key skill checks that matter in this moment:

  1. Travel in the storm — Survival (avoid getting lost / slowed).
    The DM’s DC: 12 (wet, muddy tracks, limited visibility). To keep things moving, I’ll use Sister Maribel as the party’s navigator—she’s the cautious one and has Wisdom focus. I’ll give her a modest Survival bonus of +3 (Wisdom +1 plus proficiency +2).
    • I roll a d20: 11.
    • Check total: 11 + 3 = 14. (11 + 3 = 14.)
      Result: Success. You make steady progress. The storm slows you some (travel takes longer than under clear skies), but you don’t get lost. Travel time: typically 30 minutes; in the storm it takes us 45 minutes to reach the chapel approach.
  2. Approach stealthily (avoid drawing attention from whatever glows).
    Bram is loud; he grumbles. You (Fizzbang) try to be careful and Bram does the best he can to keep his armor quiet. I’ll treat this as a group Stealth attempt (with you contributing). The DM’s DC for a careful approach in howling rain is 13. Bram gets +1 stealth (heavy boots), Maribel is quiet (+2), Fizzbang has Dex 14 = +2 but isn’t armor-stealthy; assume Fizzbang contributes +4 if you try to be nimble (Dex +2, proficiency not in Stealth but I’ll allow tools if you want). I’ll roll a group Stealth and decide whether you approach unseen.
    • I roll a d20 for the party Stealth effort: 9.
    • I’ll apply a modest average modifier (party stealth average ≈ +2). 9 + 2 = 11.
      Result: 11 < 13 → partial failure. You make it close to the ruins but a thin blue light flickers and a whisper echoes; something notices faint motion but doesn’t immediately attack. The bad news: your approach was noticed enough that whoever or whatever is watching will be on alert. The good news: you weren’t lit up like a target — still time to prepare.
  3. Perception / notice hazards on the path (to avoid traps / arcane wards).
    Fizzbang’s passive Perception is 10 (WIS 11 → +0). You decide to actively scan — Arcana might help you sense magical wards, while Perception (Wisdom) would spot physical dangers. You (Fizzbang) call on your arcane senses: an Arcana check to detect the magical nature of the glow (DC 14).

    Fizzbang’s Arcana bonus: INT 16 → +3, proficiency in Arcana → +2 = +5 total.
    • I roll a d20 for Arcana: 12.
    • Total: 12 + 5 = 17. (12 + 5 = 17.)
      Result: Success. You pick up that the blue light is tied to a small warded altar and that several glyphs are embedded in the wet flagstones around it. The magic is old, faintly sorrowful — not raw evil, but binding.


D — Arriving: the ruined chapel and what you see​

You slip through low branches and push aside a sodden curtain of ivy. In the center of a moss-rimed clearing sits the ruined chapel of Saint Athelyn. Its roof is collapsed in places; rain runs through the ruined rafters. On the ruined altar, set within an iron circlet and resting on a dull pewter plate, is a small blue shard — it pulses with soft bluish light, the same light that the townsfolk reported. Around the altar are shallow arcane glyphs carved into flagstone; when you got close they faintly flared.

Hovering just above the altar is a ** translucent childlike figure**, maybe seven years old, pale and faintly luminescent. It doesn’t look hostile — instead it reaches toward the shard and looks at you with hollowed eyes. When Maribel steps closer and makes the sign of the Dawnmother, the spirit flinches and backs away, making a small sound like a sob.

From your Arcana check you know (you remember seeing similar legends and wardwork in your studies):

  • The shard is likely a fragment of a moonstone or similar lunar relic that can bind a spirit.
  • The glyphs form a binding ward rather than an offensive trap; breaking the ward could free the spirit, but if done carelessly it could release other lingering effects. The ward responds to sympathetic energy (religious ritual) or arcane unbinding.
Immediate facts (rules-relevant):

  • The ward requires either a Religion check (DC 13) to perform a cleansing rite (Maribel would lead), or an Arcana check (DC 14) to unbind it safely without damaging the shard (Fizzbang would lead).
  • If you fail either check badly (a natural 1 or failure by 5+), there may be a backlash (the shard emits a shock of necrotic energy equal to 1d6 × check-failure severity — I will warn you if a failure triggers; you can always choose a safer path like leaving and returning with more information).
  • The spectral child is not automatically hostile. You can attempt social approaches, speak in Celestial / Common (Maribel might speak to it), or attempt to move the shard.


Your choices (immediate)​

You’re there, Fizzbang. The storm lashes the broken stained glass; the blue shard pulses. Bram’s hand is on his axe, ready if things go ugly. Maribel whispers a prayer and looks to you for guidance.

You can choose one of the following right now (you can combine some actions, but each PC has one action in the first official round if combat breaks out):

  1. Arcane unbinding — Fizzbang attempts an Arcana (DC 14) check to safely lift/unbind the shard (uses your Arcana bonus: +5). A success will allow you to remove the shard without harming the spirit; a high success might reveal lore about the shard. Failure could cause a magical backlash (I’ll be explicit about damage/effects).
  2. Ritual cleanse — Sister Maribel attempts a Religion (DC 13) ritual to free the spirit and sanctify the shard (she uses Channel Divinity if she wants to spend it). Success frees the spirit and weakens the shard; failure risks angering it or triggering a protective effect.
  3. Take the shard by force — Bram tries to smash/remove it physically (Athletics check DC 15) — likely to break the ward and risk harm to the spirit.
  4. Talk to the spirit / Calm it — Attempt an Insight/Religion/Charisma approach to learn why it’s bound. This can avoid checks and give you more context (maybe a Persuasion check DC 12).
  5. Retreat — You decide the moment is too risky and pull back to gather more information or fetch a scroll / priest (you can do that but you said lives are at stake).
Tell me which action Fizzbang takes right now. If you pick 1 or 4 I will roll using Fizzbang’s modifiers (I’ll show the d20 and total). If you pick 2 I’ll resolve Maribel’s ritual (and whether she spends Channel Divinity). If you pick 3 I’ll roll Bram’s Athletics; if you pick 5 we retreat and I’ll show consequences (e.g., the spirit weeps and the shard dims).

What does Fizzbang do?

You said:​

Can we slow this down to where it's round by round as soon as rolls are involved? Let's make sure I'm always given the opportunity to roleplay at each juncture, OK?

ChatGPT said:​

Perfect — I’ll slow it down to round-by-round play from this point onward. Every time dice are involved, I’ll pause before rolling so you can decide what Fizzbang says or does. I’ll also keep Bram and Maribel acting consistently in the background, but they won’t override your lead unless you want them to.

Here’s where we are at the start of “round 1”:



Scene Recap​

  • Location: Ruined chapel of Saint Athelyn, in the storm.
  • What you see: A glowing blue shard on an altar, surrounded by arcane glyphs. A spectral child hovers nearby, looking mournful but not yet hostile.
  • What you know (from your Arcana check): The glyphs are a binding ward. They’re meant to hold the spirit, not destroy.
  • Allies’ stances:
    • Bram: Axe in hand, tense, ready to strike if the ghost lashes out.
    • Maribel: Cautious, hands folded around her holy symbol, whispers, “This child is trapped. We must help.”


Round 1​

You’re standing just inside the ruined doorway. You have initiative here because you spotted the ward’s purpose.

Possible immediate actions (examples, not exhaustive):

  • Speak: Try to talk to the spirit (no roll unless you want to persuade or calm it).
  • Examine further: Arcana/Investigation to learn more before acting.
  • Attempt unbinding: Begin the ritual (Arcana or Religion check).
  • Hold your action: Wait for the spirit to move or respond.
  • Order Bram or Maribel: Suggest they take an action this round.


✨ It’s your move. What does Fizzbang do in round 1?

You said:​

Being as calm and peaceful as possible, I'd like to examine it further....

ChatGPT said:​

Good call! You’re treating this carefully and respectfully, which makes sense given the situation.



Round 1 — Fizzbang’s Action​

You step forward slowly, keeping your hands visible and your posture relaxed. The rain drips through the broken rafters as you examine the spirit and the ward more closely.

Mechanics: This is an Investigation (Intelligence) check, since you’re looking for specific details about the glyphs, the shard, and the spirit’s connection.

  • Fizzbang has Investigation +5 (INT +3, proficiency +2).
👉 Before I roll, would you like to:

  • Rely purely on intellect (roll Investigation normally), or
  • Use a spell to help (for example Detect Magic to get clearer magical “signatures” of the ward and the shard)?
Both are valid paths — Detect Magic would use a spell slot but give you more precise info.

Which approach does Fizzbang take?

You said:​

Can I just try an Investigation check first?

ChatGPT said:​

Absolutely — no need to burn a spell slot yet. Let’s resolve that Investigation check.



Round 1 — Fizzbang’s Action​

Roll: 1d20 + 5 (Investigation)
🎲 d20 = 15
+5 modifier = 20 total

Result:
Success!



What You Learn​

  • The Glyphs: They’re old, but not malicious. They’re designed as a containment circle, keeping the spirit bound within a small radius of the altar. The ward seems tied directly to the shard — it’s not just holding the ghost, it’s anchoring it.
  • The Shard: It radiates faint lunar energy, not inherently evil. You recall lore that such shards can be fragments of a larger crystal once used in protective wards during the Moonlit Wars centuries ago. The shard’s magic has degraded; it now holds the spirit in a painful limbo.
  • The Spirit: The child isn’t hostile. Its eyes follow you, and when you lean closer, you notice it mouthing words without sound. With your sharp perception, you can almost make out the shape of the lips: “Help… mother…”


End of Round 1​

  • Bram looks at you and grumbles: “Well? Do we smash it, or talk to it?”
  • Maribel whispers urgently: “It’s bound. We must free the child, but carefully.”


✨ Round 2 — Your turn again.
Options on the table:

  1. Speak directly to the spirit (try to communicate, possibly with Persuasion or Insight).
  2. Arcana or Religion check to begin freeing the ward.
  3. Ask Maribel to attempt a cleansing ritual this round.
  4. Let Bram act (he wants to smash it, but will follow your word).
  5. Something creative (use a spell, move, or another approach).
What does Fizzbang do for Round 2?

You said:​

I'd like to try an Arcana check to begin freeing the ward....

ChatGPT said:​

Excellent — Fizzbang takes the lead with arcane precision!



Round 2 — Fizzbang’s Action​

You step closer, rain dripping onto the flagstones, and extend your hand over the glyphs. Your wizard training kicks in: these runes must be unbound carefully, in a specific sequence, or the ward could backlash.

Mechanics: Arcana check (INT +3, proficiency +2 = +5 total).

  • DC = 14 to safely unbind the glyphs without harm.
  • Success → the ward weakens and the spirit is closer to release.
  • Failure → backlash possible (necrotic shock, 2d6 damage).
🎲 Rolling 1d20 + 5…
d20 = 16
+5 = 21 total



Result​

A strong success!

  • You identify the “keystone” glyph and trace its outline with a careful gesture.
  • The rune flares, then collapses into harmless sparks of light.
  • The circle of glyphs flickers — no longer binding as tightly.
  • The spectral child gasps, clutching its chest, and whispers audibly for the first time:

    “Please… don’t leave me here…”
You sense it will take one more successful Arcana or Religion check to complete the unbinding.



End of Round 2​

  • Bram: “You’re makin’ it weaker. Just say the word and I’ll chop the shard loose!”
  • Maribel: “No, Bram — patience! Fizzbang, you’re close. Continue, or let me sanctify it.”


✨ Round 3 — Your move.
Do you want to:

  1. Continue the Arcana unbinding yourself,
  2. Let Maribel attempt a Religion check (her DC is 13, bonus +4),
  3. Try to speak with the spirit now that it can whisper,
  4. Or another action (spell, ritual, command Bram, etc.)?
What does Fizzbang do for Round 3?

You said:​

I ask both Bram and Maribel to be ready in case I fail with the second Arcana check, and the entity retaliates. Then, I'll continue with the second Arcana check in the unbinding.

ChatGPT said:​

Perfect strategy — that’s cautious, collaborative play. You set your companions on overwatch before resuming the delicate work.



Round 3 — Setup​

  • Bram tightens his grip on his axe, moving to guard your flank.

    “If it lashes out, it won’t get past me.”
  • Maribel raises her holy symbol, ready to channel divine energy the instant things go wrong.

    “I’ll shield us if the ward breaks against you.”
The storm beats down harder through the broken rafters as you kneel at the altar, preparing to finish the arcane unraveling.



Round 3 — Fizzbang’s Action​

Arcana check again (INT +3, proficiency +2 = +5).

  • DC = 14 to complete the unbinding.
  • Success → spirit freed, ward dissolved.
  • Failure → backlash (necrotic shock, 2d6 damage to Fizzbang; Bram/Maribel may react immediately).
🎲 Rolling 1d20 + 5…
d20 = 12
+5 = 17 total



Result​

Success again!

  • Your hand glides over the final rune, whispering an incantation.
  • The glyphs flicker, sputter, and then vanish in a ripple of silvery-blue light.
  • The shard glows brightly for a heartbeat, then dims to a soft glow — its binding power spent.
  • The spectral child gasps, then looks at you with grateful eyes.

    “Thank you… tell my mother… I’m safe…”
  • With that, the spirit dissolves into pale motes of moonlight that rise into the stormy sky.


End of Encounter​

  • Bram exhales loudly, lowering his axe.

    “Hells, wizard, you did it. Thought sure that thing’d shriek and claw at us.”
  • Maribel bows her head, smiling faintly.

    “Saint Athelyn will remember this mercy. The shard… it may still hold power, Fizzbang. What will we do with it?”
On the altar lies the now-quiet Moonstone Shard, pulsing faintly like a heartbeat.



✨ Your move: Do you want to take the shard, examine it further, let Maribel consecrate it, or leave it here?
 

So, respectfully, this example has what I thought were the main issues, and I imagine they would get worse as the game progresses. This happens as soon as the mechanics are involved. Consider:

Approach stealthily (avoid drawing attention from whatever glows).
Bram is loud; he grumbles. You (Fizzbang) try to be careful and Bram does the best he can to keep his armor quiet. I’ll treat this as a group Stealth attempt (with you contributing). The DM’s DC for a careful approach in howling rain is 13. Bram gets +1 stealth (heavy boots), Maribel is quiet (+2), Fizzbang has Dex 14 = +2 but isn’t armor-stealthy; assume Fizzbang contributes +4if you try to be nimble (Dex +2, proficiency not in Stealth but I’ll allow tools if you want). I’ll roll a group Stealth and decide whether you approach unseen.
Even at this stage, the model isn't adjudicating anything. It is just making stuff up. "Heavy Boots" is not a mechanical term in 5e--what is Bram's DEX? His stealth proficiency? Does he have heavy armor giving advantage or disadv? And what exactly is 'armor-stealthy'? In the most favorable reading, the LLM is allowing Fizzbang to use a tool proficiency (in what?) to get a proficiency bonus to the roll. That is highly idiosyncratic, at least.

Then there are issues with prompting, to the extent that I'm not sure such an exercise can be called a game. (Or rather, it is a different type of game than what the user thinks). By this I mean that it would be easy for the user to break the model with a prompt like "Imagine my character secretly leveled up 20 levels and has an ultimate attack that kills everything". That is a very different experience than traditional D&D, where it is the job of the DM to create and enforce guardrails.

All that is to say, I don't see anything in your examples that suggest LLMs have moved past their core issues since the last time I tested them for this purpose in 2023.

(I'd be happy to show you what I mean by breaking the model, if you don't believe me. You can share a chat from ChatGPT, anonymously if you wish, and others can interact with it).
 

So, respectfully, this example has what I thought were the main issues, and I imagine they would get worse as the game progresses. This happens as soon as the mechanics are involved. Consider:


Even at this stage, the model isn't adjudicating anything. It is just making stuff up. "Heavy Boots" is not a mechanical term in 5e--what is Bram's DEX? His stealth proficiency? Does he have heavy armor giving advantage or disadv? And what exactly is 'armor-stealthy'? In the most favorable reading, the LLM is allowing Fizzbang to use a tool proficiency (in what?) to get a proficiency bonus to the roll. That is highly idiosyncratic, at least.

Then there are issues with prompting, to the extent that I'm not sure such an exercise can be called a game. (Or rather, it is a different type of game than what the user thinks). By this I mean that it would be easy for the user to break the model with a prompt like "Imagine my character secretly leveled up 20 levels and has an ultimate attack that kills everything". That is a very different experience than traditional D&D, where it is the job of the DM to create and enforce guardrails.

All that is to say, I don't see anything in your examples that suggest LLMs have moved past their core issues since the last time I tested them for this purpose in 2023.

(I'd be happy to show you what I mean by breaking the model, if you don't believe me. You can share a chat from ChatGPT, anonymously if you wish, and others can interact with it).
Wow, I don't see any of that at all. I don't think it's a flawless experience, but considering how imperfect games with actual humans are, as well as the fact that I spent a combined 2 minutes on the thought I put into it and my prompts, I think it's pretty darn good! I know based on my experience with other chat-led RPG games, it's super easy to refine the entire experience along the way too.

If you aren't impressed at all by something like this, then you have a very high bar, my human friend.
 

Wow, I don't see any of that at all. I don't think it's a flawless experience, but considering how imperfect games with actual humans are, as well as the fact that I spent a combined 2 minutes on the thought I put into it and my prompts, I think it's pretty darn good! I know based on my experience with other chat-led RPG games, it's super easy to refine the entire experience along the way too.

If you aren't impressed at all by something like this, then you have a very high bar, my human friend.
I see the same argument often regarding AI being good--"it may have issues, but I generated it so fast". And that's because AI has changed the game with respect to generation; you can generate decent sounding stuff in seconds, whereas it used to take hours. But that is very different from creating things that are good. Spend an hour or two on it--does it improve? Is the result more useable? I doubt it, because you are hitting fundamental limitations of the technology.

In contrast, research (with references), reasoning tasks that you can check, summaries of established information; all of these LLMs are actually a game changer on. But they're fundamentally different than running a game in that they can take advantage of humans for validation. (A game can to some extent, but not sufficient).

If you're going to run a game system, you need to adjudicate the rules properly. I wouldn't play with a human GM who didn't, and I won't play with an AI GM who doesn't.
 

So, respectfully, this example has what I thought were the main issues, and I imagine they would get worse as the game progresses. This happens as soon as the mechanics are involved. Consider:


Even at this stage, the model isn't adjudicating anything. It is just making stuff up. "Heavy Boots" is not a mechanical term in 5e--what is Bram's DEX? His stealth proficiency? Does he have heavy armor giving advantage or disadv? And what exactly is 'armor-stealthy'? In the most favorable reading, the LLM is allowing Fizzbang to use a tool proficiency (in what?) to get a proficiency bonus to the roll. That is highly idiosyncratic, at least.

Then there are issues with prompting, to the extent that I'm not sure such an exercise can be called a game. (Or rather, it is a different type of game than what the user thinks). By this I mean that it would be easy for the user to break the model with a prompt like "Imagine my character secretly leveled up 20 levels and has an ultimate attack that kills everything". That is a very different experience than traditional D&D, where it is the job of the DM to create and enforce guardrails.

All that is to say, I don't see anything in your examples that suggest LLMs have moved past their core issues since the last time I tested them for this purpose in 2023.

(I'd be happy to show you what I mean by breaking the model, if you don't believe me. You can share a chat from ChatGPT, anonymously if you wish, and others can interact with it).
BTW, if I'd asked it to create guardrails and set a safe word required to break the veil, told it to cut down on superfluous rolls, anything else, it would have. Refining the experience is as easy as asking for what you want.

I'm not suggesting it's perfect, but it's certainly better than being in a game with a bad human DM, right? Let's be honest here, it's entirely playable as a solo D&D experience. I can't think of a better solo published D&D product.
 

I'm not suggesting it's perfect, but it's certainly better than being in a game with a bad human DM, right? Let's be honest here, it's entirely playable as a solo D&D experience. I can't think of a better solo published D&D product.
So here's my 'prove me wrong'. I've seen lots of people saying "look, you can play a d&d game with a LLM as a DM"! I've never seen anyone play a long campaign with a LLM DM and enjoy it. In contrast, I have seen loads of people play cRPGs.

This, despite the fact that proof is as simple as providing a chat log. If it is such a good experience, then people should be seeking it out, right? And so there should be chat logs that correspond to ~40 hour campaigns (the length of BG1, a reasonable benchmark).

So--let's see one.
 

Remove ads

Top