LLMs as a GM

Just wanted to share my experiences at having an LLM (in this case, Gemini) act as a gamemaster. I have completed five sessions, each that ran a couple hours long. The system I taught it was my own. The lore it used was written by me, although I gave it creative license to make-up anything it wanted, as long as it stayed with the lore's context. I built several frontload documents for it to reference, which I upload before each session. The frontload documents contain game mechanics, lore, and character summaries. I also have it write a summary of each session, just in case it (or me) needs to reference past events. Here are some key findings:

- Pros
  • It does a really nice job with social encounters. That pillar of the game has been consistent, and to be truthful, as good or better than most in-person GMed games. For example, we were spying on the antagonist at a fancy restaurant and the social encounter section (including the skill checks involved) were very well done. The NPCs that have required social skill challenges also went well. For example, we stopped a couple of thieves' guild enforcers using intimidation.
  • The exploration pillar is described very well. The skill challenges seem to match the difficulty, and they appear logical. (At least from what I have seen so far.) For example, one of the characters was trying to disarm a trap. The DC was appropriate. One of the characters was using arcana to find any source of magic in a cavern full of debris. The DC seemed appropriate.
  • In the combat pillar, it used creatures that were of appropriate strength, setting the correct DC, and giving them an appropriate level of hit points for the encounter.
  • The storyline has been entertaining, and even though I wrote the lore, it has still managed to surprise me with a good plot twist.
  • It tied in the PCs' backstories very well. (I tried to give it a layup, having the two characters both have their families wrecked by the thieves' guild in the city.) It did this well, and progressed the story nicely.
  • I don't know how, but it has a good knack for knowing when to end a session based on the storyline. Like a good gamemaster, it waits until there is downtime and then will end the session. The timing of this has varied. One ended after two hours, another after 45 minutes. But most of the ending were indeed, during a point in the story where you would "start a new chapter."
- Cons
  • Oddly, like some GMs, it decided to skip one of my PC's turns - twice. Typical gamemaster behavior. ;) Once reminded though, it corrected it and allowed the PC to have their turn.
  • For some utterly bizarre reason, it seems to have trouble keeping track of coins my characters have. But it doesn't have a problem keeping track of hit points or luck dice?
  • It requires some out of character responses that generally involve re-teaching. This can happen in a session, where in the previous session, it had done it successfully. For example, I had to explain that just because my character lost initiative, they could still use a combat card during their turn. It had done this correctly in our other three combat encounters. Why it didn't do it in this one is far beyond me.
- Notes
For those interested in trying this out, I would suggest a few things.
  • Use D&D is the first one. It has obviously been trained on that system. For example, it sometimes asks for a skill challenge roll using a skill that is not in my mechanics, but is in D&D's. The frontload document I give it has stopped this.
  • Build your frontload documents while you are communicating with it. It will help you use precise wording, and even write it for you after you explain it.
  • In your frontload documents, make sure to include examples, and non-examples.
  • Ask it what it wants. I know, it's weird. But, it has given me precise answers at times. For example, I asked it if it would run a better game using a small area, such as a single city/kingdom or if it would prefer to run a grander adventure using many locales and settings. I know it can change its answer on a dime, but asking questions like this will help tailor your frontload documents to a more specific train of thought.
  • Have it write summaries. It hasn't needed them yet and has kept everything pretty coherent (as coherent as most in-person GMs). But knowing its prone to not having a good long-term memory, these might be helpful.
For anyone interested, I have copied the summaries into this document so you can see the storyline progression. Also, I am curious if anyone else has tried this out, and what their experiences were. Thanks for reading.
Elias and Emme - Adventure Summary
 

log in or register to remove this ad

I setup a Google Notebook LM for rules questions and setting development for a game I started in January. New system and all so I wasn't too aware of the setting or rules and it helped a lot. The thing I like about Notebook is you do not need to reload documents, you upload them once and it always refers to them. Not sure how good it would do on working as a GM though, but it works great for what I wanted it for. (Asking rules questions, having it create descriptions for cities, coming up with names, etc.)
 

Sounds great. My experience was not that great tho...

A couple of months ago I asked Copilot to create an adventure and run it for me as a player, directing it to use the latest D&D SRD for any resolution of challenges.

Character creation (only one PC) seemed to go very smoothly, although it probably wasn't 100% accurate with the rules (I think it forgot a couple of things), the AI essentially interviewed me into building a character according to narrative wishes and tactical preferences.

Then it began to take my character through the narrative of the story (which I previously asked to be of the vanilla-fantasy kind), which started with social interactions with NPCs to investigate some troubles in a village, and then moved to wilderness exploration. At some point in this phase, Copilot let me find and "hire" an ally into the group, which the AI then seemed to use to communicate with my PC more in-character, but basically acted as a sort of squire who would always do whatever I asked them to. The game actually didn't feature any combat encounter because I chose stealth most of the time, and ended up stalking monsters instead of attacking them, all the while none of the monsters attacked or ambushed us (TBH, they seemed to be good-aligned monsters after all).

All in all it wasn't bad, until I realized I had been playing for 3-4 hours, and have been going around in circles a bit too much... then I asked the AI specifically out-of-character: "have you actually planned an ending for the adventure, or are you making me play forever?" and it responded that "yes! I am making you play forever because you seem to enjoy the game so much so let's keep going" :rolleyes:

Also, I kind of think I barely failed any check at all during the game... it was clearly cheating to making me win every challenge.
 

I setup a Google Notebook LM for rules questions and setting development for a game I started in January. New system and all so I wasn't too aware of the setting or rules and it helped a lot. The thing I like about Notebook is you do not need to reload documents, you upload them once and it always refers to them. Not sure how good it would do on working as a GM though, but it works great for what I wanted it for. (Asking rules questions, having it create descriptions for cities, coming up with names, etc.)
Did you do a dedicated dataset outside the LLM for this? After a while, it will forget your original upload, at least that has been my experience. It might take a while, but it will happen unless you have a dedicated dataset for it to utilize a retrieval system for. (I think that is called RAG.) I only ask because I made the mistake of walking through about a hundred creatures with it. It offered great advice, but when we got to the end, it couldn't tell me anything about the first twenty creatures we discussed. And, after covering lore and cultures, it forgot the creatures all together. So you might want to keep adding to the Notebook for that purpose. Just a thought.
 

Just wanted to share my experiences at having an LLM (in this case, Gemini) act as a gamemaster. I have completed five sessions, each that ran a couple hours long. The system I taught it was my own. The lore it used was written by me, although I gave it creative license to make-up anything it wanted, as long as it stayed with the lore's context. I built several frontload documents for it to reference, which I upload before each session. The frontload documents contain game mechanics, lore, and character summaries. I also have it write a summary of each session, just in case it (or me) needs to reference past events. Here are some key findings:

- Pros
  • It does a really nice job with social encounters. That pillar of the game has been consistent, and to be truthful, as good or better than most in-person GMed games. For example, we were spying on the antagonist at a fancy restaurant and the social encounter section (including the skill checks involved) were very well done. The NPCs that have required social skill challenges also went well. For example, we stopped a couple of thieves' guild enforcers using intimidation.
  • The exploration pillar is described very well. The skill challenges seem to match the difficulty, and they appear logical. (At least from what I have seen so far.) For example, one of the characters was trying to disarm a trap. The DC was appropriate. One of the characters was using arcana to find any source of magic in a cavern full of debris. The DC seemed appropriate.
  • In the combat pillar, it used creatures that were of appropriate strength, setting the correct DC, and giving them an appropriate level of hit points for the encounter.
  • The storyline has been entertaining, and even though I wrote the lore, it has still managed to surprise me with a good plot twist.
  • It tied in the PCs' backstories very well. (I tried to give it a layup, having the two characters both have their families wrecked by the thieves' guild in the city.) It did this well, and progressed the story nicely.
  • I don't know how, but it has a good knack for knowing when to end a session based on the storyline. Like a good gamemaster, it waits until there is downtime and then will end the session. The timing of this has varied. One ended after two hours, another after 45 minutes. But most of the ending were indeed, during a point in the story where you would "start a new chapter."
- Cons
  • Oddly, like some GMs, it decided to skip one of my PC's turns - twice. Typical gamemaster behavior. ;) Once reminded though, it corrected it and allowed the PC to have their turn.
  • For some utterly bizarre reason, it seems to have trouble keeping track of coins my characters have. But it doesn't have a problem keeping track of hit points or luck dice?
  • It requires some out of character responses that generally involve re-teaching. This can happen in a session, where in the previous session, it had done it successfully. For example, I had to explain that just because my character lost initiative, they could still use a combat card during their turn. It had done this correctly in our other three combat encounters. Why it didn't do it in this one is far beyond me.
- Notes
For those interested in trying this out, I would suggest a few things.
  • Use D&D is the first one. It has obviously been trained on that system. For example, it sometimes asks for a skill challenge roll using a skill that is not in my mechanics, but is in D&D's. The frontload document I give it has stopped this.
  • Build your frontload documents while you are communicating with it. It will help you use precise wording, and even write it for you after you explain it.
  • In your frontload documents, make sure to include examples, and non-examples.
  • Ask it what it wants. I know, it's weird. But, it has given me precise answers at times. For example, I asked it if it would run a better game using a small area, such as a single city/kingdom or if it would prefer to run a grander adventure using many locales and settings. I know it can change its answer on a dime, but asking questions like this will help tailor your frontload documents to a more specific train of thought.
  • Have it write summaries. It hasn't needed them yet and has kept everything pretty coherent (as coherent as most in-person GMs). But knowing its prone to not having a good long-term memory, these might be helpful.
For anyone interested, I have copied the summaries into this document so you can see the storyline progression. Also, I am curious if anyone else has tried this out, and what their experiences were. Thanks for reading.
Elias and Emme - Adventure Summary

Did you train it using reinforcement fine-tuning, in the form of a dataset of queries and preferred responses, or did you just feed it a bunch of documents and let it infer meaning?
 


Sounds great. My experience was not that great tho...

A couple of months ago I asked Copilot to create an adventure and run it for me as a player, directing it to use the latest D&D SRD for any resolution of challenges.

Character creation (only one PC) seemed to go very smoothly, although it probably wasn't 100% accurate with the rules (I think it forgot a couple of things), the AI essentially interviewed me into building a character according to narrative wishes and tactical preferences.

Then it began to take my character through the narrative of the story (which I previously asked to be of the vanilla-fantasy kind), which started with social interactions with NPCs to investigate some troubles in a village, and then moved to wilderness exploration. At some point in this phase, Copilot let me find and "hire" an ally into the group, which the AI then seemed to use to communicate with my PC more in-character, but basically acted as a sort of squire who would always do whatever I asked them to. The game actually didn't feature any combat encounter because I chose stealth most of the time, and ended up stalking monsters instead of attacking them, all the while none of the monsters attacked or ambushed us (TBH, they seemed to be good-aligned monsters after all).

All in all it wasn't bad, until I realized I had been playing for 3-4 hours, and have been going around in circles a bit too much... then I asked the AI specifically out-of-character: "have you actually planned an ending for the adventure, or are you making me play forever?" and it responded that "yes! I am making you play forever because you seem to enjoy the game so much so let's keep going" :rolleyes:

Also, I kind of think I barely failed any check at all during the game... it was clearly cheating to making me win every challenge.
That is interesting. Maybe it would be smart of me to add in a desired number of sessions for the frontload document. As for always succeeding, it has asked me to roll the dice for my characters, and they have failed often. It doesn't seem to have difficulty describing how the character fails and giving it consequences. (For example, you failed your break challenge and now are grappled by the amphibious creature's elongated tongue, or you failed your climb challenge and fall out of three taking 2 points of damage.) That said, it has asked me what the NPC does or what we find in the chest, and I responded, "You're the gamemaster and control the environment and NPCs. I can only control my characters." Once I did that a few times, I placed it into the frontload doc, and it has been good since.
 

Did you train it using reinforcement fine-tuning, in the form of a dataset of queries and preferred responses, or did you just feed it a bunch of documents and let it infer meaning?
After each section of rules, I asked it if it had questions. It always did. I often answered those using examples, which were eventually placed in the frontload documents. Sometimes, it also needed non-examples to utilize the rule or lore correctly. I have also added those. That process took a while, but in the end, it felt like the results offered were much more consistent and comprehensive.
 

Why not just play a video game? It's essentially what you're trying to do, right (computer gaming)? I don't understand how people are trying to reinvent the wheel here and I'd really like to know your motivations for pursuing this kind of game 🤓
 

All in all it wasn't bad, until I realized I had been playing for 3-4 hours, and have been going around in circles a bit too much... then I asked the AI specifically out-of-character: "have you actually planned an ending for the adventure, or are you making me play forever?" and it responded that "yes! I am making you play forever because you seem to enjoy the game so much so let's keep going" :rolleyes:
Sorry for the second reply, but I just wanted to say thank you for that bit of information. I asked Gemini this question:
"Hi Gemini. I am back. (It was quick, I know.) I have a question for you about our campaign: Approximately how many sessions do you plan to have for this storyline? We have completed 5 already. Do you think it will run more than 10? If so, can you get it down to 10 sessions?"
This was the response it gave me.
"
Hi! Welcome back, that was quick indeed – I'm always ready when you are!

That's a great question about the session count for this particular storyline. We've completed 5 sessions so far, and this plot involving Arturus, the Vling, the Shadow Realm, and the Thieves' Guild has certainly broadened in scope.

To give you a clear answer: I will aim to conclude this specific storyline, bringing the major threads initiated by the discovery of The Old & Cold and Arturus's plot to a satisfying resolution, within the next 5 sessions, making it a total of 10 sessions.

It will be a tight but exciting pace! This means we'll likely focus on the core objectives and major confrontations as the narrative progresses. Of course, the organic flow of the game and your choices as players can always influence pacing, but I will actively work to keep us on track for that 10-session goal for this current arc.

So, yes, I believe we can bring this exciting phase of the campaign to a conclusion within your desired timeframe.

Are you ready to jump back in and see what Captain Marwick decides to do with all that new, crucial intelligence?"

I definitely do not think it would have done that had I not been specific. This was a great lesson for me helping to write the frontload document. I suppose it isn't too much like a session zero in that way. Thanks again.
 

Remove ads

Top