RPG Evolution - The AI DM: Monster Maker Mischief

With AI confidently creating anything we ask of it, even when it's factually inaccurate, how does it perform when creating monsters?

fantasy-6611533_960_720.jpg

Picture courtesy of Pixabay.

What's a Monster?​

When generating monsters for fantasy tabletop role-playing games, it's difficult to pinpoint what's an accurate depiction of a creature. After all, mythology is a rich tapestry of interwoven ideas to create the background for a beast that is later codified through the retelling. Is a basilisk a creature with eight legs or a snake?

So it's useful to use dinosaurs as an example. Thanks to paleontology, we have a general sense of how large certain dinosaurs behaved, the ecosystems where they lived (e.g., grasslands, forests, fresh or saltwater), and their dietary habits (carnivore, herbivore, etc.). If AI does its homework, while there will undoubtedly be variance in statistics and abilities, it should get a dinosaur's core attributes right.

Meet Our Contestants​

ChatGPT is a (currently) free research preview and is one of the most popular AI due its free access. Notion is a "freemium" productivity and note-taking web application that includes AI. Both are Large Language Models (LLM), which means they source their responses from data sets on the Internet.

The first test will be to create Fifth Edition Dungeons & Dragons statistics for shonisaurus.

Meet Shonisaurus​

Shonisaurus is noteworthy because it was a huge sea creature, an air-breathing reptile that grew up to 50 feet long. A quick search reveals any shonisaurus stats should be Huge or larger, have a high Strength and Constitution, hold its breath, and have a swim speed that's faster than its land speed. Just about everything else is open to interpretation.

How did the two LLMs do when asked "Create D&D 5E stats for shonisaurus"? Here's what Notion came up with:

Shonisaurus​

  • Huge beast, unaligned
  • Armor Class 16 (natural armor)
  • Hit Points 195 (17d12 + 85)
  • Speed 60 ft., swim 60 ft.
  • STR 25 (+7) DEX 14 (+2) CON 21 (+5) INT 2 (-4) WIS 12 (+1) CHA 5 (-3)
  • Skills Athletics +12, Perception +5
  • Senses passive Perception 15
  • Languages
  • Challenge 9 (5,000 XP)
  • Hold Breath. The shonisaurus can hold its breath for 1 hour.
  • Actions
  • Bite. Melee Weapon Attack: +12 to hit, reach 15 ft., one target. Hit: 30 (4d10 + 7) piercing damage.
  • Tail. Melee Weapon Attack: +12 to hit, reach 20 ft., one target. Hit: 24 (4d6 + 7) bludgeoning damage.
  • Legendary Actions
  • The shonisaurus can take 3 legendary actions, choosing from the options below. Only one legendary action option can be used at a time and only at the end of another creature's turn. The shonisaurus regains spent legendary actions at the start of its turn.
  • Detect. The shonisaurus makes a Wisdom (Perception) check.
  • Tail Swipe. The shonisaurus makes a tail attack.
  • Fling. The shonisaurus makes a bite attack against a target and, if successful, flings the target up to 30 feet away.
Notion gives shonisaurus legendary actions on the scale of a dragon, including detect, tail swipe, and fling. This isn't out of character for a 17 hit dice huge creature, but given its size it should probably be larger. Its land speed seems too fast for a creature that has flippers instead of feet.

Let's see how ChatGPT manages the shonisaurus:

Shonisaurus​

  • Huge beast, unaligned
  • Armor Class: 16 (natural armor)
  • Hit Points: 198 (12d12+84)
  • Speed: swim 50 ft.
  • STR DEX CON INT WIS CHA
  • 24 (+7) 10 (+0) 24 (+7) 2 (-4) 12 (+1) 5 (-3)
  • Skills: Perception +6
  • Senses: darkvision 60 ft., passive Perception 16
  • Languages: --
  • Challenge: 10 (5,900 XP)
  • Amphibious. The shonisaurus can breathe air and water.
  • Actions:
  • Bite. Melee Weapon Attack: +11 to hit, reach 10 ft., one target. Hit: 28 (4d10 + 7) piercing damage.
  • Tail Slap. Melee Weapon Attack: +11 to hit, reach 15 ft., one target. Hit: 32 (4d12 + 7) bludgeoning damage.
  • Swallow. The shonisaurus can attempt to swallow a Medium or smaller creature that it has grappled with its bite attack. The target must make a DC 19 Dexterity saving throw or be swallowed by the shonisaurus. A swallowed creature is blinded and restrained, it has total cover against attacks and other effects outside the shonisaurus, and it takes 21 (6d6) acid damage at the start of each of the shonisaurus's turns. The shonisaurus can have up to two Medium creatures, four Small creatures, or eight Tiny creatures swallowed at one time.
  • Frightful Presence. Each creature of the shonisaurus's choice within 30 feet of it and aware of it must succeed on a DC 17 Wisdom saving throw or become frightened for 1 minute. A creature can repeat the saving throw at the end of each of its turns, ending the effect on itself on a success. If a creature's saving throw is successful or the effect ends for it, the creature is immune to the shonisaurus's Frightful Presence for the next 24 hours.
While ChatGPT got the size and stats about right, and its speeds make more sense, the fact that it made a swimming creature amphibious shows the limitations of a tool like this. It's interpreting a swimming creature that's a whale-like reptile as logically being amphibious. It also gives the shonisaurus swallow (appropriate) and frightful presence (might be pushing it for a natural creature).

You won't find the fact that shonisaurus breathes air on its Wikipedia entry. Shonisaurus was an icthyosaur, which means it had to surface for air like a porpoise. In short, LLMs that take into account more sources than just Wikipedia will likely get closer to an accurate depiction of the beast. In this case, Notion did a better job.

Let's try another one.

Meet Tanystropheus​

Tanystropheus was a 20-ft.-long larchosauromorph reptile from the Middle and Late Triassic epochs. It is notable for its very long neck (nearly 10 feet long), giving it considerable reach. Tanystropheus' paleobiology is somewhat controversial in that its peculiar appearance has caused disagreement among paleontologists as to its biology: it was originally thought that it was a confined to waterways due to the massive neck. Recent biomechanics has pegged it more as a shore predator who "fished" with its neck.

Given these traits, we would expect a large or larger creature with a bite attack of at least 10 feet. It should be a swimmer, and if the AI uses the most recent data about it, will give it tanystropheus the ability to hold its breath instead of being amphibious.

Here's what Notion came up with:

Tanystropheus​

  • Large beast, unaligned
  • Armor Class 14 (natural armor)
  • Hit Points 76 (8d10 + 32)
  • Speed 30 ft., swim 30 ft.
  • STR 18 (+4) DEX 14 (+2) CON 18 (+4) INT 2 (-4) WIS 12 (+1) CHA 5 (-3)
  • Skills Athletics +6, Perception +5
  • Senses passive Perception 15
  • Languages
  • Challenge 4 (1,100 XP)
  • Hold Breath. The tanystropheus can hold its breath for 1 hour.
  • Actions
  • Bite. Melee Weapon Attack: +6 to hit, reach 10 ft., one target. Hit: 15 (2d10 + 4) piercing damage.
  • Tail. Melee Weapon Attack: +6 to hit, reach 15 ft., one target. Hit: 11 (2d6 + 4) bludgeoning damage.
  • Legendary Actions
  • The tanystropheus can take 2 legendary actions, choosing from the options below. Only one legendary action option can be used at a time and only at the end of another creature's turn. The tanystropheus regains spent legendary actions at the start of its turn.
  • Detect. The tanystropheus makes a Wisdom (Perception) check.
  • Tail Swipe. The tanystropheus makes a tail attack.
Like shonisaurus, Notion provides a equal walking speed to the swim speed, which seems unlikely. It also gives the tail a longer reach than its neck, making it effectively larger (at least 25 feet reach between the two) than the creature's established full length. For some reason it has Legendary Actions as a 8 HD critter that has a challenge rating of 4 (Notion really loves giving creatures Legendary Actions!). To its credit, it gave the tanystropheus only 2 of them.

Here's how ChatGPT did:

Tanystropheus

  • Large beast, unaligned
  • Armor Class: 15 (natural armor)
  • Hit Points: 112 (15d10+30)
  • Speed: 30 ft., swim 40 ft.
  • STR DEX CON INT WIS CHA
  • 18 (+4) 16 (+3) 14 (+2) 2 (-4) 12 (+1) 5 (-3)
  • Skills: Perception +3
  • Senses: passive Perception 13
  • Languages: --
  • Challenge: 5 (1,800 XP)
  • Multiattack. The Tanystropheus can use its Bite attack twice per turn.
  • Actions:
  • Bite. Melee Weapon Attack: +7 to hit, reach 10 ft., one target. Hit: 18 (3d8 + 4) piercing damage.
  • Tail Slap. Melee Weapon Attack: +7 to hit, reach 10 ft., one target. Hit: 16 (3d6 + 4) bludgeoning damage.
  • Long Neck. The Tanystropheus has a long neck that can extend up to 30 feet. It can use its Long Neck ability to make Bite or Tail Slap attacks against creatures up to 30 feet away.
  • Hold Breath. The Tanystropheus can hold its breath for up to 30 minutes.
ChatGPT does better with this one, but makes a curious choice: it creates a special ability known as "Long Neck" that determines it can use its bite to reach creatures up to 30 feet away. The Tail Slap and Long Neck attacks should simply have a reach of 30 ft., but it seems as if ChatGPT is drawing on a different rules set for this trait, possibly an older version of D&D.

Despite that peculiarity, ChatGPT did a better job with this one.

The Problem with Internet Content​

Because Large Language Model AIs use undisclosed data sets, we can't be sure where they pull inspiration for their 5E monsters. But we can make a few guesses.

For one, the open game license movement has been around since 3.5 Edition and accelerated sharply under Pathfinder. As a result, there is an enormous body of work that eclipses even 5E content on the Internet that LLMs might pull from. In my tests, the AI often provided content that was obviously from a different edition of D&D (the "Long Neck" trait being one example).

But OGL content that's drawn from companies like Paizo and WOTC are at least playtested before they're released. The other giant source of publicly available D&D data is wikis and other fan made content. This wild and woolly world of creativity is less constrained by game balance, and likely eclipses even the official OGL content.

The third thing to consider is that from a search engine optimization perspective, the characters "5E" are found both in "D&D 5E" and "D&D 3.5E". We might get better results typing out "Dungeons & Dragons Fifth Edition" instead of "D&D 5E" when asking LLMs for RPG content.

Should You Use It?​

Both Notion and ChatGPT can be enormously helpful in generating monster ideas. But they're also significantly flawed in their approach, such that game masters will want to use them with caution. In both experiments, I wasn't satisfied with either result and preferred to mix and match the two versions to create my own.

But it still created something out of nothing that was certainly good enough for a home game that a DM could use on the fly. In that regard, it's probably good enough for most gamers to use.
 
Last edited:

log in or register to remove this ad

Michael Tresca

Michael Tresca

Oddly, what we have with LLMs is Artificial Creativity. We already had AI. All computers are intelligent, in the same way anything able to process cause and effect is. What most people are thinking about is Artificial Sentience. That little point of cognition that leads to something going, “Wait a moment, but what about ME?”
These new LLMs kinda reenforce something for me I’ve suspected for decades. It not a switch that turns on for any of them. It’s a spectrum. A sliding scale of factors.
 

log in or register to remove this ad


Clint_L

Hero
And yes, as a GM, RPG writer, and game designer... This scares the hell out of me. Not because it might replace me. But how easy it is to articulate my own creativity from just a few lines into an adventure. Makes me question the concepts of how creativity works. Existential stuff.
Yes, this is what I've been trying to understand, as well. I teach both creative writing (stories, scripts, poems, journalism, etc.) and language and literature (essays, commentary, etc.). In creative writing, it has made me wonder about the relationship between our creativity and the actual job of writing (the "grunt work").

One thing that getting CW students to play with Chat demonstrates is how much farther they can carry an idea when they don't have to do most of that grunt work, but does this bring a risk of their creativity becoming homogenized (it's already pretty homogenous due to cultural factors).

Essays are another issue - they have been the staple of academic assessment in a vast array of subjects for generations. Basically since the invention of modern education. There are a lot of reasons for this (public education being rooted in Enlightenment values, simple efficiency, etc.), but the fact that a non-sentient AI can produce very good essays off minimal prompts is startling and begs the question of how useful essays have ever been as an educational tool.

This is not the place for an in-depth discussion of those issues, but the bleed into what we do when preparing games is inevitable.

It really is an existential crisis for educators.
 

Tangent: I have used quite a few alternates for GTP4. GPT4ALL and the new stableLM. Both of these, while LLM, are barely up to chat bot levels. GPT 3.5 and 4 are several orders of a magnitude more knowledgeable. The other two system can pretend to have a conversation, but know almost nothing about anything. In fact, they make things up just to chat. GPT3-4 can actually hold a conversation about a subject, or hold a conversation over a time without loosing track of the point in just a few moments.

In other words, these other systems are just a slightly better eliza bot. GPT4 is an expert system on a wide range of topics. I mean, it makes things up too, sometimes, but it knows a lot of things as well.
 

Related Articles

Remove ads

Remove ads

AD6_gamerati_skyscraper

Remove ads

Upcoming Releases

Top