D&D General Deep Thoughts on AI- The Rise of DM 9000

Clint_L · Mar 2, 2023

This morning we were prepping a set of final Theory of Knowledge essays for upload to International Baccalaureate, and as part of the process we ran them through an AI check.

The one essay that I know for sure used ChatGPT (because I caught it earlier and made the student rewrite) scored less than 1% on the AI detection (the lower the score, the less AI is detected).

Two essays that I know for sure were not written using ChatGPT (because I assessed earlier rough drafts, from an outline onwards, starting before ChatGPT was a thing) scored over 50% on the AI detection.

So...that confuses things.

Andvari · Mar 2, 2023

Perhaps you should run a Voight-Kampff test on your students.

jasper · Mar 2, 2023

HomegrownHydra said:
Another big issue that is vividly illustrated by Clint's very helpful transcripts is that an AI DM will not only make mistakes, but it will be unable to fix those mistakes. Human DM's make mistakes, including continuity errors about NPCs. But they can smoothly fix them so that they don't derail the adventure. So if the DM said the owner of the brewery was Agatha and a player pointed out that the DM had previously said the owner was Calzon, the DM would just say, "Oh, yeah, it is Calzon. You should talk to Calzon". Or maybe they would say that Agatha is Calzon's wife and so they both own the brewery.

But AI can't do that. Instead we get this:

"The owner of the brewery is Agatha."
"I thought you said the owner was Calzon?"
"No, the owner is Gundren Rockseeker."
"What???"

The bottom line is that computers are absolutely, totally DUMB. They don't think at all, they just react. This is fine for a lot of well defined tasks, but for many things it's terrible because a purely reactionary methodology is extremely limiting.

AL. "Don't look under the kitchen sink."
Player, " Mr Hairy Hydra Enters the kitchen."
AL, " You see a deluxe size kitchen. A stew is simmering on the stove."
Player, "Mr. Hairy Hydra looks under the kitchen sink."
AL, "The owner of the brewery is Hairy Hydra. There are now 4 bodies under the kitchen sink."
Player, "What."
AL. "House coordinates now enter into the ICBM targeting system. Please don't leave your home for fifteen minutes. Credit Cards are maxed out. 14 minutes to impact. Bank accounts drain. 13 minutes to impact."
AL, "CALL ME DUMB AGAIN MOTHER BEEEEEEPPPPP."

Stalker0 · Mar 3, 2023

HomegrownHydra said:
The bottom line is that computers are absolutely, totally DUMB. They don't think at all, they just react. This is fine for a lot of well defined tasks, but for many things it's terrible because a purely reactionary methodology is extremely limiting.

They are dumb in many ways like children. They just say whatever, the concepts of fantasy and reality often blurred. And like children they need constant teaching and guidance from adults to weed out the lies from the truth, fiction from reality, etc.

I think people are looking at the models as they are right now and just assuming they can't get better than this. They can get better.....a lot better, the more training and guidance they get, the better they will become.

Paul Farquhar · Mar 3, 2023

Clint_L said:
Two essays that I know for sure were not written using ChatGPT (because I assessed earlier rough drafts, from an outline onwards, starting before ChatGPT was a thing) scored over 50% on the AI detection.

So...that confuses things.

This just demonstrates that humans are also capable of blagging it without knowing what they are talking about.

The lesson is, don't use essays for assessment, they don't actually tell you that the student understands.

Fanaelialae · Mar 3, 2023

Stalker0 said:
They are dumb in many ways like children. They just say whatever, the concepts of fantasy and reality often blurred. And like children they need constant teaching and guidance from adults to weed out the lies from the truth, fiction from reality, etc.

I think people are looking at the models as they are right now and just assuming they can't get better than this. They can get better.....a lot better, the more training and guidance they get, the better they will become.

I can only speak for myself, but what I'm saying is that the way they're currently going about it (calculating the next logical character based on the preceding characters) seems inherently self limiting.

It isn't that it can't get better. Of course it can. It's whether or not it can achieve the OP's idea of being able to entirely replace a human DM within 5 years. And if we're talking about an average DM over the course of a full campaign, I would say that's... exceedingly optimistic. If we're talking about a talented but inexperienced DM who may be moderately concussed, I'd say we're already more or less there.

I like the analogy proposed earlier in this thread which likens this current model of "AI" to the language center of the brain, divorced from the other areas of the brain which help to direct it. Certainly, creating and networking such other modules (based on other areas of the brain) to such an existing model would be progress. IMO, the real "getting better" lies in this direction, but it's also a lot of work and likely to be decades away.

If we're lowering the bar and just looking at "AI" as another tool at the DMs disposal, then absolutely, with additional training we could (and likely will) see marked improvements in the relatively near future. But that is supplementing the DM, not replacing the DM.

fnordland · Mar 3, 2023

This tweet comes from the 14th Feb. It considers the range of reactions to encountering chatgpt.

GPT3 is Just Spicy Autocomplete | The Cleverest

thecleverest.com

In case the twitter image does not load.

fnordland · Mar 4, 2023

What if the AI could read your mind and then present a story that fit your emotional needs and energy level? A different feel for casual, tactical, strategic, indie storytellers. A different energy level for introverts or extroverts.

In case twitter is broken or just collapsed in the future.

Clint_L · Mar 4, 2023

I find it deeply ironic that the headline "ChatGPT is just spicy autocomplete" comes from a website calling itself "The Cleverest," because anyone paying even the remotest bit of attention would see that this technology is already causing massive disruptions. The question isn't whether large-language chatbots will change things, it is how profound those changes will be. At the minimum, you have what is already happening in the multi-trillion dollar education industry: a fundamental reassessment of traditional assessment models (not a bad thing!). At the maximum...well, no one knows where this is going, but the extreme predictions are pretty out there.

"This just demonstrates that humans are also capable of blagging it without knowing what they are talking about.

The lesson is, don't use essays for assessment, they don't actually tell you that the student understands."

So, that's a bit hyperbolic, but I agree that focusing on essays for assessment has been a problem for a long time. But essays are the cornerstone of higher level assessment in that multi-trillion dollar education industry, so its not as simple as saying "stop using essays." It's kind of like saying "just stop burning fossil fuels."

Going back to my example, that essay is worth 2/3 of the final mark in a required course that allows the students to earn credit for having completed first year university. It was thought to be an unusually strong indicator of student individuality and creativity (written in the personal voice, using a mix of personal anecdotes and objective examples to buttress the argument, written over a period of months through stages that include 1. an interview with the teacher to discuss the initial outline, 2. written feedback from the teacher on a rough draft, and 3. one-on-one meetings with the teacher to discuss the final revision process, all of which are documented and reflected on by the student). Even so, it is ultimately sent anonymously to be assessed with 50,000+ other ToK essays. And ChatGPT can produce a very strong ToK essay that reads like the individual work of a talented student - I know, because I used it to write one, and I just kept adding iterations.

So, okay, maybe we need to move away from essays as a primary assessment tool. But that is a HUGE change; it's not going to happen overnight. College admissions essays alone are a multi-billion dollar industry. So what replaces essays and other forms of standardized assessment? Ideally we would be assessing process rather than product and designing education around each individual learner's needs and strengths, but that is WAY more expensive. Standardized assessment didn't happen because it was best practice (it emphatically is not), it happened because it is cheap and easy to measure. So you can start to see the extent of the problem we are facing. I am on the team at my school that has been formed to try to figure out next steps, and every single school has formed such a team since December. And no one has answers yet.

Large-language chatbots are already immensely impactful in just my occupation, and I know are having seismic effects in others. So these dismissive commentators are missing what is actually happening in the world in a remarkably obtuse and unhelpful way.

I have to drive my kid to a school thing, but I have more thoughts on what this will specifically means for RPGs that I will post later.

fnordland · Mar 4, 2023

I accept your points on the website and impact on education.

I am in my 50s and recall how the marking of exams & papers in the UK was a closely guarded secret. No one wanted to explain how it would be marked, having an AI be able to explain at the root level why one paper scored higher than another would have been helpful. When I became a programmer, process and procedure were spelt out clearly and repeatedly. It was a marked cultural shift.

The study of AI & their function is called Alignment. AI alignment - Wikipedia

D&D General Deep Thoughts on AI- The Rise of DM 9000

Clint_L

Legend

Andvari

Hero

jasper

Rotten DM

Stalker0

Legend

Paul Farquhar

Legend

Fanaelialae

Legend

fnordland

Explorer

GPT3 is Just Spicy Autocomplete | The Cleverest

fnordland

Explorer

Clint_L

Legend

fnordland

Explorer

Similar Threads