D&D General D&D and AI Text-to-Image Generation

Byron.the.Bard

Villager
Byron the Bard has had fun with D&D inspired text-to-image generation!

Here's a little bit of context:

Machine learning models for generating images from text prompts have significantly improved over the last few years. These models process a large data of images with captions, and encode knowledge about the world and language statistically. Once trained, these models may be presented with a text prompt and they will create an original image on demand. Here we have some fun by trying to elict how much knowledge of the world of roleplaying games, and specifically Dungeons and Dragons, one of these models can capture.

Take a look for instance at the beholders generated by an AI:

a-band-of-kobolds-fight-against-a-beholder.png
a picture of a beholder by Picasso.png



To read more and see more images, check it at: D&D and Text-to-Image Generation
 
Last edited:

log in or register to remove this ad

There's a fascinating (and kind of scary) new generative neural network that can create whole images purely from text prompts, images that are potentially photorealistic and modifiable (e.g. asking the computer to add a sofa or remove a plant). It can even work in different styles or artistic flourishes.

I immediately saw the potential for use in generating character art in D&D. It's not directly accessible by the public, sadly, but there are exciting developments on this front.

Here's a link.

Edit: hah, serves me right for not going to your link first! This is just the more advanced version of the thing you've discussed.
 




G

Guest 7034872

Guest
Yeah...Artflow doesn't handle non-human faces very well, does it? Like, not even tieflings, to say nothing of dragonborn.

Everything non-human looks like an elf or an orc.
All true. I'm still surprised by just how plausible the human faces are, though. I had no idea the software had come this far.
 

J.Quondam

CR 1/8
All true. I'm still surprised by just how plausible the human faces are, though. I had no idea the software had come this far.
AI does photorealistic faces, too:


On that site, there are no inputs or prompts or anything; just hit reload to get a new picture. Usually you have to pore over the image pretty closely to spot the little discrepancies that show it's artificially generated. (Though once you spot some of the weirder ones, they can't be unseen, heh!)

It's a little shocking how good the tech has become-- and how fast it continues to evolve.
 


Eventually all formerly creative media will be automatically-generated by AI, probably.
Unlikely. Statistical models can become quite adept at generating patterns with which they have some familiarity. E.g. the one I linked that allows people to alter the style of an image, or request the same painting from a different angle. But there remains a key problem with such approaches, one that is unlikely to be resolved quickly or easily.

These models do not contain semantic content.

Remember when that new GPT model was announced, and the people making it made ominous promises not to release it for others to use because they were afraid that it was too dangerous? Yeah that can generate maybe a couple paragraphs of text without (much) need for humans fixing the problems. As shown with the "unicorns that speak perfect English" fake article though, it breaks down really badly once you get past three or four paragraphs, becoming wildly unhinged and self-contradictory. That's because it contains no semantic content, only statistically observed syntactic content.

Neither DALL-E nor GPT3 has the smallest scrap of semantic content, of grasping the meaning and significance of a piece. A great demonstration of this comes up with trying to create an AI that generates classical music. You can often train one that can generate some really beautiful and surprising  passages of, say, baroque music in the style of Bach, but it will struggle mightily with introductions and cadences, because the AI doesn't understand anything; it cannot "see" that a piece needs a satisfying ending, to say nothing of knowing what "satisfying ending" means. Much of the time it will produce a bizarre infinitely-continuing piece, where chunks of beautiful Bach-like filigree are embedded in a sea of random noise notes. Better training can reduce the amount of noise, but you need a whole different approach to write music that will actually be pleasant to listen to.

For relatively simple things, like still images, simple and short videos that only change small parts (like faces) of extant video, etc. then yes, there is the possibility that these tools could someday replace human effort for all but minor finishing touches. For novels or cartoons or performance music? No, not really, not any time soon. We're pushing the boundaries of what's possible, and we'll be running into limits of data storage and usability soon.

Plus...the instant you throw a genuine unknown at one of these things it chokes. Sometimes badly. As noted with my "wow this thing can't even do tiefling" thing; it just spits out humans, elves, and orcs. Maybe dwarves and halflings too, haven't tested them yet.

So yeah. The much more realistic concern is these things being used to automate parts of entertainment that currently do require humans, turning certain aspects of some types of creative work into mere editing. But even writing a news article is gonna be tough when you need to actually reference real names, quotes, etc. and not fictive made up ones.

As long as AI deal only in long-scale statistical associations between things, that is, only in syntactic content, then no matter how complex they do it, they will never completely replace human effort in creative media. You need semantic content, understanding how the meanings and purposes relate to one another, to produce most creative works of any meaningful length.
 
Last edited:

Thanks for that, it makes me feel a bit better. "AI making artists obsolete" is one of those ideas that gives me the willies (although a lot of the AI generated images I've seen give me a very strong "uncanny valley" feeling).

I did find it amusing that even this AI thinks that Demogorgon is that creature from Stranger Things.
 



Byron.the.Bard

Villager
Unlikely. Statistical models can become quite adept at generating patterns with which they have some familiarity. E.g. the one I linked that allows people to alter the style of an image, or request the same painting from a different angle. But there remains a key problem with such approaches, one that is unlikely to be resolved quickly or easily.

These models do not contain semantic content.

Yeah, I think that is very right and some of the failures in the post illustrate that the model is only parsing syntactic tokens. In a way, one of the challenges in using such models is a crafting a good (syntactic) string with enough tokens and hacks that the model spits out something relevant.

I would say that in many similar statistical models a critical area is where the model does not have examples and must interpolate/extrapolate. I think that is both problematic and critical: without semantic grounding the model can hardly fill the void meaningfully; but without the constraint of semantics, it can bend syntactic categories and generates wild outputs. Finding these outputs may be fun and inspiring for human artists (or for whoever like me who can not draw a beholder! :D)
 

I immediately saw the potential for use in generating character art in D&D. It's not directly accessible by the public, sadly, but there are exciting developments on this front.

Here's a link.
thank you

edit on the wait list for the big one (can't imagine I will get in) and the mini one keeps telling me it's busy... I tried for purple hair metal winged hawkwoman just to see what it would do but no go
 
Last edited:


overgeeked

B/X Known World
Byron the Bard has had fun with D&D inspired text-to-image generation!

Here's a little bit of context:



Take a look for instance at the beholders generated by an AI:

View attachment 249593View attachment 249594


To read more and see more images, check it at: D&D and Text-to-Image Generation
Something like that but for black and white line art would be amazing. How does one dive into AI-generated art? And what are the copyright issues with AI-generated art?
 



Beleriphon

Totally Awesome Pirate Brain
AI does photorealistic faces, too:


On that site, there are no inputs or prompts or anything; just hit reload to get a new picture. Usually you have to pore over the image pretty closely to spot the little discrepancies that show it's artificially generated. (Though once you spot some of the weirder ones, they can't be unseen, heh!)

It's a little shocking how good the tech has become-- and how fast it continues to evolve.
You know what's funny about some of those? I instantly recognized the initial input parameters. There's a particular angle that pops up for middle aged blonde women that is clearly derived from a photo of Angela Merkel.
 

Yaarel

Mind Mage
You know what's funny about some of those? I instantly recognized the initial input parameters. There's a particular angle that pops up for middle aged blonde women that is clearly derived from a photo of Angela Merkel.
Regarding the Artflow.AI face generator, I suspect the "base" face is more a composite of German Americans, rather than any specific person.

A way to get around this is to name a specific celebrity − but even then, the celebrity sometimes results in a few features from the celebrity superimposing on the German-American base. It seems the German composite is for both male and female. So the resulting celebrity isnt instantly recognizable. For example, "keanu-reeves" looks less like the actor himself, and more like a German American composite.

The database only recognizes some celebrities, but not others. So it seems each datapoint was entered separately. It takes time to figure out who is in the database.

Also, if you put in a first name and a last name, it sometimes interprets this as two different celebrities, and morphs them together. It is surprising when one celebrity is a man and the other is a woman, where the result can be 1. male, 2. female, or 3. intersex.
 
Last edited:

Epic Threats

An Advertisement

Advertisement4

Top