D&D General DALL·E 3 does amazing D&D art

So yesterday I wanted to make a pic for a token for an upcoming one-shot.

I fed the same prompt into ChatGPT, Microsoft Designer, and a local copy of Stable Diffusion using the 'DynaVision XL' SDXL model checkpoint (essentially the base training model).

Got three extremely different results:

ChatGPT:

View attachment 403467
Microsoft Designer:
View attachment 403468
Local Stable Diffusion:
View attachment 403469
"A vibrant 1970s fantasy style RPG illustration of a freckled dark brown-skinned dwarf woman, age 29, in a fullbody shot. She wears a platemail breastplate, and her black wavy hair is tied into a ponytail, adding to her fierce and majestic appearance. She holds a giant fantasy warhammer in both hands, standing confidently in a bustling underground dwarven cavern fortress. The cobblestone streets are lined with lively market stalls, filled with various goods from weapons to food. The underground town is filled with dwarves in medieval merchant attire, engaging in various activities, creating a lively and dynamic atmosphere. The lighting is moody, casting a deep tone over the scene. The dwarves are seen bartering and socializing, with the overall atmosphere being one of camaraderie and industriousness. The camera captures her full body."

While I am not sure it will give you the style you're after, SDXL-based models are usually not responding well to long-ish prompt.

1745627676952.png


This prompt does well with more recent models, but also fails to describe the style. Maybe more description of the actual style would help?
 

log in or register to remove this ad

While I am not sure it will give you the style you're after, SDXL-based models are usually not responding well to long-ish prompt.

View attachment 403540

This prompt does well with more recent models, but also fails to describe the style. Maybe more description of the actual style would help?
This looks more like a human hanging out with a bunch of dwarves; like Snow White if she was more likely to smash your skull in instead of eating an apple.
 

Did ChatGPT see the word "socializing" and turn the dwarves into Communists?
Socializing, hammer, industrious, goods, camaraderie.

I can actually sort of understand how the A.I. got "communism" out of that. Kind of almost maybe. Like if you can imagine the sensibility of a robot, but crossed with a student barely paying attention in Poli Sci, you get something that would read that prompt and say, "Oh! You mean communism!"
 

cbwjm said:
This looks more like a human hanging out with a bunch of dwarves; like Snow White if she was more likely to smash your skull in instead of eating an apple.

Yeah, I think there aren't enough image of female dwarfs for the model to have a clear idea of what it is. Maybe one should actually describe a short woman with beard instead of "female dwarf".
 

Socializing, hammer, industrious, goods, camaraderie.

I can actually sort of understand how the A.I. got "communism" out of that. Kind of almost maybe. Like if you can imagine the sensibility of a robot, but crossed with a student barely paying attention in Poli Sci, you get something that would read that prompt and say, "Oh! You mean communism!"

I actually love that it made that leap.

Hammer? Clearly you mean Hammer and Sickle.
Socializing? Nay, SOCIALISM.
Industrious? Take control of the tools of Industry!
Camaraderie?! Comrade amirite?!?
 


While I am not sure it will give you the style you're after, SDXL-based models are usually not responding well to long-ish prompt.

View attachment 403540

This prompt does well with more recent models, but also fails to describe the style. Maybe more description of the actual style would help?
What was your prompt, models, and setting for this one? Looks like you've landed on a much better set of models and loras that I had.
 

What was your prompt, models, and setting for this one? Looks like you've landed on a much better set of models and loras that I had.

I used your prompt, minus the elements that don't really represent graphical elements:

"A vibrant 1970s fantasy style RPG illustration of a freckled dark brown-skinned dwarf woman, age 29, in a fullbody shot. She wears a platemail breastplate, and her black wavy hair is tied into a ponytail, adding to her fierce and majestic appearance. She holds a giant fantasy warhammer in both hands, standing confidently in a bustling underground dwarven cavern fortress. The cobblestone streets are lined with lively market stalls, filled with various goods from weapons to food. The underground town is filled with dwarves in medieval merchant attire, engaging in various activities. The lighting is moody, casting a deep tone over the scene. The dwarves are seen bartering and socializing. The camera captures her full body."

The model was HiDream-Full, one of the best open source model currently available.
 

Is that the same thing as Bing/Dall-E?
Adding to what @Kichwas said, from what I can tell bing.com/create is the Dall-E 3 purely text-based one so you can't feed it an image or edits or tell it to take the previous image and iterate on it, but you get 15 fast credits per day before it switches to slow mode.

Hmm, looks like there's a Flux model now for generating RPG stuff, wonder how well it works. Could be useful to generate a base to then i2i it in another model for the style desired?
 

Hmm, looks like there's a Flux model now for generating RPG stuff, wonder how well it works. Could be useful to generate a base to then i2i it in another model for the style desired?

In my experience, to significantly alter the style of the image with an image-to-image workflow, you need enough denoise that only the basic composition of the image remains. I have had good success using i2i with first generating a basic outline with a model very apt at understanding my prompt and then using i2i with a less precise, yet more pretty model, but I don't think starting with flux and i2i in a model with the right style would work that well. The fine details would be lost during denoising.

I've tried doing an i2i by telling chatGPT 4o to identify the style of the second image in the original post and apply it to the image I had made earlier with HiDream, and the result was this one:

1745672203098.png


The dwarves were nearly lost in the background. And the freckles on the subject.
 

Remove ads

Top