D&D General DALL·E 3 does amazing D&D art

Jfdlsjfd · Sep 1, 2024

trappedslider said:
So I discovered that I can run flux locally

Flux gives better result with long, detailed prompts, unlike SD or SDXL. It's closer to Dall-E developped prompt in what it wants. Do not hesitate to mention all the details you want. Or use another AI to elaborate a 200-word description around your initial prompt (in my experience, that's Flux sweet spot, 150-200 words. Above that, it starts to miss a lot of things).

Kannik · Sep 1, 2024

Jfdlsjfd said:
AI is bad at centaurs. Dall-E isn't the only one to struggle, even the current SOTA model (Flux) can't. They need specialized training, probably we don't have enough centaurs taking selfies to post on their social medias.

Totally, 'taurs of all kinds need to start posting more selfies!

I've recently started playing around a bit with SD locally and created one half decent 'taur thus far (below, as an experiment, still needs work to finish), but I'm really intrigued to give it more attention both with some 'taur LORAs as well as prompting with an img2img conversion (sketching out a rough 'taur shape and seeing if it can fill in the gabs) and/or generating upper and lower bodies separately and stitching them together through an img2img.

tsadkiel · Sep 1, 2024

Old fashioned photos, you say?

trappedslider · Sep 2, 2024

I cast fireball

Guest 7037866 · Sep 2, 2024

trappedslider said:

Ah.. the classic 6-fingered gnome...

You'd think by now AI would stop doing that.

trappedslider · Sep 2, 2024

Jfdlsjfd said:
Flux gives better result with long, detailed prompts, unlike SD or SDXL. It's closer to Dall-E developped prompt in what it wants. Do not hesitate to mention all the details you want. Or use another AI to elaborate a 200-word description around your initial prompt (in my experience, that's Flux sweet spot, 150-200 words. Above that, it starts to miss a lot of things).

Can you give me an example? I'm still dumb on this.

Cergorach · Sep 2, 2024

ezo said:
Ah.. the classic 6-fingered gnome...

You'd think by now AI would stop doing that.

With that would be a simple fix with a decent painting program... (just remove the top finger)

trappedslider · Sep 4, 2024

Jfdlsjfd · Sep 4, 2024

trappedslider said:
Can you give me an example? I'm still dumb on this.

Well, of course

In a former post you used the prompt ""(((full body visible))) 20 yr old woman with pink dyed hair dressed as an assassin in a burning factory, intricate, highly detailed,8k ultra-realistic, colorful, painting burst, beautiful symmetrical face, a nonchalant kind look, realistic round eyes, tone mapped, intricate, elegant, highly detailed, digital painting, art station, concept art, smooth, sharp focus, illustration, dreamy magical atmosphere,4k, looking at the viewer"

This style of prompt generate tokens that the model use but it doesn't know the relationship between them. That's why earlier model either performed better when they had a single subjects (all the keyword pointed to the only subject or the image in general) or they faced concept bleed. If you had need two women, adding "wearing a yellow totebag" would lead the AI to randomly decide which girl should wear it. You could improve the odds by removing colons and putting the keywords close together, but the limitations of the "text encoder" part of the AI model showed quickly. The original text encoding model (clip) was 500 MB in size and could only do limited encoding of text into token usable by the image-generating model.

Newer models use a much larger text encoder, T5-XXL in the case of Flux, that is around 45 GB in size unpruned. It is able to understand much, much more natural text and relationship between words in order to generate tokens that can represent more complex things. However, if you prompt this newer encoder with prompts in the old style, he still doesn't know which woman is holding the yellow totebag, because it wasn't in the prompt. Also, to make sure the large encoder is used well, the image-generation part of the model is trained on image with much, much longer description, so it can learn concept more easily, so it responds better if he's prompted in a natural language. I'll post a few images and prompts to illustrate later today.

trappedslider · Sep 4, 2024

Jfdlsjfd said:
Well, of course

I should have clarified, i meant like an example prompt

D&D General DALL·E 3 does amazing D&D art

Jfdlsjfd

.

Kannik

Legend

tsadkiel

Legend

trappedslider

Legend

Guest 7037866

Guest

trappedslider

Legend

Cergorach

The Laughing One

trappedslider

Legend

Jfdlsjfd

.

trappedslider

Legend

Similar Threads

Pets & Sidekicks