Menu
News
All News
Dungeons & Dragons
Level Up: Advanced 5th Edition
Pathfinder
Starfinder
Warhammer
2d20 System
Year Zero Engine
Industry News
Reviews
Dragon Reflections
White Dwarf Reflections
Columns
Weekly Digests
Weekly News Digest
Freebies, Sales & Bundles
RPG Print News
RPG Crowdfunding News
Game Content
ENterplanetary DimENsions
Mythological Figures
Opinion
Worlds of Design
Peregrine's Nest
RPG Evolution
Other Columns
From the Freelancing Frontline
Monster ENcyclopedia
WotC/TSR Alumni Look Back
4 Hours w/RSD (Ryan Dancey)
The Road to 3E (Jonathan Tweet)
Greenwood's Realms (Ed Greenwood)
Drawmij's TSR (Jim Ward)
Community
Forums & Topics
Forum List
Latest Posts
Forum list
*Dungeons & Dragons
Level Up: Advanced 5th Edition
D&D Older Editions, OSR, & D&D Variants
*TTRPGs General
*Pathfinder & Starfinder
EN Publishing
*Geek Talk & Media
Search forums
Chat/Discord
Resources
Wiki
Pages
Latest activity
Media
New media
New comments
Search media
Downloads
Latest reviews
Search resources
EN Publishing
Store
EN5ider
Adventures in ZEITGEIST
Awfully Cheerful Engine
What's OLD is NEW
Judge Dredd & The Worlds Of 2000AD
War of the Burning Sky
Level Up: Advanced 5E
Events & Releases
Upcoming Events
Private Events
Featured Events
Socials!
EN Publishing
Twitter
BlueSky
Facebook
Instagram
EN World
BlueSky
YouTube
Facebook
Twitter
Twitch
Podcast
Features
Top 5 RPGs Compiled Charts 2004-Present
Adventure Game Industry Market Research Summary (RPGs) V1.0
Ryan Dancey: Acquiring TSR
Q&A With Gary Gygax
D&D Rules FAQs
TSR, WotC, & Paizo: A Comparative History
D&D Pronunciation Guide
Million Dollar TTRPG Kickstarters
Tabletop RPG Podcast Hall of Fame
Eric Noah's Unofficial D&D 3rd Edition News
D&D in the Mainstream
D&D & RPG History
About Morrus
Log in
Register
What's new
Search
Search
Search titles only
By:
Forums & Topics
Forum List
Latest Posts
Forum list
*Dungeons & Dragons
Level Up: Advanced 5th Edition
D&D Older Editions, OSR, & D&D Variants
*TTRPGs General
*Pathfinder & Starfinder
EN Publishing
*Geek Talk & Media
Search forums
Chat/Discord
Menu
Log in
Register
Install the app
Install
Upgrade your account to a Community Supporter account and remove most of the site ads.
Community
General Tabletop Discussion
*Dungeons & Dragons
DALL·E 3 does amazing D&D art
JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an
alternative browser
.
Reply to thread
Message
<blockquote data-quote="Jfdlsjfd" data-source="post: 9451848" data-attributes="member: 42856"><p>Well, of course <img src="https://cdn.jsdelivr.net/joypixels/assets/8.0/png/unicode/64/1f642.png" class="smilie smilie--emoji" loading="lazy" width="64" height="64" alt=":)" title="Smile :)" data-smilie="1"data-shortname=":)" /></p><p></p><p>In a former post you used the prompt ""(((full body visible))) 20 yr old woman with pink dyed hair dressed as an assassin in a burning factory, intricate, highly detailed,8k ultra-realistic, colorful, painting burst, beautiful symmetrical face, a nonchalant kind look, realistic round eyes, tone mapped, intricate, elegant, highly detailed, digital painting, art station, concept art, smooth, sharp focus, illustration, dreamy magical atmosphere,4k, looking at the viewer"</p><p></p><p>This style of prompt generate tokens that the model use but it doesn't know the relationship between them. That's why earlier model either performed better when they had a single subjects (all the keyword pointed to the only subject or the image in general) or they faced concept bleed. If you had need two women, adding "wearing a yellow totebag" would lead the AI to randomly decide which girl should wear it. You could improve the odds by removing colons and putting the keywords close together, but the limitations of the "text encoder" part of the AI model showed quickly. The original text encoding model (clip) was 500 MB in size and could only do limited encoding of text into token usable by the image-generating model.</p><p></p><p>Newer models use a much larger text encoder, T5-XXL in the case of Flux, that is around 45 GB in size unpruned. It is able to understand much, much more natural text and relationship between words in order to generate tokens that can represent more complex things. However, if you prompt this newer encoder with prompts in the old style, he still doesn't know which woman is holding the yellow totebag, because it wasn't in the prompt. Also, to make sure the large encoder is used well, the image-generation part of the model is trained on image with much, much longer description, so it can learn concept more easily, so it responds better if he's prompted in a natural language. I'll post a few images and prompts to illustrate later today.</p></blockquote><p></p>
[QUOTE="Jfdlsjfd, post: 9451848, member: 42856"] Well, of course :) In a former post you used the prompt ""(((full body visible))) 20 yr old woman with pink dyed hair dressed as an assassin in a burning factory, intricate, highly detailed,8k ultra-realistic, colorful, painting burst, beautiful symmetrical face, a nonchalant kind look, realistic round eyes, tone mapped, intricate, elegant, highly detailed, digital painting, art station, concept art, smooth, sharp focus, illustration, dreamy magical atmosphere,4k, looking at the viewer" This style of prompt generate tokens that the model use but it doesn't know the relationship between them. That's why earlier model either performed better when they had a single subjects (all the keyword pointed to the only subject or the image in general) or they faced concept bleed. If you had need two women, adding "wearing a yellow totebag" would lead the AI to randomly decide which girl should wear it. You could improve the odds by removing colons and putting the keywords close together, but the limitations of the "text encoder" part of the AI model showed quickly. The original text encoding model (clip) was 500 MB in size and could only do limited encoding of text into token usable by the image-generating model. Newer models use a much larger text encoder, T5-XXL in the case of Flux, that is around 45 GB in size unpruned. It is able to understand much, much more natural text and relationship between words in order to generate tokens that can represent more complex things. However, if you prompt this newer encoder with prompts in the old style, he still doesn't know which woman is holding the yellow totebag, because it wasn't in the prompt. Also, to make sure the large encoder is used well, the image-generation part of the model is trained on image with much, much longer description, so it can learn concept more easily, so it responds better if he's prompted in a natural language. I'll post a few images and prompts to illustrate later today. [/QUOTE]
Insert quotes…
Verification
Post reply
Community
General Tabletop Discussion
*Dungeons & Dragons
DALL·E 3 does amazing D&D art
Top