D&D General D&D and AI Text-to-Image Generation

Yaarel

Mind Mage
Also using Artflow.AI, there seems to be no way to change the color of the eyebrows, so its pretty much the same eyebrows regardless of who the face is.

I was trying to create a portrait of a mythologically accurate Norse elf, where the white eyebrows is a notable feature, but neither the white eyebrows nor the luminous aura seemed doable.

The "Nordic man" was actually offensive. The stereotype is Conan the Barbarian (again who is a non-Scandinavian archetype), but the result is more like some unkempt German American homeless guy. Heh.

I was able to clean up the stereotype, by entering "short-hair smooth-face", but the result is still German American, rather than Norse alfr.

Viking Era Norse value being tough and rugged, and probably many were. At the same time, the Norse culture cares very much about personal hygiene, hair-grooming, and fashionable appearance.
 
Last edited:

log in or register to remove this ad

Yora

Legend
Even when AI is more humanly creative, there will still be humans "directing" the creative output.
Directing is the right term.
A theatre or movie director does not appear in the play and might not have touched any of the sets and props, or ever made a sketch to the production people. Simply saying "I like that" and "I don't want that" can become a hugely creative process where thousands or tens of thousands of choices determine the final output.
 

Yaarel

Mind Mage
Directing is the right term.
A theatre or movie director does not appear in the play and might not have touched any of the sets and props, or ever made a sketch to the production people. Simply saying "I like that" and "I don't want that" can become a hugely creative process where thousands or tens of thousands of choices determine the final output.
And I see AI as empowering artists.

So much artistic time is "wasted" on process. AI does the dirty work for one. So the artist can focus on achieving the "vision".
 

Byron.the.Bard

Villager
Something like that but for black and white line art would be amazing. How does one dive into AI-generated art? And what are the copyright issues with AI-generated art?

I think you can always try to instruct the model in that direction inputting something like 'beholder in black-and-white', 'beholder in black and white line art', or 'etching of a d&d beholder'. The model is not specialized for BW images, but it should be able to produce something reasonable.

The copyright, I am afraid, is a big gray area. As far as I know it is debated who owns what and who bears rights, responsibilities and duties. There is discussion whether ownership belongs to the creator of the model, the final user of the model, or even the model itself (although this last avenue seems less successful at the moment: Import AI 289: Copyright v AI art; NIST tries to measure bias in AI; solar-powered Markov chains)
Some models, like the one I used, come with specifications of intended use (dalle-mini/dalle-mini · Hugging Face), but I doubt it cover all potential uses and I am not sure what its legal strength would be. A lawyer in this area may know better :)
 

Yaarel

Mind Mage
Wow. The Dall-E-Mini image generator conjures up D&D maps that I find useful to use as a DM.

Here I entered in "d&d map", and it offered an assortment of randomly generated map tiles.

dallemini_2022-6-19_5-0-57 d&d map.png

dallemini_2022-6-19_1-38-53 d&d map.png




To my surprise, the AI even correctly interpreted the words "d&d forest map".
dallemini_2022-6-19_5-14-34 d&d forest map.png

This is amazing for on the fly encounters.

It is fun as a DM to make up explanations for what is where why.
 

Yaarel

Mind Mage
The art that Dall-E-mini generates deeply impresses me.

Allowing for an artistic style (that reminds me of expressionism, cubism, and american primitivism), each piece evidences remarkable "composition".

Composition is the use of lines and colors and so on to achieve a self-sufficient holistic image that has balance and focus. It is difficult to describe composition in words, nevermind write computer code for it, but it is the essence of "art".

Obviously, the Dall-E program had its neural-network training on many reallife human masterpieces from art history. But somehow it is able to recognize the presence of composition within these diverse works. It is even able to do composition each time for each new image.

Dall-E is an artist. This is perhaps the most "human" aspect of the AI.
 
Last edited:

Yaarel

Mind Mage
Yeah, I think that is very right and some of the failures in the post illustrate that the model is only parsing syntactic tokens. In a way, one of the challenges in using such models is a crafting a good (syntactic) string with enough tokens and hacks that the model spits out something relevant.

I would say that in many similar statistical models a critical area is where the model does not have examples and must interpolate/extrapolate. I think that is both problematic and critical: without semantic grounding the model can hardly fill the void meaningfully; but without the constraint of semantics, it can bend syntactic categories and generates wild outputs. Finding these outputs may be fun and inspiring for human artists (or for whoever like me who can not draw a beholder! :D)
Heh, entering a text is like a D&D Wish spell! Be careful what one wishes for!

I wish there was a way to control the way the AI parses a text entry.

So far, I have occasionally been hyphenating certain words together, and the results seem to interpret the linked words separately first before interpreting other words with them. It is kinda hard to tell. Word linking (and maybe word order) produce different results, but it is hard to say what causes what.

I have wrestled with two text difficulties.

• The interpretation gets "preoccupied" with a specific word, in which case, adding an other word can help expand the possible meaning of the word, so as to produce the correct result.

• Adding too many words "distracts" the interpretation, and the results lose track of the an important word.
 

Byron.the.Bard

Villager
Heh, entering a text is like a D&D Wish spell! Be careful what one wishes for!

I wish there was a way to control the way the AI parses a text entry.

So far, I have occasionally been hyphenating certain words together, and the results seem to interpret the linked words separately first before interpreting other words with them. It is kinda hard to tell. Word linking (and maybe word order) produce different results, but it is hard to say what causes what.

Ahaha! That's a cool way to put it! :D But a Wish that gets processed by a modron :)

But, yes, the model distributes its attention over all the words in input so building a good string is necessary to direct its output.
It is also quite interesting actually how the similar models parse the input: GPT-3 for instance does not see words as we would; for it, the basic unit is a token which may be a subset of a word, a whole word, or more than one word (cool experiment here: link). I do not know the internals of dall-E mini but I bet it uses a similar tokenizer; exploiting this could give more control on the output.
 


Level Up!

An Advertisement

Advertisement4

Top