So yesterday I wanted to make a pic for a token for an upcoming one-shot.
I fed the same prompt into ChatGPT, Microsoft Designer, and a local copy of Stable Diffusion using the 'DynaVision XL' SDXL model checkpoint (essentially the base training model).
Got three extremely different results:
ChatGPT:
View attachment 403467
Microsoft Designer:
View attachment 403468
Local Stable Diffusion:
View attachment 403469
"A vibrant 1970s fantasy style RPG illustration of a freckled dark brown-skinned dwarf woman, age 29, in a fullbody shot. She wears a platemail breastplate, and her black wavy hair is tied into a ponytail, adding to her fierce and majestic appearance. She holds a giant fantasy warhammer in both hands, standing confidently in a bustling underground dwarven cavern fortress. The cobblestone streets are lined with lively market stalls, filled with various goods from weapons to food. The underground town is filled with dwarves in medieval merchant attire, engaging in various activities, creating a lively and dynamic atmosphere. The lighting is moody, casting a deep tone over the scene. The dwarves are seen bartering and socializing, with the overall atmosphere being one of camaraderie and industriousness. The camera captures her full body."
While I am not sure it will give you the style you're after, SDXL-based models are usually not responding well to long-ish prompt.
This prompt does well with more recent models, but also fails to describe the style. Maybe more description of the actual style would help?