VHawkwinter
Adventurer
The process you're describing is doable right already if you are running some model on your own videocard with sufficient RAM. From your comments about this workflow, I assumed that was what you were doing.
......
I assume companies with their own customised models running on art they own (Say, Hasbro having fine-tuned some model or other on MtG and D&D art {I don't know if there are any contractual terms that would prevent them from doing so, for the sake of this argument I am assuming not}) are doing something like that, and having their employees connect to a server with a GPU adding jobs to a queue, not connecting to an external cloud service.
Ultimately, by the time it's approaching the level of control you have as an artist doing everything manually, the workflow will include a digital artist doing many things manually. Paintovers, moving parts around, fixing errors, etc. But with less personality than if they did it all manually.
I am inclined to agree with Maxperson that when you hand an artist local Diffusion Image Generating tools and mandate they incorporate them into their workflows and have them iteratively edit the image until it matches the vision they have for a piece, it's not the same as just some guy just prompting. By that point though, I would argue that it's getting closer and closer to the older shortcut workflow of painting over a collage of photos or rough 3d renders. I assume that's how Hasbro is making their artists use it.
I have seen people a few years ago post something they were working with in Stable Diffusion where they provided not just a prompt, but a prompt + a sketch with the composition and character shapes outlined, and had it render the sketch in depth, and then they did paintovers and edits in an editor (Krita maybe? doesn't matter) and kept going. Again though, they weren't using Bing or whatever, they were doing it on their own videocard with one of the offline apps.
I am skeptical you could get there without any art skills so someone like Maxperson who apparently has no art skills could get a high quality outcome with only image to image prompts - but it would certainly lower the required skill level, and give Maxperson better results than they could achieve on their own.
This thread has me somewhat interested in showing just how you could use this stuff to build something to spec without doing a manual paint job and document what it looks like when you use it to execute your specific vision iteratively rather than calling upon a random noise procedural generator like a slot machine and hoping for the best, or to try to show what some company like Hasbro might be making their artists do - As mentioned, I did try it out it when it was new, so I have some familiarity with the process - but I'm not invested enough to go to the trouble to reinstall the diffusion model tools and whatnot, and I also wouldn't be surprised if the 2026 versions of those tools demand better hardware than my almost ten year old GTX1070.
Like a challenge where I'm limited to only using free editors and the AI tools and my mouse, no wacom stylus, no nice brushes, etc. Maybe record the whole thing to a video via OBS, and then do a commentary track of my thoughts, and frustrations, and limitations I note while going through the experiment. Maybe some day. But I certainly wouldn't be using cloud services for it.
Last edited:







