To make such a model you'd also need a lot of training data, ie. load and loads of sequences of WIP pictures. Artists sometimes post WIPs, but they rarely share the entire edit history, so getting enough data to train a model would be challenging, to say the least.
You can probably make a LORA...