And if that's not wgat people are saying and they're not interchangable with the training images how could there possibly be any IP infringement?
Well, there's several arguments here:
One, made by Sarah Silverman, and others, is that if the AI can accurately summarize or largely duplicate an original work, that is infringement. Basically, if you can give the generative AI a request to duplicate an existing work, the original persists in the system in sufficient form to be considered infringement. This argument isn't doing so hot in the courts at the moment, but some analysis I have read suggests that is because the topic is new, and that as more issues arise, this argument may gain more legs.
Another is a simple, but powerful, technical point that already has legal precedent - in order to use a work as training data, the original work must be copied as digital data in the AI training system! Done without permission, that, itself, is a violation of copyright, which prohibits duplication of covered works. While there is a long tradition of allowing individual consumers to create backups or copies for personal use, doing so for other purposes is another matter.
Remember Napster? This is what ultimately took down Napster. You can't make a digital copy of someone else's work and do whatever you want with it, just because you feel like it. Period.
Early Generative AI work leaned heavily on their being
research, sliding in under the educational arm of Fair Use. But that argument ceases to hold as you open distribution of the result to the general public and commercial concerns.
You will note that the training sets used often avoid bodies of work controlled by major business concerns. Nobody has gone and scanned a bunch of Disney animated films and used them as training data, because Disney has the wherewithal to defend itself. Instead, training data is often lifted from the internet, and smaller artists who cannot mount a significant legal defense individually.
This strongly suggests that the folks getting this data
know it is legally dicey.