We know that Meta, at least, was trained on copyrighted data. We have actual proof of this. There have been lawsuits filed. Apparently (
as of two days ago) Meta is claiming that the 7 million books it pirated had "no economic value" and that they're protected under "
fair use" because, they claim, they don't reproduce the entire book.
Now, I got Gemini to pretty much reproduce the entirety of the grave cleric, which is not OGL. Which means that the idea that AI won't reproduce copyrighten material is bogus. Maybe some AIs won't, but others will.