When it's possible for literally anyone to obtain biological eidetic memory, for a fee, the comparison might be in the ball park. If someone with perfect memory sells you the text of a book, that's still infringement of copyright. If they use lots of recognizable chunks of that book in an article/essay, that's still plagiarism.
Sure, like I said, it's how the text is used that matters the most. Which is why I gave the example of memorizing a play. Perfectly fine to just have it in your head, but once you start giving performances or narrating books, then that's another story.
Plagiarism and copyright violations are little more difficult to nail down. I was just
reading a post from deeplearning.ai where Andrew Ng (the cofounder of cousera and the aforementioned site) often posts some ML related information. Andrew talked about the possibility of certain models "understanding" the problem they are solving. He starts off by saying "understanding" is as thorny of an issue as consciousness, and is more of a philosophical question, but nevertheless, he talks about some research done on Othello-GPT; an LLM trained on Othello moves...and only the moves of Othello. In fact, the LLM wasn't even told it was learning anything about Othello.
To me, the work on Othello-GPT is a compelling demonstration that LLMs build world models; that is, they figure out what the world really is like rather than blindly parrot words. Kenneth Li and colleagues trained a variant of the GPT language model on sequences of moves from Othello, a board game in which two players take turns placing game pieces on an 8x8 grid. For example, one sequence of moves might be d3 c5 f6 f5 e6 e3…, where each pair of characters (such as d3) corresponds to placing a game piece at a board location.
During training, the network saw only sequences of moves. It wasn’t explicitly told that these were moves on a square, 8x8 board or the rules of the game. After training on a large dataset of such moves, it did a decent job of predicting what the next move might be.
The key question is: Did the network make these predictions by building a world model? That is, did it discover that there was an 8x8 board and a specific set of rules for placing pieces on it, that underpinned these moves? The authors demonstrate convincingly that the answer is yes. Specifically, given a sequence of moves, the network’s hidden-unit activations appeared to capture a representation of the current board position as well as available legal moves. This shows that, rather than being a “stochastic parrot” that tried only to mimic the statistics of its training data, the network did indeed build a world model.
I hope more investigation and research goes into this, because if true, I think we need to look at LLM's with a different lens.
I actually have an argument on the Pro-Creator side. The problem with AI having memorized corpus of text, videos or other "information" is how easy it is to be disseminated by it. Of course one can make the argument that the AI still needs to be connected to the internet, and that humans have this same possibility. The problem for a human is the amount of time it would take to transfer the knowledge from brain to computer (ie, type out the text). But, once done, would be as trivial for a human as an AI.
Of course, we already have legal protections in place for humans not to do that, eg pirate material. So I think it's a weak Pro-Creator argument, but still one that should be addressed.