Sarah Silverman leads class-action lawsuit against ChatGPT creator

Ryujin · Aug 9, 2023

RareBreed said:
I also know the argument will come up that you can't store works electronically without the author's permission.

I have a different take on this. As i mentioned, if a human with photographic memory could memorize an entire book, is that legal? If the answer is, yes, what makes us special? Why would a human be allowed to do so, but not AI? I can actually see arguments either way, but it's a question that's not far down the road. I'm on the fence whether there's some kind of "ghost in the machine" with the new billion+ parameter LLMs. They are doing some really crazy stuff that data scientists are scratching their heads how they are doing what they are doing.

Let's fast forward 20 years and we have quantum computers which are cheap(er), not room sized, and ubiquitous. Let's even say we admit they are true AGI. What then? Since they are artificial what do we do then? This used to be fantastic science fiction, but we're on the cusp of this being reality.

When it's possible for literally anyone to obtain biological eidetic memory, for a fee, the comparison might be in the ball park. If someone with perfect memory sells you the text of a book, that's still infringement of copyright. If they use lots of recognizable chunks of that book in an article/essay, that's still plagiarism.

billd91 · Aug 9, 2023

Ryujin said:
When it's possible for literally anyone to obtain biological eidetic memory, for a fee, the comparison might be in the ball park. If someone with perfect memory sells you the text of a book, that's still infringement of copyright. If they use lots of recognizable chunks of that book in an article/essay, that's still plagiarism.

Memorizing the exact content of the book, assuming it were possible (and for shorter works, it certainly is), wouldn't be a violation of copyright. The presumption would be that it's for personal use/edification. It really is the reproduction and distribution of that material without the approval of the copyright holder or invocation of some kind of approved license, beyond fair use, that would constitute a copyright violation. And that's not just in selling a reproduction of the text - it could also be a public recitation before an audience or some other performance.
Now, how do you prevent an AI from violating a copyright as it uses what it has stored in some unregulated fashion? How do you keep it to fair use? Do you find a way to teach those ethical and legal boundaries?
And I'd add that the copying of a work to use it to teach an AI is definitely not making a copy for personal use and I don't believe it would constitute fair use for educational purposes either. Fair use doesn't allow me to infringe on the whole of a work just because I'm saying I'm furthering someone (or something's) education, rather, I can use duly cited excerpts of the work as a part of educational material without fearing copyright violation.

Ryujin · Aug 9, 2023

billd91 said:
Memorizing the exact content of the book, assuming it were possible (and for shorter works, it certainly is), wouldn't be a violation of copyright. The presumption would be that it's for personal use/edification. It really is the reproduction and distribution of that material without the approval of the copyright holder or invocation of some kind of approved license, beyond fair use, that would constitute a copyright violation.

That's literally what I said.

billd91 · Aug 9, 2023

Ryujin said:
That's literally what I said.

I'm not trying to correct you. I'm springboarding off your post with further elaboration in the hope RareBreed understands it's not about storage and memory.

Ryujin · Aug 9, 2023

billd91 said:
I'm not trying to correct you. I'm springboarding off your post with further elaboration in the hope RareBreed understands it's not about storage and memory.

Sorry about that. I've taken to adding, "Yup, ..." or something similar to my agreement posts, to avoid that sort of misinterpretation.

Ryujin · Aug 10, 2023

I would call this the tip of the iceberg.

Amazon Removes AI-Generated Books That Spoofed Author's Byline

Jane Friedman said the tech giant initially refused to exterminate a swarm of books purporting to be written by her that were, in fact, penned by bots.

gizmodo.com

Scribe · Aug 10, 2023

Ryujin said:
I would call this the tip of the iceberg.

Amazon Removes AI-Generated Books That Spoofed Author's Byline

Jane Friedman said the tech giant initially refused to exterminate a swarm of books purporting to be written by her that were, in fact, penned by bots.

gizmodo.com

trappedslider · Aug 10, 2023

Ryujin said:
I would call this the tip of the iceberg.

Amazon Removes AI-Generated Books That Spoofed Author's Byline

Jane Friedman said the tech giant initially refused to exterminate a swarm of books purporting to be written by her that were, in fact, penned by bots.

gizmodo.com

Reminds me of counterfeits in the art world

found this pretty cool however from the article

Ironically, Friedman told Gizmodo she isn’t even diametrically opposed to AI models scanning writer’s content as a rule as some other authors are. The issue, she said, is when platforms and companies fail to put in place basic rules or protections to prevent hucksters from abusing the tools.

In full disclosure, I've been using chatgpt to basically error check and fill out my in progress book. How i do it is via various twist on the prompt of "Please check the following for errors" or "Make this longer/better" and "what happens next" followed be me tweaking the what happens next until i feel it fits the overall vibe i'm going for.

RareBreed · Aug 10, 2023

Ryujin said:
When it's possible for literally anyone to obtain biological eidetic memory, for a fee, the comparison might be in the ball park. If someone with perfect memory sells you the text of a book, that's still infringement of copyright. If they use lots of recognizable chunks of that book in an article/essay, that's still plagiarism.

Sure, like I said, it's how the text is used that matters the most. Which is why I gave the example of memorizing a play. Perfectly fine to just have it in your head, but once you start giving performances or narrating books, then that's another story.

Plagiarism and copyright violations are little more difficult to nail down. I was just reading a post from deeplearning.ai where Andrew Ng (the cofounder of cousera and the aforementioned site) often posts some ML related information. Andrew talked about the possibility of certain models "understanding" the problem they are solving. He starts off by saying "understanding" is as thorny of an issue as consciousness, and is more of a philosophical question, but nevertheless, he talks about some research done on Othello-GPT; an LLM trained on Othello moves...and only the moves of Othello. In fact, the LLM wasn't even told it was learning anything about Othello.

To me, the work on Othello-GPT is a compelling demonstration that LLMs build world models; that is, they figure out what the world really is like rather than blindly parrot words. Kenneth Li and colleagues trained a variant of the GPT language model on sequences of moves from Othello, a board game in which two players take turns placing game pieces on an 8x8 grid. For example, one sequence of moves might be d3 c5 f6 f5 e6 e3…, where each pair of characters (such as d3) corresponds to placing a game piece at a board location.

During training, the network saw only sequences of moves. It wasn’t explicitly told that these were moves on a square, 8x8 board or the rules of the game. After training on a large dataset of such moves, it did a decent job of predicting what the next move might be.

The key question is: Did the network make these predictions by building a world model? That is, did it discover that there was an 8x8 board and a specific set of rules for placing pieces on it, that underpinned these moves? The authors demonstrate convincingly that the answer is yes. Specifically, given a sequence of moves, the network’s hidden-unit activations appeared to capture a representation of the current board position as well as available legal moves. This shows that, rather than being a “stochastic parrot” that tried only to mimic the statistics of its training data, the network did indeed build a world model.

I hope more investigation and research goes into this, because if true, I think we need to look at LLM's with a different lens.

I actually have an argument on the Pro-Creator side. The problem with AI having memorized corpus of text, videos or other "information" is how easy it is to be disseminated by it. Of course one can make the argument that the AI still needs to be connected to the internet, and that humans have this same possibility. The problem for a human is the amount of time it would take to transfer the knowledge from brain to computer (ie, type out the text). But, once done, would be as trivial for a human as an AI.

Of course, we already have legal protections in place for humans not to do that, eg pirate material. So I think it's a weak Pro-Creator argument, but still one that should be addressed.

RareBreed · Aug 10, 2023

trappedslider said:
Reminds me of counterfeits in the art world

found this pretty cool however from the article

In full disclosure, I've been using chatgpt to basically error check and fill out my in progress book. How i do it is via various twist on the prompt of "Please check the following for errors" or "Make this longer/better" and "what happens next" followed be me tweaking the what happens next until i feel it fits the overall vibe i'm going for.

Reading this, it sounds more like a horrible case of identity theft than anything else. It even says in the article that someone could tell the writing style was different from the author, but initially chalked it up to the real author experimenting with a new narrative style.

I mean, it is possible for two people to have the same name, but this really sounds more like a bad case of identity theft (and the AI was bad at mimicking the author's narrative style). The article also concludes with her saying:

“These companies need to take these problems seriously,” Friedman said. “It’s not about shutting down AI globally, it’s about living with it in a way that respects human creators and that helps protect them from what some of these bad actors are going to do with the technology.

Sarah Silverman leads class-action lawsuit against ChatGPT creator

Ryujin

Legend

billd91

Not your screen monkey (he/him) 🇺🇦🇵🇸🏳️‍⚧️

Ryujin

Legend

billd91

Not your screen monkey (he/him) 🇺🇦🇵🇸🏳️‍⚧️

Ryujin

Legend

Ryujin

Legend

Amazon Removes AI-Generated Books That Spoofed Author's Byline

Scribe

Legend

Amazon Removes AI-Generated Books That Spoofed Author's Byline

trappedslider

Legend

Amazon Removes AI-Generated Books That Spoofed Author's Byline

RareBreed

Adventurer

RareBreed

Adventurer

Similar Threads

Voidrunner's Codex