Unbelievable Scale of AI’s Pirated-Books Problem


log in or register to remove this ad


Stop right there.
People can be wrong, or have a difference of opinion, without being anywhere near "bad faith".
If you aren't open to that difference, this discussion should not continue.

If one is going to compare an individual, not making money, to the most powerful, wealthiest corporations in the world who are trying to make a profit?

That is far more than a difference of opinion, and if thats not bad faith then I still want no part of it.

Peace Out Reaction GIF
 

2) whether what the “AI” programs produce would themselves be covered by copyright, and if so, who owns that copyright.

My understanding is that the current ruling is that the results of generative AI is not and cannot be protected by copyright. Of course, proving that a work is AI-generated is not, at the moment, possible, making enforcement difficult.
 

Discussion about “AI” always brings out the weirdest arguments.

Illegally downloading a book is piracy. Full stop. There’s no debate to be had. Existing laws are quite clear on this.

We don’t need to wait for new laws on the books, we just need existing laws enforced…against mega-corporations instead of regular people for a change. Which we all know will never happen.

Fair use does not now, nor has it ever covered commercial use. However the mega-corps got the training data (it was illegally, see piracy above), they’re quite clearly pushing to make money off their “AI” programs and LLMs.

The only real questions are: 1) whether what the “AI” programs produce would be considered derivative works in violation of copyright rather than ruled similar enough to what human artists do, and; 2) whether what the “AI” programs produce would themselves be covered by copyright, and if so, who owns that copyright.
I think the core problem is that whatever "AI" programs create, it's tainted by being based on material they don't have rights to. That you have to go back to the operator of the application doesn't own or have rights to use the data the application is using. Every single generation from that "AI" is using stolen data. Every time.
 


And we have to be clear, there are two separate violations here. The first was that they literally downloaded the archive via a torrent. They stole the works, plain and simple. That was them getting the data in the first place. The second is that by using those books against their licenses (stated in the copyright statements at the beginning of every book), each use of is a further injury to the owners of the works.

At the end of the day though, I think we're going to see "AI" die down a bit "soon". They're too unreliable for a lot of business use, the IP issues are going to cause a mountain of litigation, they cost a ton of money to operate but don't seem to have much of a model for revenue, and then there's the environmental issues. Unless there's a significant, and I mean really significant, change in how they work and how much they cost to operate, I don't see them riding this big for long. It just costs way too much and right now everyone's speculating that there's a killer app in there to make a mountain of money on to match the mountain of litigation.

Where they're doing really amazing things is in scientific applications and that doesn't have any of the IP problems we're talking about.
 

I mean am I supposed to cheer for the folks breaking copyright law or do i cheer for enforcement? It's a very confusing time we live in. For what it's worth I honestly see it as whack-a-mole and in the end futile.
Why is your choice a binary one between those two groups? Is there nobody else in the world you could cheer for? How about the actual victims? Like the people whose work is being pirated? People who make the game books you use? Struggling artists whose hard work is being pirated?

And what has ‘law enforcement’ to do with it? Copyright violation will be settled via lawsuits in civil courts, as is happening right now with a number of high profile cases. It’s not something you call the cops for.
 

And we have to be clear, there are two separate violations here. The first was that they literally downloaded the archive via a torrent. They stole the works, plain and simple. That was them getting the data in the first place. The second is that by using those books against their licenses (stated in the copyright statements at the beginning of every book), each use of is a further injury to the owners of the works.

At the end of the day though, I think we're going to see "AI" die down a bit "soon". They're too unreliable for a lot of business use, the IP issues are going to cause a mountain of litigation, they cost a ton of money to operate but don't seem to have much of a model for revenue, and then there's the environmental issues. Unless there's a significant, and I mean really significant, change in how they work and how much they cost to operate, I don't see them riding this big for long. It just costs way too much and right now everyone's speculating that there's a killer app in there to make a mountain of money on to match the mountain of litigation.
Indeed. The piracy and the AI training are two separate things. Those in the “AI training is not piracy” camp are fundamentally misunderstanding the situation and confusing the two things.
 

At the end of the day though, I think we're going to see "AI" die down a bit "soon". They're too unreliable for a lot of business use, the IP issues are going to cause a mountain of litigation, they cost a ton of money to operate but don't seem to have much of a model for revenue, and then there's the environmental issues.

Add to that the likelihood that they won't be getting much better any time soon. Just statistically, making them substantially better requires doubling or quadrupling the size of the data set. And they've already taken/used so much, that there's not a a doubling or quadrupling to be had.
 

Remove ads

Top