Unbelievable Scale of AI’s Pirated-Books Problem

At the end of the day though, I think we're going to see "AI" die down a bit "soon". They're too unreliable for a lot of business use, the IP issues are going to cause a mountain of litigation, they cost a ton of money to operate but don't seem to have much of a model for revenue, and then there's the environmental issues. Unless there's a significant, and I mean really significant, change in how they work and how much they cost to operate, I don't see them riding this big for long. It just costs way too much and right now everyone's speculating that there's a killer app in there to make a mountain of money on to match the mountain of litigation.
You're conflating certain models practice with AI in general. AI is not going anywhere. It's in heavy use in a lot of business use, especially financial and medical use and enables many current features in both sectors.
 

log in or register to remove this ad


With respect, the choice to have material out there as a "loss leader" should be the creator's, and not the consumer's. It is, simply, not our right to choose. Literally - this is about rights to material.
What loss if I bought the album anyway?
 

And we have to be clear, there are two separate violations here. The first was that they literally downloaded the archive via a torrent. They stole the works, plain and simple. That was them getting the data in the first place. The second is that by using those books against their licenses (stated in the copyright statements at the beginning of every book), each use of is a further injury to the owners of the works.

At the end of the day though, I think we're going to see "AI" die down a bit "soon". They're too unreliable for a lot of business use, the IP issues are going to cause a mountain of litigation, they cost a ton of money to operate but don't seem to have much of a model for revenue, and then there's the environmental issues. Unless there's a significant, and I mean really significant, change in how they work and how much they cost to operate, I don't see them riding this big for long. It just costs way too much and right now everyone's speculating that there's a killer app in there to make a mountain of money on to match the mountain of litigation.

Where they're doing really amazing things is in scientific applications and that doesn't have any of the IP problems we're talking about.
I think the news out of Apple and its “Apple Intelligence” should throw a lot of cold water on businesses but this is also how bubbles start to form.
 

What loss if I bought the album anyway?
Whether you did that or not, you still violated the artist’s rights without their consent. It’s not your call to make whether depriving them of that right ultimately benefitted them or not. Only they have the right to make that determination; you can’t decide it for them. That’s what the ‘right’ part of copyright means.

———-
To everybody—

To be clear, that age-old defence of piracy is not one supported by this site. While you haven’t done so, I can see exactly where this conversation is headed after having seen it a thousand times here on these boards, so please be careful not to step over the line into encouraging piracy. Thanks.

To everybody in the thread: please drop the topic of whether or not piracy as a concept is OK. That topic has been well and truly legislated not only here but everywhere where people are on the internet, and we don’t want to derail the thread.
 

Indeed. The piracy and the AI training are two separate things. Those in the “AI training is not piracy” camp are fundamentally misunderstanding the situation and confusing the two things.
Training is not piracy, but it is creating a derivative. Which without an explicit license is copyright infringement.
 


You're conflating certain models practice with AI in general. AI is not going anywhere. It's in heavy use in a lot of business use, especially financial and medical use and enables many current features in both sectors.
There are uses for current types of "AI" in working with very large data sets for analysis that are useful in both business and scientific applications. LLMs and what we have now generating art and video are significantly limited by lacking understanding of context. Which means the hallucinations we've all seen are not a fixable thing without completely changing how the system works.

Ok, example... AI image generators famously can't draw hands. The reason why is because they're trained on images but don't understand any of them. They just know that there's commonly this shape at the end of that shape. They don't know what a hand or fingers are, just that they're shapes generally of a range of colors. So when they see a still image of a person, that person's hands might be hidden, or not. They might be in such a position that anywhere from zero to five fingers are visible. The person might not actually have five fingers. So all the AI "knows" is that at the end of the hand shape are some number of shapes (fingers) that are often bent. It doesn't understand why they bend or that it's even a three dimensional object that bends. Just that's it's a curved shape. Sometimes. Without "understanding" what a hand is and what fingers are and that they're connected and can move in certain ways and that some things indicate an injury and what an injury is and so on... it can't really model a hand. Or know how to show one correctly. So it just tries to give it's best approximation of a hand shape and sometimes it's right.

Understanding context is the thing that'll get us to AGI (Artificial General Intelligence) and I honestly don't see that coming for years, likely a decade or more. I could be surprised. But this is a considerably tougher nut to crack than what they've done so far.
 

The ground's shrinking under this exploitation of AI. I saw it in articles starting early last year. Now, we see Fortune 100s (Apple, Meta) getting sued over AI-related processes.

this-doesn%27t-look-good-jonas-taylor.gif
 

Training is not piracy, but it is creating a derivative. Which without an explicit license is copyright infringement.
I am aware. I feel like your response had an unspoken “But…” before it as though you feel
It contradicts what I said, but it does not. Or perhaps it had an unspoken “Yes, and…” and you were just agreeing with me? It’s so hard to tell with text only.
 

Remove ads

Top