Unbelievable Scale of AI’s Pirated-Books Problem

Speaking of Napster, it's interesting how some have changed their tune (pun intended) about piracy. Either because they got older and wiser or it's finally hitting close to home for them.
I worked for Shawn Fanning (the guy who created Napster) at EA. To be honest, I don't really have anything positive I can say about the man while I've got a whole lot of negative. (I'm trying really hard here to stick by the rules on personal attacks)
 

log in or register to remove this ad

In one corner, we have the university student downloading a song, making no money.

Yes, making money off it is worse, but the student download still represents lost income for the creator.

The basic idea is simple - if you want content someone worked to create, you should pay them for that work. This is true if you are a megacorp, or a university student. Nobody is entitled to content, other than Fair Use.
 

so if it wasn't a megacorp doing it, we'd be okay with it? At what dollar point is piracy not okay? Also at which point is copyright laws/extensions okay? We used to decry Disney and others whenever they wanted copyright laws extended.

Is it ever OK? Not really my place to put a line in the sand.

Is it remotely the same thing (Student + Napster vs Megacorp looking to make Billions) or is it a massive, grotesque false equivalency?

I'll leave that for you to decide.
 

This was the plot of a Judge Dredd story from the 1990s. An artist arrives in Mega-City One with the intention of getting work as an illustrator. The artists goes through a series of annoying misadventures, including being fined by a particular Mega-City One judge for breaking some obscure and stupid law, before he finally makes it to the publisher and shows off his portfolio. The publisher is impressed, but refuses to hire the artist. There's no need, you see, the publisher showed the artist's work to his robots who are now producing art in the same style. While it's illegal to reproduce someone's work without permission, you can't copyright a style, so the publisher's actions don't violate the law. The story ends with the artist going off the deep end and getting sentenced to an iso-cube.

Copyright law is pretty complicated and I'm not sure where I stand on this. Is having an AI "learn" the same thing as an artist copying the style of someone else? I'm out of my depth here. Practically speaking, I don't think we're going to put the genie back in the bottle here.
 

Here's one thats going to be funny in the next few years.

Man kneels down, and bottles some water for his hike.
Gov/Corp entity extracts billions of litres of water and ships it out.

Totally the same.
 


Is it remotely the same thing (Student + Napster vs Megacorp looking to make Billions) or is it a massive, grotesque false equivalency?

I'll leave that for you to decide.

Well, part of the issue is that saying "Student + Napster" is misleading.

It was Thousands and Thousands of Students + Napster.
 



Yes, making money off it is worse, but the student download still represents lost income for the creator.

The basic idea is simple - if you want content someone worked to create, you should pay them for that work. This is true if you are a megacorp, or a university student. Nobody is entitled to content, other than Fair Use.
Very true and more, creators can put conditions on how they want their creations to be used. They can say it's free for personal use but requires a license for corporate use.

There's a handful of "ethical" LLMs out there that specifically only train with open source or public domain material. But every other one out there, they've stolen from millions of creators to train their AIs and every prompt that pulls upon their work to populate the result is another theft. That is at the root of the industry and they have to resolve this issue. OpenAI was set up as a non-profit specifically to try to get around the "commercial use" portion of Fair Use.

But really this is tech bro "disruption" at play... move fast, break things. In this case, they stole millions of works and then used them to make money. That's the bottom line. I don't think there's a way around that. And both the stealing of the works and the using of them are separate legal violations because the first is about how they acquired them and the second is that they don't have a license for using the works.

I get that LLM's and what we're calling "AI" these days are useful for some things. But we have to deal with the fact that they were built entirely upon stolen work.
 

Remove ads

Top