The AI Red Scare is only harming artists and needs to stop.

Accessed, not Obtained. Accessed.
Obtaining implies ownership.

Seems to be splitting hairs but I'm happy to say "accessed" in place of "obtained' in that sentence without changing it in anyway. The key point is that a copy of the information is copied to and from a whole series of computers before it lands on the computer of the accessor and often stays there for some length of time. The accessor of the text doesn't own it anymore than the owner of a book owns the text of the book, but he still has a copy of it.
 

log in or register to remove this ad

My problem is I'm remarkably unconvinced anyone making claims regarding that subject for any purpose is really as knowledgeable as they think they are here; there's way too many incentives for people to believe what they want to believe regarding this topic, including people who study one or the other half of it professionally.
My friend...you're on a web forum.
 

My objection isn't that OpenAI read a bunch of data they scanned from the internet. If all they were doing was having their AI read the contents of their browser cache as it was being displayed in the browser window, I don't have any problem with that. I don't think it's possible for a copyright holder to post publicly-facing internet content without implicitly giving permission for people (or entities) to create copies of that content for the express purpose of viewing those copies in internet browsers. Allowing those cached copies to exist is literally the express purpose of posting public-facing content online, so it would be non-sensical to say you're posting public-facing content online but also withholding that permission.

Agreed.

My objection would be if, as I suspect is the case, OpenAI copied the content they scraped from the internet into a training database separate from any browser cache. Saving a new copy of online content into a database separate from your browser cache is not standard practice when viewing a website, so there's no implicit permission to do so.

I don't know what they did, but if this is the basis of your objection, are you saying that if they fed the input directly into the training program without making two copies of it on the computer (indeed potentially just feeding it into a service of some sort and thus never making any permanent file of it) that in your mind the whole procedure would now be legal?

Because if I had known that some judge was going to rule that way and I was running an AI firm I totally would have had the programmers write things that way in order to follow the strict letter of the law. But, I believe you technical approach to legal rights fundamentally breaks down as non-transparent law. The law should never be such that it involves unforeseen technicalities.

And there is also precedent for why your technical letter of the law approach is flawed that I mentioned before and that is internet browsers. Open AI is far from the first company to scan the whole internet into a database and then make a derivative work of it. Google is the first company to do that. And they are still doing it. They have web crawlers that go out and read all the words, put them into a database and use that data as the basis of making search engines. They then use that search engine as the basis for developing revenue from ads. So your strict letter of the law approach based on technicalities makes not only training an AI illegal, but also building a search index for a web search.

And further, being based on a technicality as I said I could just do this without storing a file at all. And heck for all I know, they didn't ever store a file. Maybe they just put this all into some sort of database structure immediately upon crawling the web with a custom web crawler.

The web browsers that you use to search the web are just one way of accessing and displaying the information on the internet. You can - and I have - write automated web crawlers that involve no human viewing at all and which just download information from websites. In my case, I was downloading genetic transcriptions from NCBI for use in things like automated annotation and eventually protein folding, but you can do this to any website. I have for example for a while now considered writing a simple crawler (with a suitable wait period between requests) to download all my past posts at EnWorld so that I'll have a copy in the event EnWorld blows a fuse.

I think you are getting lost down a technical rabbit hole that doesn't really matter.

(And if they're skipping the browser and just scraping copyrighted material directly into a training file, I'd that's an even more blatant copyright violation.)

So is Google also guilty of a mass copyright violation and can be sued by every website it's ever crawled with its own web crawlers? Is every web search engine also guilty of copyright violation?
 

I'll mind it.
Taking a really good photo is a skill. And two photographers can each take a photo of the same subject, and get very different results. That difference is what makes it copyrightable.
If you stood at the north end of Tower Bridge in London, took a photo looking south, and then claimed copyright on it, that would suggest that for however long your copyright lasts I (or anyone else) couldn't go there, take the same photo, and do the same things with it that you did.

Which is bollocks.
 

For the millionth time, that's not what AI does.

If you want to claim that an AI being fed information and blindly and uncomprehendingly applying it according to its code is the same as a human taking inspiration then prove it.
We don't know enough about how the brain works to be reliably able to either prove or disprove this.

The brain is fed information, synthesizes and processes it, and causes the body to produce an output of some sort - art, music, writing, whatever.

The AI generator is fed information, synthesizes and processes it, and causes the computer to produce an output of some sort - art, music, writing, whatever.

The information comes, or can come, from the same source(s). No difference there.

The only difference is that some people know how the synth-and-process piece works in the AI generator, because they programmed it that way; but as yet nobody's anywhere near sure about how the human brain does it. What matters - and what makes it the same thing - is that in both cases it's being done at all.
 


For example, OpenAI successfully defended itself from writers like Sarah Silverman who argued that she should have been financially compensated for OpenAI processing her memoir as input to their model.
Not yet they haven't. The judge dismissed most of the claims brought by Silverman and other authors, but not all. The claim for violation of California’s unfair competition law was permitted to advance, and the claim for direct copyright infringement remains in the lawsuit as well. Curiously, OpenAI did not move to dismiss that last claim.

So much of this is unsettled law. It might go generative AI's way, it might not, it might land somewhere in the middle. Who can say? We'll have to see how the many current and threatened legal actions play out.

There's also the issue that currently the output of an LLM cannot itself be protected by copyright. For example, the USTPO reviewed a registration for a graphic novel consisting of human written text and AI generated images. They ruled that as a whole the book "constituted a copyrightable work, but that the individual images themselves could not be protected by copyright."
 

AI is going no where, it's essentially the next Industrial Revolution.
They said the same thing about "THE BLOCKCHAIN!" How'd that work out?

We don't know enough about how the brain works to be reliably able to either prove or disprove this.
Yes we do. We know how the brain works and we know how AI works and we know they're nowhere near the same. What are your credentials to claim otherwise? What new discovery have you made that leads you to believe you're right and neurologists and mental health experts are wrong?
 

It still turns on the idea that an AI that learns to do art is held to a different standard than a human who, using the same material to learn to do art, isn't. That's self-evident to some people, and anything but to others.
Of course there are different standards at play here. Humans have personhood and rights under the law. AI does not have have personhood and rights under the law.
 

"AI doesn't kill culture. People kill culture."

Yeah. Because AI isn't about culture. It is about the data you put into it. Cultural bits are only one kind of data.

Thinking about AI (generative or otherwise) from only the point of view of writing fictional text or making pretty pictures is wearing blinders so you don't see most of the possibilities.
 

Remove ads

Top