The AI Red Scare is only harming artists and needs to stop.


log in or register to remove this ad

by current technical standards, it's still not generally accepted that current neural net software, especially LLMs, count as minds.

We are impoverished linguistically here. I had believed that Gottlob Frege had fully described language until I started playing with Chat GPT 3 and realized that I was witnessing the intelligent production of language without either the sense or reference of the words. (It certainly lacks reference, and if it had the sense we'd have to admit it was a mind.) I honestly don't know what a LLM should be described as, as it's not sentient and its not really capable of reasoning but is capable of more intelligent speech than say 95% of the human population. And it really is intelligent speech. It doesn't err any more often than most people I talk to, and really my words aren't chosen perfectly on the first pass consistently either. Sure existing models have very limited memory and context and don't persist an identity, but I've seen tremendous progress in that in the last 3 years, and even without that extra trick there is something there that if it isn't a full human mind in the sense we normally think of it, is still some fraction of it that we as yet have no name for. Either it's not fully empty or else we are.
 

We are impoverished linguistically here. I had believed that Gottlob Frege had fully described language until I started playing with Chat GPT 3 and realized that I was witnessing the intelligent production of language without either the sense or reference of the words. (It certainly lacks reference, and if it had the sense we'd have to admit it was a mind.) I honestly don't know what a LLM should be described as, as it's not sentient and its not really capable of reasoning but is capable of more intelligent speech than say 95% of the human population. And it really is intelligent speech. It doesn't err any more often than most people I talk to, and really my words aren't chosen perfectly on the first pass consistently either. Sure existing models have very limited memory and context and don't persist an identity, but I've seen tremendous progress in that in the last 3 years, and even without that extra trick there is something there that if it isn't a full human mind in the sense we normally think of it, is still some fraction of it that we as yet have no name for. Either it's not fully empty or else we are.
Yeah, I didn't want to disrespect the artists (who are really running into trouble) by hijacking the thread, but now that it's been brought up... the philosophical implications are kind of interesting.

These things can write better than 95% of people, including me, for small amounts of time...does that mean that what we thought of as creativity isn't as unique as we thought? Or if it is, it's something only a few people have?

If it's just a glorified autocomplete, and yet pretends to be human and apes humans so well, maybe we're closer to a glorified autocomplete than we thought? How many original thoughts do we actually have? People can be pretty predictable in a lot of situations...to what extent are we a ChatGPT made of meat?
 

Homer did not consent to Tolkien's use of his ideas either. Neither did Tolkien consent or get paid by all the RPG authors who used his ideas.
Had copyright existed at the time of Homer, it would have long expired by the time Tolkien rolled around. And the issue of Tolkien (and his estate) giving consent or being paid by RPG authors using his ideas is FAR more complicated to brush away as Tolkien "not giving consent or getting paid". Clearly, there are instances when licensing has been paid and where unapproved uses of his IP has been halted through legal action.
 

Yeah, I didn't want to disrespect the artists (who are really running into trouble) by hijacking the thread, but now that it's been brought up... the philosophical implications are kind of interesting.

I don't think it's hijacking at all. I think that the philosophical implications here are central to understanding this and why analogies about meat and sausage are so much self-serving self-deception. I think it's important to address what it actually is, rather than living inside an unreflected upon mental construct that basically has nothing to do with reality.

If we can't understand the thing - spoilers we can't understand it - we'd at least better be trying to do so.

These things can write better than 95% of people, including me, for small amounts of time...does that mean that what we thought of as creativity isn't as unique as we thought?

I mean, what is creativity? Back in the 1960's we thought intelligence was something that spontaneously arose out of complexity. People - smart people - thought that if they just made a machine that could play chess that it would spontaneously evolve into something intelligent. It was in retrospect the spontaneous generation theory of life applied to intelligence. It was like those 19th century scientists looking at cells for the first time and going, "Well it has a membrane and there is a black dot in the middle. It can't be that complicated". Now were are the point where I think we're getting close to being able to define intelligence and know what intelligence is, and the answer would knock those 1960s researchers out of their chairs and if we are honest should be knocking us out of our chairs right now.

Turns out "intelligence" is a word like "magic" that we use for something we don't understand. And it's probably not, actually certainly not "a thing", but likely a bunch of things only some of which we as "thinking apes" actually have.

Or if it is, it's something only a few people have?

It's a good question. I don't know. I had always considered creativity an error handling routine. My theory was it was based on faulty memory and was what we did to fill in the gaps in those memories. But I don't know, and I don't know how accessible that routine is to most people - even smart people. There is clearly a divergence here, as we are discovering that like the ability to actually see things in ones mind's eye that I had just assumed was ubiquitous is also an algorithm with a processor of diverse power across the human population.

If it's just a glorified autocomplete, and yet pretends to be human and apes humans so well, maybe we're closer to a glorified autocomplete than we thought?

I think that's the inevitable conclusion. I mean, the more you study parrots and baboons, the more obvious it is that we have processors and processes they don't have, but it could be that human language production is more like a parrot and autocomplete than we really want to admit. As an autistic missing certain apparently common processors of my own, it explains a massive amount of what I thought was a bizarrely missing channel in human speech production which I never could figure out where the "normals" were getting the clues to fill in. Namely, it always bothered me that sentences weren't prefixed by their purpose and I couldn't understand how people fully understood language without that context, yet acted as if they did understand and had speakers that received the speech as if it had been understood. I always thought people filled the missing channel with intonation and body language, which is part of the processing complex I'm bad at/low end of the spectrum. But the more human speech I processed online, the more I realized that was wrong. Humans weren't even aware they were missing that sense channel and that was the result of so much human miscommunication. Worse, much of human speech was obviously being produced without the human understanding why they spoke as they did or what purpose they were trying to accomplish. And I think the answer is that we're all closer to parrots doing glorified autocomplete than we'd like to think. I think people string together words based on prior patterns they've heard and they keep doing that until they achieve some desired response. Like so many conversations just boil down to repeated queries of, "Are you on my side?" where the words don't even matter. Listen for them sometimes.

Now that's not entirely fair. There is clearly something we have that parrots and gorillas don't have. But I wonder what the functional range is on that extra missing thing is. It's not always clear to me that people understand what they actually said much less why they said it.

How many original thoughts do we actually have?

Do you notice how many words only survive in speech as part of certain phrases? I mean I do it too, even though my working vocabulary is in the 38000 words range, I'm not sure now that I don't just have a bigger word salad.

I have so many interactions online these days were every single sentence that the person says I could goggle and find out there, and not just one example. And even where there is some word variation it's no bigger than the range of word variation a ChatBot uses to say "yes" creatively.

People can be pretty predictable in a lot of situations...to what extent are we a ChatGPT made of meat?

Or to what extent is ChatGPT by being trained on what we say just us made out of wires? Like it's not us yet, I think we can agree to that. But how much of us is it really? Would 9 more processing complexes be enough? 99?
 

The people going "If purchasing isn't owning then piracy isn't theft" are moral cretins

Do you understand how the internet works?

Mod note:
So, the insults and condescending tone are a real problem.

Please re-evaluate how you engage with others on this topic. We need you to show significantly more respect for the intelligence and persons of those you are talking to, or recuse yourself.
 

The future of AI is too credulous.

Ed adds:
There’s seemingly no limit to this phenomenon. When Altman [of OpenAI] says something genuinely, truly insane — like how he wants to raise $7tn, which is the combined GDP of Germany and France, to build up a dedicated AI semiconductor manufacturing capacity — the response isn’t to immediately dismiss it with a chorus of derisive laughter, but to actually discuss it with a level of seriousness that isn’t even remotely warranted.
 

In reality, the only copy that was made was the temporary copy that is made when anything on the internet is viewed, which is an essential aspect of the technology without which the whole internet must be taken down.
Did OpenAI train ChatGPT using only the temporary files cached by their internet browsers? Or did they save all the copyrighted files they scraped onto a server; compile those otherwise unmodified works into an indexed training set; and then use that stored data as a commercial asset for one full cycle of their AI training process before deleting it?

I suspect the second scenario is the more likely of the two, and that's the part of the process I can't get behind. I don't have any strong opinions about a neural net creating images in a particular artist's style, especially if the machine demonstrably doesn't store copies of that artist's work. One can argue the neural network isn't violating any copyrights if it doesn't have anything to copy. So, for the sake of argument, let's not even talk about that.

Let's talk about the private company who's using copyrighted works as building blocks in a training set stored on its private servers, for use as a software development tool. That data set isn't creative, it isn't transformative, and it isn't Fair Use. If someone is going to compile copyrighted works into a proprietary commercial asset they're using to add value to their for-profit software development process, they should have to get permission from the copyright holders before doing so. Anything short of that is a copyright right violation or two (or a few billion, as the case may be).
 

Did OpenAI train ChatGPT using only the temporary files cached by their internet browsers? Or did they save all the copyrighted files they scraped onto a server; compile those otherwise unmodified works into an indexed training set; and then use that stored data as a commercial asset for one full cycle of their AI training process before deleting it?

I suspect the second scenario is the more likely of the two

I'm not sure how they did it but the key point is that as far as I know they obtained all the data legally. The point I was making was that merely obtaining an electronic copy doesn't violate copyright or everyone's browser cache would make them a felon.

and that's the part of the process I can't get behind. I don't have any strong opinions about a neural net creating images in a particular artist's style, especially if the machine demonstrably doesn't store copies of that artist's work. One can argue the neural network isn't violating any copyrights if it doesn't have anything to copy. So, for the sake of argument, let's not even talk about that.

Fair enough. At least I admit you have an interesting perspective.

Let's talk about the private company who's using copyrighted works as building blocks in a training set stored on its private servers, for use as a software development tool. That data set isn't creative, it isn't transformative, and it isn't Fair Use.

No data set is creative, transformative, or fair use in and of itself. That would be like calling a book I bought in the book store creative, transformative, or fair use. Obviously it isn't, but also at the same time that's nonsense because those terms are applied to derivative works. It's not clear to be the data set is a derivative work. The question is whether they acquired the data set legally. As far as I know they did. They certainly wouldn't have the right to distribute that data set or sell it, but the data set itself - which I admit I haven't really given much thought to - doesn't strike me as critical to the discussion. I think we both agree they would have had a right to read the data or look at the data. I always assumed that they just picked a bunch of things on the internet to scan which they could legally read and then did so. It would be interesting if the terms of service of any of those sites at the time specifically blocked the use of data for machine training, but I doubt it unless there was some blanket prohibition against using the text as the basis of scientific research. No one was thinking about those things at the time. No one was regularly saying anything in their terms of service like, "Sign here if you agree when accessing this information not to use it to train an AI."

Even then we wouldn't really be talking about copyright violation. We'd be discussing a breach of contract or theft of services (say you obtained all of NY Times archives without paying them for the access).

So then the question is whether creating a value added product from the data they could legally read was a violation of copyright. And then, I think that gets back to the question of whether the product they made was creative, transformative, or fair use. But earlier you said: "One can argue the neural network isn't violating any copyrights if it doesn't have anything to copy. So, for the sake of argument, let's not even talk about that."

If someone is going to compile copyrighted works into a proprietary commercial asset they're using to add value to their for-profit software development process, they should have to get permission from the copyright holders before doing so. Anything short of that is a copyright right violation or two (or a few billion, as the case may be).

Whether they should or not is a different question. What I'm asserting is that it was not a copyright violation, and at the time the law was silent on whether they should get permission to use copyrighted works in that manner. Again, the law is pretty silent about reading copyrighted works, and focuses on the distribution of them or on derivative works - that is whether plagiarism has occurred. And in this case they didn't distribute a copy of the works and the derived works are not in the general case plagiarism (and where they are the case in my opinion is against the individual work, not language or image model itself). Whether or not someone ought to think hard about this and write some laws around this to protect intellectual property holders I'm not sure because I'm not sure in the general case if that actually benefits society or is enforceable, but I have been saying for about two decades that we ought to be more forward thinking and start preparing the law for the inevitable AI revolution when it happens and I do have some thoughts about that. And look, the inevitable AI revolution has started and we aren't legally ready for it. Worse is probably yet to come.
 
Last edited:

The world would be much better off if the web "to steal" was reserved for only cases where the original owner is deprived of the thing stolen.

If I take your car so now you have nothing to drive around in, I have stolen your car.

If I build an exact replica of your car, but you still have your care, I may have broken some laws, but it would be so useful if we'd agree that whatever I did, I did not steal your car.
So if we're going to reframe the argument as "yes it's breaking the law...but which law?" I'm okay with it. The point has always been: the law is being broken.
 

Remove ads

Top