How Generative AI's work

Whizbang Dustyboots · Mar 18, 2024

I think it helps people understand generative AI if you point out that it's essentially doing the same thing the type-ahead feature on smartphone messaging does, which no one thinks makes their phone sentient or uniquely insightful.

MarkB · Mar 18, 2024

Whizbang Dustyboots said:
I think it helps people understand generative AI if you point out that it's essentially doing the same thing the type-ahead feature on smartphone messaging does, which no one thinks makes their phone sentient or uniquely insightful.

Sure, to much the same extent that sailing a boat around the world is essentially the same thing as pushing a rubber ducky around the bathtub.

Clint_L · Mar 18, 2024

Whizbang Dustyboots said:
I think it helps people understand generative AI if you point out that it's essentially doing the same thing the type-ahead feature on smartphone messaging does, which no one thinks makes their phone sentient or uniquely insightful.

I suggest you read the article I linked.

MarkB · Mar 18, 2024

Clint_L said:
There's a developing body of research exploring the possibility that Generative AI, Chat in particular, exceeds its original design parameters in ways that suggest a level of understanding. i.e.:

New Theory Suggests Chatbots Can Understand Text | Quanta Magazine

Far from being “stochastic parrots,” the biggest large language models seem to learn enough skills to understand the words they’re processing.

www.quantamagazine.org

Interesting. One thing that occurs to me is that, if an LLM is essentially static once it's been trained and tested, and is thereafter just generating the next word based upon percentages, does that mean that it's fundamentally uncreative - or does it mean that it did all its creativity, on every conceivable subject, during the training process, as it established the interrelationships of words from which those percentages are derived, and that the prompts we provide are simply a means of accessing one tiny facet of that already-completed creative process at a time?

gorice · Mar 18, 2024

Clint_L said:
There's a developing body of research exploring the possibility that Generative AI, Chat in particular, exceeds its original design parameters in ways that suggest a level of understanding. i.e.:

New Theory Suggests Chatbots Can Understand Text | Quanta Magazine

Far from being “stochastic parrots,” the biggest large language models seem to learn enough skills to understand the words they’re processing.

www.quantamagazine.org

I'm no expert on this stuff, but I found this article confusing. The argument seems to be something like 'the LLM is stochastically parotting a selection of stochastic parrot skills, and is therefore unlikely to be a stochastic parrot.'

There is also a lot of vague qualification in use. Saying that the LLM is 'displaying skills that add up to what some would argue is understanding' is particularly rich. Who is saying that, and according to which theory of mind?

The vagueness and speculation are actually the most troubling part of the article, for me. These things are just black boxes, and people are projecting their hopes and fears onto processes that even the experts can't reconstruct.

Clint_L · Mar 19, 2024

gorice said:
I'm no expert on this stuff, but I found this article confusing. The argument seems to be something like 'the LLM is stochastically parotting a selection of stochastic parrot skills, and is therefore unlikely to be a stochastic parrot.'

I'm not sure how you got that out of it.

gorice said:
There is also a lot of vague qualification in use. Saying that the LLM is 'displaying skills that add up to what some would argue is understanding' is particularly rich. Who is saying that, and according to which theory of mind?

The vagueness and speculation are actually the most troubling part of the article, for me. These things are just black boxes, and people are projecting their hopes and fears onto processes that even the experts can't reconstruct.

Well, it's science - scientists are always going to frame results in terms of probability and leave room for alternative hypotheses. However, the assessment by an independent scientist is pretty unequivocal: “[Generative AIs] cannot be just mimicking what has been seen in the training data,” said Sébastien Bubeck, a mathematician and computer scientist at Microsoft Research who was not part of the work. “That’s the basic insight."

And later: "...Hinton thinks the work lays to rest the question of whether LLMs are stochastic parrots. “It is the most rigorous method I have seen for showing that GPT-4 is much more than a mere stochastic parrot,” he said. “They demonstrate convincingly that GPT-4 can generate text that combines skills and topics in ways that almost certainly did not occur in the training data.”

Note that this is far from the only experiment that has been conducted. There is a raft of research currently being done on generative AI, as you might expect, and collectively they indicate a number of unexpected ways that models like Chat seem to go beyond stochastic parroting. And, as I note above, we don't understand the human mind very well, so it is a challenging point of comparison. For example, to what extent is human thought "merely" stochastic parroting?

ichabod · Mar 19, 2024

Clint_L said:
Well, it's science - scientists are always going to frame results in terms of probability and leave room for alternative hypotheses. However, the assessment by an independent scientist is pretty unequivocal: “[Generative AIs] cannot be just mimicking what has been seen in the training data,” said Sébastien Bubeck, a mathematician and computer scientist at Microsoft Research who was not part of the work. “That’s the basic insight."

The problem here is that they don't have access to the training data. This is mentioned in the article. So how can they say that it's not in the training data if they don't have the training data?

Clint_L said:
And later: "...Hinton thinks the work lays to rest the question of whether LLMs are stochastic parrots. “It is the most rigorous method I have seen for showing that GPT-4 is much more than a mere stochastic parrot,” he said. “They demonstrate convincingly that GPT-4 can generate text that combines skills and topics in ways that almost certainly did not occur in the training data.”

I did not find the description of their method very rigorous. They have created a layer of abstraction removed from the actual processing to make their case. They use that abstraction layer to assume that LLM already has understanding. Then they show that this understanding is growing faster than the error rate is supposed to drop according to statistical models. Again, note that they don't have the actual change in the error rate, they are using a statistical proxy. It's not clear to me how dependent these models are on the particular LLMs, and how much bias could be introduced by using the wrong model.

I think methods are rather weak, and full of a lot of assumptions. I am not convinced at all.

GrahamWills · Mar 19, 2024

Clint_L said:
There's a developing body of research exploring the possibility that Generative AI, Chat in particular, exceeds its original design parameters in ways that suggest a level of understanding. i.e.:

New Theory Suggests Chatbots Can Understand Text | Quanta Magazine

Far from being “stochastic parrots,” the biggest large language models seem to learn enough skills to understand the words they’re processing.

www.quantamagazine.org

Nice article -- thanks for the reference. As often is the case, a lot depends on what we mean by "understanding".
The core technology absolutely is simply predicting the next word from a. sequence of words. However, the way in which this is done is with a vast number of parameters. The question of understanding, then, is whether those parameters indicate understanding.

Now the article is generally good and well with reading, but it has a bit of a straw man in it:

The team is confident that it proves their point: The model can generate text that it couldn’t possibly have seen in the training data, displaying skills that add up to what some would argue is understanding.

It is very clear that the model does not simply generate text it say in its training data -- it generates combinations of fractions of text that it saw in its training data. If I ask it to write a poem about hobbits, St Bridgit and calculus, it will absolutely "generate text that it couldn’t possibly have seen in the training data".

The article builds a separate model for text that ties "skill nodes" to "word nodes" with the idea being that you van then correlate word usage to skill usage, and skill usage is what defines understanding. So LLMs can be said to have understanding if their word output shows that they are using skills in sensible ways. Apologies to the authors for this huge simplification of their argument.

I have some issues with this:

The researchers are really proving not that LLMs understand anything, but that they behave the same way as something that understands. Their quantification is helpful for science, but honestly, if you read some AI generated text, it's pretty clear that they behave the same way we do -- and we (hopefully) are understanding engines, so this isn't really anything new.
Their statement that understanding is equivalent to skill usage is one definition of understanding, but I'm not sure I'm 100% onboard with that as sufficient.
They state: “What [the team] proves theoretically, and also confirms empirically, is that there is compositional generalization, meaning [LLMs] are able to put building blocks together that have never been put together. This, to me, is the essence of creativity.” -- is it, though? Is it really creative to randomly put stuff together that have never been put together before? I feel there needs to be a bit more than that.

Overall, a great paper, and the use of bipartite knowledge graphs is a very clever idea that will hopefully allow us to quantify how the skill level of an LLM. I loom forward to seeing this use in the future. However, I still feel that the LLM is a stochastic parrot, but the stochastic process is so complex that the results simulate understanding without having actual understanding.

I also realize that there is a strong and valid philosophical position that if the results look like understanding, then it is understanding (the "if it looks like a duck" argument). Totally valid, and if that's your feeling, I cannot refute it. For me, though, it's not.

FrogReaver · Mar 19, 2024

GrahamWills said:
However, I still feel that the LLM is a stochastic parrot, but the stochastic process is so complex that the results simulate understanding without having actual understanding.

I also realize that there is a strong and valid philosophical position that if the results look like understanding, then it is understanding (the "if it looks like a duck" argument). Totally valid, and if that's your feeling, I cannot refute it. For me, though, it's not.

Then consider the reverse. Using your definition of ‘understand’, can you prove you or any human is not also a stochastic parrot?

GrahamWills · Mar 20, 2024

FrogReaver said:
Then consider the reverse. Using your definition of ‘understand’, can you prove you or any human is not also a stochastic parrot?

I am a professional mathematician by training, so I am very dubious of any attempt to prove anything that does not have a solid definitional structure. My only proof is evident to myself alone; my self-knowledge that I have consciousness. I cannot apply that to others.

How Generative AI's work

Whizbang Dustyboots

Gnometown Hero

MarkB

Legend

Clint_L

Hero

MarkB

Legend

New Theory Suggests Chatbots Can Understand Text | Quanta Magazine

gorice

Hero

New Theory Suggests Chatbots Can Understand Text | Quanta Magazine

Clint_L

Hero

ichabod

Legned

GrahamWills

Hero

New Theory Suggests Chatbots Can Understand Text | Quanta Magazine

FrogReaver

As long as i get to be the frog

GrahamWills

Hero

Similar Threads