I like to refer to LLMs in particular as "advanced autocomplete". The transformer model was genuinely innovative! But it's still autocomplete, which we've had for decades now.
One of my favorite facts I have learned during this latest hype cycle is that the coinage of the term AI was a branding move by John McCarthy, made primarily so his 1955 Dartmouth workshop would not fall under the umbrella of Cybernetics so they wouldn't have to deal with Norbert Wiener.
Only thing I'd add is that incremental advance on existing technologies is built on theft of IP on a massive scale. The hucksters in charge of these machine learning companies themselves describe the lawsuits from NY Times and big name authors as an 'existential threat', and if the courts in Europe or the US find in favor of the copyright holders the whole thing goes up in a puff of smoke.
Machine learning involves the application of "neural networks" to massive data sets. And despite the name, neural networks have nothing to do with neurones. They are just a complicated and non-linear version of discriminant analysis, invented nearly 100 years ago. Fisher used it to assign plants to different species based on some measures such as sepal width. The "training set" contains plants where both the measurements and the species are known.
LLMs take strings of text and predict what text will come next. That involves massively greater training sets, and much more complex outputs, but the underlying principle is the same.
Here's an article which gives the history before making some of the observations I've offered (though with a less critical tone).
If you take any very large quantity of non-random data and use a powerful enough statistical algorithm you are likely to find some predictable patterns. But these patterns do not automatically provide and explanation of how the patterns were generated.
The Greeks, over 2000 years ago, discovered that the apparent movement of the planets in the Zodiac could be modelled, to an acceptable degree of accuracy, by using epicycles. We now know that using enough epicycles (given a powerful enough computer) it is possible to model almost any pattern - But the epicycle model (however far it is taken) will not tell us that planets move round the sun.
I suggest the large languge models (while undoubtedly very powerful and useful tools) succeed in modelling "intelligence" for the same reason that epicycles model the apparent movement of the planets. There are clearly patterns (including at lot of repetitions) in any vast (terabyte) collections of cultural information and using enough computer power and suitable algorithms at lest some of these patterns can be identified and used to make feasible looking predictions. However I suspect that using the Turing Test "black box" model to assess them tells us very little about the underlying "intelligence" which generated the original patterns., as it is quite clear that the LLMs do not UNDERSTAND the information they are processing.
This was the main criticism made of discriminant analysis back in the 1970s, by McFadden and others, looking at choice of transport mode. They wanted an explicit model of choice, not a black box prediction.
Most of the problems of "machine learning" were discovered in the early days of statistical modelling, but have been forgotten.
Dr. Emily Bender is an amazing starting off point for unwrapping the hype machine. Now Knowing her name, and now knowing that MYSTERY AI HYPE THEATRE 3000 exists:
this has been my Sunday morning unexpected moments of excellence.
Thankyou for introducing me to this woman.
Kudos Kathy E Gill, and as always thank you Dave Karpf, for encapsulating so much to your substack you are my Coles Notes, my starting off point for Breaking Developments in all things Tech.
Thanks for a very useful reference. In the 1960s I was asked to look how to interface future IMIS (Integrated Management Information Systems) with human users - and I proposed an electronic clerk interface which used a syntactically simple language CODIL (using the human terminology) which should work symbiotically as a member of a human work team. The key requirements (which the latest AI developments fail to met) was TOTAL transparency, and full human-reasable self-documentation. In effect nothing of the shared task should be hidden in a "black box." I was twice made redundant because the idea did not meet the then fashionable AI paradigm and I am now reassessing the project archives. It seems that CODIL can be considered as an accidentally reversed-engineered model of how the human brain's short term memory processes information and that much of our intelligence depend on our ability to share cultural knowledge (I.e. we are good copycats) and our intelligence is built (using Sir Isac Newton's words) on the shoulder of giants.
Yep! As GenAI scales, I wonder if we’ll see a wave of “accidental plagiarism” incidents as people create what they think is original work via bespoke prompts but the algorithm uses shortcuts or defaults to a median result per its training data. Also curious what impact we’ll see from more and more of the internet (and thus AI training data) becoming itself AI-generated, in an infinite recursion loop
As best I undersstand, LLM (what we have today) is based on a massive data set that has connectrions assigned for predictive relationships (what cynics call autocomplete). Based on Altman's call for trllions for more processing power and data collection, the hope seems to be that all those connections will form a network that will somehow spontaniously develop "real" "AI", like a human brain. Will it work? Who can say. Odds are we wind up with a supercomputer who tells us the answer is "42".
It's worth observing that the change in meaning of "algorithm" from "reliable mathematical procedure to solve a given problem" to "statistical model with unknown properties" has helped the credulous acceptance of "machine learning".
I really like the idea of making a distinction between something genuinely new and something substantive but rebranded. It's a useful way to view this critically.
Perhaps one potential criticism is that something can be imminently world-altering because of the financial might of those backing it. You write (excellently) a lot about the power concentrated in Silicon Valley. In other words, even bullshit generators can be imminently world-altering in a world run on futurism by a handful of cashed-up ideologues and nutters.
Perhaps too, a broader consideration of potential and actual use cases might in turn broaden your view on the novelty and transformative potential of this tech. Even if we take "bullshit generator" as a given, a business consultant and an artist are going to see those as two profoundly different things. For example, in abstract/experimental artistic ontologies where accuracy has limited to no meaning, it's difficult to parse the concept of "bullshit". I find those use cases interesting to think about, even if they are fringe.
thanks, and agreed. I wrote a piece a year or so ago about generative AI as satisficing tools. I still think there's a lot to that. There's significant uses for tools like these, so long as they're constrained to what they're good for.
Thanks for the link and reply, that was a good read too. A good complimentary piece to this one I thought. Satisficing strongly reminds me of an idea from a Peter Watts novel: "There's no such things as survival of the fittest. Survival of the most adequate, maybe. It doesn't matter whether a solution's optimal. All that matters is whether it beats the alternative."
> Are LLMs a genuinely new phenomenon, like the steam engine, or are they a significant incremental advance like, say, broadband-speed internet.
I think the answer is *both*, on different timelines. There are a lot of reasons the conversation around LLMs is so confusing and contentious; one is that people are talking about different time scales. The applications we can actually put our hands on today range from snake oil to significant incremental advance. However, the core capabilities of LLMs will continue to advance. More importantly, it's still very, very early days in figuring out how to make good use of these models. How to get information in and out of them ("retrieval-augmented generation", tool use, etc.), which application domains they're best suited for, redesigning user interfaces around chat instead of mouse clicks, etc.
The term 'general purpose technology' made me think: no it is not, but it is a specific approximation technology dat is more 'general data'-driven, and the results suggests it is 'general-purpose'.
I can think of lots of ways that AI can be put to use in science and engineering. Google’s protein folding project was a good example. But an LLM is not the right AI tool for any of them. The current wave of LLMs do not impress me as anything I would use for anything that was really important.
Exactly. As a biologist and a "data scientist", I use machine-learning techniques myself (although I don't call them "AI", in view of all the nonsense associated with that term these days), but I consider LLMs just Stupid Computer Tricks (with apologies to David Letterman).
"Andohbytheway, it isn't so clear that it actually works this time either.": It mostly doesn't, a fact that should be emphasized in all discussions of "AI" but typically isn't even mentioned. See, for example, "The fallacy of AI functionality":
"Deployed AI systems often do not work. They can be constructed haphazardly, deployed indiscriminately, and promoted deceptively. However, despite this reality, scholars, the press, and policymakers pay too little attention to functionality. This leads to technical and policy solutions focused on 'ethical' or value-aligned deployments, often skipping over the prior question of whether a given system functions, or provides any benefits at all."
When I worked at the Swedish Institute of Computer Science during the 1990s, the institute was divided into three departments or "labs". One of them was called the "Knowledge-Based Systems" lab. Members of KBS lab worked on what some people called "AI" (this was the "expert system" era), but at least some members of KBS lab avoided the term "AI", which had been brought into disrepute by previous iterations of the "AI" hype cycle. Unlike clowns like Sam Altman, they were genuine, honest scientists.
In my work as a biologist and a "data scientist"*, I've used "machine learning", which is a term of convenience for certain kinds of statistical modeling (e.g., support-vector machines, gradient-boosted regression trees, and, yes, various flavors of artificial neural network). It isn't intelligent, not even close, if by "intelligent" we mean exhibiting anything like the breadth, flexibility, or reliability of reasoning by even a rather stupid human. And no, it isn't just a matter of expanding the training data. One clue to the contrary is that humans learn to recognize faces, generate grammatical sentences, etc. on the basis of far smaller data sets than state-of-the-art machine-learning systems require.
Moreover, this isn't surprising to anybody who knows even a little about evolution or how brains work. Intelligence isn't "one weird trick". It's a whole bag of tricks, which evolved over a vast span of time (e.g., the divergence time between humans and chimps is circa seven million years) and many of which aren't well understood yet, let alone replicated with computers. I know of no good reason to doubt artificial intelligence is possible but several good reasons to doubt it will happen within the next 20 years.
*Another widely used but cringe-inducing term. Which scientists aren't "data scientists"? (String theorists?)
I’m really looking forward to this AI boom to pop. I’m quite sick of it.
LLMs are quite useful to help me code. I probably save 30 minutes a week. That’s not nothing, but it’s also not a game changer. It’s not a $100 billion data center worth of usefulness .
I like to refer to LLMs in particular as "advanced autocomplete". The transformer model was genuinely innovative! But it's still autocomplete, which we've had for decades now.
One of my favorite facts I have learned during this latest hype cycle is that the coinage of the term AI was a branding move by John McCarthy, made primarily so his 1955 Dartmouth workshop would not fall under the umbrella of Cybernetics so they wouldn't have to deal with Norbert Wiener.
I can understand that. I appreciate some of Wiener's work, but what I've read about him suggests he was eccentric to the point of exasperating.
Great points.
Only thing I'd add is that incremental advance on existing technologies is built on theft of IP on a massive scale. The hucksters in charge of these machine learning companies themselves describe the lawsuits from NY Times and big name authors as an 'existential threat', and if the courts in Europe or the US find in favor of the copyright holders the whole thing goes up in a puff of smoke.
Machine learning involves the application of "neural networks" to massive data sets. And despite the name, neural networks have nothing to do with neurones. They are just a complicated and non-linear version of discriminant analysis, invented nearly 100 years ago. Fisher used it to assign plants to different species based on some measures such as sepal width. The "training set" contains plants where both the measurements and the species are known.
LLMs take strings of text and predict what text will come next. That involves massively greater training sets, and much more complex outputs, but the underlying principle is the same.
Here's an article which gives the history before making some of the observations I've offered (though with a less critical tone).
https://www.sciencedirect.com/science/article/pii/S0047259X24000484
Very nice, thanks!
If you take any very large quantity of non-random data and use a powerful enough statistical algorithm you are likely to find some predictable patterns. But these patterns do not automatically provide and explanation of how the patterns were generated.
The Greeks, over 2000 years ago, discovered that the apparent movement of the planets in the Zodiac could be modelled, to an acceptable degree of accuracy, by using epicycles. We now know that using enough epicycles (given a powerful enough computer) it is possible to model almost any pattern - But the epicycle model (however far it is taken) will not tell us that planets move round the sun.
I suggest the large languge models (while undoubtedly very powerful and useful tools) succeed in modelling "intelligence" for the same reason that epicycles model the apparent movement of the planets. There are clearly patterns (including at lot of repetitions) in any vast (terabyte) collections of cultural information and using enough computer power and suitable algorithms at lest some of these patterns can be identified and used to make feasible looking predictions. However I suspect that using the Turing Test "black box" model to assess them tells us very little about the underlying "intelligence" which generated the original patterns., as it is quite clear that the LLMs do not UNDERSTAND the information they are processing.
This was the main criticism made of discriminant analysis back in the 1970s, by McFadden and others, looking at choice of transport mode. They wanted an explicit model of choice, not a black box prediction.
Most of the problems of "machine learning" were discovered in the early days of statistical modelling, but have been forgotten.
Dr Emily Bender’s testimony is spot on, IMO. She’s the person who taught me that AI is a 1950s marketing effort.
Still searching for the video. Here’s written testimony.
https://democrats-science.house.gov/imo/media/doc/Dr.%20Bender%20-%20Testimony.pdf
Bender's work is categorically excellent, agreed.
Dr. Emily Bender is an amazing starting off point for unwrapping the hype machine. Now Knowing her name, and now knowing that MYSTERY AI HYPE THEATRE 3000 exists:
(here: https://www.dair-institute.org/maiht3k/ )
this has been my Sunday morning unexpected moments of excellence.
Thankyou for introducing me to this woman.
Kudos Kathy E Gill, and as always thank you Dave Karpf, for encapsulating so much to your substack you are my Coles Notes, my starting off point for Breaking Developments in all things Tech.
You’re welcome! Yes Dr Bender is awesome!
Thanks for a very useful reference. In the 1960s I was asked to look how to interface future IMIS (Integrated Management Information Systems) with human users - and I proposed an electronic clerk interface which used a syntactically simple language CODIL (using the human terminology) which should work symbiotically as a member of a human work team. The key requirements (which the latest AI developments fail to met) was TOTAL transparency, and full human-reasable self-documentation. In effect nothing of the shared task should be hidden in a "black box." I was twice made redundant because the idea did not meet the then fashionable AI paradigm and I am now reassessing the project archives. It seems that CODIL can be considered as an accidentally reversed-engineered model of how the human brain's short term memory processes information and that much of our intelligence depend on our ability to share cultural knowledge (I.e. we are good copycats) and our intelligence is built (using Sir Isac Newton's words) on the shoulder of giants.
Thanks for making the point that these heralded and partially extant tools do not actually work.
Total side note, but I noticed an eerie similarity between your header image and one I generated with Stable Diffusion for one of my own articles last year: https://allscience.substack.com/p/ai-head-to-head-gpt-35-vs-gpt-4
I have long since lost my exact prompt details, but it's interesting to see how easily it defaults to recycled archetypes/stereotypes!
iiiiiiinteresting.
Yep! As GenAI scales, I wonder if we’ll see a wave of “accidental plagiarism” incidents as people create what they think is original work via bespoke prompts but the algorithm uses shortcuts or defaults to a median result per its training data. Also curious what impact we’ll see from more and more of the internet (and thus AI training data) becoming itself AI-generated, in an infinite recursion loop
As best I undersstand, LLM (what we have today) is based on a massive data set that has connectrions assigned for predictive relationships (what cynics call autocomplete). Based on Altman's call for trllions for more processing power and data collection, the hope seems to be that all those connections will form a network that will somehow spontaniously develop "real" "AI", like a human brain. Will it work? Who can say. Odds are we wind up with a supercomputer who tells us the answer is "42".
It's worth observing that the change in meaning of "algorithm" from "reliable mathematical procedure to solve a given problem" to "statistical model with unknown properties" has helped the credulous acceptance of "machine learning".
I really like the idea of making a distinction between something genuinely new and something substantive but rebranded. It's a useful way to view this critically.
Perhaps one potential criticism is that something can be imminently world-altering because of the financial might of those backing it. You write (excellently) a lot about the power concentrated in Silicon Valley. In other words, even bullshit generators can be imminently world-altering in a world run on futurism by a handful of cashed-up ideologues and nutters.
Perhaps too, a broader consideration of potential and actual use cases might in turn broaden your view on the novelty and transformative potential of this tech. Even if we take "bullshit generator" as a given, a business consultant and an artist are going to see those as two profoundly different things. For example, in abstract/experimental artistic ontologies where accuracy has limited to no meaning, it's difficult to parse the concept of "bullshit". I find those use cases interesting to think about, even if they are fringe.
thanks, and agreed. I wrote a piece a year or so ago about generative AI as satisficing tools. I still think there's a lot to that. There's significant uses for tools like these, so long as they're constrained to what they're good for.
https://davekarpf.substack.com/p/on-generative-ai-and-satisficing
Thanks for the link and reply, that was a good read too. A good complimentary piece to this one I thought. Satisficing strongly reminds me of an idea from a Peter Watts novel: "There's no such things as survival of the fittest. Survival of the most adequate, maybe. It doesn't matter whether a solution's optimal. All that matters is whether it beats the alternative."
> Are LLMs a genuinely new phenomenon, like the steam engine, or are they a significant incremental advance like, say, broadband-speed internet.
I think the answer is *both*, on different timelines. There are a lot of reasons the conversation around LLMs is so confusing and contentious; one is that people are talking about different time scales. The applications we can actually put our hands on today range from snake oil to significant incremental advance. However, the core capabilities of LLMs will continue to advance. More importantly, it's still very, very early days in figuring out how to make good use of these models. How to get information in and out of them ("retrieval-augmented generation", tool use, etc.), which application domains they're best suited for, redesigning user interfaces around chat instead of mouse clicks, etc.
The term 'general purpose technology' made me think: no it is not, but it is a specific approximation technology dat is more 'general data'-driven, and the results suggests it is 'general-purpose'.
I can think of lots of ways that AI can be put to use in science and engineering. Google’s protein folding project was a good example. But an LLM is not the right AI tool for any of them. The current wave of LLMs do not impress me as anything I would use for anything that was really important.
Exactly. As a biologist and a "data scientist", I use machine-learning techniques myself (although I don't call them "AI", in view of all the nonsense associated with that term these days), but I consider LLMs just Stupid Computer Tricks (with apologies to David Letterman).
"Andohbytheway, it isn't so clear that it actually works this time either.": It mostly doesn't, a fact that should be emphasized in all discussions of "AI" but typically isn't even mentioned. See, for example, "The fallacy of AI functionality":
https://dl.acm.org/doi/abs/10.1145/3531146.3533158
"Deployed AI systems often do not work. They can be constructed haphazardly, deployed indiscriminately, and promoted deceptively. However, despite this reality, scholars, the press, and policymakers pay too little attention to functionality. This leads to technical and policy solutions focused on 'ethical' or value-aligned deployments, often skipping over the prior question of whether a given system functions, or provides any benefits at all."
When I worked at the Swedish Institute of Computer Science during the 1990s, the institute was divided into three departments or "labs". One of them was called the "Knowledge-Based Systems" lab. Members of KBS lab worked on what some people called "AI" (this was the "expert system" era), but at least some members of KBS lab avoided the term "AI", which had been brought into disrepute by previous iterations of the "AI" hype cycle. Unlike clowns like Sam Altman, they were genuine, honest scientists.
In my work as a biologist and a "data scientist"*, I've used "machine learning", which is a term of convenience for certain kinds of statistical modeling (e.g., support-vector machines, gradient-boosted regression trees, and, yes, various flavors of artificial neural network). It isn't intelligent, not even close, if by "intelligent" we mean exhibiting anything like the breadth, flexibility, or reliability of reasoning by even a rather stupid human. And no, it isn't just a matter of expanding the training data. One clue to the contrary is that humans learn to recognize faces, generate grammatical sentences, etc. on the basis of far smaller data sets than state-of-the-art machine-learning systems require.
Moreover, this isn't surprising to anybody who knows even a little about evolution or how brains work. Intelligence isn't "one weird trick". It's a whole bag of tricks, which evolved over a vast span of time (e.g., the divergence time between humans and chimps is circa seven million years) and many of which aren't well understood yet, let alone replicated with computers. I know of no good reason to doubt artificial intelligence is possible but several good reasons to doubt it will happen within the next 20 years.
*Another widely used but cringe-inducing term. Which scientists aren't "data scientists"? (String theorists?)
I’m really looking forward to this AI boom to pop. I’m quite sick of it.
LLMs are quite useful to help me code. I probably save 30 minutes a week. That’s not nothing, but it’s also not a game changer. It’s not a $100 billion data center worth of usefulness .
I read a lot of short story blogs, and I think Claude can write short stories better than like 80% of them. As an example, check out this post of mine: https://substack.com/home/post/p-145637657?r=loua&utm_campaign=post&utm_medium=web
So I'm a little more inclined than you to believe the hype!