37 Comments
User's avatar
Luigi's avatar

I like to refer to LLMs in particular as "advanced autocomplete". The transformer model was genuinely innovative! But it's still autocomplete, which we've had for decades now.

Expand full comment
Pablo's avatar

One of my favorite facts I have learned during this latest hype cycle is that the coinage of the term AI was a branding move by John McCarthy, made primarily so his 1955 Dartmouth workshop would not fall under the umbrella of Cybernetics so they wouldn't have to deal with Norbert Wiener.

Expand full comment
Glen's avatar

Great points.

Only thing I'd add is that incremental advance on existing technologies is built on theft of IP on a massive scale. The hucksters in charge of these machine learning companies themselves describe the lawsuits from NY Times and big name authors as an 'existential threat', and if the courts in Europe or the US find in favor of the copyright holders the whole thing goes up in a puff of smoke.

Expand full comment
John Quiggin's avatar

Machine learning involves the application of "neural networks" to massive data sets. And despite the name, neural networks have nothing to do with neurones. They are just a complicated and non-linear version of discriminant analysis, invented nearly 100 years ago. Fisher used it to assign plants to different species based on some measures such as sepal width. The "training set" contains plants where both the measurements and the species are known.

LLMs take strings of text and predict what text will come next. That involves massively greater training sets, and much more complex outputs, but the underlying principle is the same.

Here's an article which gives the history before making some of the observations I've offered (though with a less critical tone).

https://www.sciencedirect.com/science/article/pii/S0047259X24000484

Expand full comment
Dave Karpf's avatar

Very nice, thanks!

Expand full comment
Chris Reynolds's avatar

If you take any very large quantity of non-random data and use a powerful enough statistical algorithm you are likely to find some predictable patterns. But these patterns do not automatically provide and explanation of how the patterns were generated.

The Greeks, over 2000 years ago, discovered that the apparent movement of the planets in the Zodiac could be modelled, to an acceptable degree of accuracy, by using epicycles. We now know that using enough epicycles (given a powerful enough computer) it is possible to model almost any pattern - But the epicycle model (however far it is taken) will not tell us that planets move round the sun.

I suggest the large languge models (while undoubtedly very powerful and useful tools) succeed in modelling "intelligence" for the same reason that epicycles model the apparent movement of the planets. There are clearly patterns (including at lot of repetitions) in any vast (terabyte) collections of cultural information and using enough computer power and suitable algorithms at lest some of these patterns can be identified and used to make feasible looking predictions. However I suspect that using the Turing Test "black box" model to assess them tells us very little about the underlying "intelligence" which generated the original patterns., as it is quite clear that the LLMs do not UNDERSTAND the information they are processing.

Expand full comment
John Quiggin's avatar

This was the main criticism made of discriminant analysis back in the 1970s, by McFadden and others, looking at choice of transport mode. They wanted an explicit model of choice, not a black box prediction.

Most of the problems of "machine learning" were discovered in the early days of statistical modelling, but have been forgotten.

Expand full comment
⚡️Kathy E Gill's avatar

Dr Emily Bender’s testimony is spot on, IMO. She’s the person who taught me that AI is a 1950s marketing effort.

Still searching for the video. Here’s written testimony.

https://democrats-science.house.gov/imo/media/doc/Dr.%20Bender%20-%20Testimony.pdf

Expand full comment
Dave Karpf's avatar

Bender's work is categorically excellent, agreed.

Expand full comment
Liam Gee's avatar

Dr. Emily Bender is an amazing starting off point for unwrapping the hype machine. Now Knowing her name, and now knowing that MYSTERY AI HYPE THEATRE 3000 exists:

(here: https://www.dair-institute.org/maiht3k/ )

this has been my Sunday morning unexpected moments of excellence.

Thankyou for introducing me to this woman.

Kudos Kathy E Gill, and as always thank you Dave Karpf, for encapsulating so much to your substack you are my Coles Notes, my starting off point for Breaking Developments in all things Tech.

Expand full comment
⚡️Kathy E Gill's avatar

You’re welcome! Yes Dr Bender is awesome!

Expand full comment
Chris Reynolds's avatar

Thanks for a very useful reference. In the 1960s I was asked to look how to interface future IMIS (Integrated Management Information Systems) with human users - and I proposed an electronic clerk interface which used a syntactically simple language CODIL (using the human terminology) which should work symbiotically as a member of a human work team. The key requirements (which the latest AI developments fail to met) was TOTAL transparency, and full human-reasable self-documentation. In effect nothing of the shared task should be hidden in a "black box." I was twice made redundant because the idea did not meet the then fashionable AI paradigm and I am now reassessing the project archives. It seems that CODIL can be considered as an accidentally reversed-engineered model of how the human brain's short term memory processes information and that much of our intelligence depend on our ability to share cultural knowledge (I.e. we are good copycats) and our intelligence is built (using Sir Isac Newton's words) on the shoulder of giants.

Expand full comment
Alicia Bankhofer's avatar

Thanks for making the point that these heralded and partially extant tools do not actually work.

Expand full comment
Eric Fish, DVM's avatar

Total side note, but I noticed an eerie similarity between your header image and one I generated with Stable Diffusion for one of my own articles last year: https://allscience.substack.com/p/ai-head-to-head-gpt-35-vs-gpt-4

I have long since lost my exact prompt details, but it's interesting to see how easily it defaults to recycled archetypes/stereotypes!

Expand full comment
Dave Karpf's avatar

iiiiiiinteresting.

Expand full comment
Eric Fish, DVM's avatar

Yep! As GenAI scales, I wonder if we’ll see a wave of “accidental plagiarism” incidents as people create what they think is original work via bespoke prompts but the algorithm uses shortcuts or defaults to a median result per its training data. Also curious what impact we’ll see from more and more of the internet (and thus AI training data) becoming itself AI-generated, in an infinite recursion loop

Expand full comment
Cheez Whiz's avatar

As best I undersstand, LLM (what we have today) is based on a massive data set that has connectrions assigned for predictive relationships (what cynics call autocomplete). Based on Altman's call for trllions for more processing power and data collection, the hope seems to be that all those connections will form a network that will somehow spontaniously develop "real" "AI", like a human brain. Will it work? Who can say. Odds are we wind up with a supercomputer who tells us the answer is "42".

Expand full comment
John Quiggin's avatar

It's worth observing that the change in meaning of "algorithm" from "reliable mathematical procedure to solve a given problem" to "statistical model with unknown properties" has helped the credulous acceptance of "machine learning".

Expand full comment
Nick Blood's avatar

I really like the idea of making a distinction between something genuinely new and something substantive but rebranded. It's a useful way to view this critically.

Perhaps one potential criticism is that something can be imminently world-altering because of the financial might of those backing it. You write (excellently) a lot about the power concentrated in Silicon Valley. In other words, even bullshit generators can be imminently world-altering in a world run on futurism by a handful of cashed-up ideologues and nutters.

Perhaps too, a broader consideration of potential and actual use cases might in turn broaden your view on the novelty and transformative potential of this tech. Even if we take "bullshit generator" as a given, a business consultant and an artist are going to see those as two profoundly different things. For example, in abstract/experimental artistic ontologies where accuracy has limited to no meaning, it's difficult to parse the concept of "bullshit". I find those use cases interesting to think about, even if they are fringe.

Expand full comment
Dave Karpf's avatar

thanks, and agreed. I wrote a piece a year or so ago about generative AI as satisficing tools. I still think there's a lot to that. There's significant uses for tools like these, so long as they're constrained to what they're good for.

https://davekarpf.substack.com/p/on-generative-ai-and-satisficing

Expand full comment
Nick Blood's avatar

Thanks for the link and reply, that was a good read too. A good complimentary piece to this one I thought. Satisficing strongly reminds me of an idea from a Peter Watts novel: "There's no such things as survival of the fittest. Survival of the most adequate, maybe. It doesn't matter whether a solution's optimal. All that matters is whether it beats the alternative."

Expand full comment
Steve Newman's avatar

> Are LLMs a genuinely new phenomenon, like the steam engine, or are they a significant incremental advance like, say, broadband-speed internet.

I think the answer is *both*, on different timelines. There are a lot of reasons the conversation around LLMs is so confusing and contentious; one is that people are talking about different time scales. The applications we can actually put our hands on today range from snake oil to significant incremental advance. However, the core capabilities of LLMs will continue to advance. More importantly, it's still very, very early days in figuring out how to make good use of these models. How to get information in and out of them ("retrieval-augmented generation", tool use, etc.), which application domains they're best suited for, redesigning user interfaces around chat instead of mouse clicks, etc.

Expand full comment
Gerben Wierda's avatar

The term 'general purpose technology' made me think: no it is not, but it is a specific approximation technology dat is more 'general data'-driven, and the results suggests it is 'general-purpose'.

Expand full comment
Bill Flarsheim's avatar

I can think of lots of ways that AI can be put to use in science and engineering. Google’s protein folding project was a good example. But an LLM is not the right AI tool for any of them. The current wave of LLMs do not impress me as anything I would use for anything that was really important.

Expand full comment
werdnagreb's avatar

I’m really looking forward to this AI boom to pop. I’m quite sick of it.

LLMs are quite useful to help me code. I probably save 30 minutes a week. That’s not nothing, but it’s also not a game changer. It’s not a $100 billion data center worth of usefulness .

Expand full comment
Jonah Ogilwy's avatar

I read a lot of short story blogs, and I think Claude can write short stories better than like 80% of them. As an example, check out this post of mine: https://substack.com/home/post/p-145637657?r=loua&utm_campaign=post&utm_medium=web

So I'm a little more inclined than you to believe the hype!

Expand full comment
Philip Koop's avatar

I wonder whether you are thinking of "Tesler's Theorem", which says that "AI is whatever hasn't been done yet." This was used in the goalpost-moving context you mention. It's called the "AI Effect". https://en.wikipedia.org/wiki/AI_effect

ETA: I guess the pithy way of making your point is "AI used to be whatever hasn't been done yet. Now AI is whatever has been done already."

Expand full comment