Michael Ginn - Thoughts

Back

13 march 2025

The field of NLP has been in a bit of a crisis since the arrival of LLMs. Unofficially, the primary goal of the field has always been to produce a system (or systems, plural) that can manipulate, understand, and generate text as effectively as a human (though not necessarily in the same way). My claim is that this goal has been effectively reached (since around GPT-3). Certainly, if you presented someone with no knowledge of LLMs a ChatGPT-generated text, and told them it was produced by a human, there is very little chance they would suspect anything.

Understandably, this has caused a bit of discord among NLP researchers. No scientist actually legitimately hopes the central mysteries of their field will be solved, as they will then be out of a job. For NLP, this has caused a majority of researchers to pivot to LLMs, and a minority to write highly-critical position papers. This, to me, does not seem like a particularly sustainable arrangement.

In fact, the hottest thing in LLM research at the moment—that is, "agentic behavior" and LLM reasoning [1]— has pretty much breached containment and escaped from NLP, instead returning to ancient topics in AI like search and logic. I think this is good evidence for my claim above, suggesting that we have modeled language so well that the only thing left to model is the actual thought that produces language! If you are a linguist who strongly believes that language is a distinct and self-contained psychological process, this should be great news for you.

Okay, back to NLP. NLP is a bit of a weird field in that it simultaneously studies methods for working with natural data but also the data itself. This leads to papers such as the original word2vec paper which both proposes a new method for various NLP tasks (static word embeddings) but also uses said method to make the observation that words with analogical relationships form predictable structure when embedded in these vector spaces. Likewise, NLP conferences tend to be composed of a bimodal distribution of people:

Engineering-minded folks who are interested in optimizing performance on valuable tasks
Science-minded people who are interested in understanding how and why certain methods work

So from the basis of my claim that LLMs effectively solve NLP, what are my predictions? I suspect that continuing to try and hill-climb with more and more accurate models is likely a losing battle, or at least an unexciting one. There will certainly always be room to optimize, but the improvements will only get smaller, and the focus will largely be on improving the speed, efficiency, and portability of models.

On the other hand, I think there are a huge amount of potential for exploring how and why these models work, and using them as models to understand language. My advisor liked to rant that we now have complete models of language that are small enough to run on a laptop, and yet people who study language refuse to use them.

Perhaps moving in this direction would mean less funding from industry, and less hype from the media. On the other hand, it feels like far more interesting and fundamental science.

Of course, I may be as wrong as everyone who declared that physics was solved prior to quantum.

[1] I realize these are distinct topics, but they kind of have the same vibe.