Back

7 march 2025

I suspect that studying a language is a bit like studying a snowflake. Even at a distance, you can see the structure and intentionality, and as you get closer and pull out a magnifying glass you can see more tiny, beautiful details. These details almost seem to go infinitely smaller—a fractal.

Now, as the expert, you can fill a book with your knowledge about the snowflake. Turns out, others have written their own books on their own snowflakes, and browsing through them you can find patterns that look the same as yours. You might even devote your life to reading these books and sorting the snowflakes into piles based on their patterns, so you can now understand (roughly) all of the snowflakes in the world. However, it's a bit uncertain whether you've really learned much about snowflakes at all!

This is unashamedly a bit of a rant about the current research agenda of (formal) linguistics. The vast majority of garden-variety generative grammar work is endlessly looking closer at snowflakes, hoping that maybe if we look close enough we will suddenly see what snowflakes are. Meanwhile, typology is going around recording data on various snowflakes and making up categorizations, in hopes that maybe a big enough sample will reveal the truth.

The reason we understand what snowflakes are is because we understand how they are formed. It seems to me like that must be how we should understand language. We would need to understand how it is produced and processed in the brain, how it arises within a population, how it spreads and changes, what laws of information theory govern it, and how variation occurs.

Not to stretch the metaphor too far, but I do suspect it is the case that language has a fractal attractor. That is, the formation of a language instance is so highly sensitive to some random starting condition (whatever that means in the context of a population) that it is impossible to predict the full extent of its rules (which have infinite complexity, limited only by hardware restrictions).

Worth noting, this really only applies to morphology/syntax/semantics/and up. Phonology (❤️) is obviously correct and an obvious universal thing with good biological motivation.

mg