What Are Language Families? Part I: The Indo-European Connection

Sam Quillen
4 min readJan 24, 2024

Speakers of most of the world’s languages are at least vaguely aware that their tongue is related to some others. For people in the West, the most familiar example is Romance languages, which began as Latin slang in Italy, Spain, Gaul, and elsewhere and grew up to become great languages in their own right. Fewer people are aware that Latin itself was just one branch of a family tree whose roots were already deep-set when Rome was just a minor Italian city-state.

The Romance languages are part of the broader Indo-European language family, which includes virtually all of the languages of Europe (which mainly fall into the Romance, Germanic, and Slavic groups), as well as Iran and northern India. Around 4,000 years ago, the ancient forebears of all these people, warlike pastoralists known as the Aryans, left their ancient home near the Black Sea and decided to conquer a large swath of the entire world.

In the millennia that followed, they mixed with other populations (thus why today’s Pakistanis and Norwegians look different), and developed vastly different cultures. But their languages still harken back to their common roots. I am currently working in Pakistan, where even my rudimentary efforts to learn Urdu have yielded some comforting traces of familiarity. Another nation kept in touch with their Aryan heritage so well that they still call themselves Iran.

Relatives can drift apart in four thousand years, but there will always be a connection.

Even this cursory survey reveals some commonality that has endured over the millennia. Other features remain as well — for example, Indo-European languages tend to be highly inflected, compared to other languages. That is, the fact that words take on different endings depending on their role in the sentence (e.g., “I see” versus “he sees”) is actually relatively uncommon in language globally. But the way that different tongues do these things does, of course, vary widely.

Some languages change more than others — English and the major Romance languages, thanks to the turbulent history of the Middle Ages and the later age of empire, have shifted more than most. Likewise, at the opposite end of the Indo-European world, classical India’s Sanskrit-speaking empires gave way to medieval states whose people garbled their Sanskrit dialects into new languages, while preserving the old tongue in their temples.

On the other hand, languages with relatively stable, isolated populations can be handed down the generations largely intact. While modern English speakers struggle with Shakespeare (let alone Old English), today’s Icelanders have relatively little trouble with Viking sagas. Even Russians today still speak relatively similarly to their ancestors a thousand years ago. Lithuanian, the most conservative of all Indo-European languages, has drifted so little that its speakers can still understand some basic Sanskrit (see the chart above).

The father of this field of linguistics was William Jones, a lawyer and polyglot who left his native Wales for a lucrative career working for the British East India Company. While posted in Bengal, he set himself to studying Sanskrit. At first, his papers demonstrating a deep kinship between the Indian holy language and Latin and Greek were met with ridicule. But by 1781, his work won him a knighthood.

The Bhagavad Gita, like other sacred texts of Hinduism, is written in Sanskrit. It was the favourite book of Heinrich Himmler, the Nazi boss who saw it as an exemplar of pure Aryan religion.

Indo-European remains by far the most-studied and best-understood language family. That makes sense, as the heirs of the Aryans constitute fully 46% of the world’s population, as well as billions more who learn juggernaut languages from English to Hindi. Linguists have even reconstructed the Proto-Indo-European language of our most ancient ancestors, using cognates and other links from attested (i.e., written) classical languages like Latin Greek, Sanskrit, and Persian to extrapolate back into the the prehistoric past. If you are curious, ChatGPT does a remarkably good job rendering and explaining sample texts in it.

It is an difficult challenge that even the best linguists will always be best at analysing languages similar to their own. I am sure it is not a coincidence that my own forays into Arabic and India have resulted in two of my most-criticised articles. But lessons first distilled in Europe and South Asia can be applied globally. In the next part, we can channel our inner William Jones and study how the apparent cacophony of around 7,000 individual languages in the world actually filters out into a relatively small number of major chords.



Former linguistics student; current investment bank analyst who sometimes thinks about something other than spreadsheets