The Principles of Linguistics, in Plain English

Sam Quillen
6 min readMar 22, 2022

--

Linguistics, the study of how people communicate with one another, is probably the most versatile of any science, with applications from psychology to computer coding. However, for something that we all use throughout every day of our lives, the scholarly lexicon we use to describe languages can be pretty hard to understand.

To grasp the basics of the field, it is helpful to know how it developed. For thousands of years before linguistics existed as a distinct discipline, grammarians were assiduously describing the proper way to speak and write in great languages like Latin and Arabic. These scholars were the first to admit that their work was necessary because most people did not speak properly. As early as the 4th Century A.D., Roman proto-linguists compiled entire books to gripe about the errors the rabble were making. Perhaps ironically, the vulgarities are of a lot more interest to scholars today.

Linguistics became formally scientific in the 19th Century thanks to the efforts of the Germans. Unlike Latin and Greek, ancient Germanic was never written down, so men like Jacob Grimm had to devise more universal laws for how to describe and study language.

The most elementary exercise for linguistics novices is to transcribe words in the International Phonetic Alphabet. My partner and I had different accents, so whenever we messed up we would blame that to get off the hook.

At the most basic level, to talk about a language, you need to cover individual words (how they are pronounced and what they mean), and how they fit together. For the first point, we use the International Phonetic Alphabet, a neutral set of Latin-ish letters that encompass all the vowels and consonants in any language in the world.

How to pronounce a word is one thing, but what it means can be more complicated. Language changes constantly, and there is a whole field of linguistics devoted to tracing words’ etymologies and how their meanings shift over time. For a few key words with philosophical or political portent, a subtle difference in how two groups define a word can be of lethal importance.

If you are interested in learning more about this area, speaking English is a rare blessing. English boasts one of richest etymological heritages of any language in the world. It is a Germanic language, but thanks to its unique history, the majority of its vocabulary is Latin or Greek (usually via Norman French). And unlike other peoples, mostly infamously the French, English speakers have always borrowed with gusto from other languages all over the world. If you spot a word that looks a bit foreign, or even a familiar one like “admiral,” “ketchup,” or “tattoo,” try looking it up in the Online Etymology Dictionary — you would be surprised how weird our words are. (To demonstrate the point, every italicised word in this paragraph is one that came from a foreign language.)

The Vikings were the first of several waves of invaders that crashed over England in the Middle Ages and changed our language forever (photo credit: Entertainment Weekly).

Once we understand individual words, the next step is how they fit together. This topic was near and dear to grammarians’ hearts for millennia before linguistics was a formal discipline, but like the IPA, syntax is now a science. Probably the best-known grammatical feature constituent word order, i.e., the arrangement of the subject, object, and verb.

English, like most modern European languages and about a third of languages around the world, goes SVO. The most common worldwide, used by about half of all languages, including Latin, Persian, and Japanese, is SOV (i.e., Caesar Galliam vincit, literally, “Caesar Gaul conquers”). A smaller share of languages are VSO, and the other three permutations are vanishingly rare. A few languages are the perfect reverse of English, while others sound like Yoda. Take a moment to reflect on the plight of the generations of PhD students who spent months in the high Caucasus and Brazilian Amazon before they finally found a few tribes who say “Capybaras I see” (OSV).

For more sophisticated analysis of how every part of speech fits into place, we use syntax trees. This is not a simple concept to explain in a brief article, but at a high level, these trees branch into noun, verb, and preposition phrases, and these further into determiners (e.g., “the,” “a”), nouns, adjectives, prepositions, and other fun little twigs. They take some time to master, but once you do, it can be a stimulating way to while away a long plane flight.

I hope the little bear is paying closer attention to catching dinner than to seeing the forest full of syntax trees.

Once you have mastered the theoretical side of linguistics, it is time to apply your skills to the messy world of real languages. One popular exercise is to look at a map, usually of some unfamiliar place like a Polynesian island or medieval Poland, screened over with a chart of how a certain word is pronounced. Based on principles like people near a river sharing a common dialect, or mountains keeping communities apart, you can deduce how people in a certain blank part of the map will pronounce it.

I might have had an easier time learning linguistics if I had been learning this instead of what 13th Century Mazovian peasants called plows.

At some point, words have shifted so much that two communities can no longer understand one another, and a new language is born. In the real world, what constitutes a language as opposed to a dialect has more to do with politics than intelligibility. Mandarin and Cantonese Chinese are at least as far apart as Spanish and Italian, while Danes and Norwegians can understand one another quite readily. As the saying goes, a language is a dialect with an army and a navy.

But one of the oldest and most popular fields in linguistics is devoted to cataloguing and grouping the languages of the world. The most broadest units are language families. Of the 8,000 or so languages spoken today, all fit into one of about 150 families. The biggest of these by far is the Indo-European family, including almost all of the languages of Europe, plus Persia and North India. Since their speakers went their separate ways three millennia ago, a vast gulf has opened between English and Hindi. But if you look closely, especially at fundamental words like numbers, you can still find similarities. Sharp Lithuanians can even understand basic Sanskrit.

Other major language families include Afro-Asiatic (including Arabic, Hebrew, and other Near Eastern languages), Sino-Tibetan (centered on China), Austronesian (the Pacific Islands), and Niger-Congo (Sub-Saharan Africa). A huge share of languages (albeit not of speakers) fall into minor families, or are isolates all to themselves. About an eighth of all languages are spoken only on the island of New Guinea. There is an excellent summary of all this on a site called KryssTal — it was one of the first things I read that got me interested in linguistics.

It is no accident that language boundaries so often coincide with today’s international borders. Most of the world’s conflict zones lie where they do not.

Below the family level, languages are divided into progressively more specific taxons until we arrive at individual languages. To again use our own family, the major Indo-European branches include Germanic, Romance, Slavic, Iranian, and Indo-Aryan (i.e., Indian). Just three thousand years ago, the ancestors of everyone from Cork to Calcutta shared one common tongue.

By extension, it is possible (though unlikely) that all the languages of the world have a common origin, but thanks to hundreds of generations of kids ignoring their grammar teachers it is impossible to trace it now. If people paid attention to pedantic purists, the world probably would be a lot more peaceful and united today. In my opinion, at least, it would also be a lot less interesting.

--

--

Sam Quillen
Sam Quillen

Written by Sam Quillen

Former linguistics student; current investment bank analyst who sometimes thinks about something other than spreadsheets

Responses (3)