Freethought Forum - View Single Post

erimir · #4 01-15-2018, 02:05 AM

Quote:

Originally Posted by The Man

I'm not actually sure if this or The Sciences is a more appropriate place to put this thrad, but the question I have is largely historical in scope, so I'll put it here.

You could justify it either way, but I'd note that while the difference you're noticing is connected to etymology, which is historical ("What are the origins of many multisyllabic words in English?"), it's also reflected synchronically as well. That is, if you compare English and Greek or Romance languages, you'll find that English generally has a higher percentage of monosyllabic words, or shorter words more generally. (This probably would also hold up in comparing Old English and Latin or Ancient Greek, as it's reflected across the various members of those language branches.)

Quote:

I've gotten sucked into a vortex of looking up the etymologies of English words, and I've noticed an increasingly ubiquitous pattern: longer words seem much likelier to have Greek or Latin etymologies than shorter ones [...]. Most of the one-syllable English words I've looked up have had Germanic origins of some sort, while almost all the three-syllable-and-above words have had either Greek or Latin roots.
[...]
As a result, I'm trying what's probably a more reliable approach: asking a forum that contains actual linguistic experts (e.g., erimir). Is there a correlation between English word length and linguistic origin? Purely beyond the searches I've run, writing advice that's stuck in my head leads me to suspect there is.

Germanic languages tend to have monosyllabic roots, especially. Words can be longer for various reasons, such as inflections (long-er, writ-ing, wish-es, etc.), derivations (slow-ness, slow-ly, writ-er, etc.) or compounding (light-house, al-so, etc.).

The reason that the etymologies of short words tend to be Germanic is due to this phonological difference between these language families. Germanic languages tend to put the stress on the first syllable, and over time they lost many of the later syllables (all those magic silent e's in English used to be pronounced, for example).

I would also note that many borrowings from French are also only one or two syllables. This is because French is the Romance language that has reduced or eliminated the most vowels. We also borrowed more commonplace words from it due to the Norman invasion, whereas Latin and Greek words entered English largely through academic and scientific writings. So we have words like "debt", "doubt", "pretty", "chief", "beef", "pork", etc.

Borrowings from other languages will reflect the phonological patterns in those languages too. Chinese languages tend to have words with one or two syllables, so we have "tea" (and "chai"), "gungho", "ketchup", "gingko", "kowtow", "typhoon", etc. I suspect we're unlikely to borrow many single syllable Chinese words since many will already have a meaning in English; especially since if you ignore tone, Chinese has many fewer possible monosyllables than English. But we have far fewer borrowings from languages that aren't French, Latin, or Greek.

So differences in phonology explains part of it. But part is probably also related to the types of words that would be borrowed.

It's probably true that technical words tend to be longer across languages, and those words are even more likely to be borrowings (particularly in English), while function words (prepositions, pronouns, conjunctions, etc.) tend to be shorter across languages, and function words are less likely to be borrowed.

To illustrate the point with technical words... Icelandic tries to avoid Latin and Greek borrowings, but the Germanic neologisms they use instead are often longer words too (for example, veðurfræði means weather-science, i.e. meteorology, which is quite similar to what the Greek roots mean anyway). A lot of Latin and Greek technical words are also compounds (anything ending with -ology or -ography, for example) and compounds tend to be longer obviously, since they must contain two roots instead of one. If we tried to use native Anglo-Saxon words for a lot of technical vocabulary, we'd probably end up doing similar things (we could say, perhaps weathercraft or weatherlore) and they'd still end up being multi-syllabic.

So the type of words we borrow (more technical words, fewer commonplace words, very few function words) also has an effect.

Borrowings are also moderated by English's own phonological patterns. As I mentioned, we don't have many monosyllabic borrowings from Chinese since there's a lot of overlap with preexisting English words. Similarly, we'd be unlikely to borrow a particularly long Japanese or Swahili or Turkish word. A German word like backpfeifengesicht is unlikely to enter common use, and if it does, it probably would end up being shortened. Schadenfreude is probably about as long as it gets if we're not familiar with the components of the word (and most people elide the final vowel in my experience).

Specifically for Latin and Greek, as compared to other sources of borrowings, I would say since we're familiar with so many Latin and Greek roots, we can combine them into longer words. As you can imagine, a large amount of scientific vocabulary didn't actually exist in Latin when it was a spoken language, since those words were created to describe new concepts. "Colonoscopy" is not a word used by the ancients, I'm guessing.

Many of these patterns would also apply to borrowings in other languages! I took a semester of Swahili, and Swahili, like English, has a vocabulary dominated by borrowings (in the case of Swahili, from Arabic and to a lesser extent English). But while Swahili words tend to be much longer than English ones because of more extensive inflection, I noticed that Arabic-origin verbs tended to have longer roots than Bantu-origin verbs. The verb for "to eat", is ku-la and "to be" is ku-wa, but "to travel" is ku-safiri (yes, related to safari) and "to discuss" is ku-jadili. I can be relatively assured of the etymologies because Bantu verbs always end in 'a', and so the ones that don't are borrowings, usually Arabic.

Quote:

Orwell advised, in "Politics and the English Language", to avoid using too many words of foreign (especially Greek or Latin) origin, and also, "Never use a long word when a short word will do".

But this is just an imperfect heuristic for avoiding words that won't be understood, because they're technical jargon or overly "learned". Plenty of foreign words in English are common and easily understood. (For example, "plenty", "foreign", "common" and "easy" all come from French.)

If you perceive the word as foreign, on the other hand, that might be a better indicator that it's less likely to be understood than whether it is foreign. (But if you intentionally avoided non-Germanic words in English and used archaic Germanic words, you would be using less foreign words but be harder to understand.)

Generally speaking, I find hard and fast rules like that for writing to be misguided. Elements of Style contains a lot of stupid rules, for example.