Sunday, March 11, 2018

A few word-play jokes

First, a translation of a poem (ci-poem to be exact) by Ms. Li Qingzhao (李清照, 1084 – ca 1155/1156), a poet at the turn of the Northern-to-Southern Song dynasty.

李清照《永遇乐·落日熔金》
落日熔金,Sunset of molten gold
暮云合壁,Evening clouds of enclosing jade
人在何处。 Where am I standing?
染柳烟浓, Mist coloring the willows thickens
吹梅笛怨, Flute plays “The plum of melancholy”
春意知几许。 How's the springtime coming?
元宵佳节, The joyous Festival of Lantern
融和天气, in this clement weather
次第岂无风雨。 “Will it not be windy and rainy soon?”
来相召、香车宝马,谢他酒朋诗侣。 “Sorry”, said I to my wine-and-poetry friends, who came to invite me for an outing, in their fragrant BMW

Second, a list of words offered to "improve" English vocabulary, with a caution to the readers when I posted it to Weibo. And the "facts" stated therein are not to be trusted.

English vocabulary (non-)study
英语词汇的(非)学习
Learners of limited vocabulary should wear gas masks to avoid poisoning.
词汇有限的学习者须戴防毒面具

* infantry:
In the mid-20th century, the first public child care facility in the US was established in the suburb of Chicago, Jenkins Infantry, named after the owner Mary Jenkins.

* indefatigable
At the end of the 3-month clinical trial, 35% of the volunteers presented no change in either the body-mass index or the normalized adipose quantity. These indefatigable participants were advised to join a more aggressive weight watch program.

* bruxiathesaurus
A group of international paleontologists recently discovered never-seen-before dinosaur fossils, tentatively named bruxiathesaurus, on the evidence that these creatures apparently would grind their teeth while sleeping. Bruxia or bruxism, grinding or clenching teeth at night, is common among homo sapiens. This is the first time dinosaurs are found to have this behavior.

* infarction
Some patients with irritable bowel syndrome (IBS) try to “hold in” flatulence. There is no controlled study on either any benefit or harm done by this practice of infarction.

Sunday, February 4, 2018

The Multilingual Idioms List

Linguaholic created a crowdsourcing project, The Multilingual Idioms List. I think two things are new in this project.

  • As far as I know, there was never a dictionary that pairs idioms and only idioms from different languages. It's true that numerous dictionaries of idioms for a specific language have been published. The explanations or definitions of the idioms may be in the same language as the idioms, or in a different language. When they are in a different language (called target language for the sake of argument), more often than not a matching idiom in the target language cannot be found, and a wordy explanation is provided. The Multilingual Idioms List project handles this situation differently: leaving the entry blank on the target language side. This is actually a good thing. It either positively acknowledges such lack, or catches readers' attention and waits for other native speakers to find a good idiom in later times.
  • The List is multilingual, not limited to two languages. Unlike any published dictionary of idioms where the source and target languages differ, the contributors, or in a sense lexicographers, of the crowdsourcing List are not language professionals. This is not a big problem since the List is not a highly technical dictionary. The big advantage, on the other hand, is that the contributors are almost all native speakers. This is significant because good or even correct usage of idioms is very much dependent on real life experience in the language environment. Being native may be more relevant to this project than being professional if being both is not possible.

Today, I made a small contribution to the List, by adding the column Chinese (since no one before me had done that), and providing a dozen or so idioms, as follows:

a bitter pill不得不吞的苦果
a piece of cake小菜一碟
Achilles' heel软肋
add insults to injury雪上加霜;往伤口上撒盐
an arm and a leg倾家荡产
beat around the bush拐弯抹角
best of both worlds两全其美
bite the bullet硬着头皮上
burn the midnight oil开夜车
cast in stone板上定钉
cat nap打个盹儿
from A to Z从头到尾
from scratch从零开始
have eyes in the back of one's head眼观四路,耳听八方
hit the road上路
let the cat out of the bag抖包袱
kick the bucket见阎王
off the hook如释重负

In Chinese, there are different types of idioms. 成语 (literally probably "solidified or invariable phrases") are more formal and literary, mostly of four characters, such as "自相矛盾" ("self-contradictory"), "纸上谈兵" ("talk of military strategy (only) on paper"). 歇后语 (literally "sentences said after taking a rest") are colloquial proverbs, such as "和尚打伞,无法无天" ("A monk holds up an umbrella. No hair|law. No sky.", or "The dharma is obscured and heaven blocked."). Obviously some idioms are in neither category, and yet are expressions that cannot be literally interpreted, such as "硬着头皮上", literally "go ahead with hardened scalp", which I consider matching "bite the bullet" in English.

I can think of one improvement that may be made on the current List. It would be nice to provide a place to enter the literal translation of an idiom and optionally a brief explanation. For instance, I would love to add that the Chinese idiom "软肋" for "Achilles' heel" literally means "soft rib" because the rib bone is relatively weak and fragile, and that "雪上加霜" for "add insults to injury" literally means "add frost on top of snow", a phrase that may not need an explanation. With these additions, the List would be more fun to read. So for instance, we'll know that instead of "beat around the bush", the Chinese "make turns and scratch corners" ("拐弯抹角"), and the French "turn around the pot" ("tourner autour du pot") instead. While English-speaking people consider Greek a difficult language ("It's all Greek to me!"), the Chinese language is regarded by by far the most other peoples; "Chinese" occurs 24 times out of about 100, compared to 12 for "Greek", on the Wikipedia page for Greek to me. Through this List, we know a little more about different cultures. But technical limitation for the List is understandable; it is in the format of a spreadsheet, where adding two more columns (literal meaning and explanation) for each language would make the list too hard to read. Other options include adding comments to the spreadsheet cell, where the comments are not shown unless the mouse is over the cell.

Overall, this is a great project. I hope they'll set up a Wikipedia page, with versions in many different languages contributed by the same volunteers that build the List.

Saturday, November 11, 2017

Chinese translation of a poem by Kahlil Gibran

Kahlil Gibran (1883 – 1931) was an accomplished Lebanese poet. His well-known poem On Children

Your children are not your children.
They are the sons and daughters of Life's longing for itself.
They come through you but not from you,
And though they are with you, yet they belong not to you. 
has been translated into Chinese as follows:
你们的孩子,都不是你们的孩子
乃是生命为自己所渴望的儿女。
他们是借你们而来,却不是从你们而来
他们虽和你们同在,却不属于你们。 
or in another version:
你的儿女,其实不是你的儿女。
他们是生命对于自身渴望而诞生的孩子。
他们借助你来这世界,却非因你而来,
他们在你身旁,却并不属于你。

The second line, plainly paraphrased, means that the children are the offspring or outcome of the longing of Life for itself. Here Life acts as an entity as if it exists in space and time. It tries to find itself, and in the process, are born the children who appear to belong to you, the addressee of the author. The Chinese rendering of this abstract description, "生命为自己所渴望的儿女", is a grammatically perplexing one. Let's build up from the basics. "他所渴望的是工作" is "What he longs for is a job". Based on that model, "自己所渴望的" must mean "what (someone/something) he/she/it-self longs for", or here specifically, "what (something) itself longs for". (I added "someone" or "something" solely to work around the problem that the word he/she/it-self alone cannot stand alone.) Now, if we substitute Life for this something, therefore, "what Life itself longs for" or "生命自己所渴望的" in Chinese, that doesn't match the original meaning; the author intends to say the children are the outcome of the longing, not of what Life longs for. Life longs for itself and this longing process begets the children. Unfortunately, the translation "生命为自己所渴望的儿女" is not saying the same thing, either. In fact, it says something a native Chinese speaker has trouble understanding. I can't even think of a good literal translation of this ambiguous and possibly ungrammatical phrase. In contrast, the second translation, "他们是生命对于自身渴望而诞生的孩子" is a good one, thanks to the extra word "诞生" added by the translator. Literally it says "They are the children born out of Life's longing for itself", which is remarkably close to Gibran's original.

The third line is deceivingly simple. What does the author exactly mean by "through you but not from you"? The first Chinese translation, "他们是借你们而来,却不是从你们而来", uses "借" (v. "to borrow"; prep. "with the help of") for "through", and "从" for "from". The second translation, "他们借助你来这世界,却非因你而来", uses "借助" ("with the help of") for "through", and "因" ("because", "because of", "due to") for "from". Both translations interpret "through you" as "with the help of you". The first literally renders "from", while the second changes it to "because of". I checked the translations of this line into a few other languages. For example
Spanish: Vienen a través vuestro, pero no de vosotros.
French: Ils viennent à travers vous mais non de vous.
German: Sie kommen durch dich, aber nicht von dir.
Italian: Tu li metti al mondo, ma non li crei.
Only the Italian version does not literally translate the prepositions "through" and "from" in the original poem. Instead, the sentence means, plainly put, "You put them into the world, but do not create them."

The Italian rendering, in my opinion, has gone a little too far from the author's possibly deliberate wording that borders on mischievous play of words. Similarly, the Chinese translations, which change the author's "through" to "with the help of" and (in one case) "from" to "because of", would be frowned upon by the author. We know that unlike scholarly translation which should be literal, some or even a great deal of flexibility is allowed in translation of literary especially poetic works. But the Spanish, French and German translations I found all stubbornly stick to the literal mapping of the two prepositions. My take on this is that if the original poem can be understood in its original language and also in the translated language with literal translation, no word change should be made, and I believe that is exactly the case here. We can make sense of "They come through you but not from you" if we use a good analogy. Imagine the scene in which bright sunlight shines through the window and comes into the room. This sunlight (the children in Gibran's poem) comes through the window glass (you) and yet it is not truly from the window or glass, but from the sun. In this interpretation, the light travels literally through the glass, without the help of the glass (contrary to both Chinese interpretations), without the glass somehow putting the light down into the room (contrary to the Italian interpretation), and having no cause-and-effect relation with the glass (contrary to the second Chinese translation). The light belongs to the sun because the sun created it. The light can come into the room simply because only the window out of the whole external wall is transparent. Gibran's "through you but not from you", when likened to "through the window glass but not from the glass", is a clever play of the prepositions and yet makes perfect sense. There is no need to replace them unless misunderstood. The best Chinese translation may simply be a literal one, "他们通过你而来,却不是从你而来". If needed, a translator's note can be provided to help the reader. Anything else will likely tarnish the beauty of this line.

Thursday, September 14, 2017

Language difficulty

Chinese has been widely considered to be one of the most difficult languages in the world. What constitutes the difficulty of a language? Can it be measured and how? Whenever someone posts a message about language difficulty on a forum, it almost always generates a heated discussion. Comments range from "English is the easiest because the verbs have minimum conjugations and nouns have no gender", "Chinese and Japanese are hard because there're too many characters or kanji's", to "No language is inherently more difficult than any other because native speakers grow up speaking it with about the same effort", and "Language difficulty is subjective perception", to name a few.

Most language enthusiasts on various forums are not scholars. The diversity of those opinions is a result of no good definition of language difficulty. But we can tell that most people are referring to the difficulty experienced by an adult (not a young child) in learning a foreign language (not mother tongue), and in many cases the adult's native language is English. If we qualify the discussion with these requirements, i.e.

  • the learner is an adult;
  • the language whose difficulty is evaluated is learned by the adult as a foreign language;
  • the difficulty is evaluated when the adult's native language is specified
then a measurement of language difficulty becomes meaningful.

I believe that in many social sciences, there are two general methods to measure a quantity, internal and external. For example, in linguistics, a researcher can define a set of factors pertinent to the correlation between orthography (spelling) and pronunciation in order to calculate the orthographic depth of a language, i.e. "the degree to which a written language deviates from simple one-to-one letter-phoneme correspondence". Alternatively, one can simply conduct a controlled study among a group of people (cohort) and see which language causes how many spelling errors in dictation or in a similar experiment.

When it comes to rating language difficulty, we can devise a set of rules and individually assess each language against these rules and then sum the rule ratings (with weights); e.g., percentage of words that have cognate or loan relationship with the words in the learner's native language, whether the nouns have genders and cases, how many variations in verb conjugation, whether the dominant word order differs from that of his native language, etc. For lack of a better term, we may call this an internal evaluation.

The external evaluation, on the other hand, has been done and is widely quoted. The most well-known data for English native speakers are from Defense Language Institute of the US, where they statistically measure the time for the learners to take in achieving a certain language proficiency level. The official Web page for this study is https://www.ausa.org/articles/dlis-language-guidelines, duplicated below for your convenience.

  • Category I languages, 26-week courses, include Spanish, French, Italian and Portuguese.
  • Category II, 35 weeks, includes German and Indonesian
  • Category III, 48 weeks, includes Dari, Persian Farsi, Russian, Uzbek, Hindi, Urdu, Hebrew, Thai, Serbian Croatian, Tagalog, Turkish, Sorani and Kurmanji
  • Category IV, 64 weeks, includes Arabic, Chinese Mandarin, Korean, Japanese and Pashto
The earliest version of this data was on a Webpage of Dr. William Baxter of the University of Michigan, which he got "from documents I got at a workshop of some kind" (private email). But Dr. Baxter later removed it from his Website, so you have to reference it from archive.org, duplicated below.

Languages included
(Languages regularly offered at the University of Michigan are in capital letters; this is NOT a complete list)

Hours of instruction required for a student with average language aptitude to reach level-2 speaking proficiency

Speaking proficiency level expected of a student with superior language aptitude, after 720 hours of instruction
GROUP I Afrikaans, Danish, DUTCH, FRENCH, Haitian Creole, ITALIAN, Norwegian, PORTUGUESE, Romanian, SPANISH, Swahili, SWEDISH 480 3
GROUP II Bulgarian, Dari, FARSI (PERSIAN), GERMAN, (Modern) Greek, HINDI-URDU, INDONESIAN, Malay 720 2+ / 3
GROUP III Amharic, Bengali, Burmese, CZECH, Finnish, (MODERN) HEBREW, Hungarian, Khmer (Cambodian), Lao, Nepali, PILIPINO (TAGALOG), POLISH, RUSSIAN, SERBO-CROATIAN, Sinhala, THAI, TAMIL, TURKISH, VIETNAMESE 720 2 / 2+
GROUP IV ARABIC, CHINESE, JAPANESE, KOREAN 1320 1+
That data differs from DLI's current data in not a small way. I had some email exchanges with DLI but they didn't explain these discrepancies.

[Update 2018-04]
Dr. Robert Marzari, the author of Leichtes Englisch, schwieriges Französisch, kompliziertes Russisch, kindly sent me a summary of the result of his research and granted me permission to post it here.

In my book I tried to evaluate the difficulty of seven European languages (English, French, Spanish, Italian, Russian, Polish - and German) for a German speaking learner; for the evaluation of the German language I imagined a Romance speaker, i.e. a mixture of a French, Italian and Spanish speaker. The results of the evaluation therefore do not show absolute degrees of complexity, but rather relative degrees of difficultness, i.e. relative to a German or Romance speaker.
   If you could get hold of my book (at a University library perhaps?) just take a look at the charts on pages 269 to 275: On these charts I give the results of my evaluation of those seven languages according to the linguistic subsystems of phonetics, writing system, grammar, lexicon and textual structurization (i.e. reading difficulty).
   According to these the degree of a learner`s difficulty is as follows:
     active competence  passive competence  complete competence
     (speaking+writing)          (reading)
Spanish   29 points         11 points          40 points
English   33 points         13 points          46 points
Italian   35 points         13 points          48 points
French    43 points         10 points          53 points
Russian   51 points         15 points          66 points
German    50 points         18 points          68 points
Polish    54 points         16 points          70 points

This excellent research indicates that a German native speaker rates language difficulty as Spanish < English < Italian < French < Russian < Polish, which is quite consistent with many polyglots's experience, although reading has a slightly different order. Apparently this research uses an internal evaluation (see above for a description), rating various aspects of a language instead of checking students' learning challenge. Thus, placing German in this language list makes sense even though the German learners speak a different native language, a Romance language instead of German.

Unfortunately, I'm not aware of any other research on this topic. But as you can already see, an otherwise hot topic can be made quite cool by the above analysis, cool as opposed to hot or debatable, and cool in the sense of being interesting.