Saturday, November 11, 2017

Chinese translation of a poem by Kahlil Gibran

Kahlil Gibran (1883 – 1931) was an accomplished Lebanese poet. His well-known poem On Children

Your children are not your children.
They are the sons and daughters of Life's longing for itself.
They come through you but not from you,
And though they are with you, yet they belong not to you. 
has been translated into Chinese as follows:
你们的孩子,都不是你们的孩子
乃是生命为自己所渴望的儿女。
他们是借你们而来,却不是从你们而来
他们虽和你们同在,却不属于你们。 
or in another version:
你的儿女,其实不是你的儿女。
他们是生命对于自身渴望而诞生的孩子。
他们借助你来这世界,却非因你而来,
他们在你身旁,却并不属于你。

The second line, plainly paraphrased, means that the children are the offspring or outcome of the longing of Life for itself. Here Life acts as an entity as if it exists in space and time. It tries to find itself, and in the process, are born the children who appear to belong to you, the addressee of the author. The Chinese rendering of this abstract description, "生命为自己所渴望的儿女", is a grammatically perplexing one. Let's build up from the basics. "他所渴望的是工作" is "What he longs for is a job". Based on that model, "自己所渴望的" must mean "what (someone/something) he/she/it-self longs for", or here specifically, "what (something) itself longs for". (I added "someone" or "something" solely to work around the problem that the word he/she/it-self alone cannot stand alone.) Now, if we substitute Life for this something, therefore, "what Life itself longs for" or "生命自己所渴望的" in Chinese, that doesn't match the original meaning; the author intends to say the children are the outcome of the longing, not of what Life longs for. Life longs for itself and this longing process begets the children. Unfortunately, the translation "生命为自己所渴望的儿女" is not saying the same thing, either. In fact, it says something a native Chinese speaker has trouble understanding. I can't even think of a good literal translation of this ambiguous and possibly ungrammatical phrase. In contrast, the second translation, "他们是生命对于自身渴望而诞生的孩子" is a good one, thanks to the extra word "诞生" added by the translator. Literally it says "They are the children born out of Life's longing for itself", which is remarkably close to Gibran's original.

The third line is deceivingly simple. What does the author exactly mean by "through you but not from you"? The first Chinese translation, "他们是借你们而来,却不是从你们而来", uses "借" (v. "to borrow"; prep. "with the help of") for "through", and "从" for "from". The second translation, "他们借助你来这世界,却非因你而来", uses "借助" ("with the help of") for "through", and "因" ("because", "because of", "due to") for "from". Both translations interpret "through you" as "with the help of you". The first literally renders "from", while the second changes it to "because of". I checked the translations of this line into a few other languages. For example
Spanish: Vienen a través vuestro, pero no de vosotros.
French: Ils viennent à travers vous mais non de vous.
German: Sie kommen durch dich, aber nicht von dir.
Italian: Tu li metti al mondo, ma non li crei.
Only the Italian version does not literally translate the prepositions "through" and "from" in the original poem. Instead, the sentence means, plainly put, "You put them into the world, but do not create them."

The Italian rendering, in my opinion, has gone a little too far from the author's possibly deliberate wording that borders on mischievous play of words. Similarly, the Chinese translations, which change the author's "through" to "with the help of" and (in one case) "from" to "because of", would be frowned upon by the author. We know that unlike scholarly translation which should be literal, some or even a great deal of flexibility is allowed in translation of literary especially poetic works. But the Spanish, French and German translations I found all stubbornly stick to the literal mapping of the two prepositions. My take on this is that if the original poem can be understood in its original language and also in the translated language with literal translation, no word change should be made, and I believe that is exactly the case here. We can make sense of "They come through you but not from you" if we use a good analogy. Imagine the scene in which bright sunlight shines through the window and comes into the room. This sunlight (the children in Gibran's poem) comes through the window glass (you) and yet it is not truly from the window or glass, but from the sun. In this interpretation, the light travels literally through the glass, without the help of the glass (contrary to both Chinese interpretations), without the glass somehow putting the light down into the room (contrary to the Italian interpretation), and having no cause-and-effect relation with the glass (contrary to the second Chinese translation). The light belongs to the sun because the sun created it. The light can come into the room simply because only the window out of the whole external wall is transparent. Gibran's "through you but not from you", when likened to "through the window glass but not from the glass", is a clever play of the prepositions and yet makes perfect sense. There is no need to replace them unless misunderstood. The best Chinese translation may simply be a literal one, "他们通过你而来,却不是从你而来". If needed, a translator's note can be provided to help the reader. Anything else will likely tarnish the beauty of this line.

Thursday, September 14, 2017

Language difficulty

Chinese has been widely considered to be one of the most difficult languages in the world. What constitutes the difficulty of a language? Can it be measured and how? Whenever someone posts a message about language difficulty on a forum, it almost always generates a heated discussion. Comments range from "English is the easiest because the verbs have minimum conjugations and nouns have no gender", "Chinese and Japanese are hard because there're too many characters or kanji's", to "No language is inherently more difficult than any other because native speakers grow up speaking it with about the same effort", and "Language difficulty is subjective perception", to name a few.

Most language enthusiasts on various forums are not scholars. The diversity of those opinions is a result of no good definition of language difficulty. But we can tell that most people are referring to the difficulty experienced by an adult (not a young child) in learning a foreign language (not mother tongue), and in many cases the adult's native language is English. If we qualify the discussion with these requirements, i.e.

  • the learner is an adult;
  • the language whose difficulty is evaluated is learned by the adult as a foreign language;
  • the difficulty is evaluated when the adult's native language is specified
then a measurement of language difficulty becomes meaningful.

I believe that in many social sciences, there are two general methods to measure a quantity, internal and external. For example, in linguistics, a researcher can define a set of factors pertinent to the correlation between orthography (spelling) and pronunciation in order to calculate the orthographic depth of a language, i.e. "the degree to which a written language deviates from simple one-to-one letter-phoneme correspondence". Alternatively, one can simply conduct a controlled study among a group of people (cohort) and see which language causes how many spelling errors in dictation or in a similar experiment.

When it comes to rating language difficulty, we can devise a set of rules and individually assess each language against these rules and then sum the rule ratings (with weights); e.g., percentage of words that have cognate or loan relationship with the words in the learner's native language, whether the nouns have genders and cases, how many variations in verb conjugation, whether the dominant word order differs from that of his native language, etc. For lack of a better term, we may call this an internal evaluation.

The external evaluation, on the other hand, has been done and is widely quoted. The most well-known data for English native speakers are from Defense Language Institute of the US, where they statistically measure the time for the learners to take in achieving a certain language proficiency level. The official Web page for this study is https://www.ausa.org/articles/dlis-language-guidelines, duplicated below for your convenience.

  • Category I languages, 26-week courses, include Spanish, French, Italian and Portuguese.
  • Category II, 35 weeks, includes German and Indonesian
  • Category III, 48 weeks, includes Dari, Persian Farsi, Russian, Uzbek, Hindi, Urdu, Hebrew, Thai, Serbian Croatian, Tagalog, Turkish, Sorani and Kurmanji
  • Category IV, 64 weeks, includes Arabic, Chinese Mandarin, Korean, Japanese and Pashto
The earliest version of this data was on a Webpage of Dr. William Baxter of the University of Michigan, which he got "from documents I got at a workshop of some kind" (private email). But Dr. Baxter later removed it from his Website, so you have to reference it from archive.org, duplicated below.

Languages included
(Languages regularly offered at the University of Michigan are in capital letters; this is NOT a complete list)

Hours of instruction required for a student with average language aptitude to reach level-2 speaking proficiency

Speaking proficiency level expected of a student with superior language aptitude, after 720 hours of instruction
GROUP I Afrikaans, Danish, DUTCH, FRENCH, Haitian Creole, ITALIAN, Norwegian, PORTUGUESE, Romanian, SPANISH, Swahili, SWEDISH 480 3
GROUP II Bulgarian, Dari, FARSI (PERSIAN), GERMAN, (Modern) Greek, HINDI-URDU, INDONESIAN, Malay 720 2+ / 3
GROUP III Amharic, Bengali, Burmese, CZECH, Finnish, (MODERN) HEBREW, Hungarian, Khmer (Cambodian), Lao, Nepali, PILIPINO (TAGALOG), POLISH, RUSSIAN, SERBO-CROATIAN, Sinhala, THAI, TAMIL, TURKISH, VIETNAMESE 720 2 / 2+
GROUP IV ARABIC, CHINESE, JAPANESE, KOREAN 1320 1+
That data differs from DLI's current data in not a small way. I had some email exchanges with DLI but they didn't explain these discrepancies.

[Update 2018-04]
Dr. Robert Marzari, the author of Leichtes Englisch, schwieriges Französisch, kompliziertes Russisch, kindly sent me a summary of the result of his research and granted me permission to post it here.

In my book I tried to evaluate the difficulty of seven European languages (English, French, Spanish, Italian, Russian, Polish - and German) for a German speaking learner; for the evaluation of the German language I imagined a Romance speaker, i.e. a mixture of a French, Italian and Spanish speaker. The results of the evaluation therefore do not show absolute degrees of complexity, but rather relative degrees of difficultness, i.e. relative to a German or Romance speaker.
   If you could get hold of my book (at a University library perhaps?) just take a look at the charts on pages 269 to 275: On these charts I give the results of my evaluation of those seven languages according to the linguistic subsystems of phonetics, writing system, grammar, lexicon and textual structurization (i.e. reading difficulty).
   According to these the degree of a learner`s difficulty is as follows:
     active competence  passive competence  complete competence
     (speaking+writing)          (reading)
Spanish   29 points         11 points          40 points
English   33 points         13 points          46 points
Italian   35 points         13 points          48 points
French    43 points         10 points          53 points
Russian   51 points         15 points          66 points
German    50 points         18 points          68 points
Polish    54 points         16 points          70 points

This excellent research indicates that a German native speaker rates language difficulty as Spanish < English < Italian < French < Russian < Polish, which is quite consistent with many polyglots's experience, although reading has a slightly different order. Apparently this research uses an internal evaluation (see above for a description), rating various aspects of a language instead of checking students' learning challenge. Thus, placing German in this language list makes sense even though the German learners speak a different native language, a Romance language instead of German.

Unfortunately, I'm not aware of any other research on this topic. But as you can already see, an otherwise hot topic can be made quite cool by the above analysis, cool as opposed to hot or debatable, and cool in the sense of being interesting.

Monday, July 10, 2017

Tian Ji's horse racing and the electoral vote system

[The following was written on November 11, 2016.]

The author of the famous military strategy book The Art of War, Sun Wu, commonly known as Sun Tzu, had a descendent, Sun Bin, who also wrote a book with the same title. In ca. 340 BC, Sun Bin advised his patron Tian Ji at a horse racing event and won the race. The following is the excerpt from Sima Qian's Records of the Grand Historian about this interesting story:

齐使者如梁,孙膑以刑徒阴见,说齐使。齐使以为奇,窃载与之齐。齐将田忌善而客待之。忌数与齐诸公子驰逐重射。孙子见其马足不甚相远,马有上、中、下、辈。于是孙子谓田忌曰:“君弟重射,臣能令君胜。”田忌信然之,与王及诸公子逐射千金。及临质,孙子曰:“今以君之下驷与彼上驷,取君上驷与彼中驷,取君中驷与彼下驷。”既驰三辈毕,而田忌一不胜而再胜,卒得王千金。于是忌进孙子于威王。威王问兵法,遂以为师。
(The ambassador of the Qi state went to the Liang state. Sun Bin as a convicted criminal went to visit and talk to him secretly. The Qi ambassador regarded Sun as valuable and carried him back to Qi. Tian Ji, the Qi general, gave him a warm reception. Ji and some princes often betted heavily on horse racing. Mr. Sun saw that all the horses were about equally capable, rated superior, average, and inferior. So Sun advised Tian Ji, "Sir, you just bet heavily. I'll make you win." Tian Ji trusted him and betted a thounsand units of gold with the king and the princes. Right before the race, Mr. Sun said, "Use your inferior horse to race with his best horse, use your average horse to race with his inferior horse, and use your best horse to race with his average horse." After three rounds, Tian Ji lost one and won two of the three rounds, and carried away one thousand units of gold. Then Ji recommended Mr. Sun to the King Wei, who interviewed Sun on military tactics and assigned him as the Chief of Staff.)

Fast forward to 2016. We see that the electoral vote in the US presidential race matters while the popular vote does not and that the two votes mathematically represent two different winners in this 2016 presidential race. Although neither Hillary Clinton nor Donald Trump can move her or his supporters from one state to another, there is similarity between the electoral vote system and Tian Ji's winning strategy. If democracy is the name given to the principle of the minority obeying the majority, the popular vote is the only true democracy. (As of this writing, Clinton has won 60,274,974 popular votes, while Trump has won 59,937,338.)

The reasons for some people to decide to not vote are (A) equal dislike of the candidates; (B) lack of interest in politics; (C) living in a non-swing state, one person's vote matters little. Group C may be small. But it's the only one out of the three that would make a difference if the American electoral vote system were abolished or even mitigated (by adjusting the weights i.e. the electors assigned to different states, e.g.). If that happened, swing states would have lower voter turnout and non-swing states would have higher. But since there're fewer swing states than non-swing states, the total popular vote count would be higher.

Sunday, April 16, 2017

自由: "freedom" or "liberty"?

A Chinese reader asked me about the difference between "freedom" and "liberty" when translating Chinese "自由" into English. We can find many answers with a Google search for "difference between freedom and liberty". One article maintains that "Freedom is a state of being capable of making decisions without external control", while liberty "is freedom which has been granted to a people by an external control". And some like this laboriously attempt to make a clear distinction between these two words.

Having read a handful of such answers but not satisfied with any of these, I told the person asking me the question: 1. the etymology of the two words differs; 2. in general usage, "liberty" is more abstract and philosophical than "freedom". Other than these two points, there is no difference, but in different contexts, only one of the two words is more common. For example, nowadays we say "freedom of speech", not "liberty of speech". (But see the ngram figure in Appendix 1.) We say "Liberty, Equality, Fraternity", not "Freedom, Equality, Fraternity". These set phrases are by convention, just as in Chinese idiom "破釜沉舟" ("cut off all means of retreat", "decide to fight to death"), not "破釜沉船", even though "舟" and "船" are completely synonymous.

Making distinctions between words is so intriguing that someone has even built a Web site www.differencebetween.net dedicated to this task. Language professionals and general public alike are fond of writing articles on these topics. While many such articles are valuable contributions to the correct usage in English, there is one common deficiency not fully recognized: the judges are the native speakers of the language, not linguists or scholars. An age-old debate among lexicographers is relevant here: Should a dictionary be prescriptive, directing people toward correct or supposedly correct usage, or be descriptive, faithfully documenting the actual usage in the native speaker community? Nowadays there may be more dictionaries in the latter category, presumably consistent with the increased level of public education. In the case of "freedom" vs. "liberty", if enough people, not English-as-a-foreign-language learners but native speakers, ask the question about their difference, the very fact that they ask this is a sign that the distinction, if there is a theoretical one, hardly exists in practice. Instead of making a great effort to separate them, it would be better to acknowledge, in modesty, the lack of difference between them.

________________________

Appendix 1

This figure is the Google ngram showing the historical usage of "freedom of speech" and "liberty of speech". We can see that from the mid-19th century on, "freedom of speech" has significantly gained in usage over "liberty of speech". But before that time, it only had slightly higher usage frequency.

Appendix 2

Some Weibo users gave me a few helpful pointers on this topic. One user informed me that political theorist and philosopher Isaiah Berlin's Four Essays on Liberty used "freedom" and "liberty" interchangeably. Two other users directed me to political scientist Hanna Pitkin's Are Freedom and Liberty Twins? According to Pitkin, most people don't make a distinction between these two terms, but Hannah Arendt is an exception. However, the author questioned Arendt's distinction from the point of view of political science as well as etymology (see the bottom of p.6 and p.9 of the article).

Appendix 3

The prescriptive-descriptive dichotomy, however, only applies to everyday language usage. In academic fields, especially of science and technology, but to some extent, of social sciences and humanities as well, the "prescriptive" approach should be supported, in accordance with the principle of division of linguistic labor as proposed by the philosopher Hilary Putnam. Take osteoarthritis as an example. An educated English speaker would think this meant inflammation (-itis) of bone (osteo-) joint (-arthr-). But it is not. Then, should the distinction between "freedom" and "liberty", if non-existing in practice, be made in the academic circle as two different terms in social sciences or humanities, followed by educative admonition to the public about the research outcome? Scholars have the freedom of research and can make any distinction between any pair of words in their research. In fact, social scientists and particularly philosophers habitually do that. As to whether the distinction should be imposed to the public, No!

Monday, January 9, 2017

Comparison of Chinese and Western Etymology

In my last post, I said "Most languages in the world take the alphabetic writing system. Studying the internal history of its vocabulary primarily means analyzing phonological and morphological changes through time." In this post,[note1] I'll expand on that point and contrast that with the Chinese tradition.

Take the word language as an example. In English, we read

late 13c., langage "words, what is said, conversation, talk," from Old French langage "speech, words, oratory; a tribe, people, nation" (12c.), from Vulgar Latin *linguaticum, from Latin lingua "tongue," also "speech, language," from PIE *dnghu- "tongue" (see tongue (n.)).
The -u- is an Anglo-French insertion (see gu-); it was not originally pronounced. Meaning "manner of expression" (vulgar language, etc.) is from c. 1300. ...

Source: Online Etymology Dictionary

In Spanish, we have

idioma m. language. [LL. idiōma: id. <Gk. idiōma: peculiarity (as lang.) <idiousthai: to make one's own <idios. See idio-.]; idiomático,ca a. idiomatic. [Gr. idiōmatikos: particular.]
Source: A Comprehensive Etymological Dictionary of the Spanish Language with Families of Words based on Indo-European Roots by Edward A. Roberts, 2014.

And most importantly, in French, we have

LANGUE, sf. a tongue; formerly lengue, from L. lingua. For in=en=an see § 71, and Hist. Gram. p. 48. — Der. langage, languette.
Source: An Etymological Dictionary of the French Language by Auguste Brachet, 1882.

The reason for my praise "most importantly" is that Auguste Brachet, the "romanistischer Autodidakt"-turned-professor according to (German) Wikipedia, created a monumental masterpiece in not just French etymology but etymology in general. In addition to what a regular etymologist would do, such as tracing the word form to its etymons in the same or other languages, Mr. Brachet systematically summarized the rules of the morphological and phonological changes and applied them to individual words in his dictionary. In the said example, he noted that for the derivation of in < en < an in the development of Latin lingua to French langue, the reader can consult his rule 71 in the book, where he says

I in Latin position [i.e. "when followed in the Latin word by two consonants" according to him, a convention not exactly the same as adopted today; my note] is changed to e in Merovingian Latin: thus fermum, ..., for firmum, ...' and this e, pronounced ei (see § 66), has produced two distinct French forms, according as it has preferred the open è sound, or the i sound.

You can choose to follow up to rule 66 in this book and p.48 of his A Historical Grammar of the French Tongue for more information about these sound (phonological) as well as spelling (orthographic) changes.

Western etymological publications may be divided into two groups: (1) dictionaries that give etymons or source words; (2) scholarly books and research articles on phonological and orthographic changes over time. Mr. Brachet's dictionary is unique in that it merges the two into one, so that the reader is conveniently offered the explanation of sound changes right in the headword entry, obviating the need to research as to why, e.g., the first i in *linguaticum would change to a in the history of the English word language.

However, a word contains more than its sound and spelling, but its meaning as well, which etymology cannot avoid tracing. But as linguist Calvert Watkins warned us, it is "more hazardous to attempt to reconstruct meaning than to reconstruct linguistic form". Sense development is much less researched and also less described in dictionaries. Unlike phonology, semantics or the study of meanings of words is not easily subject to formal (as in "formal logic") structural analysis. And yet tracing the sense development is the primary task of Chinese etymology. Chinese phonological development is a separate field of study; it is not incorporated in etymology, because the meaning of Chinese characters (or words, whose meanings are almost always based on the component characters) is largely dissociated from the sound. Take the character 文 ("text") as an example.


Source: 谢光辉《汉语字源字典》, 北京大学出版社, 2000年, 29页
Translation of the embedded text: "文" is a pictographic character. "文" in oracle bone script (甲骨文) and bronze inscription script (金文) resembles a standing person facing forward. His chest bears tattoo of decorative patterns. This is in fact a vivid description of the ancient "文身" (tattoo) custom. Thus the original meaning of "文" was a person with tattoo on his body, as well as pattern, texture. Later, the meaning was extended to character, article, culture, civilization etc.

That was a typical entry of Chinese character etymology. For simple characters especially pictographic ones, it is simply pure 依类象形 or description of the object according to what it looks like. The focus is on the meaning, not the reading or sound. Some more complicated characters may be decomposed into elements each of which is analyzed the same way, as in the case of "秦" (see my last post).

Needless to say, the majority of the characters (at least 80%) are of the type 形声字 or characters of form and sound, such as "指" (finger; to point), where the form radical "扌" suggests the meaning, i.e. something related to hand, and the sound component "旨" suggests its reading , i.e. zhǐ. The classical Shuowen Jiezi (说文解字) dictionary, unsurprisingly, points out that this character "从手旨聲" (the meaning is based on "手" and the sound on "旨").

Similarities and differences between Chinese and Western etymology can also be revealed from the definition of the word etymology itself. The Webster dictionary defines it as "the history of a linguistic form (as a word) shown (1) by tracing its development since its earliest recorded occurrence in the language where it is found, (2) by tracing its transmission from one language to another, (3) by analyzing it into its component parts, (4) by identifying its cognates in other languages, or (5) by tracing it and its cognates to a common ancestral form in an ancestral language" (I added the parenthesized numbers). Thus we see that most western dictionaries with etymological information meet the requirements (1) and (2), sometimes (3). Wiktionary and Friedrich Kluge's An Etymological Dictionary of the German Language also meet (4) and (5) most of the time. What if we apply these requirements to Chinese character etymology? (1) is often met if we interpret it as finding the first occurrence in history, which nowadays is made drastically easy with the aid of a computer-based search. But tracing its development in the course of long history, either inside Chinese or (2) across different languages, is rarely done. (3) is done, though with significant differences from that in western languages. (4) and (5) are rare because they're mostly irrelevant to Chinese characters.

How is analyzing a Chinese character into its components special compared to the western tradition? While a character e.g. "指" can be analyzed into "扌" (for meaning) and "旨" (for sound), there is no systematic change of a component from one form to another. Take rule 126, one of the many summarized by Mr. Brachet for French, as an example, "Before a, initial c ... passes through the successive aspirated sounds k'h, tk'h, kch, ch." He supports this rule of ca- > ch- with about 80 words as evidence, champ < campus, chien < canis, etc. Can we construct an analogy of this rule and find supporting examples in Chinese etymology? Since Chinese does not use an alphabetic writing system, there's hardly any need in dealing with the sound change of a character in etymology. Instead, we may substitute the change in form of a character. For example, after studying the 金文, 小篆, and 楷体 forms of "指" and other characters with "扌" on the left side, we may conclude that all (or most) such characters have gone through the predictable change of this radical in these forms, just as the French ca- changed to ch-. Similarly, all or most characters with "旨" on the right side probably went through the same change as shown here (see the row for 字源演变). Thus, we find in etymological studies a parallel between Chinese and western languages in identifying common component change in characters or words.

However, Chinese etymological dictionaries are also interested in finding the "root cause" of the most basic characters. Because the characters are ultimately from pictographs in origin, this "root cause" finding is mostly "依类象形" (describing the object according to what it looks like). If we must find a parallel for this practice in western etymology, it is equivalent to answering the question why e.g. the Proto-Indo-European stem from which Modern English word word is ultimately derived is *were-, that is, why that sound. Obviously, except for some onomatopoeias, there is no answer, or no such research. While Chinese etymologists have forged ahead in that direction, so far this "research" is, I'm afraid, very much based on guess work, simply because there is no record left in history about why a specific character was invented to be of that form. "文" may indeed be a symbol for a person with tattoo, with no hard proof anyway. But this is too error-prone. In my last post, I quoted the article 许慎为何将象释成母猴——“为”字趣释 (Why did Xu Shen interpret an elephant as a female monkey: interesting interpretation of character "为"). In a recent weibo blog post, a scholar interpreted, purely based on its resemblance, "夷" in its original oracle bone script as a person squatting, while in 《汉语字源字典》 (Dictionary of Chinese Character Etymology) by another scholar in this field, it was thought to represent a man bound by ropes, to be served as a slave or for sacrifice. On this stretch of imagination, I have but one comment: "汉字字源,看图识字,见仁见智" (Chinese character etymology / Look at pictures and learn to be literate / Trust your opinions and beliefs).

________________________
[note1] Due to the unique nature of the Chinese language, etymology can be of characters as well as words. This post is about character etymology.

Sunday, September 4, 2016

Why is it rare to see Chinese etymology?

People speaking English as the native language are used to dictionaries in which each headword contains not only the definition of the word and example phrases or sentences, but also brief etymology, as in this example in the Merriam-Webster dictionary for the word word.

Middle English, from Old English; akin to Old High German wort word, Latin verbum, Greek eirein to say, speak, Hittite weriya- to call, name
First Known Use: before 12th century

A Chinese dictionary, on the other hand, almost never gives the etymology. In this blog posting, I'll try to explain why.

For the sake of discussion, we need to make a distinction between two types of Chinese dictionaries. Due to the nature of the Chinese language, the English word dictionary (or its equivalent in most other languages) can mean either "字典" (literally "character-dictionary") or "词典" also written as "辞典" (literally "word-dictionary") in Chinese. I have not seen a dictionary for general Chinese words published by anyone that contains etymological information for the headwords.[note1] Thereinafter, a Chinese etymological dictionary only refers to a character-dictionary.

The disappointment at lack of an etymological dictionary of Chinese words does not extend to that for a dictionary of Chinese characters or 字典. Back in the Eastern Han dynasty (25–220 AD), the scholar Xu Shen (c. 58 – c. 147 CE) wrote the monumental dictionary Shuowen Jiezi (literally "Explaining Graphs and Analyzing Characters" according to Wikipedia). Since Xu lived in a period only one thousand or less years after a large number of Chinese characters were invented, the etymology he gave in the book for each of the 9000 plus characters is mostly trustworthy. Take the character "秦" (qín) as an example.[note2]

伯益之後所封國。地宜禾。从禾,舂省。一曰秦,禾名。𥠼,籒文秦从秝。匠鄰切
(The fief given to the descendant of Boyi. The land is suitable for crops. The character has a meaning based on "禾" ("crop") and contains an abbreviation or syncope of the character "舂". Another theory claims that this character is the name of a crop. This character in Zhouwen script [a script used just before the time of the First Emperor], "𥠼", is based on "秝". Pronounced with the initial consonant of 匠 combined with the final of 鄰.)

This is an excellent example of Chinese character etymology; it not only describes the source of the character but also analyzes the morphology or form of the character, as evidenced by the construction of "秦" through "禾" and part of "舂". The significance of Xu's book in the history of the Chinese language is such that almost two millennia later, scholars are still using his book in research. The only major revision came after the 1899 discovery of oracle bones, which the Shang dynasty (c. 1600 BC–c. 1046 BC) people used for divination. The oracle bone script predates Xiaozhuan script, the primary source for Xu Shen's character etymology because the latter is the earliest script known to Xu. Owing to this gap of knowledge, Xu inevitably made numerous mistakes in his otherwise near-perfect dictionary. One good example can illustrate the point. In the article 许慎为何将象释成母猴——“为”字趣释 (Why did Xu Shen interpret an elephant as a female monkey: interesting interpretation of character "为"), the author explained how the simple character "为", meaning "for" or "to do" nowadays, evolved from the oracle-bone pictograph depicting a man holding an elephant leash but mistaken for a female monkey by Xu Shen. (By the way, elephants indeed roamed around middle and northern China three thousand years ago, but the species was not the same as in southern China or India today.)

With all the background information, now we may answer the question why it is rare to see Chinese etymology. By that I don't mean you can't find character etymology at all. Books such as 《汉语字源字典》 ("Dictionary of Chinese Character Etymology") and the Web site Chinese Etymology by Richard Sears are available. But this is almost never incorporated into a Chinese dictionary other than a specialized etymological dictionary. If a general English reader is not more academically inclined than a Chinese reader, why does a common English dictionary such as the Webster, American Heritage, or OED (Oxford English Dictionary) include etymology without hesitation? The reason may be that Chinese (character) etymology almost never helps a reader in studying the Chinese language due to the long history and evolution of the character. (Can you stretch your imagination far enough to associate the scene of a man and an elephant with the sense of "for" or its slightly older sense of "to do"? See above.) In addition to the long history, I believe there's another, more subtle, element in clouding the Chinese etymology. Most languages in the world take the alphabetic writing system. Studying the internal history of its vocabulary primarily means analyzing phonological and morphological changes through time; e.g., there was a systematic change of f to h in Spanish for a large number of words. Secondly, less conducted is the semantic evolution of words; it's less done because it is "more hazardous to attempt to reconstruct meaning than to reconstruct linguistic form" as linguist Calvert Watkins said. And yet, the Chinese characters rarely went through systematic morphological changes that apply to a large number of characters and, since Chinese is not based on an alphabetic writing system, phonological changes are not conducive to the study of etymology per se. This leaves a large part of Chinese etymology to the study of semantic evolution, which is, as stated, more error-prone in scholarly reconstruction.

There is another reason for not incorporating etymology in Chinese dictionaries. Many characters originate from pictographs or pictograph-like glyphs such as Xiaozhuan script. Publication has to render them as images instead of text, which is an editorial inconvenience. The images with their explanatory texts take a significant amount of space relative to the definitions and examples in usage, which a regular user cares more about. This is in contrast with the etymology in an English dictionary, which can be made brief and still makes sense to the minority of interested readers. And yet a third reason may be that it's just the custom of Chinese lexicography, i.e. no etymology except in specialized dictionaries. This is probably also the reason why dictionaries of other languages than English lack etymology. (Try to find etymology in any dictionary of Spanish, French, German or Italian in a bookstore or library!) But nobody knows the original cause or reason for this custom.

Therefore, unlike a language where a student may make use of etymology in vocabulary study optionally combined with some mnemonics (as demonstrated in my book for Spanish), the Chinese characters have to be studied in a different way. Etymology comes in handy only for the very first few characters, such as "火" ("fire"), "山" ("mountain"), which are frequently used to impress complete beginners. After 10 or 20 such "pictographs", rote memory is commonly adopted, but books such as Tuttle Learning Chinese Characters that laboriously make up mnemonics are helpful. Fortunately, a large portion of the character repertoire consists of characters combining two parts, one more or less representing its meaning and the other representing the sound. However, in none of these cases would etymology play any role.

___________________________________
[note1] By emphasizing "general", I'd like to point out that a special group of Chinese words, 成语 (idioms), are an exception, in that dictionaries of Chinese idioms almost always give the first occurrence of the idioms and sometimes even briefly describe the sense development as well.
With regard to dictionaries of words in general, one may think of the book 《辭源》, literally "word origin". First published in 1925, it takes a misleading title because it's no more than a dictionary (albeit of high-quality) of Chinese words with no etymology. In fact, even if we take an alternative interpretation of "辭源" as "first occurrence of word", this book fails as well; e.g., the entry for "中国" does not list its first occurrence in the Book of Documents, or the bronze inscription which the Book records. Another book we can even more readily dismiss is the 《詞源》 by Zhang Yan in the Song dynasty because the book is on the subject of the literary genre , not "words".
[note2] Incidentally, the character "秦" is significant in that traditionally many scholars including Paul Pelliot believed that it is the ultimate source for the word China in many languages in the world, although more recent research attributed the origin to "晋". Two other sources of the word referring to China are Khitan as in the case of Russian, and silk.)

Saturday, August 13, 2016

Translation of "technical"

The dictionary translation of "technical" is "技术的", as in "technical skill", "technical innovations". But the word is often used in a more general, "non-technical", context, particularly as an adverb, "technically", e.g., "Technically, driving at 31 mph at a speed limit of 30 is speeding." In this case, instead of "技术的", a very natural Chinese equivalent may be "严格说来" (strictly speaking).

Another example (modified from the original),

--- begin quote ---
the problems are technical, not systemic. Afterward, when she told her sister they had named the problems as "technical," her sister responded “What does that mean?” Indeed that was the question I had, because the discussion was not about technical issues at all
--- end quote ---

The word "technical" literally translated as "技术的" in this context indeed causes confusion to people not speaking English at all, but might make some sense if the Chinese knows a little English. A more meaningful translation, I think, would be "具体操作的", as "这些问题是有关具体操作的,而不是整体上的(或体制上的)". But if the reader or listener is moderately proficient in English, the translation "这些问题是有关技术性细节的" works, too.

Saturday, May 28, 2016

"Oriental" is not derogatory

On May 20th, Obama signed a bill that removes "Negro," "Oriental" and a few other terms from federal laws, specifically, "striking 'a Negro, Puerto Rican, American Indian, Eskimo, Oriental, or Aleut or is a Spanish speaking individual of Spanish descent' and inserting 'Asian American, Native Hawaiian, a Pacific Islander, African American, Hispanic, Puerto Rican, Native American, or an Alaska Native'." The bill, sponsored by New York congresswoman Grace Meng, an Asian American born in 1975, focused on the word "Oriental" but included other derogatory terms such as "Negro".

No doubt "Negro" is offensive, derogatory, reminding us all of the dark history of slavery. But does "Oriental" have the same effect to arouse a mental image of Chinese exclusion, coolies, or other more subtle discriminations in later decades? As an Asian American myself who came to the United States in early 1990's, I say No to this specific question. Discrimination against Asian Americans has never been completely eliminated and takes different forms from those against, say, African Americans: secretly raising college entrance standard, racial slurs in public broadcast with impunity, and others. But it never occurred to me that the word "Oriental" would be offensive to me in any way. About twenty years ago, I worked at a lab, where we all shared one telephone. One day the phone rang. My coworker, a white technician, came to me saying, "It's for you. The guy has an Oriental accent". That sounded absolutely normal to me. Interestingly, now I just realize that the word "Oriental" was indeed rarely used in recent years. In fact, I don't recall hearing it again in daily conversation ever since. But that may be just due to a natural evolution of the English language in which some words gain and some words lose popularity, instead of people's realization of the newly acquired offensive sense.

I'm not the only Oriental, a.k.a Asian, that considers the word neutral. Two years ago, a reader commented on an article saying "the word 'Oriental' is still widely used here in Japan". I want to add that the word is also commonly used as part of English translations for thousands if not millions of hotels, restaurants, all kinds of businesses in China, including the famous 东方明珠, officially named Oriental Pearl Tower, the tallest structure in China from 1994–2007 and one of the most visited places in Shanghai. Right after Obama signed the bill, an Asian American wrote My 'Oriental' Father: On The Words We Use To Describe Ourselves on NPR.org. Her father emigrated from Hong Kong to the US in 1969 and has always insisted on using the term "Oriental" to refer to himself and the style of his Chinese restaurant, in spite of the author's repeated reminders that the term has picked up an offensive connotation over the years. Readers of the article generally consider "Oriental" to be neutral as well. I can't agree more with the following comment currently at the top:

As a dumpy old white guy, I have never thought of Oriental as a disrespectful term. Yet, regardless of my feelings on the matter, if someone feels marginalized by the term, it shouldn't be a problem for me to use a word or phrase that they find more appropriate.

That being said, there is indeed a distinction we can make between self-referral and referral-to-others, as one reader comments

This is a critical point that is very different from words used by others to describe each of us. Your wife [referring to another reader's comment] is comfortable referring to herself as "Oriental," like the author's father. But it may be different for her if someone else uses the same word in a different way, such as "it is hard to tell what Orientals are thinking" or "inscrutable Oriental."

That is because there is often a need to consider intent (versus ignorance) in the words used by others to describe each of us. A shift to geographically based terms like European, African, Asian reduces that need somewhat.

Very well said! However, whether a word becomes derogatory should follow a simple "democracy" rule, so to speak. If a large number of people speaking this language use the word in a derogatory sense, it is so. If not, it is not. There's no magic. It's a descriptive rule not, in this case, challenged by prescriptive linguists or scholars, but ironically, challenged by some young generation Asian Americans, up to Congresswoman Grace Meng, good intentions notwithstanding. Although eliminating one word from our vocabulary or limiting its use to specialized areas is harmless, if we continue to move words into the dictionary of tabooed language, our life will nevertheless become increasingly more inconvenient.

By the way, it would be interesting to find the origin of the new, allegedly derogatory, connotation of "Oriental", something no article I've read touched upon. It's not likely that one single incident or a fictional scene created such a dramatic effect. Certain young Asian Americans may have suffered from weak and implicit unfairness in whose context the word "Oriental" was used. If this wild guess is completely unfounded, another source of this connotation may be a continuation and re-surge of Orientalism most famously expounded by Palestinian-American scholar Edward Said in late 1970's. In a Foreign Policy article Chinese Is Not a Backward Language, the author uses the term "Orientalism 2.0" as a label for the re-emerging notion of western superiority and corresponding eastern inferiority. Is there a causal association with "Oriental" derogation? The Orientalist ideas are largely restricted to the academic circles. If the derogatory sense of "Oriental" has truly been felt by mostly scholars and "leaked" to some highly educated young Asian Americans, that may indeed be the origin of the new connotation we are looking for, and it's consistent with the fact that the general public is not aware of the semantic evolution.

Saturday, March 12, 2016

English "can" and Chinese "会"

An auxiliary verb is one that cannot be used alone and must work with a regular verb. English "can" is an example, e.g. "I can speak Chinese", where the verb "speak" cannot be omitted. But in the case of Chinese "会", both "我会说中文" and "我会中文" are perfectly grammatical. In this blog posting, we'll compare the English "can" with its Chinese counterpart "会" particularly in the context of language study.

The sentence "我会中文" must be translated to English as "I know Chinese", or "I can [a verb such as speak] Chinese", but not "I can Chinese", because "会" is used as a regular transitive verb, a usage not existing for English "can". In the first translation here, "会" matches "know". But if you mull over the connotation, there's a subtle nuance that easily escapes our attention. To know is to have knowledge. "I know Chinese" implies that I have knowledge of this language, a passive knowledge not readily leading to an action. The Chinese "会", on the other hand, often suggests a more active role, and "我会中文" is more accurately translated as "I can [a verb such as speak] Chinese" than to "I know Chinese". The only problem with this "more accurate" translation is that we can't assume "会" is unambiguously "can speak"; of the various aspects of the language skill, speaking is only one, parallel with reading, writing and listening comprehension.

There seems to be a deficiency in second language education in China when compared to that in other countries. "哑巴英语" (literally, "mute or dumb English"), referring to English education with emphasis on scoring high on paper tests at the expense of speaking skills, was and probably still is widespread in China. But language study in other countries is generally in a better shape, where someone said to know a language is assumed to be able to speak that language. As a result, "我中文" and "I can speak Chinese" become equivalent in real-life situations.

It's obvious that Chinese "会" is used as an auxiliary verb when it's followed by a regular verb, just like English "can". When "会" is followed by a noun, a usage missing for English "can", it is a full-fledged regular verb. In this sense, "会" means "be capable of" or "know" as in "know a language". The noun that follows must represent a type of skill. A language is probably the most common example. But many other skills work as well, e.g., "他会魔术" ("he can do magic", "he knows how to perform magic"), "他会书法" ("he can do calligraphy", "he's good at calligraphy"), "他会量子力学" ("he knows quantum mechanics", although this English sentence may be better interpreted as "他懂量子力学"). In other cases, it becomes ambiguous whether the object is a noun or verb, e.g., "我会游泳" ("I can swim", "I know how to swim"), where "游泳" can be both a noun and a verb.

Chinese is not the only language where the verb "会" may function not only as an auxiliary verb but also as a regular verb. In the Facebook Polyglots group, one German learner asks, "Why do I come across sentences where the main verb is left out; 'Ich kann Deutsch auch'....Where is 'Sprechen'?!". That's simply because the German word "können" (for which "kann" is the first person singular form) serves as a regular verb here. Interestingly, the question asks "Where is 'Sprechen' [speak]?", consistent with the above observation that "speaking" is the dominant or default aspect of the language skill.

Wednesday, January 20, 2016

Restrictive and non-restrictive clauses

In English, a restrictive clause restricts the scope of the noun or pronoun in front of it (antecedent, head word), while a non-restrictive clause does not. For example,

Restrictive: The New Yorkers who like to walk are healthy.
Non-restrictive: The New Yorkers, who like to walk, are healthy.

In a posting to the Facebook Polyglots group, I'm surprised to find that many non-English-native-speakers have a hard time understanding the difference. I started the discussion because I wanted to see how the sentences are translated to other languages, especially German, where commas are used "profusely". (The two commas in the English sentence are essential in making the distinction between the two types of clauses.) According to the polyglots' responses, it looks like the distinction exists in Romance languages (French, Spanish, Italian, etc.), but not in many others (German, Polish, possibly Russian). In the latter group of languages, breaking up the sentence into two parts is a solution, e.g., "The New Yorkers like to walk and are healthy".

The reason I bring up this topic here is that, when I think of the distinction in Chinese, I find that it too has the difficulty: both sentences would be translated as "爱走路的纽约人身体健康". Does that mean only those New Yorkers who like to walk are healthy (in the restrictive sense), or New Yorkers in general are healthy because they like to walk (in the non-restrictive sense)? If we were to ask the people who understand Chinese and more or less know that New Yorkers walk a lot, I bet most people will interpret it the non-restrictive way: New Yorkers like to walk and they are healthy. But I strongly believe this is context-dependent. By that I mean, if we ask people who understand Chinese and know that Houston is the fattest city in America how to interpret "爱走路的休斯顿人身体健康" (literally "The Houstonians(,) who like to walk(,) are healthy", where the commas are ambiguous as in Chinese), I'm sure most will think in the restrictive sense: Only those Houstonians who like to walk are healthy. It would be unthinkable to say Houstonians in general like to walk, because many start to pant after dragging their unwieldy bodies for one-eighth of a mile. Sadly, fat Houstonians and lean New Yorkers affect the way we read an English sentence.

Lack of distinction between restrictive and non-restrictive clauses in a specific language of course does not mean the grammarians of that language are unaware of it. In case of Chinese, 定语 or attributive word or phrase or clause is said to have both 修饰 (literally "decorative", corresponding to "non-restrictive" here; not "modifying" as some would translate it to) and 限制 ("limiting", "restrictive") functionalities. Nevertheless, most Chinese are not aware of it and subconsciously mix them up, leading to confusion or misinterpretation.

Lastly, I'd like to point out that if English uses an attributive word instead of a clause, the same ambiguity arises. Consider "The hard-working first-generation immigrants deserve our respect". It can mean (restrictive) "The first-generation immigrants that are hard-working deserve our respect", or (non-restrictive) "The first-generation immigrants, who are hard-working, deserve our respect". Since the first-generation immigrants in general are relatively hard-working, the second interpretation may prevail. But if you are of the opinion that a significant proportion of first-generation immigrants are just as lazy as the population in general, the first interpretation sounds better.

Wednesday, November 25, 2015

What language is popular? A revisit

Four and half years ago, I checked language popularity based on the number of shelves for the foreign language books at a local Borders Bookstore here in south Texas. The result was as follows:

March 2011 at Borders
Spanish: too many
French: 4
Italian: 3
German, Japanese: 2
Latin: 1.5
Arabic, Chinese, Portuguese: 1
Russian: less than 1
Korean: less than 0.5

The Borders bookstore was closed down soon after that. Recently I went to a local Barnes and Noble store and checked the foreign language books just as I did before, and the result is:

November 2015 at Barnes and Nobles
Spanish 7
French 3
Italian 1.3
Chinese 1
German, Japanese 0.8
Russian 0.6
Latin, Portuguese 0.5
Arabic 0.4
Korean, Vietnamese 0.3

The following is a summary of the two sets of data (Spanish is excluded; the numbers less than 1 and 0.5 are artificially set to 0.8 and 0.4, respectively):

It's probably safe to assume that the two bookstores carry approximately the same total number of books. Then we see that either foreign language books are genuinely sold less in 2015 than in 2011, or the two stores have different focus, i.e., Barnes and Noble cares less about these language books than Borders.

But more interesting is the difference in change for specific languages. Of all that have data for both years (excluding Spanish), only Chinese has kept at the same level, taking one full bookshelf in both years. On average, the 2015-to-2011 ratio is 0.57. Relative to that, Chinese is way above average in books to be sold, and French, Russian and Korean are above average as well. The rest are attracting lower customer interest now than before, Portuguese > Italian > German = Japanese > Latin. Latin, with the biggest drop, occupies one-third of the space on the shelf after elapse of four and half years. There may be a shift of people's interest away from pure intellectual enjoyment to practical economic benefit.

This is of course a crude way to measure language popularity. Commercial bookstores such as Borders and Barnes and Noble make an effort to meet the demand of the market but there's no perfect, up-to-the-minute, match.

For completeness, here is another measurement of language popularity, based on an April 2014 poll of what language the language-loving people are studying conducted in the Polyglots group on Facebook, which had 16,000 members at the time. The popularity order,

French (935) > Spanish (807) > German (799) > English (651), Italian (448), ..., Mandarin+Chinese (360)

is obviously different from the bookstore popularity in 2011 or this year. For one, the stores are in Texas, where the presence of Spanish is particularly strong. Secondly, the market-determined popularity may differ from that in the Polyglots-poll because the latter is more of a reflection of the fun or leisure time enjoyment, not necessarily going along with the economic development in the region where a specific language is spoken; while studying Chinese may increase your chance to find a job in the global market, it may not be as much fun as studying the sexy Italian. It would be interesting, though, if the Polyglots group could conduct a poll every once in a while, so that we could see if the fun factor could also go up and down as the usefulness does. It probably will, but on a much smaller scale.

Saturday, November 7, 2015

On a proposed new name for China combining Mainland and Taiwan

Xi Jinping of Mainland China and Ma Ying-jeou of Taiwan, two top leaders on each side, had a historic meeting on November 7, 2015. It was an unprecedented event to ease the tension across the Taiwan Strait since one government split into two in 1949. Some people on the Internet are excited about making up new names for a possible merger of the two governments.[note1] It would be unfair to reuse either of the current names

  • 中华人民共和国, People's Republic of China, used in Mainland
  • 中华民国 (中華民國), Republic of China, used in Taiwan (but 中华台北, Chinese Taibei, in certain international events)

if such merger were to happen. Of all possible names proposed in the half-serious half-hilarious online discussions, "中华共和国" merits linguistic and historical analyses. The first component of this neologism, "中华", combines the central theme of the names from both Mainland and Taiwanese governments, and represents the common cultural element that ties both sides together. Generally translated as "China", it literally means "central" or "middle" (as in "Middle Kingdom") and "flowery beauty" (see Name of China).

The latter component of "中华共和国", i.e. "共和国", is not that straightforward. It is the third component of the compound word "中华·人民·共和国", but superficially differs from the second of the compound "中华·民国". But are they truly different and how? Confusingly, both governments take "Republic" as part of the official English name. While English "republic" most commonly maps to "共和国" in Chinese, the Wikipedia 共和制 article states that "在東亞有一些共和制國家也以舊譯「民國」为名" (In East Asia, there are some republican states that take an old translation "民國" as their names.) So "民国" as a general term (as opposed to a proper noun) is simply an alternative translation of "republic". The problem of this one-to-many relationship, "republic" translated as both "共和国" and "民国", is that "中华共和国" will have to be translated as "Republic of China". But that English phrase has already been taken; it's the official English name for "中华民国". While the Mainland government may be happy with the Chinese name "中华共和国", its potential English name "Republic of China" would be a horrible choice precisely because of the conflict with the English name of the current Taiwanese government. (This reminds me of cybersquatting in which people register Internet domain names in the hope of selling them for a good price later. But the Taiwanese government of course didn't do this intentionally and had no intention to make money out of such good English name.) I have not found the originator of this English translation for "中华民国". A message was sent to taiwan.gov.tw asking for help but I have yet to receive a response. Apparently, soon after 中华民国 was established in 1912, both "Republic of China" and "Nationalist China" were used, but the former prevailed as time passed by, even though "Nationalist China" is literally closer in meaning.

The English word "republic" comes from Latin "res publica", meaning "public affair(s), public matter(s), public thing(s)" (not "people's public affairs" or "people's affairs" as some sources claim). Over time, it has evolved into the modern sense of "a form of government or country in which power resides in elected individuals representing the citizen body and government leaders exercise power according to the rule of law" (from Wikipedia Republic). The Latin source of the word does not inherently conceive its modern meaning, nor does it map to Chinese "共和国" or "民国". So we just need to examine the relationship between the modern meaning of "republic" and the two Chinese translations. "民国" is an easy term, literally "people's country" or "people's state", or a country "of the people, by the people, for the people". On the other hand, "共和国" is more complicated. "共和" as the name of the political system was coined by Japanese scholar Bankei Otsuki (大槻磐溪) referring to Gonghe Regency when the Zhou Dynasty was ruled jointly by two dukes more than 2000 years ago.[note2] This translation using Kanji characters was later brought into the Chinese vocabulary as a loan word. Now, it's clear that neither "共和国" nor "民国" is a perfect translation of "republic" in its modern sense, in terms of intension (not intention) of the words. But "民国" as "people's country" is definitely closer than "共和国" if understood literally; why a country ruled by two dukes would in any way be likened to a modern republic is beyond me. Unfortunately, for unknown historical reasons, the Chinese word "共和国" is actually very much more common as a translation of English "republic" and "民国" has fallen out of fashion.[note3]

In short, the proposed name "中华共和国" is a good one in Chinese. But its translation in English and all other languages with a writing system not directly based on Chinese characters pose a curious challenge.

_____________________
[note1] The Mainland government calls the Taiwanese counterpart "authority", not "government". But that's a point beyond my interest.
[note2] Nowadays more historians seem to attribute another event at the very beginning of the Chinese chronology, i.e. 841 BCE, as the source for "共和". This is called "共伯和干王位" (The Count of Gong named He ruled as a regent.).
[note3] As a proper noun, "民国" is short for "民国时期", referring to a historical period, either from 1912 (beginning of ROC) to 1949 (beginning of PRC) used in Mainland, or from 1945 (beginning of ROC's rule of Taiwan) on.

Saturday, September 26, 2015

Funny Chinese transliterations to help remember English words

Some Chinese guy with too much free time came up with interesting Chinese transliterations of some English words. Part of the list is as follows, with my English translation of the transliterations in parentheses.

救护车 ambulance 俺不能死 (I can't die)
雄心 ambition 俺必胜 (I must win)
强壮 strong 死壮 (die strong; this Chinese "word" only exists as a transliteration for English "strong")
羡慕 admire 额的妈呀 (Oh my God; literally, my mom; 额 is a dialectal pronunciation of 我)
脾气 temper 太泼 (too surly or boorish and rowdy)
经济 economy 依靠农民 (rely on farmers)
海关 customs 卡死他们 (block them to death)
怀孕 pregnant 扑来个男的 (a man throws himself down on me)
地主 landlord 懒得劳动 (too lazy to work)

What's special about these phonetic transliterations is that they are meaningful phrases or sentences on their own and there's semantic connection, although no equivalence, with their English counterpart. An "ambulance" doesn't mean "I can't die", but imagine what the person being transported is saying to himself. "Customs" ("custom" in the original posting) maps to "block them to death", consistent with the practice of economic protectionism.

English-speaking people learning a European language, particularly a Romance (Latin) language, can greatly benefit from etymology.[note] When etymology fails to help, some sort of mnemonics may be conjured up, unless the learner prefers rote memory, as young children tend to do. Alison Matthews and Laurence Matthews' Tuttle Learning Chinese Characters does exactly that to help English speakers learn Chinese words. The book exists because there's too little etymological connection between English and Chinese. To go in the other direction, a Chinese cramming English vocabulary has to rely on mnemonics as well. I only hope to see a more complete list of Chinese transliterations than the one shown above, ideally published as a pocket dictionary, so the student can look up a word to read the suggested mnemonic while enjoying the fun that serves to strengthen the memory.

___________________
[note] In fact, I've been writing my book Learning Spanish Words Through Etymology And Mnemonics for almost a year, inspired by this idea. See here for more details.

Monday, July 13, 2015

Prepositional separation as a difficulty of Chinese

For ease of understanding, I call this prepositional separation: In Chinese, a concept normally denoted by a single prepositional word in most other languages must be expressed by two characters or words separated by other words. For instance, English "in" is "在...中" or "在...里面" in Chinese.[note] In short simple sentences, this is not a problem. But when a sentence becomes longer, even a native speaker begins to struggle when he crams more and more intermediate components into short-term working memory, in eager anticipation of the end marker such as "中" or "里面", to finish processing the information. Take the following as an example,

He put the ring in a bright color box that has an exquisitely decorated label on it, which reads "For Julia".
他把戒指放进了鲜艳颜色的、上有精心装饰写有“给朱莉娅”的标签的盒子。

Of course, a good translator may choose to break up the long-winding Chinese sentence, precisely because the long-winding attributive clause sounds awkward, unnatural, or simply, non-Chinese. In addition to attributive clauses, Chinese suffers from potentially long complement clauses as well. Take the following as an example.

A sophisticated computer behaves like a human in the sense that it can generate its own commands according to the current situation it encounters.
一台高级计算机(电脑)在以下意义上如人一般行动:它能根据它当场碰到的情况产生出指令。

The Chinese translation is almost forced to have two sentences; otherwise, the "in the sense that" clause would be too foreign to a Chinese ear.

Separation of a semantic structure that provides one single functionality is generally undesirable, in Chinese or any language. Winston Churchill allegedly made up a sentence, "This is the kind of arrant pedantry up with which I will not put" in response to the picky editor. German has tons of separable verbs (as well as its ending "not") that swell the brains of simultaneous interpreters. Fortunately, English and many other languages normally consolidate the words or phrases that represent one concept, or can do so as an option. You can say "put on the jacket", or "put the jacket on". But when the "jacket" becomes long due to a series of adjectives, it's unlikely you'll separate "put" and "on", and you'll definitely not do so if "jacket" has an attributive clause. In Chinese, on the other hand, the "word ... word" (e.g. "在...中") construct is the only option. A translator has to be clever enough to shorten the "..." part to a comfortable level.

The Chinese language is difficult not just because of its large repertoire of characters, but also because of other aspects such as its prepositional separation, which may become increasingly cumbersome in expressing complicated ideas in the modern world.

_______________________
[note] Technically, this preposition is called circumposition, although some linguists only use the word "preposition".

Saturday, May 23, 2015

Translation of a poem "Snow falling on high branches..."

Somebody asked for a translation of this poem:

雪落高枝映蕾菲
惊洁美
卓然岂摧眉

My translation:

Snow falling on high branches befits blooming flowers
I'm amazed at this clean beauty
So demonstrative as to debase myself?

The last sentence may be a challenge. 摧眉 literally means "make lower eyebrows". But figuratively it means "lose confidence; flatter (in the bad sense)".

This poem, technically ci rather, was probably composed by a person named 啼非 in 2004.

(The image of the fan below is from Mr. Vishal Upadhayay's Facebook posting.)

Saturday, April 25, 2015

English words that seem to have opposite meanings, "sanction", "bash", "bashful"

Somebody in a forum mentioned a few English words that seem to have opposite meanings. One is "sanction", which according to Wiktionary means (1) an approval, by an authority, generally one that makes something valid; (2) a penalty, or some coercive measure, intended to ensure compliance; especially one adopted by several nations, or by an international body. A quick response from another guy is "really? somebody should tell Putin that sanction can mean something better!!" That's interesting! Anyway, I've always worried about the seemingly conflicting meanings of this word and sometimes avoid using it in my writing or speech, unless it's very clear, as in "economic sanction (against a country)".

Another example mentioned is "bash". On the one hand, it has the meaning of to strike heavily, to criticize harshly (source), as in "Bashing Hillary? Don't go there, GOP". On the other hand, the word "bashful" means shy, timid (source). How could you justify these two opposite human behaviors with one single word (or word with a suffix)? This word, though, is different from "sanction", in that "bashful" actually has a completely different etymology and is related to the verb "abash", to make ashamed or embarrassed (source). In any dictionary, when two words have the same spelling but are descended from two etymological sources, they're usually listed as two headwords or entries. In case of "bashful", it's not the word "bash" with the suffix "-ful", but was formed by "abashed" + "-ful" and later lost the initial unstressed vowel (a process called aphesis). "bash" does have the same meaning as "abash" except that sense is now obsolete.

Well, some words that look like having different meanings but they don't, such as "flammable" and "inflammable". That's another topic, for now.

Thursday, March 26, 2015

Thought experiment: reading speed of a trilingual

I've always wondered about this. Imagine a person perfectly trilingual[note1] in English (or any language using a phonetic writing system hereinafter), Korean, and Chinese. I predict that his reading speed is Chinese > Korean > English. The speed can be measured by reading a culture- and language-neutral passage telling the same simple story. (But see [note2].) I suppose that the information the brain and eyes acquire within a given moment is the most for Chinese, less for Korean and the least for English. Suppose the text font is about the same on three sheets of paper (or computer screens), each written in one of the three languages. The eyes scan from left to the right. Due to the nature of English, the eye movement is completely linear. The amount of information thus gained within, say, one tenth of a second, may be less than in the case of Chinese. A language- and culture- neutral passage about the same content is normally printed shorter in Chinese than in English. But the eye scanning speed is about the same. So the total reading time will be shorter for Chinese. Let's call that advantage of information density for Chinese.

Secondly, I suppose human eyes focus on circular areas, instead of horizontal lines. That is, the very focal point is a round dot, the farther away from it, the less focus; eyes do not focus on one fixed-width line, the farther away above or below or to the left or right to it, the less focus. Chinese characters have the advantage of packing semantic information in a square block, compared to English which spreads the same information out horizontally, whose acquisition requires left-to-right scan by the eyes.

The Korean language is a unique case in that its letters are arranged both in a block and horizontally. It's a perfect compromise between English and Chinese. The eyes can gain some information focusing on one spot but usually must move horizontally to gain enough for a word. So the reading by the same person reading the same content will have the intermediate speed.

On the other hand, a March 24, 2015 article reports the research by Georgetown University scientists, After learning new words, brain sees them as pictures (probably based on this publication). Indeed, our brain does not process English text strictly linearly when our eyes horizontally scan the line into our brain. This means that the distinction between a language with largely block-based information (Chinese) and one with line-based information (English etc.) is not so dramatic. Nevertheless, I would be very delighted if we could find a person perfectly trilingual in English, Korean and Chinese, or at least perfectly bilingual in two of the three, to test his reading speed. Personally I'm not sure if I'm perfectly bilingual in English and Chinese. But I do feel that my reading speed in Chinese slightly better than in English. But I have not done a well-prepared test.[note2]


________________________
[note1] He's said to be a maximal trilingual, in the field of Second Language Acquisition.
[note2] Obviously the test must be given by another person. Since the person taking the test naturally reads the passage faster if he's already read the same content in any language, the passage actually cannot contain the same content translated into different languages in real implementation of this test. How the test can be given in the most appropriate way may be a technical challenge.

Tuesday, March 17, 2015

My first book

I just self-published my first book, Random Thoughts while Studying English and Chinese. The table of contents are as follows:

1 "主席" (Zhǔxí) was Chairman, is President
2 "Ni Hao Ma" (你好吗) is not a native Chinese greeting
3 Affective meaning of "中国" (China)
4 Chinese Character Usage Frequency
5 2015: The Year of "羊"
6 Why do Chinese choose uncommon English names?
7 Scholarly translation should be literal
8 "We should all be learning Chinese..." thus they say
9 "He" (他) and "she" (她) mix-up for Chinese students
10 "第几" has no English equivalent
11 New Chinese Acronyms
12 "Dragon" for "龙": a mistranslation?
13 Joke due to translation: "Oracle bone script" was registered as a software brand by Americans
14 Chinese religious language
15 "Modern" and "现代" or "近代"
16 Translation: a case study, "feminism" and "女权主义"
17 Translation of "computer", "calculator", and others
18 New Year's Wish: Less new usage of 被 (bei)
19 "谢谢叔叔!" (Thank you Uncle) said not to a family relative
20 "NBA" as an entry in Chinese dictionary
21 ESL methods: bilingual and immersion
22 Linguistic authority
23 Chinese "empty word" 虚词
24 虚词"虽然": empty word "although"
25 虚词"当然": empty word "of course"
26 虚词"很": empty word "very"
27 Interjection (叹词)
28 Why should the Chinese language not adopt a phonetic writing system?
29 Learning ... as a second language
30 Technical document needs literal translation
31 Levels of translation quality proposed by Yan Fu: A small example
32 Proper name translation: semantic or phonetic
33 Proper name translation: standardization
34 What language is popular?
35 Which English letter do Chinese pronounce wrong the most?
36 Language education to solve Chinese ethnic conflict

Any comments and reviews, good or bad, are highly appreciated. In case you ask, the landscape painting featured on the cover of the book is my hand-drawn replication of 山水图 (Picture of Mountains and Water) by 黄鼎 (Huang Ding, 1660-1730), an early Qing dynasty painter.

Saturday, February 28, 2015

Excerpts from a book on second language acquisition

Second Language Acquisitionby Susan M. Gass (C) 2013, Taylor & Francis, 4th ed

It's a comprehensive literature review or overview of the academic research on second language acquisition, citing over 1000 references. Very complete, yet boring and verbose. The following are excerpts that I personally think are interesting. (Some text can be found at archive.org.)

p.11
The basic assumption in SLA research is that learners create a language system, known as an interlanguage (IL). This system is composed of numerous elements, not the least of which are elements from the NL and the TL.
Central to the concept of IL is the concept of fossilization, which generally refers to the cessation of learning. The Random House Dictionary... "to become permanently established in the interlanguage of a second language learner in a form that is deviant from the target-language norm and that continues to appear in performance regardless of further exposure to the target language."

p.56
English does not allow resumptive pronouns in relative clauses (I saw the woman who she is your son's teacher). p.71
pronominal reflexes (or pronoun retention/resumptive pronoun), a phenomenon--common in many languages (including informal English)

p.90
Table 4.1 Hierarchy of Difficulty (Source: Adapted from Stockwell, Bowen & Martin, 1965)

CategoryExample
DifferentiationEnglish L1, Italian L2: to know versus sapere/conoscere ←most difficult
New categoryJapanese L1, English L2: article system
Absent categoryEnglish L1, Japanese L2: article system
CoalescingItalian L1, English L2: the verb to know
CorrespondenceEnglish L1, Italian L2: plurality ←easiest; two forms are used in roughly the same way

p.118
In general, children have better phonology, but older learners often achieve better L2 syntax

p.164
the notion of equipotentiality, expressed by Schachter (1988). She pointed out that children are capable of learning any language.

p.165
Lexical categories...: nouns, adjectives, verbs, adverbs, and so forth. These are referred to as content words. Functional categories... (e.g., articles, possessives), or they may be categories consisting of grammatical morphemes (e.g., plurals, tense markers).

p.194
language learning is largely lexical learning (e.g., Chomsky, 1989) (Some notes on economy of derivation and representation. MIT Working Papers in Linguistics, 10, 43-74)

p.210
Oral repetition correlated with general proficiency, but visual repetition (writing words over and over, memorizing the spelling letter by letter, writing new words and translation equivalents repeatedly) negatively predicted vocabulary size and general proficiency.

p.257
A comparable example took place at a G8 summit in Okinawa, Japan. Prior to the summit, Prime Minister Mori of Japan spent time brushing up on his English. Upon meeting President Clinton, he apparently became flustered and, instead of saying, How are you?, said instead: Who are you? President Clinton responded: I'm Hillary Clinton's husband. However, Prime Minister Mori, unaware that he had asked the wrong question, was anticipating a response something like I'm fine, and you? and responded I am too.
p.538
Thanks go to Caroline Latham for bringing this example to our attention.

p.266
Focused attention was most beneficial for syntax and least for the lexicon. In addition, there was a diminished effect for proficiency, with focused attention having a greater effect in early stages of learning.

p.435
The original formulation of CPH (Critical Period Hypothesis) came from Lennenberg (1967), who noted that "automatic acquisition from mere exposure to a given language seems to disappear [after puberty], and foreign languages have to be taught and learned through a conscious and labored effort. Foreign accents cannot be overcome easily after puberty"... The Sensitive Period Hypothesis predicts sensitivity, but not absolute drop-offs, such that a learning decline might be gradual.

p.438
Their (Bialystok and Hakuta (1994)) recalculations also revealed a deterioration in proficiency starting after age 20--well after the proposed biological changes suggested by the CPH.

p.439
Examples of easy structures are word order in simple sentences and pronoun gender; examples of difficult structures are articles and subcategorization features. Easy structures did not show age-related effects, whereas difficult structures did. He (DeKeyser (2000)) ties this to explicit and implicit learning, claiming that younger learners have intact the ability for implicit and explicit learning, whereas adults have lost their ability to learn implicitly.

p.440
DeKeyser and Larson-Hall (2005)...: Children necessarily learn implicitly; adults necessarily learn largely explicitly. As a result, adults show an initial advantage because of shortcuts provided by the explicit structure, but falter in those areas in which explicit learning is ineffective, that is, where rules are too complex or probabilistic in nature to be apprehended fully with explicit rules. Children, on the other hand, cannot use shortcuts to the representation of structure, but eventually reach full native speaker competence through long-term implicit learning from massive input. This long-term effect of age of onset is most obvious to the casual observer in pronunciation, but on closer inspection appears to be no less robust in the domain of grammar.

p.442
The primary difference between children and adults is in the mastery of phonology, which can hardly be due to input differences.

p.458
Studies indicate that motivational arousal is greatest for tasks that are assumed to be of moderate difficulty (see the discussion in Brehm and Self, 1989)

p.462
Anxiety is not always a negative factor in learning. ...: low levels help, whereas high levels hurt.

p.463
Hoffman (1986) notes that anxiety can direct attention away from meaning and toward pure form (acoustic properties, order of presentation, phonetic similarities).

p.470
strategy instruction was found to be substantially more effective when ... when the strategies targeted reading, speaking, and vocabulary, rather than writing, listening, and grammar.

p.473
Although adults show a faster speed of learning an L2, children seem to have an overall advantage in terms of ultimate attainment, at least for phonology and, possibly, syntax.

pp.480-1
Table 18.1 Definitions of Bilingualism

p.484
Cook (2005, Multi-competence: Black-hole or worm-hole?) argued that there are effects of multilingualism on how individuals process their NL, even individuals with a minimal knowledge of an L2.

p.488
in early L3 production, certain functions, such as prepositions, articles, and conjunctions, tend to come from the L2 and not the NL. This may occur even when the two languages are not phonetically similar.

p.489
Cenoz (2001)... cross-linguistic influence... linguistic distance is one factor. This was the case for all learners, regardless of language dominance... Age is another (factor), with older learners showing more cross-linguistic influence than younger children. There are language-related factors as well, with more transfer of content words than functional words.

Saturday, January 31, 2015

Affective meaning of "中国" (China)

"中国" (China), literally "Middle Country" or "Middle Kingdom", probably acquired its affective meaning quite late in history. This means that the word was used as a pure geographical term for a long time in Chinese history, unlike a few other terms such as "华夏", or during the Tang dynasty "大唐" (Great Tang, 618-907 CE). Mention of the latter two subconsciously arouse an emotional sense of national pride among the contemporaries.

One piece of evidence for this observation comes from its occurrence in a book written in the Eastern Jin Dynasty (317–420). Faxian (or Fa-Hien, Fa-hsien) (337–c. 422) "was a Chinese Buddhist monk who traveled by foot all the way from China to India, visiting many sacred Buddhist sites." His 《佛国记》 (Record of Buddhistic Kingdoms) (a.k.a 《法显传》, Biography of Faxian) states that "从是以南,名为中国。中国寒暑调和,无霜、雪。人民殷乐无户籍官法。" (The place from here toward the south is named the Middle Country. In the Middle Country, cold and heat are harmonized, and there's no frost or snow. The people are well-to-do and happy. Household registry or government laws do not exist.)

Why does Faxian's book serve as evidence for lack of the affective sense of "中国" in early China? The word "中国" first occurred in the earliest book in Chinese history, 《尚书》 (Book of Documents)[note], which was compiled by Confucius (551–479 BC) and read and memorized by every pupil that could afford minimum education in ancient China. Monk Faxian definitely knew most if not all words in the book. But he used the word "中国" to denote the central area of today's Indian subcontinent, to be precise, to translate the Sanskrit word "Madhya-Des", paraphrased as "central territory" or "central kingdom" or "midland country" by various scholars, which existed during the Gupta period (approximately 320 to 550 CE). It's hard-pressed to imagine that Faxian would have chosen "中国" if it had already had acquired a sense beyond its literal meaning. If "中国" had been used dearly as words such as "夏" or later in history "大唐", Faxian would have considered a different character combination to translate the Sanskrit word, or perhaps added a translator's note if "中国" had to be used.

More than a millennium later, 《四库全书总目》 (Annotated Catalog of the Complete Imperial Library, completed in 1798 by the Qing dynasty) criticized Faxian's word choice by saying that he "以天竺为中国,以中国为边地" ("considered India as the Middle Country and China as the frontier"; note the same word "中国" could mean both "Middle Country" and "China"). What we can read off of this critique is that the word "中国" had acquired its affective sense by the late 18th century. From now on, it's not to be used to mean just the middle part of some land, or any country in the middle of something larger. It uniquely refers to the Middle Country or the Middle Kingdom or China, with its rich culture and history. Yet two centuries later, it's our turn to criticize the editors of the imperial encyclopedia for their lack of the sense that one born later in history should only judge a historical figure from the historically contemporary perspective.

___________
[note] 《尚书·卷十四·梓材》: "皇天既付中国民,越厥疆土,于先王肆" (The heaven already obliged [the Zhou family] to govern the people in the Middle Country. If we extend and develop the territory, our ancestor's Dao will flourish.)