Saturday, April 3, 2021

Mutual intelligibility to distinguish between language and dialect: case of Chinese and Cantonese

Sometimes it is debatable to say that two language varieties are two different languages, or that they are two dialects of one single language. It comes down to the concepts of "language" and "dialect". Among various criteria to distinguish between a language and a dialect, mutual intelligibility may be the most popular one, and appears to be easy to follow. But is it really easy?

1. First of all, we have to absolutely refrain from any political and nationalist influences if we are determined to adopt the mutual intelligibility criterion. They are not conducive to a technical or linguistic study. Although non-linguistically based definitions serve other, pragmatic purposes, they are not part of the following discussion.

2. Mutual intelligibility requires mutual understanding of the speaker or author. One-way or uni-directional understanding may only serve as an intermediate step in measuring the degree of understanding.

3. Mutual intelligibility itself does not stipulate the modality of the source language production. It is generally interpreted as understanding of speech. But that's only because the majority of world languages use the alphabetic writing system so that speech and written text are generally consistent. (There is the concept of orthographic depth, which measures this consistency.) But in case of the character-based writing system, strictly applying the mutual intelligibility criterion requires separate analyses with regard to modality, one for speech, the other for writing. In the case of Chinese and Cantonese, it is generally agreed that a person speaking a variety of Chinese (typically Mandarin) with no ability in Cantonese and a person speaking Cantonese with no ability in the Chinese variety that the other person speaks cannot verbally communicate. Therefore Chinese and Cantonese are said to be different languages in terms of oral mutual intelligibility. From this point to the end of this blog posting, let's discuss written mutual intelligibility only.

4. To test whether two language varieties are languages or dialects of one language, we must not fall for the fallacy of contrived test materials cherry-picked to prove a pre-supposed conclusion. This practice is particularly widespread when people, not just language amateurs but also professional linguists, argue for the two-language-verdict of Chinese and Cantonese. The correct test should be based on a very large language corpus. In giving materials to volunteers in a test, the sentences must be randomly selected from a comprehensive corpus, ideally the whole Internet content, probably supplemented by some text commonly produced but rarely uploaded to the Internet. Notably, in case of Cantonese, if the test materials contain a higher ratio of Cantonese-specific characters and words than average, the test is biased and becomes invalid.

5. To check for percentage of understanding of the materials given in the tested language, the multiple choice questions should have a relatively high number of choices (at least 4), to avoid random-guess correctness.

So far I have outlined an experiment to check whether Chinese and Cantonese are languages or dialects by strictly applying the mutual intelligibility criterion. We can see that the result is not a Yes or No, but a percentage, unless you arbitrarily declare that above a certain cut-off value they are dialects and below that they are languages.

I personally only know Chinese, specifically its Mandarin and Sichuanese dialects, and don't know Cantonese at all. In terms of written mutual intelligibility, I don't know how much percentage of an absolutely randomly selected Cantonese document I can read and understand. If I may hazard a guess, I would say at least 70%, i.e. I can answer 7 or more out of 10 reading comprehension questions correctly. But without such an experiment, it's only a guess.

6. To make this discussion complete, we have to prevent one trivial trap in applying the mutual intelligibility criterion, which we must consider to be a necessary but not sufficient condition. We cannot conclude that language varieties A and B are dialects as long as they meet the mutual intelligibility requirement. The missing condition that must also be met is that A and B are under one genus as defined by Dryer and Haspelmath. As other scholars have done, we add this condition to preclude the obviously incorrect but otherwise possible conclusion that, for example, Chinese and Japanese become two dialects of one language because a Chinese and a Japanese can communicate by writing. We avoid this specious claim by realizing that Chinese and Japanese are not closely related, or specifically, not of one genus in language classification. (When using Dryer and Haspelmath's Genealogical Language List, we should, for the purpose of strictly applying the mutual intelligibility criterion to distinguish languages from dialects, disregard the fact that they list Cantonese under the heading of Chinese.)

Summary It is possible to strictly apply the mutual intelligibility criterion to determine whether Chinese and Cantonese are two languages or dialects. Due to the unique writing system, this criterion must be separated into oral and written intelligibility. Thus, in terms of oral mutual intelligibility, Chinese and Cantonese can be said to be two languages. In written mutual intelligibility, the decision can only be made after an actual experiment and after setting a cut-off value for intelligibility.

Thursday, February 4, 2021

2021: The Year of "牛"

Of the twelve Chinese zodiac animals, some are translated into English as various names, such as 羊 as "sheep", "goat" or "ram", 鸡 as "rooster" or "hen", and when a year falls on such a zodiac animal, there is invariably a debate as to which English word is the best fit. Other animals are much less debated. For instance, nowadays 牛 is almost always translated as "ox".

So another question arises, Why is "ox" preferred to, say, "bull", "cow", "buffalo", or "cattle"? In fact, such translations did exist, but they gradually died out over the past decades, specifically since 1960's or 1970's according to Google Ngram. The reason for "ox" to eventually come to the top is not easy to explain, as is the case with many things in human languages. Let's break up the question a little bit. To be precise, an ox is a castrated male cattle, a bull is an uncastrated one, and a cow is a female. I think the reason why the word "cow" is not chosen, in spite of its higher usage frequency, is that in the western zodiac, there is the Taurus, which is male, and that word and its referent probably had some influence on the early choice of word for the Chinese zodiac animal 牛. Next, let's analyze the choice between "ox" and "bull". According to an Internet user who answered the question I posted to a Facebook group, an ox is a bovine trained as a draft animal, as stated on Wikipedia. Similarly, a 牛 in the mainstream traditional Chinese culture is also a draft animal, not one as the source for food (beef, milk, etc.). In this sense, English "ox" is the more appropriate translation than "bull".

Back in 2015, I blogged about the English word for 羊 as the Chinese zodiac animal, and I proposed the idea that to eliminate the ambiguity in the Chinese word or character, we simply find the biological name at the lowest level in taxonomy under which the species the various English names refer to are. For example, a sheep belongs to the genus ovis, which belongs to the subfamily caprinae, and a goat belong to genus capra, which belongs to subfamily caprinae. Therefore, the best word to translate 羊 is caprinae. Well, it is best only if we can ignore the ignorance of the general public. But generally that's not a very good idea. Fortunately, in the case of 牛, the word "cattle" seems to cover both "ox", "bull", "cow" or even "buffalo", and "cattle" happens to be a common word that even an ignorant John Doe knows the meaning of. So I think "cattle" is the best translation for 牛. But it's too late to promote this because the English-speaking people have already been saying "ox" for Chinese 牛 for 50+ years.

Sunday, January 24, 2021

English teacher's accent (英语老师的口音)

Facebook一语言学习群有人说他在法国一所小学教英语。学校招了一名印度裔英语老师(英语是她的母语),他是这名新老师的领导。学校其他老师向他抱怨说新老师的口音对小孩是个问题,特别是发r这个音时。他认为这些老师不对,因为他认为习惯不同口音是非常好的一件事。(他说“i dont want to be angry. I want to make them understand the wonderful benefit of learning from different accents. Do you have any suggestions?”)




Thursday, December 3, 2020

First floor vs ground floor 楼层的称谓

美国英语称底楼为first floor,往上依次是second floor、third floor等,英国英语称底楼为ground floor,往上是first floor、second floor,至今仍然如此。欧洲各国遵循英国惯例,世界很多地区也是如此。但在拉丁美洲,楼层的编号采用英、美两种惯例的都有,决定于哪个国家,似乎看不出规律,可参见维基Storey条,但维基关于墨西哥是错的,他们用美国惯例。(注:说英国或美国惯例只是方便称谓,并不表明某地区的惯例的来源是美国或英国。)





如果我们咬文嚼字,英国惯例其实是不合逻辑的,底层叫ground floor(直译:“地楼层”),上一层叫first floor(“第一楼层”),既然底层用了floor这个词,就认可了它也是众多floors之一,但为什么它在floors集合中没有序数呢?如果你有三个儿子,老大可以叫大儿子,后面两个当然叫二儿子、三儿子而不可能叫第一儿子、第二儿子,无论老大多么特殊。所以,将“地楼层”与“第一楼层”分开是强词夺理的。这个惯例的起源暂不清楚,但英国殖民者来到北美后采用了我们熟悉的美国惯例无疑是更合理的。由于英国在历史上的影响,世界上采用英国惯例的国家更多,但以人口数量论,由于中国的惯例与美国相同,世界上用英国惯例的人即便包括印度人在内也可能更少。


Sunday, November 15, 2020

"drawing" and "painting"

On Weibo or Microblog, the Chinese social network, the blogger 芝加哥艺术博物馆 (Art Museum of Chicago) made a posting about Claude Monet, and quoted him say "I never had one [studio] and personally I don’t understand why [people] would want to shut themselves up in some room. Maybe for drawing, sure, but not for painting" (my bold text), and offered a Chinese translation as "我从来没有过画室,我也不明白为什么要把自己关起来。也许是为了绘画,但不是为了绘画". Other than missing "people", the English quote is grammatically correct, and more or less faithful to the original quote in French.[note]

But the confusing part of the Chinese translation is 绘画. Its first occurrence is for "drawing", the second for "painting". What's the difference between drawing and painting (or dessiner and peindre in French)? Drawing is more about creating art with dry or somewhat dry materials, with a pencil, pen, charcoal, etc. Painting, which reminds us of painting a room or a car, is more about creating art with wet materials, including paint and acrylic. Secondly, drawing focuses on the outline while painting on colors. Lastly, drawing is traditionally black and white while painting must have various colors. These differences I list here are obviously not hard and fast rules, especially in modern art. (Note: Monet died about a century ago.) You can find other people's opinions with a Google search.

What about the Chinese words for "drawing" and "painting"? The Wikipedia page for drawing has its Chinese page titled 素描, literally "black-and-white outline", and that for painting has the Chinese page 绘画. This latter Chinese word is translated as both "drawing" and "painting" in English. Etymologically, both 绘 and 画 emphasize drawing more than painting. But as we discuss earlier, it's wrong to find the modern meaning in the original meaning of a word; we should only find its meaning as the word is used today. On the other hand, 素描 precludes the possibility of colored outline, which, needless to say, was indeed rare in Monet's times.

So, how do we translate Monet's words into Chinese, making a distinction between "drawing" and "painting"? Unfortunately, in spite of splendid Chinese culture and civilization, the vocabulary of the Chinese language is not rich enough to expose this nuance in what Monet tried to convey. A less than perfect translation of his words, judging by the context, may be "(关在画室里)打画稿可以,画一幅画不行" (literally, "(shutting oneself up in a room/studio), making a sketch is OK, making a painting is not OK"). This is a roundabout way to paraphrase Monet and it depends on my understanding of his attitude toward nature and his personal way to represent nature. Until we artificially designate one Chinese word for "drawing" and the other for "painting", the sentence cannot be literally translated. If we do go for 素描 for "drawing" and 绘画 for "painting", the Chinese reader will definitely get confused, unless a translator's note is given to that effect.

[note] "Mon atelier ! Mais je n'ai jamais eu d'atelier, moi, je ne comprends pas qu'on s'enferme dans une chambre. Pour dessiner, oui : pour peindre, non" (source: Wikipedia. Note there is no word for "maybe" as in the English translation, which misses the word "studio", and renders "oui" as "sure" instead of "yes".

Tuesday, September 8, 2020

Do not use etymology to determine current meaning of a word

It may sound obvious. When you want to know the meaning of a word, you look it up in a dictionary and check the definitions, probably with some examples. Only if you're interested in its origin will you check its etymology. But in reality, we see that a lot of people trying to explain the connotations or nuances of meanings of a word resort to etymology. For example, in my 2017 blog 自由: "freedom" or "liberty"?, I criticized those who rack their brains trying to come up with certain semantic differences between freedom and liberty while there is none (although which word is more customarily used in which set phrase exhibits a difference in frequency).

Recently, in a Weibo posting, a Chinese blogger tried to justify his translation of draconian as "惨无人道" ("inhumanely atrocious"). He was reading the following passage of an MIT Technology Review article Every country wants a covid-19 vaccine. Who will get it first?,

"By then, though, China had a different problem: not enough covid-19. Its draconian lockdown measures had quashed the virus at home so effectively that doctors couldn’t find patients to fully test their vaccine on."

His comment is, "这里特别用到一个极其恶毒的词语叫draconian,可以翻译为惨无人道" ("An extremely vicious word is used here called draconian, which can be translated as inhumanely atrocious"). When other readers pointed out to him that his understanding of this word was incorrect, he justified his interpretation by finding the origin of draconian, which is the Athenian lawmaker named Draco, known for making harsh laws.

So much for this story. Let's re-read the renowned linguist Thomas Pyles's frequently quoted statement that "[t]here is a widespread belief, held even by some quite learned people, that the way to find out what a word means is to find out what it previously meant — or, preferably, if it were possible to do so, what it originally meant--a notion similar to the Greek belief in the etymon... such an appeal to etymology to determine present meaning is as unreliable as would be an appeal to spelling to determine modern pronunciation." (The Origins and Development of the English Language, 1964 ed., pp304-5). Not heeding this warning, we would say calculate only if we were to count pebbles because calculate comes from Latin calx ("stone"), and we would either quarantine potential SARS-CoV-2 virus carriers for 40 instead of 14 days or flatly refuse to use the word quarantine because the word inherently meant "forty".

Monday, September 7, 2020

Linguists' responses to school dismissing professor saying 那个 in communication class

A filler word in a language is uttered when the speaker hesitates in speech. While most languages have eh, ah or m, some languages have their language-specific words. For example, some English speakers say you know for this purpose, and Chinese may say 那个 (pronounced like naygher or nagher without the trailing rhotic vowel; pinyin: nèige or nàge). According to Los Angeles Times, University of Southern California business school professor Greg Patton gave 那个 (nèige) as an example of a Chinese filler word in his business communication class and was dismissed by the school who listened to the complaint of certain African American students in his class. The following are a few most like'd comments on this news in the Facebook Linguistics group:

* What a ridiculous thing. An inoffensive word in another language sounds close to an offensive word in your native language and so you get the professor fired? Perhaps those students need to learn some tolerance about linguistic differences.

* I can't be the only one to whom this part of the identity movements in the US feels very much like a toxic and bigoted form of American cultural colonialism, where certain groups within the US try to force their form of cultural ethics onto the rest of the world?
How is it reasonable for Americans (or more generally, mono-lingual English speakers) to demand respect for their own culture or ethnicity, but demand other cultures to adapt themselves to their own highly culturally-specific standards? How is it acceptable in the English-speaking academic world to demand non-English speakers to adapt their native language "because it sounds offensive" to an outsider?
-- * [my follow-up comment] (if we expand this topic a little bit) These students' complaint and the school's decision about the professor who indicated the usage of the word in clear context will have an effect of alienating Chinese Americans who overall supported the Black Lives Matter movement, which, like any movement, ought to recruit as many supporters as they can. These two things should be separated. But unfortunately humans are human.

* [me to another commenter] You mean he should have chosen another filler word? In Chinese, eh or its variant ah is pretty much the only other one. But 那个 is so common and distinctive in Chinese not mentioning it can be considered a fault in teaching. By the way, the fact that there're 10,000 characters in Chinese is irrelevant to what filler words exist in Chinese.

Other comments:

* The fact that the professor introduced that it was another languages’ conversational manager word and then said he word makes all the difference. If these students conversed with someone in a Chinese dialect, would these students try to get the Chinese student expelled?
* My best friend, who is Black, visited China on a short term abroad in a business course in school. He obviously heard this term used as it is part is casual language. Should he have been angry with the tour guides, restaurant employees, etc? He told me he was initially confused and even worried that it was meant to be hurtful but after the linguistic meaning was explained to him it all made sense and he no longer felt any distress. Why didn’t this help the USC students? I feel for them but I also feel for the professor.
* I remember in grad school a Colombian woman gave a teachimg presentation in spanish and used negro in reference to black people. It is the correct term for the color in spanish and many people in central and South America use the term for their skin color as well. But i heard audible gasps from listeners in the room. Thankfully, everyone had what I felt was enough maturity to realize it was not a slur she was using, but a word in another language. I think someone asked her privately afterwards about the use, but nothing else ever came of it.
-- * [a follow-up comment] maturity and cultural understanding make a big difference)
* the student response feels extreme from my perspective as a white linguist who also teaches communication, but I also think it would have been better if he could have chosen a different example or given a call out th…
-- * [a follow-up comment] How would a teacher of philosophy teach about Kant in the USA?
-- * [a follow-up comment] What example would you use? In 10+ years as a Chinese speaker, I’m not sure I’ve heard any other word in Chinese used in that way.
* What would happen if the Professor was Chinese and explained the same thing?
* Ugh... it’s never ending with stupid people
* as a member of the human race who can think I can logically deduce that the professor did not mean to use the N word.
* I thought I had seen stupidity at its lowest level. I was wrong.
* Monolingual people problems!
* This is bullshit. America, land of the contextually dead.

Sunday, August 2, 2020

Tones are better immune to interference

A Chinese American person recently told me she had better listening skills in Chinese than English (note: not listening comprehension, but just listening, or speech sound recognition). It's surprising because she was born and grew up in the US, never living in a Chinese speaking country except for short periods. She admitted that her conclusion may have a confounding factor that both her parents are Chinese immigrants and speak clear Mandarin to her at home. I told her that her better listening may be related to the fact that the tone of Chinese, or any tonal language for that matter, offers a high interference immunity. This means that the listener can discern the speaker's tone even if there is ambient noise, if the speaker does not utter syllables clearly, or if the distance between the speaker and the listener significantly reduces the sound volume. Under less optimal conditions, if different tones of a sound in the language alter the meanings of the sound, there will be less loss of information carried to the listener because the tone is more immune to interferences than other phonemic features of the sound.

So, the tone of a language is a desired feature. But why is that only some languages are tonal? According to this 2015 article Climate, vocal folds, and tonal languages: Connecting the physiological and geographic dots, tonal languages are generally distributed in humid regions of the world, while non-tonal languages are in arid or dry regions. To produce tones, the human organ requires a favorable ambient environment, and "very cold/dry regions apparently serve as barriers to the spread of (complex) tone".

Saturday, July 25, 2020

Adjective in the form of the past participle of an intransitive verb

First, a few grammatical terms. Everyone knows what an adjective is, like "big" in "a big car". Past participle (PP hereinafter) is a form of a verb that you use after "have" to indicate a completed action, like "opened" in "I have opened the door". A verb is intransitive when it is not followed by an object, like "happen" in "The incident happened", although it can be followed by a complement indicating time, place, etc. A transitive verb is followed by an object, like "hit" in "He hit him".

Sometimes the PP of a verb can be used as an adjective, like "opened" in "the opened jar", referring to the jar that was opened (by somebody), which is slightly different from "the open jar", where the speaker emphasizes the state of the jar more than someone's opening action.

All is fine if the verb is transitive. That is, there is no problem in using PP of a transitive verb as the modifier of a noun (nominal modifier), serving the function of an adjective. But can PP of an intransitive verb do so? The answer is sometimes but not always. We can say "an expired license", which is the same as "a license that has expired". The phrase "the disappeared man" seems to be acceptable, referring to the man that has disappeared, not necessarily implying that the man was forced to disappear by e.g. abduction. (The verb disappear does have the rare transitive sense of "to make vanish" according to Wiktionary, but we don't discuss it here.) On the other hand, we cannot say *"a come guest" (* means incorrect) and have to say "a guest that has come".

An interesting question is, How do we know when the PP of an intransitive verb can be used as an adjective or nominal modifier? I posted a question to the Facebook Linguistics group. One reader, apparently a linguist, referred me to the concept of "unaccusative verb". According to Wikipedia, "an unaccusative verb is an intransitive verb whose grammatical subject is not a semantic agent. In other words, it does not actively initiate, or is not actively responsible for, the action of the verb." Let me paraphrase. Just because a word (or phrase) is the grammatical subject in front of a verb doesn't always mean it actively (主动地) takes the action indicated by the verb. For example, "The window broke" doesn't mean the window wanted to break and therefore broke. It broke probably because someone broke it, or the bad weather caused it to break. This is different from "A guest comes" because the guest can walk and take action by himself and comes. Note that in linguistics, "accusative" refers to the relationship between the verb and its immediate action on its direct object; it has nothing to do with the action of accusing someone doing something bad, although "John accuses Jake" does have the accusative action in it ("accuses Jake").

The article goes on to say "[u]naccusative past participles can be used as nominal modifiers with active meaning", and gives a criterion to identify such verbs. For example, in the archaic sentence "He is fallen/come" (which means He, usually referring to Jesus, has fallen / come), because "is" instead of "has" is used, both "fall" and "come" are unaccusative. Well, obviously, in Modern English, only "a fallen tree", not *"a come visitor", makes sense. So I'm afraid we can only say some unaccusative past participles can be used as nominal modifiers or adjectives. The article lists 6 groups of unaccusative verbs given by Perlmutter (1978). But I don't think all are fit to be used as nominal modifiers. Specifically, I would say (a) and (c) won't work (e.g. *"the happened event"). In (f), only "survive" works.

For native English speakers, this is a non-issue because which intransitive verb can and which cannot be turned into PP and act as an adjective naturally comes to the mouth or pen (nowadays keyboard). For English learners, it may be more fruitful to just learn them by reading and listening than by studying the grammatical rule. Nevertheless, the linguists' effort to decipher the underlying grammatical rule is intriguing to the curious mind.

Monday, February 10, 2020

"self-driving" vs "self-driven", "self-limiting" vs "self-limited"

In English, the compound adjectives <NP>-<V>ing (noun or noun phrase followed by verb in its -ing form) and <NP>-<V>ed (noun or noun phrase followed by verb in its -ed or past participle-like form) imply different relationships between <NP> and <V>. Specifically, in the former case, <NP> is the object[note] of the action <V>, while in the latter, <NP> is the agent of <V>. For example, "man-made" implies that man makes (whatever follows), as in "a man-made satellite". If you were to say "man-making", it would denote something that makes man or a human!

But this analysis seems to break when the first element is the word "self". A Google exact phrase search for "self-driving car" currently returns about 6,980,000 results and a search for "self-driven car" returns about 540,000. While the latter -ed form is less than 10% of the -ing form, most articles appear to be written by native speakers, suggesting that both forms are accepted (but people may be subconsciously treating "self" as an object more than an agent?). After all, it makes sense because "self" means, well, self; there's no need to distinguish between agent and object.

The recent coronavirus causes pneumonia that is self-limited, according to China’s National Health Commission. So, let's check "self-limiting disease" vs. "self-limited disease", a term referring to a disease that runs its course without medical treatment (treatment may speed up the process, but that's a separate point). "Self-limiting" is slightly more popular than "self-limited", 118,000 vs. 105,000 on Google. Indeed, when the <NP> is "self", either the -ing or the -ed form of the verb is accepted.

[note] A more technical term for "object" here is "patient", not in any way related to a sick person in a hospital.