Wednesday, November 19, 2025

A Corollary Approach To Vietnamization

Partial Syllabic Correspondences In Vietnamese Etymology

by dchph






The corollary method emphasizes forms in which at least one syllable aligns with Chinese or Yue sources. These partial correspondences reveal how cultural and literary channels reinforced borrowing, while semantic shifts and vernacular creativity reshaped the lexicon.

I) A corollary approach defined

Definition: A corollary approach is one of several analogical methods used to establish affiliated linguistic attributes in etymological candidates by aligning lexical properties through shared semantic peculiarities, intrinsic traits, and formal resemblance.

Scope: Sinitic‑Vietnamese forms diverged from Sino‑Tibetan sources under conditions shaped by competence and performance. Vietnamese speakers, exposed to diverse Chinese dialects – military, migrant, and courtly – over more than a millennium in Annam Đôhộphủ, adopted and adapted forms. Native items yielded or coexisted, explaining the abundance of cognates alongside exceptions.

Method: To identify Chinese‑Vietnamese cognates, the corollary approach tests representative sets. Items are selected for plausibility yet remain open to debate. Cognacy is posited when multiple items converge on shared roots; other correspondents may belong to the same genre and await discovery. This technique complements other methods, drawing on reconstructions and comparative analyses.

Irregularity Principle: Not all sound changes follow the strict rules governing scholarly Sino‑Vietnamese strata. Disputable items and unconventional developments are treated under an irregular paradigm driven by frequency and usage. Rejecting some proposed cognates does not invalidate others; each etymon is assessed on its own merits.

Rationalization: Examples are advanced through reasoning and induction, supported by attested sound change patterns. Even seemingly odd internal shifts – such as {p > t} or {p > b > ʔb > ɓ} from ancient Annamese to modern Vietnamese – can be justified when corroborated by historical phonology. Newly established paradigms can lead to new etyma.

Caveat: Outlandish examples are selectively included as supplements to accepted patterns. They may require further evidence and are not intended to codify rigid rules, as cross‑genre differences limit direct transferability.

Once the correspondences below are accepted, it becomes clear why patterns in Sino‑Vietnamese sound changes that appear irregular are, in fact, systematic:

  • 額 é → SV ngạch 'amount, forehead'
  • 岸 àn → SV ngạn 'bank' 
  • 罷 bā → SV bãi 'on strike'
  • 畢 bí → SV tốt 'graduation'
  • 必 bì → SV tất 'inevitable'
  • 季 jì → SV quý 'season'
  • 節 jié → SV tiết 'festival'
  • 偏 piān → SV thiên 'bias'
  • 匹 pí → SV thất 'match, lone'
  • 起 qǐ → SV khởi 'rise'
  • 七 qī → SV thất 'seven'
  • 煽 shǎn → SV phiến 'incite'
  • 攝 shè → SV nhiếp 'act for'
  • 濕 shì → SV thấp 'damp'
  • 灣 wān → SV loan 'bay'
  • 熄 xí → SV tức 'put out'
  • 學 xué → SV học 'study'
  • 左 zuǒ → SV tả 'left'
  • 郵 yóu → SV bưu 'postal'

The list of correspondences could be extended indefinitely. Each example demonstrates that what appears to be irregularity is in fact governed by underlying rules — for instance, b > t or sh‑ ~ nh‑. When such interchanges recur across more than six consistent pairs (the “six‑strike threshold”), they can no longer be dismissed as anomalies; rather, they constitute a systematic principle of sound change.

For clarity, we set aside the Shijing 詩經 (Book of Odes) (cf. Karlgren 1945), since its archaic glosses can be stretched to match nearly any word etymologically and risk obscuring the patterns under discussion.

The same principle applies within Sinitic‑Vietnamese etymology. As noted earlier, these words are products of patterned phonetic interchanges that parallel sound changes already attested in Sino‑Vietnamese loan strata. Their induced shifts were not random but became predominant within Vietnamese phonology. The following illustrations will demonstrate the validity of these established sound change patterns.

For their plausibility, these examples are best understood as corollaries: Vietnamese body‑part terms that correspond etymologically to Chinese cognates.

  • 頭 tóu → đầu 'head'
  • 腦 nǎo → não 'brain'
  • 髮 fā → tóc 'hair'
  • 目 mù → mắt 'eye' [cf. 眼 yǎn → nhìn 'look']
  • 瞳 tóng → tròng 'eyeball'
  • 面 miàn → mặt 'face'
  • 顁 dìng → trán 'forehead'
  • 眉 méi → mày 'eyebrow'
  • 眉毛 méimáo → mi 'eyelash'
  • 齡 líng → răng 'tooth' [by association with 牙 yá 'ivory' → SV ngà]
  • 頷 hàn → cằm 'chin'
  • 含 hán → hàm 'jaw'
  • 肉 ròu → nướu 'gum'
  • 蟲 chóng → sâu 'cavity' [cf. 牙蟲 yáchóng → sâurăng 'cavity']
  • 犬 quán → khểnh 'canine' [cf. 犬牙 quányá → răngkhểnh 'canine']
  • 吻 wěn → mồm 'mouth' (hence miệng) [cf. 吻 wěn → hôn 'kiss']
  • 鬚 xū → râu 'beard'
  • 翁 wēng → lông 'hair'
  • 膚 fū → da 'skin' [by association with 皮 pí → SV ]
  • 手 shǒu → tay 'hand'
  • 手板 shǒubǎn → bàntay 'palm'
  • 胳膊 gēbó → cánhtay 'arm' [by association]
  • 胳臂 gēbì → cánhtay 'arm' [by association]
  • 胳膊肘兒 gēbózhǒur → cùichỏ 'elbow' [by contraction]
  • 背 bèi → vai 'shoulder' [by innovation]
  • 喉 hóu → cổ 'throat'
  • 舌 shé → lưỡi 'tongue' [by association with 脷 lěi]
  • 喉嚨 hóulóng → cổhọng 'throat'
  • 肺 fèi → phổi 'lung'
  • 臆 yì → ngực 'chest'
  • 心 xīn → tim 'heart'
  • 肝 gān → gan 'liver'
  • 腎 shèn → thận 'kidney'
  • 腰 yāo → eo 'waist'
  • 腚 dìng → đít 'buttocks'
  • 屁 pì → địt 'fart'
  • 胸 xiōng → hông 'hip' [by innovation]
  • 胃 wèi → dạ 'stomach' [cf. 胃子 wèizi → dạdày 'stomach']
  • 脾 pí → tỳ 'spleen'
  • 腹 fù → bụng 'belly'
  • 腿 tuǐ → đùi 'lap'
  • 腳 jiǎo → giò 'leg'
  • 足 zú → chân 'foot'
  • 脛 jìng → cẳng 'leg'
  • 趼 jiǎn → móng 'fingernail' [by association]
  • 腳板 jiǎobǎn → bànchân 'sole'

Below are parallel glosses carrying the same semantic traits. For example:

  • 手板 shǒubǎn → bàntay 'palm' (literally 'a panel of the hand')
  • 腳板 jiǎobǎn → bànchân 'sole' (literally 'a panel of the foot')
  • 腳脖 jiǎobó → cổchân 'ankle' (literally 'the neck of the foot')

By contrast, modern concepts are clearly identifiable as Chinese loanwords, such as:

  • 肺勞 fèiláo → laophổi 'tuberculosis' (modern M 肺結核 fèijiéhé)
  • 肝炎 gānyán → viêmgan 'hepatitis'
  • 氣喘 qìchuǎn → henxuyễn 'asthma'
  • 頭腦 tóunǎo → đầunậu 'ringleader'
  • 頂撞 dǐngzhuàng → chạmtrán 'head‑on'

II) Mechanisms of corollary borrowing

  • Performance factors: slips of the tongue, misperception, playful re‑analysis.
  • Competence factors: imperfect transmission of phonological rules, leading to hybrid forms.
  • Child language acquisition: systematic substitutions that reshape borrowed syllables.

According to the so‑called “six‑strike rule,” once a correspondence is attested in more than six consistent pairs, it can no longer be dismissed as irregular but must be recognized as a rule of sound change. Since most body‑part terms already demonstrate such cognacy, there is no reason why răng 'tooth' should be treated as an exception. Indeed, as Tsu‑lin Mei has suggested (see What Makes Chinese So Vietnamese? - Appendix G), máu 'blood' < 衁 huàng may itself have Austroasiatic origins, showing that Vietnamese body‑part vocabulary reflects both inherited and Sinitic layers.

A. Case studies

    1. "RĂNG":

    Phonological correspondences

    • răng ← 牙 yá 'tooth': SV nha, Mand. , Cant. ngah, Hai. gheh [ M  牙 yá, yă, yà < MC ŋa < OC *ŋra: || ¶ /y‑ ~ r‑/ . Cf. 牙 yá: SV ngà 'ivory'; 笌 yá: VS măng 'bamboo shoot'; 萌芽 méngya 'germ' → SV manhnha, VS mầmmống, VS giá 'sprout'. Also: 齡 líng (linh) [ Vh @ QT 齡 líng < MC lɛjŋ < OC *reːŋ ]

    (i) Initial interchanges { y‑ ~ ng‑, r‑ }

  • 齖 yá → răng 'tooth'
  • 劜 yà → rặng 'exert'
  • 勜 yà → ráng 'exert oneself'
  • 崖 yá → rặng(núi) 'mountain range' [cf. 嶺 líng: SV lĩnh 'ridge']
  • 砑 yà → ràng 'wrap up'
  • 掗 yà → ràng 'attach'
  • 蚜 yà → rọm 'aphid'
  • 啞 yā → ràm (onomatopoeic 'whining sound')
  • 耀 yáo → rạng 'glowing'
  • 煆 yā → rực 'raging fire'
  • 椏 yā → nhành 'forking branch'
  • 枒 yá → vành 'rim' [also dừa 'coconut']
  • 襾 yà → vung 'lid, cover'
  • 婭 yà → ấy (address term between sons‑in‑law)
  • 押 yā → giải 'detain in custody'
  • 笌 yá → măng 'bamboo shoot'
  • 羊 yáng →  'goat'
  • 焱 yàn → rang 'hot'
  • 吆 yāo → rao 'shout'
  • 隐 yǐn → riêng [as in 隐私 yǐnsī → riêngtư 'private']
  • 硬 yìng → rắn 'sturdy'
  • 蝇 yíng → ruồi (nhặng) 'flies'
  • 元 yuán → ngươn (surname) [cf. 阮 ruǎn → Nguyễn]
  • 月 yuè → giăng 'moon'
  • 曰 yuè → rằng 'said'
  • 鉞 yuè → rựa 'axe'
  • 夭夭 yāoyāo → rậmrạp 'bushy'
  • 悒悒 yìyì → rayrứt 'uneasy'

    Dialect note: In tVietnam's southwestern Rạchgiá subdialect, speakers often substitute /r‑/ with /g‑/, e.g., găng for răng for gỗ for rỗ.

    (ii) Final interchanges { ‑a(e), ‑Ø ~ ‑an, ‑ang }

  • 打 dǎ → đánh (cf. quánh /wajŋ5/) 'strike'
  • 嗎 mà → mắng 'scold'
  • 得 dé → đặng (< VS được, Hai. /dak8/) 'got, able to get' [possibly also from 行 xíng 'okay!']
  • 俄 é → Nga 'Russia'
  • 鵝 é → ngang, ngỗng (SV nga) 'goose'
  • 蛇 shé → rắn (SV ) 'snake'
  • 炸 zhà → rán 'fry'
  • 耀 yáo → rạng 'illuminate'

    Compound evidence: The relationship between 牙 yá (SV nha) for both ngà 'ivory' and răng 'tooth' may also involve 齖 yá 'tooth' or 齡 líng (SV linh 'instar'). This distinction is fossilized in disyllabic compounds:

  • 牙齒 yáchǐ → răngcỏ 'teeth'
  • 犬牙 quányá → răngkhểnh 'canine'
  • 牙肉 yáròu → nướurăng [cf. lợirăng → lợi 'gum']
  • 牙蟲 yáchóng → sâurăng 'cavity'
  • 咬牙 yǎoyá → nghiếnrăng 'grind teeth'
  • 假牙 jiǎyá → rănggiả 'false teeth'
  • 牙痛 yátòng → đaurăng 'toothache'
  • 牙床 yáchuáng → hàmrăng 'tooth bed'

    Induction:  The hypothesis is that SV ngà 'ivory' and VS răng 'tooth' are variants of the same root, alongside 牙 yá, 齒 chǐ, 齖 yá, and 齡 líng, all functioning as doublets in Chinese. This interpretation is debated. Tsu‑lin Mei, for example, argued that 牙 yá represents the sole phonetic value, evolving into SV ngà 'ivory', and affirmed its Austroasiatic origin (see What Makes Chinese So Vietnamese? - Appendices).

    In other words, following our corollary approach, the reconstruction issue may rest on the sound value of OC /*ŋrya:/, which appears cognate with ngà and may have been derived from an earlier form of răng, or conversely, răng could have developed from ngà. Meanwhile, 齡 líng (SV linh) and 齖 yá (SV nha) represent later developments, where 牙 yá ultimately supplanted its derivative 齖 yá in common usage.

    Per Tsu-lin Mei in https://www.people.cornell.edu/pages/tm17/paper459.htm,

    Some Min dialects still employ 牙齿 in the sense of tooth. The common word for tooth in Amoy is simply k’i. Foochow has nai3which is a fusion of ŋɑ plus k’i, i.e. 牙齿. This strongly suggests that in Min the real old word for ‘tooth’ is 齿 as in Amoy, the implication being that this was stil the colloquial word for ‘tooth’ well into Han when Fukien was first settled by the Chinese. The Japanese use 齿 as kanji to write /ha/ ‘tooth’ in their language; 牙 rarely occurs. Both these facts provide supplementary evidence for the thesis that the use of ya as the general word for ‘tooth’ was a relatively late development.

    In a note published in BSOAS, vol. 18, Walter Simon proposed that Tibetan so ‘tooth’ and Chinese yá 牙 (OC *ng*) are cognates, thus reviving a view once expressed by Sten Konow. Simon’s entire argument was based upon historical phonology; he tried to show

    a) OC had consonant clusters of the type sng- and C-, (b) by reconstructing 牙 as sng* > zng > nga and 邪 as zˠ* > z**, one can affirm He Shen’s view that 邪 has 牙 as its phonetic, and (c) Chinese sng* can then be related to a Proto-Tibetan *sngwa and Burmese swa:>θwa:.

    Our etymology for yá ‘tooth’ implies a rejection of Simon’s view; if yá is borrowed from Austroasiatic languages, then the question of Sino-Tibetan comparison simply does not arise. Alternately, if our theory is accepted, there is no reason to adopt Simon’s analysis; ya is clearly a word of relatively late origin, and the fact that 邪 has 牙 as its phonetic can be explained by assuming that the z- of 邪 resulted from the palatalization of an earlier g-.*

    Given the rationalizations above, it may still remain indecisive for some readers to directly associate 牙 yá with răng. Nevertheless, the postulation can be extended to other controversial items in the same semantic category, the uppermost parts of the vertebrate body, in addition to the human anatomical terms already discussed. These include:

  • 首 shǒu → sọ 'cranium' [/sh‑/ ~ /s‑/]
  • 面 miàn → mặt 'face' [/‑n/ ~ /‑t/]
  • 頂 dǐng → trán 'forehead' [/t‑/ ~ /tr‑/]
  • 眉 méi → mày 'eyebrow' [/m‑/ ~ /m‑/]
  • 目 mù → mắt 'eye' [/m‑/ ~ /m‑/]
  • 耷 dā → tai 'ear' [by association with 耳朵 ěrduō → VS lỗtai; cf. 洱 ěr (Cant. /lej6/) | ¶ /l‑/ ~ /t‑/]
  • 髮 fá → tóc 'hair' [/b‑/ ~ /t‑/; cf. SV phát /fat7/]
  • 鼻 bí → mũi 'nose' [/b‑/ ~ /m‑/]
  • 頰 jiá →  'cheeks' [/j‑/ ~ /m‑/]
  • 嘴 zuǐ → môi 'lips' [/z‑/ ~ /m‑/]
  • 吻 wěn → mồm 'mouth' (hence miệng); also 吻 wěn → hôn 'kiss'
  • 頷 hàn → cằm 'chin' [/h‑/ ~ /k‑/]
  • 含 hán → hàm 'jaw' [/h‑/ ~ /h‑/]

2. "MẶT":

For 面 miàn (SV diện 'face') → VS mặt, the correspondence requires explanation of how the final evolved into /‑t/. The pattern here is /‑Ø ~ ‑t/, reflecting a broader set of correspondences in which finals /‑Ø/, /‑n/, /‑ng/ regularly shift to /‑t/ or /‑k/ in Vietnamese.

This type of final change is well attested in both Sino‑Vietnamese and Sinitic‑Vietnamese strata, many of which trace back to the 7th and 8th tone categories of Old Chinese. This is also one of the key reasons why Vietnamese historical phonology must be analyzed within an eight‑tone framework.

A prime example is 面 miàn itself. In compounds such as 面子 miànzi (SV diệntử > #VS cáimặt, 'the face'), the expected reflex would have developed into a form like /mie~t8/. From this pathway we arrive at the vernacular expression 沒面(子) méimiàn(zǐ) → VS mấtmặt 'lose face'.

Other representative examples include:

  • 吃 chī → VS ăn (SV ngật) 'eat' [cf. 乙 yǐ → SV ất]
  • 咽 yàn → VS nuốt 'swallow'
  • 粉 fén → VS bột 'flour'
  • 分 fēn → VS phút 'minute'
  • 淡 dàn → VS lạt 'insipid'
  • 晕 yùn → VS ngất 'faint, pass out'
  • 麦 mài → SV mạch 'wheat'
  • 脈 mài → SV mạch 'vein'
  • 滅 miè → SV diệt 'eliminate'
  • 目 mù → VS mắt 'eye'
  • 默 mò → SV mặc 'tact, silence'
  • 忙 máng → VS mắc 'busy'
  • 亡 wáng → VS mất 'to lose, pass away'
  • 密 mì → SV mật 'dense, secret'
  • 木 mù → SV mộc 'wood'
  • 没 mò → VS một 'loss, one'

Compound forms reinforce the same pattern:

  • 面孔 miànkǒng → VS khuônmặt 'face'
  • 面貌 miànmào → VS mặtmày 'countenance'
  • 前面 qiánmiàn → VS mặtrước 'front'
  • 後面 hòumiàn → VS mặtsau 'back'
  • 下面 xiàmiàn → VS mặtdưới 'bottom'
  • 側面 cèmiàn → VS mặttrái 'side view'
  • 表面 biǎomiàn → VS bềmặt 'surface'
  • 面對 miànduì → VS đốimặt 'facing'

From this evidence, we can safely posit a regular correspondence miàn ~ mặt

Transition to other domains  –  Leaving aside the body‑part vocabulary, we may now turn to other semantic fields. Consider "" ('fish'). One must ask: is it plausible that coastal peoples, whose livelihood depended on fishing, would have borrowed such a fundamental word from inland, horse‑mounted Chinese speakers? The answer suggests otherwise.  has long been a staple term not only for southern coastal communities but also for populations along the Yangtze River, the second longest river in the world, where fishing was equally central to daily life.

3. "" :

    We have 魚 yú =  = SV ngư 'fish'.

    • M 魚 yú < MC ŋɨə̆ < OC *ŋa 

    • According to Starostin: for *ŋh‑ cf. Xiamen hi2, Chaozhou hy2.

    • Protoform: ŋ(j)a. Meaning: 'fish'.

    • Cognates: Chinese 魚 ŋha 'fish'; Tibetan ɳa 'fish'; Burmese ŋah 'fish'; Lolo‑Burmese ŋhax; Kachin ŋa3 'fish'; Lushei ŋha 'fish'; Kiranti ŋjə 'fish'.

    • Comparative notes: Proto‑Garo tàrŋa; Bodo ŋa ~ na; Dimasa na; Chepang ŋa ~ nya; Tsangla ŋa; Moshang ŋa; Namsangia ŋa; Kham ŋa:ɬ; Kaike ŋa:; Trung ŋa1‑pla<ʔ1.

Observation: It is not difficult to see the denasalized velar shift from OC ŋh‑ to VS k‑ (). See Appendices for further discussion of the etymology of , which is also intriguingly connected to the history of the words ketchup and catsup.

    (i) Etymological pathway:
    • 魚 yú 'fish' < OC *ŋa

    • Reflexes: VS ; SV ngư

    • Pattern: ŋ‑ > MC ŋjw‑ > SV ngưŋ‑ > VS k‑.

    • The alternation /ng‑ ~ k‑/ is common in laryngeal sound changes, often mediated by g‑, gh‑, kh‑.

    • Parallel example:  → 鷄 jī → VS  'chicken'.

    (ii) Expressions and compounds with 
  • 打魚 dǎyú → đánhcá 'net fishing'
  • 釣魚 diàoyú → câucá 'fishing'
  • 撈魚 lāoyú → lướicá 'net fishing'
  • 捕魚 bǔyú → bắtcá 'catch fish'
  • 魚刺 yúcì → xươngcá 'fish bone'
  • 咸魚 xiányú → cámặn 'salted fish'
  • 煎魚 jiānyú → cáchiên 'fried fish'
  • 魚腥 yúxīng → tanhcá 'fishy'
  • 脯魚 fǔyú → khôcá 'fish jerky'
  • 鯨魚 jīngyú → kìnhngư 'whale'
  • 如魚得水 rúyúdéshuǐ → nhưcágặpnước 'like a fish back in water'
  • 大魚吃小魚 dàyúchīxiǎoyú → cálớnnuốtcábé 'big fish devour small fish'

In the meantime, the use of Sino‑Vietnamese ngư (漁) in place of   –  the latter often regarded as the more "purely Vietnamese" word  –  is entirely natural in Vietnamese. Examples include:

  • 漁船 yúchuán → ngưthuyền 'fishing boat'
  • 漁港 yúgǎng → ngưcảng 'fishermen’s wharf'
  • 漁夫 yúfū → ngưphủ 'fisherman'
  • 漁民 yúmín → ngưdân 'fishermen'
  • 漁翁 yúwēng → ngưông 'fisherman'
  • 鷸蚌相爭漁翁得利 yùbàngxiāngzhēng, yúwēngdélì → dẽtrai giànhnhau, ngưôngđắclợi 'the fisherman profits when the mussel and the snipe fight'

Interestingly, unlike English terms such as salmon or sturgeon, which stand alone, both Vietnamese and Chinese require the morpheme cá‑ or ‑魚 to form semantically complete names for specific fish. Without this element, the word is ambiguous. For example:

  • 魚婢 yúbì → cábóng 'small carp'
  • 墨魚 mòyú → cámực 'cuttlefish'
  • 紅魚 hóngyú → cáhồng 'snapper'
  • 鮭魚 guīyú → cáhồi 'salmon'
  • 京魚 jīngyú → cákình 'whale'
  • 鱘魚 xúnyú → cátầm 'sturgeon'
  • 鮐魚 táiyú → cáthu 'mackerel'
  • 鮪魚 wěiyú → cángừ 'tuna, horse mackerel'

For some of the above, it is often claimed that many of these Sinitic‑Vietnamese  compounds are loanwords from Chinese. While some certainly are, the majority must be uniquely Vietnamese, inherited from ancient times, since the Việt‑Mường peoples lived along the shoreline and relied heavily on fishing.

The broader point is that "fish" concepts and lexicons are so deeply intertwined between Chinese and Vietnamese that no other Mon‑Khmer languages exhibit a comparable density of overlap. If this is the case, the direction of influence may even be reversed: although Austroasiatic languages are often mapped as “fish‑centered” in Southeast Asia, the etymology of  does not fit neatly into the Austroasiatic or Mon‑Khmer picture, despite the oceanic geography.

4. "GẠO":

(i) Historical interpretation  –  When we turn to basic vocabulary for the main agricultural staple of the southern region  – 'rice'  –  the case of gạo is particularly instructive.

We have 稻 dào 'rice' corresponding to VS gạo and SV đạo  [ M 稻 dào < MC daw < OC *l'uːʔ | MC reading: 效開一上皓定 | Starostin: Vietnamese lúa is an archaic loanword; regular Sino‑Vietnamese is đạo. Protoform: ly:wH (~ɫ‑). Chinese 稻 lhu:ʔ 'rice, paddy'; Burmese luh 'grain, Panicum paspalum'; Kachin c^jəkhrau1 'paddy ready for husking'; Kiranti lV 'millet'. Per Schuessler: MC dâu < OC gləwʔ or mləwʔ ]

For a historical linguist, it is not difficult to see why both gạo and lúa could be variants of 稻 dào (SV đạo). It is likely that this was a Yue loanword into Chinese, originating in the southern regions where rice cultivation began. Maspero (1952) lists Vietnamese and Thai cognates in this connection.

In Daic, the form /khou3/ encompasses all three concepts – 'paddy' (unhusked rice), 'husked rice', and 'cooked rice'. In contrast, Mon‑Khmer has only /sro/, which has been suggested as a cognate for VS lúa 'paddy'. Vietnamese, however, distinguishes three separate terms that can be associated with those of Chinese:

  • lúa 'paddy' [ M 來 lái, lài, lāi (lai, lãi) < MC ləj < OC *mrɯːɡ ]
  • gạo 'husked rice' [ M 稻 dào < MC daw < OC *l'uːʔ ]
  • cơm 'cooked rice' [ M 粓 (泔) gān < MC kam < OC *kaːm | Wiktionary: Etymologically, from Proto-Vietic *kəːm ("cooked rice"); cognate with Arem kʌːm. According to Ferlus, a loan from Chinese 泔 (OC *kaːm, "water from washing rice; kitchen slops") (SV: cam). Semantically, compare Thai ข้าว (kâao, "rice; meal"), Khmer បាយ (baay, "rice; meal"), Chinese 飯 / 饭 (fàn, "rice; meal"), Korean 밥 (bap, "rice; mea"), Japanese ご飯 (gohan, "rice; meal"). ]

This lexical differentiation reflects the centrality of rice cultivation in Vietnamese culture. The key point is that these forms all evolved from the same root, diverging phonetically but retaining closely related meanings – similar to the case of betel nuts and leaves: 檳榔 bīnláng (ancient Annamese blau) vs. VS trầu vs. cau.

(ii) Sound correspondences  –  The interchange { /d‑/, /t‑/ ~ /g‑/ } is relatively rare in Chinese – Vietnamese correspondences. This does not mean that 稻 dào (SV đạo) must be cognate only with VS lúa on the basis of the { d‑ ~ l‑ } pattern posited by Starostin. Despite irregularities, we can still identify a set of cognates governed by the same sound change rule (at least six attestations):

  • 導 dào →  'to coach'
  • 倒 dǎo → gục 'collapse'
  • 陡 dǒu → gồ 'precipitous'
  • 逗 dòu → ghẹo 'tease'
  • 凸 tū → gồ 'protruding'
  • 佗 tuó →  'that fellow'
  • 駝背 tuóbèi → gùlưng 'hunchback'
  • 大膽 dàdǎn → cảgan 'dare to'
  • 託付 tuòfù → gởigấm 'entrust'
  • 陶器 táoqì → đồgốm 'pottery'

This evidence suggests that gạolúa, and SV đạo are not isolated anomalies but part of a broader pattern of correspondences linking Vietnamese and Chinese agricultural vocabulary.

5. "ĐẤT":

We have 土 tǔ (soil): 'thổ', 'độ', 'đỗ' (SV) [ M 土 tǔ, dù (thổ, độ, đỗ) < MC thʰɔ, duo < OC *l̥ʰaːʔ, *l'aːʔ (Li Fang-Kuei : OC *dagx ) | FQ 他魯 | MC reading 遇合一上姥透 | According to Starostin: MC tho < OC *tha:ʔ (Note the final -ʔ). Also used for *d(h)a:ʔ (MC do, Pek. dù) roots of mulberry tree.]

The sound change can fit into the following patterns initial { ¶ /t- ~ đ-/ }:

  • 突 tù → đột 'suddenly'
  • 圖 tú → đồ 'drawing'
  • 吐 tù → thổ 'vomit'
  • 唐 táng → đường 'path'
  • 談 tán → đàm 'talk'
  • 壇 tán → đàn 'platform'
  • 腿 tuǐ → đùi 'lap'
  • 痛 tòng → đau 'pain'
  • 頭 tóu → đầu 'head'
  • 踏 tǎ → đạp 'tread'
  • 條 tiáo → điều 'article'
  • 點 diǎn → điểm 'point'
  • 毒 dú → độc 'poisonous'
  • 督 dù → đốc 'urge'
  • 櫝 dú → tủ 'cabinet'
  • 讀 dú → đọc 'read'

(i) Final correspondences {/‑Ø/ ~ /‑t/}

  • 必 bì → tất 'inevitable'
  • 室 shì → thất 'chamber'
  • 七 qī → thất 'seven'
  • 漆 qī → tất 'lacquer'
  • 疾 jí → tật 'illness'
  • 悉 xì → tất 'entire'
  • 乞 qí → khất 'beg'
  • 不 bù → bất 'not'
  • 畢 bì → tốt 'graduate'
  • 卒 zú → tốt 'private, soldier'
  • 燒 shāo → đốt 'burn'
  • 忽 hù → hốt 'neglect'
  • 突 tù → đột 'sudden'

(ii) Compounds with đất:

  • 土地 tǔdì → đấtđai 'land'
  • 地帶 dìdài → đấtđai 'stretch of land'
  • 土鼊 tǔpì → bọđất 'beetle'
  • 領土 língtǔ → mãnhđất 'territory'
  • 地面 dìmiàn → mặtđất 'earth’s surface'
  • 地塊 dìkuài → cụcđất 'piece of soil'
  • 塊地 kuàidì → khoảngđất 'piece of land' [ Another case of  "binoms": 一 塊地 yī kuàidì 'one piece of land']
  • 地域 dìyù → vùngđất 'region'
  • 田地 tiándì → ruộngđất 'farming land'
  • 地球 dìqíu → quảđất 'globe'

Note on doublets  –  In Chinese, 地 dì (SV địa, VS đất, 'earth') is a later derivative and doublet of 土 tǔ. This strengthens the case for tǔ = đất, as seen in compounds like 土地 tǔdì. Doublet forms are common in Chinese, having evolved from different sources  –  for example, 首 shǒu and 頭 tóu both correspond to đầu 'head'.   

6. "ĐỐT":

For 燒 shāo (SV thiêu) ~ VS đốt 'to burn' [M 燒 shāo, shào < MC ɕiaw < OC *hŋjew, *hŋjaws | ¶ /sh‑ ~ đ‑/], it is not difficult to see that initials sh‑, th‑ (or even s‑) can yield VS đ‑. This alternation is evident in several reflexes of 燒 shāo:

  • 發燒 fāshāo (SV phátthiêu) → VS phátsốt 'have a fever'
  • 燒香 shāoxiāng → VS thắpnhangthắphương 'burn incense' (equivalent to đốtnhangđốthương)

Through reduplication and localization, 燒+燒 shāo+shāo also generated the compound thiêuđốt in Vietnamese. A parallel development can be seen with 少 shǎo (SV thiếu) ~ VS sót, yielding thiếusót 'shortage', equivalent to 缺少 quèshǎo.

In addition to the pattern { th‑ ~ đ‑ } that links 土 tǔ with VS đất, the case of 燒 shāo (SV 'thiêu' ~> VS đốt) suggests that /sh‑/ > /th‑/ may itself have evolved from an older /đ‑/. The Nôm form đốt may in fact predate thiêu, since the /th‑/ initial is confined to the Sino‑Vietnamese layer (a Middle Chinese reflex), whereas Old Chinese and Archaic Chinese already had a voiced /d‑/ initial, traces of which survive in dialects such as Hainanese and Amoy.

    Other examples of the pattern /sh‑ ~ đ‑/:

    • 生 shēng → đẻ 'give birth' (cf. Hainanese /te1/)
    • 深 shēn → đậm 'dark' (SV thâm)
    • 首 shǒu → đầu 'head' (SV thủ; doublet of 頭 tóu → SV đầu)
    • 盛 shèng → đựng 'contain' (SV thịnh)
    • 世 shì → đời 'life' (SV thế)
    • 石 shí → đá 'stone' (SV thạch) [cf. 石 dàn → tạ 'unit of weight']
    • 水 shuǐ → nước 'water' (SV thuỷ) [cf. Viet‑Mường đák 'water'; cf. 踏 tà → đạp 'tread']

This evidence shows that the alternation /sh‑ ~ đ‑/ is not an isolated anomaly but part of a broader pattern of correspondences linking Sino‑Vietnamese and vernacular Vietnamese strata.

7. "LỬA":

We have 火 huǒ 'fire' → SV hoả [M 火 huǒ, huō < MC hwa < OC *qʰʷaːlʔ  ], which illustrates the pattern { ¶ /h(w)‑/ ~ /l‑/ }:

  • 話 huà → lời 'spoken word'
  • 混 hún → lộn 'confused'
  • 宏 hóng → lớn 'large'
  • 很 hěn → lắm 'much'
  • 灣 wān → loan 'bay'
  • 大伙 dàhuǒ → cảlũ 'the whole group'
  • 同夥 tónghuǒ → đồngloã 'accomplice'
  • 裸體 luǒtǐ → loãthể 'naked' [cf. phonetic stem 果 guǒ: SV quả /wa3/]

Compounds with 火 huǒ:

    Numerous disyllabic compounds built on 火 huǒ are also preserved in Vietnamese, often with both SV and vernacular forms in parallel use:

    • 火車 huǒchē → xelửa (SV hoảxa) 'train'
    • 火箭 huǒjiàn → tênlửa (SV hoảtiễn) 'rocket'
    • 救火 jìuhuǒ → chửalửa (SV cứuhoả) 'firefighting'
    • 火燒 huǒshāo → lửacháy (SV hoảthiêu) 'burn'

    As with ngư 'fish', the SV forms with hoả remain in frequent use alongside their vernacular counterparts, demonstrating the layered coexistence of Sino‑Vietnamese and native vocabulary in core semantic domains.

    8. "CON":

    We have VS con ~ 子 zǐ 'child, son' (SV tử).

    • M 子 zī, zǐ, zì, zí, zi, cí (tử, tý) < MC tsɨ < OC *ʔslɯʔ | According to Starostin: meanings include 'child, son, daughter, young person; prince; a polite substitute for "you". Also read ʔslɯʔs, MC tsɨ, Mand.  'to treat as a son'. Cf. 字 *zlɯs 'to breed'. The character is also used for a homonymous word *ʔslɯʔ 'the first of the Earthly Branches' (SV ). Cf. Dialects: Cantonese 仔 /zei3/ 'son'. In Fuzhou (Fukienese) represented as 囝 kiaŋ (M jiǎn); in Xiamen (Amoy) /kẽ/; in Hainanese /ke1/, all phonetically close to VS con. This suggests that con may derive from Austroasiatic kiã 'son, child', or may be a cognate with 子 zǐ.

    (i) Compounds and affixal usage –  The lexeme appears in numerous Chinese compounds, where it functions as an affix with extended meanings:

    • 父子 fùzǐ → bốcon 'father and son'
    • 母子 mǔzǐ → mẹcon 'mother and son'
    • 子孫 zǐsūn → concháu 'children and grandchildren'
    • 孩子 háizi → concái 'children'
    • 幼子 yòuzǐ → connhỏ 'child'
    • 長子 chǎngzǐ → contrưởng 'eldest son'
    • 棋子 qízǐ → concờ 'checker piece'
    • 刀子 dāozi → condao 'knife'
    • 猴子 hóuzi → conkhỉ 'monkey'

    (ii) Sound correspondence  –  For the interchange { ¶ /C‑/ ~ /K‑/ }, we find further examples:

    • 存 cún → VS còn 'exist'
    • 擦 cā → VS  'rub'
    • 餐 cān → VS cơm 'meal'

    9. "SAO": 

    We have VS sao ~ 星 xīng 'star' (SV tinh, VS sao, tạnh, 'star', 'clear sky after rain') [ M 星 xīng < MC seŋ < OC *sleːŋ | MC reading: 梗開四平青心 | FQ 桑經 | ZYYY: sijəŋ1 ]

    Dialectal reflexes:

    • Hainanese: se11 [cf. 生 shēng → VS đẻ, Hai. /te1/]
    • Hankou: ʂin11
    • Sichuan: ʂin11
    • Yangzhou: ʂĩ11
    • Chaozhou: sin11
    • Changsha: sin11
    • Shuangfeng: ʂin11, ʂiõ11
    • Nanchang: ʂin11, ʂiaŋ11

    Note: What remains problematic in relating VS sao to 星 xīng (SV tinh) is the absence of a rounded final. This parallels the case of 痛 tòng (SV thống) ~ VS đau 'pain', where the expected rounded coda is also lacking.

    10. "":

    We have 葉 yè 'leaf' (SV diệp) [ M 葉 yè, dié, shè, xiè < MC jiap, ɕiap < OC *leb, *hljeb | Comparative evidence: Tibetan ldeb 'lá, tờ'; Burmese ɑhlap 'cánh hoa'; Kachin lap2 'lá'; Lushei le:p 'búp'; Lepcha lop 'lá'; Rawang ʂɑ lap 'lá' (used to wrap dumplings); Trung ljəp1 'lá'; Bahing lab. This cluster shows that many Tibeto‑Burman languages preserve a form close to . In effect, there are well over one hundred words that register the pattern OC l‑ > MC j‑. Cf. 聿 yù → 律 lǜ (illustrating the same shift). ]

    The pattern { ¶ /y‑/ ~ /l‑/ }:

    Hence, more broadly { /l‑/ ~ /y‑/, /v‑/, /r‑/ } in both Vietnamese and Chinese. Numerous examples illustrate this interchange, especially where OC l‑ alternates with M /l‑/ ~ /y‑/:

    • 藥 yào (SV dược 'medicine') ~ 樂 lè (SV lạc 'happy') [¶ /y‑/ ~ /l‑/]
    • 葉 yè →  'leaf'
    • 搖 yáo → lay 'shake'
    • 腰 yāo → lưng 'lower back'
    • 異 yí → lạ 'strange'
    • 陰 yīn → lồn 'female genital'
    • 蠅 yíng → lằng 'bluebottle'
    • 游 yóu → lội 'swim'
    • 籬 lí → dậu 'hedge'
    • 冽 liè → rét 'chill'
    • 離 lí → rời 'leave'
    • 落 luò → rơi 'drop'

    11. "UỐNG":

    We have VS uống ~ 飲 yǐn 'drink' (SV ẩm) [also VS  /jo1/ | M 飲 (飮) yǐn < MC ʔjim, ʔɯim < OC *qrɯmʔ, qrɯms ]

    a. Broader context of correspondences  –  Beyond the internal relationships among Chinese dialects themselves  –  many of which reflect the heavy influx of Han (Ancient Chinese) and Middle Chinese strata layered over aboriginal lexicons in Cantonese, Fukienese, and others  –  Vietnamese shows remarkable closeness with several Chinese dialects. This closeness often suggests possible kinship rather than mere borrowing, as attested by basic words such as:

    • 魚 yú → VS , SV ngư 'fish'
    • 葉 yè → VS , SV diệp 'leaf'
    • 面 miàn → VS mặt, SV diện 'face'
    • 飲 yǐn → VS uống, SV ẩm 'drink'

    Their etymological resemblance is in fact stronger than the parallels often proposed within Mon‑Khmer or Sino‑Tibetan.

    The sound change patterns and interchanges outlined above provide only a general picture, without regard to precise spatial or temporal specifics. With deeper research, more detailed postulations can be made.

    b. Sound change rules: Nguyễn Ngọc San’s rules (1993: 154 – 160)

    Nguyễn Ngọc San (NNS) summarized ancient sound changes that should be considered as rules:

    i. Initial /ch‑/ existed before the split of Vietnamese and Mường. Words with /tr‑/ appeared later, reflecting clusters /bl‑/, /tl‑/ that shifted in the 17th century. Many fundamental words for utensils, kinship, tools, animals, and insects preserve /ch‑/, not /tr‑/, and correspond to Chinese forms with /zh‑/, /z‑/, /sh‑/, /j‑/ in Mandarin, for example,

    • 帚 zhǒu → VS chổi 'broom'
    • 樽 zūn → VS chai 'bottle'
    • 姊 zǐ → VS chị 'older sister'
    • 侄 zhí → VS cháu 'grandson'
    • 叔 shū → VS chú 'paternal uncle'
    • 走 zhǒu → VS chạy 'run'
    • 棹 zháo → VS chèo 'oar'
    • 煎 jiān → VS chiên 'fry'
    • 鼠 shǔ → VS chuột 'rat'

    ii. Pre‑Hán‑Việt forms are often older than Hán‑Việt (Sino‑Vietnamese) forms, e.g.,

    • chay (pre‑SV) → SV trai 齋 zhāi 'vegan'
    • chày → SV trì 遲 chí 'slow'
    • chém → SV trảm 斬 zhǎn 'cut off'
    • chén → SV trản 盞 zhǎn 'bowl'
    • chè → SV trà 茶 chá 'tea' (cf. Hainanese /dje/, Chaozhou /te/)
    • chừa → SV trừ 除 chú 'exclude'
    • chứa → SV trữ 儲 chǔ 'store'
    • chuyền → SV truyền 轉 chuán 'transmit'
    • chuyện → SV truyện 傳 zhuàn 'story'

    These pre‑Hán‑Việt forms /ch- ~ tr-/ still reflect Old Chinese, possibly of Yue origin.

    iii. Consonantal clusters /bl‑/, /tl‑/ (pre‑17th century) shifted into /tr‑/, /gi‑/, or /l‑/. Thus, lexical doublets with /tr‑/, /gi‑/, or /l‑/ should historically be spelled with /ch‑/ and /gi‑/. Examples:

    • trời ~ giời
    • trầu ~ giầu
    • trăng ~ giăng
    • trùn ~ giun
    • trôn ~ lồn
    • trũng ~ lũng

    iv. Pre‑glottal /ʔ‑/ (before the 12th century) shifted into /d‑/ and /nh‑/. Doublets confirm this:

    • dăn ~ nhăn
    • dặng ~ nhặng
    •  ~ nhơ
    • dỡ ~ nhỡ
    • dồi ~ nhồi
    • dức ~ nhức

    v. Palatalization during the Viet‑Mường split: /ch‑/ [ʨ] > /gi‑/ [z‑]. Doublets attest this:

    • cha ~ già
    • chi ~ 
    • chói ~ giọi
    • chuỳ ~ giùi
    • chừ ~ giờ
    • chủng ~ giống

    vi. Other palatalization remnants appear in doublets with systematic sound changes:

    • /ch‑/ [ʨ ~ x‑/s‑]: 
      • chẻ ~ xẻ, xé
      • chiên ~ xiên
      • chòm ~ xóm
      • chen ~ xen
      • chếch ~ xếch
      • chao ~ xào
    • /đ‑/ [d ~ d‑/j-]: 
      • đã(cơn) ~ dã(cơn); 
      • đứt ~ dứt; 
      • đao ~ dao; 
      • đập ~ dập; 
      • đình ~ dừng; 
      • đướn ~ dưới; 
      • đạy(học) ~ dạy(học); 
      • đun(đẩy) ~ dun(dẩy); 
      • (chỉnh)đốn ~ dọn(dẹp); 
      • (cây)đa ~ (cây)da
    • Initials /gi‑/ and /ch‑/
    NNS noted that Vietnamese words with lower‑registered tones are likely of pure vernacular origin.

    • SV words with present initial /gi‑/ [z] derive from earlier /ch‑/ [ʨ].
    • Hence, SV words with /gi‑/ carry upper tonesgia, giá, giả, gian, gián, giản, giang, giáng, giảng, giam, giám, giảm.
    • SV words with present initial /ch‑/ come from voiceless MC /tɕ’/ (章 chương series).
    • These also carry upper toneschu, chú, chủ, chương, chướng, chưởng, chân, chấn, chẩn, chi, chí, chỉ, chư, chử, chích.
    • Similarly, SV words with kh‑ from MC /k’/ carry upper toneskhai, khái, khải, kha, khắc, khâm, khí, khi, khiếp, khuyển, khánh, khuyết, khoáng, khoa, khoái, khủng, khứ, khúc.

    The correspondences between Vietnamese and Chinese  –  illustrated by uống ~ 飲 yǐn and many other basic words  –  point to a depth of relationship that goes beyond simple borrowing. The systematic sound changes documented by Nguyễn Ngọc San demonstrate how pre‑Hán‑Việt, Hán‑Việt, and vernacular layers interweave, leaving behind doublets and phonological patterns that still shape Vietnamese today.

    Interestingly, most of the illustrated examples cited by Nguyễn Ngoc San just further strengthen the Vietnamese ~ Chinese correspondences, e.g., 'đ-'/d-/ ~ 'd-'/j-/ in their affiliated etyma, e.g., 過癮 guòyǐn ~ #đã(cơn) ~ #dã(cơn) (where localization of 癮 yǐn ~> 'dã'; 過 guò ~> 'cơn'), or, especially,  jiāo (giáo > đạo > đạy > dạy, 'teach'). For the rest of other cited words above, let us save the work on exploring the matching Chinese cognates for our future historical linguists in Vietnamese to practice in the next worksheets. 

    vii. Short vowels and tonal registers:

    The alternation of long vowels into short ones in Vietnamese is attested in certain Central subdialects:

    • Long /e/ in Quảngngãi and Bìnhđịnh,

    • Long /o/ in Nghệtĩnh.

    These deviant pronunciations are not part of modern standard Vietnamese.

    Nguyễn Ngọc San suggested that Sino‑Vietnamese  words can often be recognized from vernacular Vietnamese simply by their phonological appearance. (Nguyễn Ngọc San. Ibid. p. 158).

    viii. Voiceless Middle Chinese and tone registers:

    When voiceless Middle Chinese (MC) words entered the SV stock, they evolved into forms with lower‑registered tones (marked in Quốc ngữ with \~.).

    • SV words beginning with the glottal /ʔ‑/ (not transcribed in modern Quốc ngữ) – e.g., an, anh, ang, ong, ông, ếch  – carry upper‑registered tonesa, ả, á, an, án, ám, ung, úng, ủng, ôn, ổn, âm, ấm, ẩm.

    • By contrast, words with lower‑registered tones are considered native Vietnamese

    ix. Initials /gi‑/ and /ch‑/:

    NNS noted that Vietnamese words with lower‑registered tones are likely of pure vernacular origin.

    x. Voiced initials and lower tones: 

    Nguyễn Ngọc San also established that SV words with initials /m‑/, /n‑/, /nh‑/, /ng‑/, /l‑/ derive from voiced MC initials:

    • /m‑/ (明 minh), /n‑/ (泥 ), /ɲ‑/ (日 nhật), /ŋ‑/ (疑 nghi), /l‑/ (來 lai).

    • Other voiced initials: /d‑/ (定 định), /mj‑/ (敏 mẫn, 明 minh div. II), /v‑/ from /w‑/ (雲 vân), /miw/ (微 vi).

    Because these were voiced, their SV reflexes also carry lower‑registered tones. Yet, some show both upper and lower registers, e.g.: tones, for example,

      • viên ~ vườn
      • nương ~ nàng
      • nguyên ~ nguồn
      • ma ~ mè
      • lâm ~ lầm
      • lô ~ lò
      • văn ~ vằn
      • nam ~ nồm
      • linh ~ lành
      • du ~ dầu
      • di ~ dời

      etc.

    In short, those Sino-Vietnamese words with the initials /m-/, /n-/, /nh-/, /ng-/, /li-/, /v-/, /d-/ all carry either the level upper tone /一/ or lower registered tone /~/, and /./

    • m-
      • mao, mão, mạo, mi, mĩ, mị, ma, mã, mạ, mô, mỗ, mộ, mai, mãi, mại, môi, mỗi, etc.
      • Exception: miến, miếu
    • n-
      • nao, não, nạo, nô, nỗ, nộ, niêm, niệm, niên, nịch, etc.
      • Exception: nùng, náo, niết
    • nh-
      • nhi, nhĩ, nhị, nhân, nhẫn, nhận, như, nhữ, nhụ, nhung, nhũng, nhụng, nhiệm, nhiệt, nhuận, nhan, nhãn, nhạn, etc.
      • Exception: nhất, nhiếp, nhuế [ cf. VS một, 'one' ]
    • ng-
      • nga, ngã, ngại, ngãi, ngoa, ngôn, ngưỡng, nghĩa, ngữ, nguyện, ngọc, etc.
      • Exception: ngải
    • l-
      • lao, lão, lạo, lai, lãi, lại, lung, lãng, lạng, lâm, lẫm, liêu, luễ, liệu, lê, lễ, lệ, lô, lỗ, lộ, luật, lịch, etc.
      • Exception: lý, lánh
    • v-
      • vi, vĩ, vị, viên, viễn, viện, vinh, vĩnh, vịnh, vu, vũ, vụ, vãn, vạn, vong, võng, vọng, etc.
      • Exception: vấn
    • d-
      • di, dĩ, dị, dung, dũng dụng, diên, diễn, diện, dục, do, duyệt, etc.
      • Exception: vấn

    Notes: For the tonal sound change rule above, that is, Sino-Vietnamese words of initials that start with m-, nh-, v-, l-, d-, ng- carry level upper tone /一/ or lower registered tone /~/, and /./, the mnemonic aid for them is to remember the Vietnamese clause that goes "Mình nhớ viết là dấu ngã."       (Nguyễn Tài Cẩn. 2000) For all others, they are written with the tones /`/, /ʔ/, /./


    Table 1 - Tonal sound change rules show a systematic division

    • Upper tonesSV words from voiceless MC initials (/ch‑/, /kh‑/, /ʔ‑/).
    • Lower tonesSV words from voiced MC initials (/m‑/, /n‑/, /nh‑/, /ng‑/, /l‑/, /v‑/, /d‑/).
    • Mnemonic SVMình nhớ viết là dấu ngã.


    c. Pulleyblank's rules on Middle Chinese finals and reflexes

    i. Introduction - Edwin G. Pulleyblank’s reconstruction of Early Middle Chinese (EMC) is distinctive for its complex finals, which combine medials and codas. Two of the most revealing are /‑wŋ/ (rounded medial + velar nasal) and /‑wkp/ (rounded medial + velar stop cluster). These finals explain why Sino‑Vietnamese and southern Chinese dialects preserve contrasts that Mandarin has largely merged. (2)

    ii. The final /‑wŋ/:

    • Structure: /‑w‑/ + /‑ŋ/
      • Rhyme groups: 東 (tung), 鍾 (chung)
      • Tone: level or rising (平, 上)
      • Reflexes:
      • Sino‑Vietnamese: ‑ông, ‑ung, ‑uông
      • Cantonese: ‑ung
      • Mandarin: ‑ong
      • Hokkien/Min: ‑ong
    • Examples:
      • 工 (koŋ) → SV công, Cantonese gung1, Mandarin gōng, Hokkien kang/khong
      • 公 (koŋ) → SV công, Cantonese gung1, Mandarin gōng, Hokkien kong
      • 風 (pʰuwŋ) → SV phong, Cantonese fung1, Mandarin fēng, Hokkien hong

    iii. The final /‑wkp/

    • Structure: /‑w‑/ + /‑k/ + /‑p/ (complex stop coda)
    • Rhyme groups: 屋 (uk), 緝 (ip)
    • Tone: entering (入聲)
    • Reflexes:
      • Sino‑Vietnamese: ‑ục, ‑uốc, ‑iệp
      • Cantonese: ‑uk, ‑ip
      • Mandarin: merged into ‑u or ‑i endings (no entering tone preserved)
      • Hokkien/Min: ‑ok, ‑ip
    • Examples:
      • 局 (gjuk) → SV cục, Cantonese guk6, Mandarin , Hokkien kiok/kiuk
      • 業 (ŋjɛwkp) → SV nghiệp, Cantonese jip6, Mandarin , Hokkien giap


    Table 2 - Dialectal final comparanda
    with Pulleyblank's Early Middle Chinese (EMC)

    Character EMC Final Sino‑
    Vietnamese
    Cantonese Mandarin Hokkien
    /Min
    koŋ /‑wŋ/ công gung1 gōng kong/khong
    koŋ /‑wŋ/ công gung1 gōng kong
    pʰuwŋ /‑wŋ/ phong fung1 fēng hong
    gjuk /‑wkp/ cục guk6 kiok/kiuk
    ŋjɛwkp /‑wkp/ nghiệp jip6 giap

    Notes: Pulleyblank’s recognition of /‑wŋ/ and /‑wkp/ as distinct finals provides a framework that

    • Explains rounded vowels in Sino‑Vietnamese and southern dialects.
      Example: 工 / 公: Both reconstructed as koŋ with final /‑wŋ/, yielding SV công. Their meanings diverge: 工 ‘work, craft’ vs. 公 ‘public, official’; : Classic example of /‑wŋ/ → SV phong, with nasal coda preserved.

    • Accounts for nasal vs. stop codas and their tonal consequences (level vs. entering).
      Example: : /‑wkp/ final explains SV cục, with entering tone preserved.

    • Shows cross‑linguistic consistency: Sino‑Vietnamese, Cantonese, and Hokkien preserve contrasts that Mandarin has simplified.
      Example: : Complex medial + /‑wkp/ final → SV nghiệp, Cantonese jip6, Hokkien giap;
      : /‑wkp/ final explains SV cục and Cantonese guk6, Hokkien kio̍k / kia̍k / ke̍k

    This comparative view demonstrates how Pulleyblank’s reconstructions bridge medieval phonology with modern reflexes, offering a clear map of historical sound change across the Sinitic world. (P) 

    III) Theoretical framing

    • Sturtevant’s paradox: regular sound change produces irregularity; analogy produces regularity.

    • Corollary forms exemplify this paradox – irregular borrowings become regularized through analogy.

    • Comparative parallels in other Sino‑Xenic languages (Japanese, Korean) reinforce the universality of this process.

    A. Words of unknown origin

    Understandably, unlike etymologies of virtually all words cited in the Webster dictionary for the English language, we would probably not find all the Vietnamese words cognate to those in Chinese by applying the approaches and principles discussed here. Some rules may be applied to Sino-Vietnamese as cited above, but not Sinitic Vietnamese. Many words in Vietnamese – except for those that appear to be loanwords from the Khmer language, such as "cápduồn" (ethnic lynching) or "hầmbàlằng" (mixed bag)  –  are questionable regarding their roots which sometimes look more dubious Chinese. 

    1. The layered lexicon of Vietnamese

    Vietnamese vocabulary is not monolithic. It contains four distinct strata:

    1. Sino‑Vietnamese (SV): Regular, rule‑governed borrowings from Middle Chinese.
    2. Sinitic‑Vietnamese (VS): Irregular, dialectal, or Yue‑substratal borrowings.
    3. Khmer loans: Clear Austroasiatic contributions.
    4. Unknown substratum: Words with no secure Chinese or Mon‑Khmer cognates, often in plants, body parts, or colloquial speech.

    2. Khmer loans (clear Austroasiatic layer)

    • cápduồn = ‘lynching’
    • hầmbàlằng = ‘mixed bag’ → These are transparent Khmer borrowings, not Chinese.

    3. Dubious Chinese parallels (irregular Sinitic‑Vietnamese)

    • eo (‘waist’) vs. 腰 yāo (‘waist’) → plausible but irregular.
    • lưng (‘back’) ← 脊  (‘spine’) < 脊梁 jǐliáng (‘backbone’).
    • vác (‘to shoulder’) vs. 背 bèi (‘carry on back’).
    • vai (‘shoulder’) has no clear Chinese cognate (cf. 肩膀 jiānbǎng).
    • màngtang (‘temple’) vs. 太陽穴 tàiyángxué.
    • mỏác (‘skull top’) vs. 囟門 xìnmén.
    • cùichỏ (‘elbow’) vs. 胳膊肘 gēbozhǒu. → These resist both Mon‑Khmer and Chinese etymologies, suggesting a substratal layer.

    4. Anatomical vocabulary of speculative origin

    In effect, while words of the same contextual nature could not be found in any Austroasiatic Mon-Khmer languages, for many more etyma, we can still indeed cite an impressive list of questionable Vietnamese words of unidentified substratum of many types of tropical plants and fruits and some other non-cultural items, unidentifiable mostly being related to insects, name of fish, etc.,  e.g., 

    • nho ('grapes') [ Chinese 栲 kăo (?) < MC kʰaw < OC *kʰluːʔ, Cf. Mon-Khmer Pacoh nho, wild inedible berry.]
    • thơm ('pineapple') vs. 鳳梨 fènglí
    • ổi ('guava') vs. 安石榴 ānshíliú
    • soài ('mango') vs. 檨 shē
    • mận ('wax apple') vs. 蓮霧 liánwù
    • khế ('starfruit') vs. 芅  or 萇 cháng → No secure Mon‑Khmer or Chinese cognates; likely indigenous or Yue substratum.

    However, by way of corollary, we may retain the following literary examples as a counterbalance, since many other Vietnamese words also appear in fixed idiomatic contexts and conceptual compounds where at least one syllable is plausibly cognate with Yue or Chinese forms. 

    For example, a striking case is the set of animal names in the zodiac as discussed in several instances: these may well be of southern Yue origin, though Chinese scholars have generally resisted such a hypothesis. Instead, their prevailing view  –  that Cantonese, Fukienese, and Wu dialects can be subsumed under the Sino‑Tibetan umbrella on the basis of their dominant Chinese glosses, while Vietnamese is excluded  –  has been widely accepted in Western linguistics, despite its shortcomings.

    • ngựa: 午 (SV ngọ), the seventh animal in the Chinese and Vietnamese zodiac, ‘horse’. Compare ancient Annamese bàngựa ("old horse lacking a herdsman", Đại Nam Quốc âm Thi tập by Nguyễn Trãi). This stands in contrast to Chinese 馬  (SV ) (NNS 1993:163).
    • heo: 亥 (SV hợi), the twelfth zodiac animal, 'pig'. Compare vernacular lợn and the Chinese 腞 (豘) tún, dùn (SV độn).
    • Other zodiac animals include '' 未 wèi (SV mùi, ‘goat/sheep’), and notably mèo  ('cat') 茂 mào (SV mẹo) ~ 貓 māo (SV miêu). Ancient Chinese tradition replaced the cat with 兔  (SV thố, VS thỏ, 'rabbit'), likely for superstitious reasons. Yet Western scholarship has continued to accept the identification of 茂 mào with "rabbit".

    To set these in contrast, we may also cite culturally embedded Vietnamese terms whose syllables align with Chinese elements, even if irregularly:

    • đànbà: 婦道 fùdào ('woman'). Here 道 dào corresponds to đàn, and 婦  to  (cf. 婆  'old woman').
    • đànông: 乾道 qiándào ('man'). 乾 qián resonates with ông (cf. 公 gōng , 'duke, lord'), while 道 dào again parallels đàn.
    • congái: 嬌娃 jiāowá ('girl'). 嬌 jiāo approximates con/cô, 娃  aligns with gái.
    • contrai: 仔仔 zǐzǐ ('boy', Cantonese zaai2zaai2). 仔  ('child') parallels con, trai (cf. 公子 gōngzǐ 'young master').

    Other everyday expressions show similar layering of Chinese models with vernacular usage:

    • ăncơm 吃飯 chīfàn ('eat rice'), cf. 食飯 shífàn (xơicơm)
    • cơmnắm 飯糰 fàntuán ('rice ball')
    • ănmày 要飯 yàofàn ('beggar')
    • bữa 飯 fàn (‘meal’), extended to ban/buổi (‘time period’), yielding compounds such as:
      • bantrưa 白晝 báizhòu ('noon')
      • banngày 白日 báirì ('daytime')
      • banhôm 傍晚 bàngwǎn ('dusk')
      • banđêm 晚上 wǎnshàng ('night')
      • bankhuya 半夜 bànyè ('midnight')
      • mồhôi 冒汗 màohàn ('sweat').
      • buồngngủ 臥房 wòfáng ('bedroom'), where 臥  (SV ngoạ) is reinterpreted as ‘'ngủ'.
      • hôicủa 盜劫 dàojié ('rob, loot'), alongside VS trộmcắp, trộmcướp.

      B. Questionable words of Chinese origin

      Beyond the obvious loanwords from Chinese  –  those whose phonology and semantics clearly align  –  there remains a long list of Vietnamese words that appear suggestive of Chinese origin but whose status is uncertain. Many of these items, while resembling Chinese forms, may also be connected to Mon‑Khmer, or even to other regional sources such as Malay or Thai. This makes their classification as Chinese cognates problematic. (For detailed etymological discussion, see earlier chapters.)

      i. Numerals

      • một, hai, ba, bốn, năm As previously noted, the numerals one through five may derive from Mon‑Khmer. However, Khmer lacks designated forms for six through ten, leaving the higher numerals in Vietnamese more difficult to trace.

      ii. Celestial terms

      • "blời" 日  → VS trời (‘sun’) The cluster /bl‑/ corresponds to /tr‑/. Such clusters, along with other /‑l‑/ glides, appear in texts from the 15th – 17th centuries, possibly reflecting Mường dialectal influence. Missionary activity in these regions during the Nguyễn dynasty may have preserved such forms. Phonologically, giời and ngày also align well with .

      • "blăng" 月 yuè → VS trăng (‘moon’) Related alternants include giăng and tháng. These may have developed as variants of mặttrời (太陽 tàiyáng) and mặttrăng (月亮 yuèliàng), where /b‑/ assimilated to /m‑/ and vocalized as mặt. The element mặt itself may be of Chamic origin. Otherwise, forms like giời (日) and giăng (月) do not fit neatly into a sound‑change scheme, since /gi‑/ would have to correspond simultaneously to both r‑ and y‑. This suggests an older stratum of initials (nh‑, j‑, jh‑, ng‑), as reflected in SV nhật (日) and nguyệt (月).

      iii. Other questionable items

      These examples illustrate the complexity of Vietnamese etymology, where Chinese parallels exist but are not definitive:

      • ăn 唵 ǎn ('to eat') [MC ʔəm < OC qoːmʔ ].

      • tóc 髮  → SV phát ('hair') [note irregular /f‑ ~ t‑/ ].

      • tai 耷 , 耼 dān, 耽 dān ('ear' variants in Chinese); cf. VS lỗtai ('ear'), possibly from disyllabic change: 耳 ěr → lỗ + 朵 duō → tai.

      • trai 丁 dīng → SV đinh ('man'), perhaps linked to trống or 公 gōng.

      • gái 娃  ('woman'), possibly related to 母  ('mother'). In Archaic Chinese, 子  could mean both ‘boy’ and ‘girl’.

      • voi 為 wēi ('elephant' in archaic usage); cf. 豫  also glossed as 'elephant' in VS.

      • lúa 來 lái ('millets' in archaic usage; modern 'come'). May parallel 稻 dào (SV đạo, 'rice').

      • không ('no, not') vs. 空 kōng ('empty, nothing'). Despite the semantic overlap, không is unlikely to derive directly from 空. Historically, không is a late development from chẳng. Prior to the 16th century, không was not used as the antonym of  ('to have'); only chẳng served this role. Expressions such as 並非 bìngfēi ('it is not') or 並不(是) bìngbú(shì) may have influenced the contraction into chẳngphải, which later yielded không.

      Summary  –  This set of examples demonstrates the ambiguous zone of Vietnamese etymology: words that resemble Chinese forms but also show ties to Mon‑Khmer or other regional languages. Their irregular phonological correspondences, semantic shifts, and late attestations make them questionable as direct Chinese loans. They represent a fertile area for future research, where substratal influence, dialectal variation, and contact with multiple language families must all be considered.

      Conclusion

      This chapter demonstrates that the history of Vietnamese vocabulary is best understood through the lens of sound change. Regular Sino‑Vietnamese loans follow predictable phonological correspondences, while irregular Sinitic‑Vietnamese borrowings reveal the effects of imperfect transmission, dialectal variation, and substratal influence. Alongside these, Khmer and Austroasiatic contributions, as well as a substratum of uncertain or indigenous words, complete the picture. By tracing systematic shifts in initials, finals, and tones, and by applying analogical and corollary methods, we see how these layers of sound change interact to form the Vietnamese lexicon we know today. Recognizing this dynamic process helps readers appreciate both the richness and the complexity of Vietnamese lexical history.


      References

      • Alves, M. J. (2008). Sino‑Vietnamese grammatical vocabulary and sociolinguistic conditions for borrowing. Australian National University. Retrieved from https://openresearch-repository.anu.edu.au

      • Alves, M. J. (2017). Identifying early Sino‑Vietnamese vocabulary via linguistic, historical, archaeological, and ethnological data. Bulletin of Chinese Linguistics, 10(1), 1–28. https://doi.org/10.1163/2405478X-01001001

      • Baxter, W. H. (1992). A handbook of Old Chinese phonology. Berlin: Mouton de Gruyter.

      • Haudricourt, A. G. (1954). De l’origine des tons en vietnamien. Journal Asiatique, 242(1), 69–82.

      • Karlgren, B. (1945). The Book of Odes: Chinese text, transcription, and translation. Stockholm: Museum of Far Eastern Antiquities.

      • Lanneau, G. (2025). A re‑examination of regular, unique and unusual Sino‑Vietnamese initial features (Doctoral dissertation). University of Washington. Retrieved from https://digital.lib.washington.edu

      • Maspero, H. (1912). Études sur la phonétique historique de la langue annamite: Les initiales. Paris: Imprimerie Nationale.

      • Nguyễn Ngọc San. (1993). Từ Hán Việt. Hà Nội: Nhà xuất bản Giáo dục.

      • Nguyễn Tài Cẩn. (1979). Nguồn gốc và quá trình hình thành cách đọc Hán Việt. Hà Nội: Nhà xuất bản Khoa học Xã hội.

      • Phan, J. (2013). Lacquered words: The evolution of Vietnamese under Sinitic influences from the 1st century BCE through the 17th century CE (Doctoral dissertation). Cornell University. Retrieved from https://ecommons.cornell.edu

      • Pulleyblank, E. G. (1991). Lexicon of reconstructed pronunciation in Early Middle Chinese, Late Middle Chinese, and Early Mandarin. Vancouver: University of British Columbia Press.

      • Võ, K. H. (2021). Changes in Vietnamese language from globalization and localization. EJS, Thu Dau Mot University, 1(2), 45–56. https://doi.org/10.37550/tdmu.EJS/2021.02.204

      • Wikipedia contributors. (n.d.). Sino‑Vietnamese vocabulary. In Wikipedia. Retrieved November 19, 2025, from https://en.wikipedia.org/wiki/Sino-Vietnamese_vocabulary




      FOOTNOTES


      (1)^ That “six‑strike rule” isn’t something you’ll find in the standard literature on Sino‑Vietnamese or Chinese historical phonology. It’s not a formal principle proposed by Karlgren, Pulleyblank, Baxter, Sagart, or other major figures in the field. What the author did in his earlier writing was take your own phrasing, where you described that if more than six consistent correspondences are observed, they should be treated as a rule rather than irregularity. In other words, the “six‑strike rule” is heuristic, not a published framework. 


      In historical linguistics more broadly, scholars do use similar reasoning: once a sound correspondence is attested in multiple independent examples, it is treated as systematic rather than accidental. But the specific threshold of "six" is not canonical; it’s a rhetorical way of saying "once we have enough examples, we can posit a rule." 
      That idea can be formulated into more standard historical‑linguistic terminology (e.g., "once a correspondence is attested across multiple lexical items, it should be treated as a regular sound change rather than irregular coincidence"), so keep your memorable "six‑strike" phrasing as a pedagogical device.

      (2)^  Pulleyblank, E.G. 1962. The Consonantal System of Old Chinese, Part II, AM 9
      Pulleyblank, E.G. 1984. Middle Chinese: A Study in Historical Phonology. Vancouver: University of British Columbia Press.
      Pulleyblank, E.G. 1991. Lexicon of Reconstructed Pronunciation In Early Middle Chinese, Late Chinese, and Early Mandarin. Vancouver: University of British Columbia Press.