Origins and Crossroads
Vietnamese occupies a singular position in Asian historical linguistics. It is conventionally classified as Austroasiatic, yet its lexicon, phonology, and orthographic history reveal deep entanglements with Sinitic and Yue‑Taic. This paper argues that Vietnamese is not merely a case of heavy borrowing, but a layered palimpsest whose study reshapes our understanding of language contact, register stratification, and orthographic reform.
1. The problem of classification
From the 19th century onward, scholars have debated whether Vietnamese should be grouped with Mon‑Khmer, with Sinitic, or with Tai‑Kadai. Each classification captures part of the truth, but none alone suffices. Vietnamese is best understood as a convergence language: Austroasiatic inheritance, Yue‑Taic substratum, and Sinitic superstratum. This layered identity makes it a test case for how we define “language families” in the first place.
2. Comparative lexical evidence
The following extended table illustrates the stratification of basic vocabulary across multiple domains:
| Gloss |
Sinitic- Viet-namese |
Sino‑ Viet-namese |
Chin. (MC/OC) |
Mon‑ Khmer |
Proto‑ Tai |
Notes |
|---|---|---|---|---|---|---|
| water | nước < nác < đák | thuỷ |
水 shuǐ < MC ɕjwi < OC *qʰʷljilʔ
(See * below) |
Khmer tuk, Mon dok. From Proto-Vietic *ɗaːk (“water”), from Proto-Austroasiatic *ɗaːkʔ (“water”). Cognates with Nghệan/Hàtĩnh dialects nác, Muong đác, Nguôn đác, Khmer ទឹក (tik), Bahnar đak, Eastern Mnong dak, Central Nicobarese râk/dâk, Santali ᱫᱟᱜ (dak’), Sanskrit दक (daka). | *nam* | nước aligns phonetically with Tai *nam*; thuỷ is is plausibly a interchange of /dak/ |
| fire | lửa | hoả | 火 huǒ < MC hwa < OC *qʰʷaːlʔ | Khmer phlɛŋ | *fai* | lửa is Austroasiatic; hoả is Sino‑Vietnamese; Tai *fai* shows parallel innovation. |
| mother | mẹ < mợ < vú < u | mẫu | 母 mǔ < MC məw < OC *mɯʔ | | Khmer mday | *ma* | mẹ is Austroasiatic; mẫu is Sino‑Vietnamese; Tai *ma* shows universal root. |
| sky | trời | thiên | 天 tiān < MC tʰɛn < OC *qʰl'iːn | — | *hlɯi* | trời aligns with Tai; thiên is a learned borrowing. |
| tooth | răng | linh | 齡 líng < MC lɛjŋ < OC *reːŋ | Mon rang | *hnɯŋ* | Austroasiatic alignment stronger for răng; nha is literary. |
| head | đầu / tróc | thủ | 頭 tóu < MC dəw < OC *do:ʔ | Khmer tpoal | *thaw* | Layering suggests vernacular doublets plus Sino‑Vietnamese register. |
* The Case of 'nước':
1) The character 地 dì (‘earth’) in the Kangxi Dictionary also encompasses the variant 坔, which combines the phonophoric element 'dák' with the semantic radical 水 ‘water’. This structure parallels other characters such as 踏 tă (‘to tread’) and 泰 tài (‘great’), and ultimately conveys the meaning of ‘earth’ (土 thổ). The Kangxi entry under 土部·四 glosses 坔 as: 《集韻》與 地 同 — “Identical in meaning with 地.”
(2) The Old Chinese (OC) reconstruction *tujʔ, terminating in a glottal stop -ʔ or its variant -k, corresponds to the word ‘water’. In ancient Vietnamese, this was pronounced /dak/. Several Chinese characters—such as 踏 tà, tā (‘to tread’) and 沓 tà, dá (‘dense, layered’)—feature the 水 radical and share this final -k or -ʔ in their OC forms. This phonological pattern supports the plausibility of a Vietnamese reflex /dak/, which, through regular sound change, yields /nak/—the modern Vietnamese word nước (‘water’).
(3) On the phonetic correspondence ¶ /sh- ~ n-/: Consider 說 shuō (‘to speak’) and 山 shān (‘mountain’). While most Sinitic dialects preserve an initial sibilant (s-), Hainanese notably pronounces 山 as /tui³/, indicating a shift from /s-/ to /t-/. This phonetic evolution mirrors the transformation of 山 into a form like noa, which subsequently becomes non—the Vietnamese term for ‘mountain’.
3. Semantic grids and register stratification
Vietnamese often maintains doublets: a native vernacular term and a Sino‑Vietnamese learned form. The grid below illustrates register stratification:
| Domain |
Sinitic‑ Vietnamese |
Sino‑ Vietnamese |
Notes |
|---|---|---|---|
| mother | mẹ < mợ < mụ < u | mẫu, mô | vernacular vs. formal: 母 mǔ, mú, wǔ, wú< MC məw < OC *mɯʔ |
|
heaven / day |
trời < giời < ngày | nhật | vernacular vs. literary: 日 (𡆠) rì, mì < MC ȵit < OC *njiɡ |
| water | nước |
thuỷ (See * above) |
vernacular vs. literary: 水 shuǐ < MC ɕjwi < OC *qʰʷljilʔ |
| book | sách | thư | Cf. SV 'sách' 冊 (册) cè < MC tsjɐik < OC *tʂrēk, *shreːɡ |
4. Orthographic layers
- 漢字 (Hántự): Classical Chinese script for administration and scholarship.
- chữNôm (字喃 zìNán): Adaptation of Chinese graphs for vernacular Vietnamese.
- Quốcngữ: Latin‑based script introduced in the 17th century, standardized in the 20th.
Division of Historical Periods in the Development of the Vietnamese language
| A | Proto-Vietnamese | 2 languages in use: Ancient Chinese (a vernacular Mandarin spoken by the ruling class) and Vietnamese; 1 Chinese writing script | the 8th and 9th centuries |
| B | Archaic Vietnamese | 2 languages in use: Ancient Chinese and Archaic Vietnamese (spoken by the ruling class); 1 Chinese writing script | the 10th, 11th, and 12th centuries |
| C | Ancient Vietnamese | 2 languages in use: Ancient Vietnamese and Classical Chinese; 2 Chinese and Chinese-based Nôm scripts | the 13th, 14th, 15th, and 16th centuries |
| D | Middle Vietnamese | 2 languages in use: Middle Vietnamese and Classical Written Chinese; 3 Chinese writing scripts: Chinese and Nôm scripts, and National Romanized Quốcngữ writing system | the 17th, 18th, and the first 1/2 of the 19th centuries |
| E | Early contemporary Vietnamese | 3 languages in use: French, Vietnamese and Classical Written Chinese; 4 writing scripts: French, Chinese, Nôm, National Romanized Quốcngữ writing systems | during the rule of the French colonial government |
| F | Modern Vietnamese | 1 language in use: Vietnamese; 1 National Romanized Quốcngữ writing system | From 1945 until present |
Based on the formation of the Hán-Việt pronunciation of the Middle Chinese, Annam Dịchngữ (安南譯語 'Translated Annamese Words') and the Annamese-Latin-Portugese Dictionary by Alexandre de Rhode (1651), H. Maspero devised similar division of 5 development periods:
A) Proto-Việt (prior to the 9th century)
B) Archaic Vietnamese: the 10th century (formation of the Hán-Việt)
C) Ancient Vietnamese: the 15th century (Annam Dịchngữ)
D) Middle Vietnamese: the 17th century (Dictionary by A. de Rhôde 1651)
E) Contemporary Vietnamese (19th century)
Source: Table by Nguyễn Tài Cẩn (1998, p. 8) quoted by Bùi Khánh-Thế.
5. Methodological implications
- Rigorous sound correspondences across strata.
- Semantic stability in core vocabulary.
- Register awareness (vernacular vs. literary vs. technical).
- Historical context: conquest, migration, and cultural prestige.
6. Conclusion
Vietnamese matters in the Sinitic‑Austroasiatic debate because it refuses to fit neatly into one family. It is a frontier language, layered by Austroasiatic inheritance, Yue‑Taic substratum, and Sinitic superstratum. Its study forces us to rethink not only Vietnamese, but the very categories of “family,” “borrowing,” and “substratum” in historical linguistics.
- Vietnamese is a convergence language, not a simple daughter language.
- Lexical strata (vernacular, Sino‑Vietnamese, Yue‑Taic) coexist and stratify by register.
- Orthographic history (Hántự, chữNôm, Quốcngữ) mirrors lexical layering.
- Comparative method must integrate inheritance, substratum, and borrowing.
Footnotes
- Haudricourt, André‑Georges (1954). “De l’origine des tons en vietnamien.” Journal Asiatique.
- Schuessler, Axel (2007). ABC Etymological Dictionary of Old Chinese. University of Hawai‘i Press.
- Baxter, William H.; Sagart, Laurent (2014). Old Chinese: A New Reconstruction. Oxford University Press.
- Nguyễn, Tài Cẩn. 1979. Nguồn gốc và Quá trình Hình thành Cách đọc Âm Hán Việt. TP HCM: NXB Khoa học Xã hội.