Between Austroasiatic Roots And Sino-Tibetan Layers
by dchph
Vietnamese linguistic identity cannot be understood in isolation from its long and complex interaction with Chinese. While the language retains a native Austroasiatic foundation, centuries of contact have layered it with Sinitic elements that shape its vocabulary, phonology, and literary traditions. The result is a hybrid system in which indigenous and borrowed forms coexist, often in complementary roles.
This article dismantles the prevailing myth of Vietnamese linguistic origin, the Mon‑Khmer hypothesis, and proposes a categorical realignment. Rather than tracing Vietnamese to a distant Austroasiatic ancestry, the chapter situates it within a continuum of deep entanglement with Sino‑Tibetan systems. It argues that the Vietnamese basic lexicon – terms for kinship, anatomy, natural elements, and daily life – is overwhelmingly cognate with Old and Middle Chinese etyma, not peripheral Mon‑Khmer roots.
I) Restating the core thesis
Reframing Vietnamese linguistic identity requires situating it within its long and intimate interaction with Chinese. The decisive shift is clear: Vietnamese is not merely a language peppered with borrowed Chinese words, but one structurally aligned with Sinitic syntax, phonology, and semantics. Western scholarship’s fixation on genetic lineage is critiqued here, likened to insisting that French must be Gaulish rather than recognizing it as a Romance language. In parallel, Vietnamese has evolved through centuries of contact, migration, and imperial integration, most visibly during the Ming occupation (1407-1427), when Chinese served as the language of administration and scholarship.
This paper challenges the Mon‑Khmer camp’s reliance on surface‑level wordlists that lack historical phonological reconstruction. By juxtaposing Nguyễn Ngọc San’s examples with plausible Chinese cognates, it demonstrates that many alleged Mon‑Khmer roots are more convincingly explained through Sinitic etymology. Comparative tables align Vietnamese terms with Mon‑Khmer, Tai‑Kadai, and Old Chinese equivalents, consistently showing that the strongest cognacy lies with Chinese.
In this light, the chapter marks a turning point in the Sinitic‑Vietnamese thesis. It shifts the inquiry from speculative ancestry to linguistic substance, asserting that Vietnamese belongs – structurally, semantically, and historically – within the Sinitic family. Annotated examples, etymological grids, and historical scaffolding lay the foundation for a paradigm shift in Vietnamese classification.
Crucially, the argument does not claim that Vietnamese directly evolved from a Sino‑Tibetan root. Rather, from a historical and anthropological perspective, Vietnamese forebears are believed to have originated in southern China. As shown in the preceding analysis, Vietnamese shares a vast body of basic vocabulary with Sino‑Tibetan etyma.
At a deeper level, the question is not which root a language once sprang from, but where it belongs categorically. What matters is how a language presents itself as a system, its defining traits integrated into a coherent whole. Identifying the true nature of Vietnamese means recognizing its resemblance to other Chinese languages, rather than metaphorically digging a trench in a tunnel that already blocks the northern light.
The theorization of a genetic link to the Austroasiatic Mon‑Khmer family, if it existed at all, belongs to a remote prehistory. Western scholars who remain attached to the Mon‑Khmer hypothesis might consider the case of French: speakers do not use Gaulish, but French, a Romance language, akin to Italian or Spanish. By analogy, Vietnamese is not an extant aboriginal Yue tongue, but a language that has long shared basic vocabulary with multiple roots, including Sino‑Tibetan, since antiquity.
II) Between Austroasiatic roots and Sino‑Tibetan layers
Vietnamese linguistic identity is best understood as a hybrid system, where indigenous Austroasiatic foundations intertwine with centuries of Sinitic influence to produce a language that embodies both continuity and transformation.
In so far as research on the origin of Vietnamese is concerned, if it relies solely on analysis of basic cognates with a handful of Mon‑Khmer words, such an approach cannot nullify the overwhelming commonalities between Chinese and Vietnamese across virtually all linguistic aspects, not limited to vocabulary alone. Specialists of Vietnamese may still pursue other paths by justifying cognates within a wider etymological spectrum that spans Sino‑Tibetan, Chinese, and Mon‑Khmer elements.
For the time being, whatever progress the Mon‑Khmer camp may achieve, their elements can only be regarded as a taxonomical scheme for Vietnamese unless the Sino‑Tibetan etymologies presented in this survey are formally acknowledged. What is at stake is not the nativity of the language at birth, whether by locality or anthropology, but the resulting product of a mixed stock in which Chinese origin is massive, including the basic stratum. The focus must be on the wholeness of Vietnamese as it appears today, not on genetic affinity inferred from a few basic words that happen to fit the Austroasiatic platform. With the same lexical substance, Mon‑Khmer elements have left a stronger imprint in the Mường dialects than in Vietnamese proper.
Terminologically, ancestral nativity may be designated as "the aboriginal," a concept parallel to "Yue" in Chinese historical records. Yet the term "Yue" cannot be extended to cover all the racial composition of indigenous speakers in Indo‑China, such as Austronesian and Austroasiatic peoples who were ancestors of the Chamic and Mon‑Khmer groups, now minorities in Vietnam. These groups inhabited the northern, western, central, and southern regions long before the Annamese advanced there. In other words, the direct descendants of those Southeast Asian groups are not necessarily related to the Kinh, the majority of modern Vietnam, who arose from later racial fusion with waves of migrants from the north over the last 2,200 years, rather than from Austronesian or Austroasiatic forebears who spread further south into the Malayan peninsula, the Indonesian archipelago, and the Pacific islands 8,000 to 6,000 years ago.
In modern Vietnam, with territories annexed as late as the 18th century from the extinct Champa and Khmer kingdoms, the Annamese continued to move south and became the dominant Kinh majority over minorities who had long controlled those lands. The Kinh population absorbed descendants of earlier northern immigrants and later settlers in the south, including Chamic, Khmer, and Ming "boatpeople" refugees from China's southern coast after the fall of the Ming Dynasty in the 17th century.
Throughout national development, contact with other languages had little impact on the basic vocabulary of modern Vietnamese. Many later fundamental words are only add‑ons to existing Sinitic forms, such as 陽 yáng ~ V.S "nắng" (sunshine), 湯 tàng > VS "nóng" (hot liquid), 煬 tāng > VS "nung" (fuse), 貌 máo ~ VS "màu", VS 模 mó ~ VS "mẫu" (model), 姊 zǐ > VS "chị", 姐 jiě > "chế", 餅 bǐng > VS "bánh", etc. Local Mon‑Khmer loanwords should be treated in the same way: elements already present in the lower stratum, confined to small geographic pockets, encountered when the Annamese resettled and came into contact with native Mon‑Khmer speakers. Such patterns of admixture became more visible after the 18th century when southern territories were annexed.
Sinologists who study the formation of Cantonese and Hokkien recognize that Chinese elements grew atop earlier substrata in the very same way. This survey emphasizes the Sino‑Tibetan basic etyma within the Mandarin spectrum that contribute to such top layer. The point is underscored by the historical fact that, during the two decades from 1407 to 1427 when Vietnam was a province under Ming dominion, Chinese was taught as a living language and required for all imperial examinations. That legacy, in turn, permeates every aspect of modern Vietnamese. (1)
As for fundamental words, every people must have had a minimum set of basic vocabulary from the dawn of their existence. It is difficult to imagine borrowing for kinship, body parts, natural phenomena, or daily survival terms. While many Vietnamese basic words appear cognate with both Mon‑Khmer and Chinese, the relationship between Chinese and Vietnamese extends far beyond the basic stratum. The pressing issue is to interpret this affinity: specialists often either ignore it, treating such words as "purely Vietnamese", or insist on unique cultural origins, denying them as cultural loans. Examples include 標杆 piāogàn > VS "câynêu" (New Year banner pole) or "bánhtét" (bánh Tết: 餅 bǐng + 節 jié). Therefore, recognition of 節 jié as SV "tiết" > VS "Tết" shows how one character carries meanings of "season", "harvest", and "festival", and how the entering tone 陽入聲 yángrùshēng explains the final /‑t/ shared by Vietnamese and many Chinese dialects while official Mandarin is having none.
For all such traits, Vietnamese and Chinese histories reinforce one another. Cultural affiliation is evident in ways absent from Mon‑Khmer or Tai‑Kadai. Yet some Vietnamese scholars still claim purity, arguing that their forefathers "twisted" pronunciations so Sino‑Vietnamese would no longer sound Chinese. Such views reflect patriotic nationalism more than linguistic reality. They overlook the historical fact that Vietnam was part of China for a thousand years until 939 A.D. and remained a vassal state in later periods. These historical highlights should remind us of the true identity of Vietnamese basic words that are cognate with Mandarin Chinese in the lager part.
One historical linguist, Nguyễn Ngọc San (1993, pp. 105-120), acknowledged that Chinese records exist which allow certain Vietnamese basic words to be traced back to Old Chinese. Yet he sided with other scholars who argued that Vietnamese is a hybrid language, evolving from multiple sources built upon an Austroasiatic Mon‑Khmer stratum. The difficulty, however, is that no historical evidence has been provided to substantiate the claims for Austroasiatic Mon‑Khmer or Tai‑Kadai elements in the Vietnamese basic lexicon.
To compensate for the absence of historical reconstruction comparable to what is available in Chinese historical phonology, Nguyễn Ngọc San relied on modern wordlists from living Mon‑Khmer languages. He cited examples such as "chrohom" ~ Vietnamese "chồmhỗm" (squat), "choho" ~ Vietnamese "chòhõ" or "chànghãng" (stand), "rôsao" ~ "laoxao" (bustling), and "comhai" ~ "hơi" (breath). These parallels, however, are no more convincing than the surface similarities that appear between modern Putonghua and Vietnamese, such as 明兒 míngr ~ VS "mainầy" (tomorrow), 受不了 shòubùliăo ~ VS "chịukhôngnổi" (cannot stand it), or 了不起 liăobùqi ~ VS "nổibật" (outstanding).
Below are some of the samples listed by Nguyễn Ngọc San, which he claimed to be typical proofs that the Vietnamese vocabulary stock was built from different sources – primarily Mon‑Khmer, Daic (Tày‑Thái in his classification), and Old Chinese – though in many cases we can still identify plausible Chinese cognates.
Table 1 - Mon-Khmer roots (by Nguyễn Ngọc San)
| English meaning | Mon-Khmer | Vietnamese | Chinese cognates by dchph |
|---|---|---|---|
| braid | pul | búi, múi | **** 襆 pú |
| break | tưt | đứt, dứt, nứt | **** 斷 duàn |
| carry | păng | bưng | **** 捧 pēng |
| chest | ngức | ngực, ức | ****** 臆 yì |
| clip | thkiep | cặp, gắp | ****** 夾 jiá |
| dragon | tơluông | rồng, thuồngluồng, đuống (cầuđuống) (?) | ****** 龍 lóng |
| end | tot | chót (vót), tót | *** 卒 zú |
| flies | rui | ruồi, dòi | **** 蠅 yíng |
| gnaw | khăm | cắm, cắn, gặm | **** 啃 kěn |
| nest | t'ôh | ổ, tổ | **** 窩 wō |
| pinch | peo | béo, véo | *** 揑 niē |
| roll | kbên | bện vấn, quấn | **** (編)圈 (biān)quăn |
| round | kvenh | quàng, vành, quanh | **** 環 huán |
| shake | tunl | đun, dun(dảy), | **** 動 dòng |
| slice, tear | cheek | chẻ, xé | **** 切 qiè ('chẻ), **** 撕 cì (xé) |
| smear | blei | trây, giây, bây (bai bây) | *** 塗抹 túmó |
| smelly | sôui | thối, hôi | ****** 臭 chòu |
| snake | t'an | rắn | **** 虵 yé |
| sow | pon | bón, vón, vunh | **** 播 bō |
| stretch | chang | chạng, dạng (chân) | **** 張 zhāng |
| wall lizard | t'lan | thằnlằn, trăn (?) | **** 蝘蜓 yǎntíng |
| water | tưk | nắc, nước | *** 水 shuǐ (Cf. đák: 淂 dé, SV đắc) |
| wave | kvơ | quơ, vơ, huơ | ****** 揮 huī |
| with | pơơi | mới, với | **** 與 yú |
Table 2 - Tai-Kadai roots (by Nguyễn Ngọc San)
| English meaning | Tai-Kadai | Vietnamese | Chinese cognate by dchph |
|---|---|---|---|
| throw | quăng | quăng, văng | **** 扔 rèng |
| strike | phang | phạng, phang (đánh) | **** 搒 páng |
| stretch | chăng | chăng, giăng | **** 張 zhāng |
| short | cọt | cộc, cụt, ngủn | *** 短 duăn |
| recognize | nhin | nhận | **** 認 rèn |
| radiate | choả | toả, xoả | *** 射 shè |
| prick | chộc | chọc, xọc | *** 扎 zā |
| pounce | tup | đập, dập | *** 踏 tă |
| piece | pjêng | miếng, mãnh | **** 片 piàn |
| peel | poóc | bóc, vót | **** 剝 pō |
| here | nấy | nầy, đây | 茲 zī |
| grasp | pôôc | bốc, vốc | *** 抓 zhuā |
| graceful | rủngrỉnh | đủngđỉnh, dủngdỉnh | *** 婷婷 tíngtíng |
| father | pò | bố, bú, bọ | ****** 父 fù |
| empty | rồng | rỗng, trống | **** 空 kōng |
| carry | đeo | đeo, neo, đèo (bòng) | *** 戴 dài |
| ? | cọn | cọn, guồng | ? |
| ? | khẳn | khẳm, khẳn, hăm | ? |
Table 3 - Chinese roots (by Nguyễn Ngọc San)
| English meaning | Old Chinese | Vietnamese | Chinese cognate by dchph |
|---|---|---|---|
| board | pan | bản, ván | ****** 板 băn |
| break | pwo | bửa, phá, vỡ | **** 破 pò |
| decadence | toi | đồi (bại), tồi | **** 墮 duò |
| drive out | khu (trục), khua, xua | ****** 驅 qū | |
| embroider | suo | tú (cẩm), thùa thêu | ****** 繡 xìu |
| gaze | tăm | đăm, chiêm (ngưỡng), nom | **** 瞻 zhàn |
| hard | k'u | khổ, khó (nhọc) | **** 苦 kǔ |
| hit home | tung | đúng, trúng | ****** 中 zhòng |
| last | tsuot | tốt (lính), chót (cuối), sót (đểlại) | **** 卒 zú |
| mend | pu | bổ, vá | ****** 補 bǔ |
| pursue | tweir | đuổi, truy | **** 追 zhuī |
| side | pjen | bê, biên, men, ven viền | **** 邊 biān |
| skillful | k'jiao | kháu, khéo, xảo | ****** 巧 qiáo |
| sprinkle | sai | tưới, rưới, rảy | ****** 灑 să |
| then! | pyot | tất (cả), sốt, sất (không có gì sốt!), tuốt | ** 唄 bei! |
| wife | piwo | bụa, phụ, vợ | ****** 婦 fù |
III) Concluding point
It is no surprise that the Daic-Kadai appear to be less plausibly cognate to those etyma in Vietnamese. What else can one expect the language of a nation would have become after 1,000 years of foreign domination? Take a quick look at them and you can draw your own conclusion.
A. Austroasiatic camp
To wrap up, the following arguments are representatives for the rest of all fundamental vocabulary: No Sinitic borrowing erases the Austroasiatic bedrock – kinship and daily life remain native. Say, these three words (mẹ, ăn, uống) are indigenous Austroasiatic roots in Vietnamese, not Sino‑Vietnamese borrowings. They anchor kinship and everyday life vocabulary, contrasting with Sino‑Vietnamese layers that dominate abstract, political, and scholarly domains.
1. mẹ ('mother')
- AA etymology: From Proto‑Vietic *meːʔ ~ mɛːʔ, ultimately Proto‑Austroasiatic *meʔ.
- Cognates: Bahnar mĕ, Khmer mae, Mon mìˀ, Khasi mei.
- Note: This is a classic nursery‑type root across Austroasiatic, showing the widespread m‑ onset for kinship terms.
2. ăn ('eat')
- AA etymology: From Proto‑Vietic *ʔan, inherited from Proto‑Austroasiatic *ʔan/ʔaan.
- Cognates: Khmu ʔan, Proto‑Katuic *ʔoon, Bahnaric *ʔun.
- Note: The form is stable across Vietic and Austroasiatic branches, marking a core verb of daily life.
3. uống ('drink')
- AA etymology: From Proto‑Vietic *ʔuəŋ, traced to Proto‑Austroasiatic *ʔuŋ/ʔuək.
- Cognates: Found across Austroasiatic languages (e.g., cognate sets in Mon‑Khmer).
- Note: Like ăn, this verb is part of the basic Austroasiatic lexicon, later integrated into Vietnamese tonal phonology.
B. Chinese camp:
Counter-argument: For nearly every proposed Austroasiatic root, there exist corresponding Sinitic overlays, often multiple layers, that complicate or challenge the attribution.
For nearly every item of fundamental vocabulary, plausible Chinese cognates emerge, often with multiple overlays across Old Chinese, Middle Chinese, and Sino‑Vietnamese strata. These correspondences show that Vietnamese is not simply sprinkled with borrowings but structurally aligned with Sinitic phonology, semantics, and syntax. The comparative evidence consistently demonstrates that the strongest cognacy lies with Chinese, underscoring the hybrid identity of Vietnamese as both rooted and layered.
No Austroasiatic proposal stands alone, Sinitic echoes resound in almost every root:
Table 4 - Austroasiatic vs. Sinitic‑Vietnamese vocabulary
| Semantic field | Austroasiatic | Chinese/Sino-Vietnamese | Gloss |
|---|---|---|---|
| Kinship | mẹ (Proto‑Austroasiatic *meʔ) | 母 mǔ (SV mẫu) | 'mother' |
| Nature | lá (laʔ in Khmu) | 葉 yè (SV diệp) | 'leaf' |
| Everyday life | ăn (Proto‑AA *ʔan/ʔaan) | 唵 ǎn (SV àm, VS ăn) | 'eat' |
| Everyday life | uống (Proto‑AA *ʔuŋ/ʔuək) | 飲 yǐn (SV ẩm) | 'drink' |
| Abstract | — | 學 xué (SV học) | 'study' |
| Political | — | 國 guó (SV quốc) | 'nation' |
| Philosophical | — | 哲學 zhéxué (SV triếthọc) | 'philosophy' |
and more with variants, derivatives, doublets, and triplets:
1. mẹ ('mother'): 母 mǔ, mú, wǔ, wú (mẫu, mô) < MC məw < OC *mɯʔ || Derivatives: (1) mẫu, (2) mô, (3) men, (4) mẻ, (5) mạ, (6) mệ, (7) mợ, (8) me, (9) mái, (10) cái, (11) vú, (12) u, (13) mụ, (14) mẹ
2. ăn ('eat'): 唵 ǎn (àm, ảm) < MC ʔəm < OC qoːmʔ
Etymology: Schuessler (2007) proposes this as an endoactive derivation from 喑 (OC qɯːm, ‘mute; silent’). Ultimately traceable to Proto‑Sino‑Tibetan *m‑ʔum ~ mum (‘to hold in mouth; to chew; to eat; to kiss’). Compare Tibetan ཨུམ um (‘kiss’) (STEDT; Benedict 1972; Coblin 1986; Schuessler 2007).
Synonyms and extensions:
- 餌 ěr (SV nhĩ, VS ăn, 'take a bite')
- 食 shí (SV thực, VS xơi, sực, 'eat')
- 吃 chī (VS xơi, ăn, 'eat'; cf. phonetic 乙 yǐ, SV ất)
- 輕咬 qīngyáo (VS ănnhẹ, 'nibble')
- 用膳 yòngshàn (VS ăncơm, 'dining')
- 宴席 yànxí (SV yếntiệc, VS ăntiệc, 'feast')
- 挨打 ǎidǎ (VS ănđòn, 'get beaten')
- 贏錢 yíngqián (VS ăntiền, 'win a bet')
3. uống ('drink'): 飲 yǐn (SV ẩm) < MC ʔɯim, ʔjim < OC *qrɯmʔ, *qrɯms || Extensions: (1) dzô, (3) dô, (4) nhấm; include modern hớp 吸 xī (SV hấp, 'sip'), hớp 喝 hè (SV hát, 'drink'), nhấmnháp 攝入 shèrù (SV nhiếpnhập, 'consume'), 歃血為盟 shàxuèwéiméng (VS uốngmáuănthề, 'pledge by drinking blood').
In short, once cultural and historical extensions are taken into account, the figurative scope of each concept expands without limit – from kinship meals to idiomatic expressions of gain, loss, and social experience. As illustrated by the word ăn, these extensions unfold within a Sinitic framework, not an Austroasiatic one.
Conclusion
The paper critiques the Mon‑Khmer hypothesis, exposing its reliance on surface.
Vietnamese linguistic identity must be reframed through its deep interaction with Chinese. Rather than a language sprinkled with borrowed terms, Vietnamese is structurally aligned with Sinitic syntax, phonology, and semantics. Centuries of contact, migration, and imperial integration, most notably during the Ming occupation (1407-1427), embedded Chinese as the language of administration and scholarship. This challenges Western scholarship’s fixation on genetic lineage, showing that Vietnamese, like French in relation to Latin, belongs categorically to a family shaped by historical substance rather than speculative ancestry.
By presenting both the historical context and the linguistic evidence within this paper, readers can follow the argument without reference to prior studies and draw conclusion themselves. The reframing highlights Vietnamese as a language forged through contact, adaptation, and resistance, a witness to cultural convergence and a voice of its own.
FOOTNOTES
(1)^ Suppression of Vietnamese culture
When the Ming invaded; all classical Vietnamese
printing blocks, books and materials were burned and suppressed.
Vietnamese records like gazettes, maps, and registers were
instructed to be burned, saved for one copy.
This policy
was strictly enforced by Yongle emperor. His command to the army in
Vietnam in July 1406 is as follow:
兵 入。 除 釋 道經板 經文 不 燬。 外 一切 書板 文字 以 至俚俗童蒙
所 習。 如 上 大人 丘乙 已 之類。 片紙 隻 字 悉 皆 燬 之。
其 境內 中國 所 立碑 刻 則 存 之。 但是 安南 所 立者 悉 壞 之。
一字 不存。
"Once our army enter Annam (Vietnam currently),
except Buddhist and Taoist text; all books and notes, including
folklore and children book, should be burnt. The stelas erected by
China should be protected carefully, while those erected by Annamese
(Vietnamese currently), should be completely annihilated, do not
spare even one character."
Yongle's command on 21
May 1407 read:
"I have repeatedly told you all to burn all Annamese books,
including folklore and children books and the local stelas should
be destroyed immediately upon sight. Recently I heard our soldiers
hesitated and read those books before burning them. Most soldiers
do not know how to read, if this policy is adapted widely, it will
be a waste of our time. Now you have to strictly obey my previous
command, and burn all local books upon sight, without
hesitation."
For this reason almost no vernacular chữNôm texts
survive from before the Ming invasion. Various ancient sites such as
pagoda Bao Minh were looted and destroyed. The Ming dynasty applied
various Sinicization policies to spread more Chinese culture in the
occupied nation.
Source: https://en.wikipedia.org/wiki/Fourth_Chinese_domination_of_Vietnam