Thursday, April 3, 2025

Chapter 9 - Similarity In Cross-Linguistic-Family Vocabularies Proves No Genetic Relation


 Executive summary

  1. The underlined stratum of basic vocabularies
    Vietnamese core lexicon reflects a layered history of contact and inheritance. Beneath the later Sino‑Vietnamese overlay lies a substratum of indigenous and Mon‑Khmer elements, themselves intertwined with Yue and proto‑Vietic speech. These strata reveal that Vietnamese is not a simple offshoot of Mon‑Khmer, but a hybrid language shaped by multiple converging traditions in the Red River basin.

  2. Haudricourt's theory of tonal development
    Haudricourt’s 1954 proposal—that Vietnamese tones arose late from the loss of final consonants—remains influential, yet it requires re‑examination. Evidence from Old Chinese, Middle Chinese, and early Sino‑Vietnamese loans suggests that tonal categories were already present in Vietnamese centuries before the 12th century. Rather than a late innovation, tonality in Vietnamese developed in parallel with Chinese, reinforced by a millennium of close contact.

  3. Correspondences in basic vocabularies revisited
    A fresh look at basic vocabulary shows that many items long attributed to Mon‑Khmer substrata also display clear cognacy with Chinese and Sino‑Tibetan forms. Words such as chó ‘dog’, ‘chicken’, and lúa ‘paddy’ demonstrate that Vietnamese shares fundamental etyma with Chinese, often predating Middle Chinese. These correspondences challenge the Austroasiatic hypothesis and point instead to a deeper, shared ancestry within the Sino‑Tibetan–Yue continuum.


x X x


The preceding chapter has shown how deeply Vietnamese is rooted in the Yue–Sinitic continuum, with disyllabicity and inversion serving as diagnostic tools for etymology. Yet this is only one side of the story. Vietnamese also bears the imprint of Mon‑Khmer migrations, whose vocabulary and cultural practices penetrated the Red River basin and left a lasting substratum in the language. Chapter 8 turns to this Mon‑Khmer association, examining how these layers interacted with the Yue foundation and the later Sinitic overlay to produce the complex etymological mosaic we recognize today.

Vietnamese emerges not as a late‑tonalized Mon‑Khmer language, but as a language forged at the crossroads of Yue, Chinese, and Austroasiatic influences. Its tonal system and basic vocabulary testify to a long history of interaction and parallel development with Chinese, demanding a reassessment of entrenched theories about its genetic affiliation.

The Austroasiatic Mon‑Khmer theory of Vietnamese‑Khmer affiliation has gained recognition for drawing attention to genuine lexical parallels between Vietnamese and southern Mon‑Khmer languages. By emphasizing shared vocabulary items, particularly in the semantic domains of agriculture, kinship, and daily life, the framework has underscored the importance of Austroasiatic influence in shaping the Vietnamese lexicon. This perspective has also provided a counterweight to Sinitic‑centered narratives, reminding scholars that Vietnam’s linguistic heritage is layered and cannot be reduced solely to Chinese contact. In this sense, the Mon‑Khmer hypothesis has played a valuable role in broadening the scope of comparative research and situating Vietnamese within the wider Austroasiatic family.

At the same time, the Mon‑Khmer framework rests on a relatively narrow foundation, relying heavily on lexical comparisons drawn from southern vocabulary while giving less attention to phonological systems, morphosyntactic structures, and historical migration patterns. Such a selective approach risks overstating Khmer‑Vietnamese affinity while underrepresenting the influence of Yue substrata and Sinitic superstrata. For this reason, the theory warrants closer scrutiny and reevaluation within a broader historical and comparative context. A more comprehensive account of Vietnamese origins must integrate Austroasiatic evidence with Sinitic, Tai‑Kadai, and Yue elements, recognizing the complex interplay of contact, convergence, and inheritance that has shaped the language over millennia.

I) The underlined stratum of basic vocabularies

A central question in Vietnamese historical linguistics is whether the language should be regarded as a hybrid, layered through centuries of contact with neighboring peoples, or as a direct descendant of a single ancestral stock. One hypothesis posits that an ancestral root, which we may call Taic, gave rise to both the Yue and Daic linguistic families. These spread across southern China, including the northeastern Red River Basin of northern Vietnam, where aboriginal communities long cultivated irrigated rice.

Over time, large numbers of Mon‑Khmer speakers from what is now northern Cambodia and southern Laos resettled in the fertile delta. Their subsistence practices—hunting and shifting cultivation on dry fields—contrasted with the wet‑rice agriculture of the Yue. With them came Mon‑Khmer vocabulary, which penetrated the native Viet‑Muong speech and accounts for the many basic Mon‑Khmer words preserved in Vietnamese today. This ancient language was likely spoken by the people of the Phùng Nguyên culture and by the legendary subjects of the Hùng kings some three millennia ago. It would not have sounded like modern Vietnamese. (H)

The arrival of the Han in 111 B.C. profoundly altered this linguistic landscape. Old Chinese reshaped the Yue speech of the majority, leading to the split of the Viet‑Muong continuum into Mường and Vietic (early Annamese), both heavily infused with Chinese elements. From this entanglement of Yue, Mon‑Khmer, and Chinese influences, a distinct Sinitic‑based Vietnamese language began to crystallize by the 10th century.

Linguists who support the Austroasiatic hypothesis argue that Vietnamese descends from the Mon‑Khmer branch of the larger Austroasiatic family. They point to fossilized remnants in Mon‑Khmer that appear as substratal layers in Vietnamese, including a set of basic words that may have remained stable for as long as 15,000 years (Zachary Stieber, Ancient Languages Have Words in Common). (S)

The position advanced here, however, is that both Vietnamese and Mon‑Khmer may ultimately derive from a Yue‑related ancestral language, Taic. This proto‑language could have given rise not only to Tai‑Kadai (Ding Bangxin, 1977) (T) and Sinitic within the Sino‑Tibetan family, but also to Austronesian and Austroasiatic branches. (菲)

The debate over the Austroasiatic origin of Vietnamese has persisted for more than a century. The prevailing view still classifies Vietnamese as a Mon‑Khmer descendant, citing numerous basic words scattered across Mon‑Khmer languages and even some cognates with Munda. Yet the deeper picture suggests a more complex genealogy, rooted in the linguistic mosaic of southern China and the Red River basin. (A)

With respect to those Austroasiatic languages, Norman (1988) noted that they “are spoken over a vast geographic range: the Munda languages in north Western India, Khasi in Assam, Palaung-Wa and Mon in Burma, the Mon-Khmer languages in Indo-China, Vietnamese and Muong in Vietnam [...] and were once spoken much more widely in China.” (pp. 7-8)

Figure 1 – Visual view of linked kinship of Vietnamese
with other major linguistic families and their sub-strata

Sino-Tibetan Proto-Taic
Proto-Tibetan Proto-Chinese Yue Austroasiatic
Tibetan Archaic Chinese Proto-Vietic Proto-Daic Mon-Khmer
Old Chinese Vietic Proto-Muong Tai-Kadai Zhuang Yao
Ancient Chinese Proto-Vietmuong Muong, Chac, Arem, Ruc, etc. Daic Dong, Miao Mon-Khmer
Annamese
Middle Chinese Vietnamese Thai Shui, etc. Khmu, Riang, etc.
modern Chinese dialects Laotian, etc. etc. Bahnar  Hrê, etc.

Before we go on, it is worth mentioning here that in the early 20th century there existed a long-gone past trend for linguists to partake in the School of Prague on analysis of phonemic system and phonological description of languages for its simplicity in methods and procedures and without the need to learn the language; their focus on such practice suggested that the methodology was scientific. A renown linguist of our contemporary time, Bloomfield, for example, was able to describe and analyze the Tagalog language solely based on the basis of the information provided by one informant (Indo-Pacific, Part II, Descriptive Linguistics, or Lingua 15, 1963, p. 515).

It is therefore unsurprising that many early proponents of the Austroasiatic Mon‑Khmer hypothesis worked within this framework, often without first‑hand command of the languages they studied. Their analyses relied heavily on data gathered from local informants who themselves lacked linguistic training. By the 1960s, a generation of what the author calls "summer‑camp linguists" — researchers funded by short‑term grants from institutions such as the U.S. National Endowment for the Arts —conducted brief field trips in South Vietnam. Few of them achieved real mastery of the languages under investigation. The deficiencies of this approach are visible in their published work, which is marred by orthographic inconsistencies, misspellings, typographical errors, and mismatched cognate pairings.

Even as late as 1991, when Parkin classified Vietnamese (of the Viet‑Muong branch) as Austroasiatic, he acknowledged that "considerable controversy has surrounded the problem of the affiliation of Vietnamese." (Parkin, 1991. p. 89) His acceptance of Haudricourt's and Shorto's position formed the basis of his classification. In effect, the Austroasiatic view of Vietnamese origins, grounded in a relatively small set of presumed Mon‑Khmer ~ Viet‑Muong cognates, became entrenched among leading scholars. Their students, in turn, built upon this inherited foundation, treating it as a springboard for further hypotheses rather than re‑examining its premises.

Readers will have noticed that many of the etyma cited in this paper demand not only solid linguistic training but also a kind of "linguistic feeling", an intuitive sense that comes only with first‑hand experience in the target language. Such sensitivity is essential for engaging in the necessary exercise of "guesswork" (W) which allows us to appreciate how and where words have evolved. As King observed, "this procedure [guesswork] is not guaranteed to lead infallibly to the correct form of an innovation. But progress in historical reconstruction has always come from making guesses—not wild and unsupported guesses but those credible by considerations of simplicity and naturalness. In any case, the historical linguist usually has very little to lose and much to gain from pressing his reconstruction to the utmost in the directions of simplicity and naturalness" (1969: 164).

On the one hand, when a theory first proposed by a few prominent scholars gained enough traction to appear convincing, it was soon repeated by later entrants to the field. Many of these newcomers were not specialists in Vietnamese linguistics but simply adopted the prevailing view and echoed what others were already saying. In the early twentieth century, this meant embracing the newly theorized Austroasiatic Mon‑Khmer affiliation of Vietnamese with languages of Southeast Asia.

On the other hand, unlike the immutable laws of physics or astronomy, empirical sciences such as anthropology, history, and historical linguistics are always subject to revision. Newcomers entering the study of Vietnamese linguistics should therefore resist the temptation to follow the same well‑trodden path. Instead, they should chart new directions. The approach advanced here to explore Sinitic‑Vietnamese etymology with disyllabicity as the central focus opens a fresh realm of inquiry. Sound changes in disyllabic formations cannot be reduced to one‑to‑one correspondences between isolated syllables, and this recognition provides a more accurate framework for reconstruction.

Yet most newcomers in Vietnamese historical linguistics have begun from the Austroasiatic Mon‑Khmer baseline, a non‑historical theory shaped by misconceptions about monosyllabicity versus disyllabicity. This reliance stemmed from misperception, misinterpretation, lack of proficiency in the target languages, and an uncritical acceptance of early research simply because it was produced by renowned specialists at the dawn of Vietnamese linguistics. Their work became the foundation for subsequent studies, all moving in the same direction. In the mid‑twentieth century, it became fashionable to debate tonogenesis, a discussion initiated by Henry Maspero and André Haudricourt. Unsurprisingly, their views were revisited in the latter half of the 20th century by scholars such as Barker, Parkin, and Thomas, whose Mon‑Khmer lexical data continue to be cited.

The challenge for new scholars is to resist the dogmatism that discourages deviation from established premises. True progress requires decisiveness and the spirit of novelty. The message here is clear: newcomers should not simply follow "pre‑set premises" that have grown stale and unproductive. The pioneering works that once seemed innovative, linking Vietnamese to Austroasiatic or Mon‑Khmer, gained popularity because they were new at the time. But they no longer offer fresh insight.

Ultimately, the Austroasiatic Mon‑Khmer theory benefited from this cycle of repetition. Parkin (1991) paraphrased Maspero's argument that the absence of tonality in Mon‑Khmer languages was contradicted by the presence of Thai vocabulary in Vietnamese, where tonal words were treated as cognates. Maspero also pointed to other peculiarities (p. 89), even while accepting Haudricourt's proposal of a Mon‑Khmer substratum. Haudricourt, however, directly challenged Maspero's key claims. As Thomas summarized, "Maspero's examples of Thai‑Vietnamese cognates [were reinterpreted as] general Southeast Asian vocabulary, with correspondences between Vietnamese tones and Mon‑Khmer final consonants." Thus, "Maspero's key argument, that tones cannot be acquired by a language previously lacking them, is rejected" (p. 90). Haudricourt's view remains the one generally accepted today.(See Haudricourt's theory of tonal development in the next section.)

Table 1 - Reliability of a research
published on the internet versus in print

Electronic information today is easily accessible through the internet, including sources such as Wikipedia or the Britannica Encyclopedia. Yet no serious linguist should rely heavily on these media, or on similar platforms, as the foundation of academic research. Even in the case of these two "prestigious" sources, when a widely accepted theory changes, the corresponding entry may or may not be updated, and if it is, there is no guarantee that the revision will be timely. Once such changes occur, they are rarely tracked across the chain of non‑scholarly references that proliferate online—blogs, social media, emails, and other informal channels. A piece of misinformation that is five years old can still be encountered as if it were a new discovery by someone coming across it for the first time.

It often does not occur to novices in the field that what circulates online is not necessarily reliable scholarship. More often than not, it is only a summary of what has already been repeated elsewhere, not original academic work by serious researchers. This is precisely why books and peer‑reviewed publications in print continue to matter: they provide a stable, verifiable record that cannot be so easily altered or diluted by the churn of the internet.

The Austroasiatic position rests largely on the cognateness of basic vocabulary shared between Mon‑Khmer and Vietnamese. The question of tonality remains relevant here: although the Mon‑Khmer equivalents under examination are toneless, in many cases they correspond to Vietnamese etyma that also align with Chinese cognates within a tonal framework. From the author's perspective, certain Mon‑Khmer items, identified by Maspero as of Thai origin, may in fact be loanwords from Vietnamese, re‑packaged with a tonal substitute such as a glottal stop [ʔ] after the original tone was lost.

Interestingly, the same items in Mường subdialects can appear with tones. More broadly, the Mường language has retained its tonal system despite prolonged contact with neighboring Mon‑Khmer groups. This persistence suggests that tonality in Mường, and by extension in Vietnamese, cannot be dismissed as a superficial borrowing but reflects a deeper structural feature. It is worth recalling that Mường is classified within the same family as Vietnamese.

There are relatively few true cognates between Vietnamese and Mon‑Khmer basic vocabulary, and many of the Mon‑Khmer items cited as such rest on dubious etymological foundations. Beyond the lexicons already listed in this section, even the names of the twelve zodiac animals—chuột, trâu, cọp (hùm), mèo, rồng, rắn, ngựa, dê, khỉ (vượn), chó, heo—illustrate the problem. A handful of correspondences can be identified with more certainty, for example: Old Khmer /cnam/ ~ VS 'năm' (year) 年 nián; Old Khmer /cau/ ~ VS 'cháu' (nephew) 侄兒 zhír; Khmer /babuh/ ~ VS 'bọt' (bubble) 泡 pào.

At the same time, there are numerous fundamental Vietnamese words for which Mon‑Khmer provides no cognates at all. Examples include 蓮藕 lián'ǒu ~ VS 'ngósen' (lotus stem); 'đồng' 田 tián (paddy field) ~ VS 'ruộng'; and 'đồng' 銅 tóng (bronze) ~ VS 'thau'. These gaps underscore the limitations of the Mon‑Khmer hypothesis when applied to the Vietnamese core lexicon.

The possibility that many so‑called basic cognates in Mon‑Khmer are in fact Vietnamese loanwords supports a reverse logic to Maspero's claim about the non‑inheritance of tones—namely, that tones could not be acquired naturally or intuitively by speakers of non‑tonal languages. A parallel phenomenon can be observed in Japanese and Korean, where Chinese loanwords appear without tones, even though we know from historical records that they were borrowed directly from Chinese during the Tang Dynasty (618–907 A.D.).

With this in mind, let us examine the nature of the Thai, Mon‑Khmer, Vietnamese basic vocabulary that undermines the postulations advanced by Maspero and Haudricourt, Maspero's thesis of Thai originality and Haudricourt's theory of tonogenesis. In what follows, the author will elaborate on each etymon, grouping them under a Sino‑Vietnamese label that accompanies each cited item. To begin, we may enumerate several Vietnamese words from Maspero's own examples (Études sur la Phonétique Historique de la Langue Annamite, 1952), which he classified as having a Mon‑Khmer substratum and Thai cognates. In each case, however, the author finds that they also display clear Chinese and Sino‑Tibetan correspondences:

A) Mon‑Khmer (items Maspero accepted as substratum in Vietnamese, following Haudricourt)

1. rừnglín 'forest' (SV lâm)

  • Derivation: M 林 lín < MC lim < OC ɡ·rɯm. Cf. OC srɯm (SV sâm, VS rậm). Cantonese /lam4/.
  • Pattern: /l‑ ~ r‑/ parallels include 龍 lóng (SV long) ~ VS rồng 'dragon'; 蘢 lóng (SV long) ~ VS rậm 'dense'; 壟 lóng (SV long) ~ VS rẫy 'farming ridge'.
  • Cognates: Burmese rum 'dense'; Kachin diŋgram2 'forest'; Lushei ram 'forest' (Starostin). Shafer: Sino‑Tibetan Luśei ram (p. 67); Central Branch: Kukis r2am, Ngente, Haka ram (p. 230).
  • Mon‑Khmer parallels: Old Mon /grīp/, modern /gruip/; Danaw /pʿrɑ2bo4/; Riang White /priʔ/; Riang Black /prɪʔ/; Palaung /bréɪ2/; Wa /brɑʔ3/; Old Khmer /vraɪ/; Sakai /brɪ/; Besisi /ʾmbri/; Semang /těpɪʾ/; Srê /brɪ/; T'eng /brɪ/; K'mu /mprɪ/; Khasi /brɪ/; Mundari /bɪr/.
  • Wiktionary: Etymologically from Proto‑Sino‑Tibetan rəm 'jungle, forest, country, field' (STEDT ram). Cognate with 森 (OC srɯm 'forest'), Mizo ram 'forest, country', Karbi ram 'jungle'. Alternatively, an areal word (Schuessler 2007), shared with Khmer រាម riəm 'jungle along a stream', Old Khmer rām 'inundated forest', Mon ရာံ rèm 'copse'.

2. áo 'shirt' (SV y)

  • Derivation: M 衣 yī, yì < MC ʔiəi, ʔɨj < OC *qɯl, qɯls.
  • Notes: Starostin: 'clothes, garment, gown'. As a verb, also ʔjəj‑s, MC ʔyj (FQ 於既), Pek. 'to wear'. Sometimes conflated with 依 ʔjə.
  • Related form: 襖 ào (SV áo) 'coat'. Attested late (earliest in Shuowen Jiezi), possibly Austroasiatic in origin. Compare Proto‑Mon‑Khmer ʔaawʔ 'upper garment', whence VS áo, Mường ảo, Bahnar ao, Khmer អាវ ʼaaw, Pacoh ao.

3. chimqín 'bird' (SV cầm)

  • Derivation: M 禽 (擒) qín < MC gim < OC ɡrɯm. Tang reconstruction: ghyim.
  • Dialects: Cantonese kam4; Hẹ kim2; Tc ʑin12; Ôc ʑiaŋ12; Shuangfeng ʑin12.
  • Classical sources: Shuowen defines 禽 as 'two‑footed creatures with feathers'; Kangxi cites multiple glosses, including 'bird and beast collectively'.
  • Guangyun: 禽 琴 巨金 羣 侵B, MC gi̯əm.
  • Starostin: Since Late Zhou, 禽 often used for 'wild bird(s)' ('something caught'), while 擒 is used for 'to catch, capture'.

4. lúalái 'unhusked rice' (SV lai)

  • Derivation: M 來 lái, lài, lāi < MC ləj < OC mrɯːɡ. Tang reconstruction: ləi.
  • Dialects: Cantonese lai4, loi4, loi6; Hẹ loi2.
  • Shuowen: associates 來 with 麰 'barley/wheat'.
  • Starostin: Shijing OC rjəs. MinNan forms: Jianou lej2, Jianyang le2, Shaowu li2.
  • Wiktionary: 來 originally a pictogram of wheat, later borrowed for 'to come'. Related to 麥 (OC *mrɯːɡ 'wheat'). Cognate with Burmese လာ (la 'come'), Proto‑Vietic laːjʔ.
Notes: Vietnamese lúa is an archaic loan; regular Sino‑Vietnamese is đạo 稻. The irregular tonal development suggests a complex borrowing history, possibly involving an intermediate stage with ‑k > ‑ʔ.

5. ngày 'day' (SV nhật)

  • Derivation: M 日 rì, mì < MC ȵit < OC njiɡ.
  • Dialects: Min forms—Xiamen tɕit8, lit8; Chaozhou zik8; Fuzhou nik8; Jianou ni8; Cantonese /jat8/, /jit8/.
  • Sino‑Tibetan parallels: OB nyi‑ (nyin); Dwags nyen‑te; Old Kukish k‑ni; Luśei, Meithlei ni; Burmish ńi‑; Loloic ńi; Akha nẵ¯; Ulu nie. Baric: Bodo ‑ni, Dimasa ‑nai, Atong ‑ni, etc.
  • Mon‑Khmer parallels (Luce): Old Mon /tŋey/, modern /tŋai/; Danaw /tsʿɪ1/; Riang White /sʿɤŋyiʔ/; Palaung /săŋɑ'i2/; Wa /ʃɪ4ŋɑiʔ3/; Old Khmer /tŋaɪ/; Sakai /těŋŋɪ/; Srê /ŋái/; K'mu /simyi/; Khasi /sngi/; War /juŋai/; Gadaba /sĩi/.
  • Vietnamese variants: VS giời 'sun' < trời 'heaven, sky'.
    B) Thai (Vietnamese words of Thai origin as posited in Maspero's list)

    1. gà 雞 jī 'chicken' (SV kê)

    Derivation: M 鷄 jī < MC kiej < OC *ke: 
    Phonetic pattern: /j- ~ g-/.
    Examples: gàmái: 雞母 jīmǔ 'hen'; gàtrống: 雞公 jīgōng 'cock' (Cantonese, Minnan, including Hai.). Also gàmẹ: 母雞 mǔjī 'hen'; gàcồ: 公雞 gōngjī 'cock'.
    Related correspondences: cf. jìn 近 (SV cận: gần), jì 記 (SV ký: ghi), jì 寄 (SV ký: gởi), jí 急 (SV cấp: gấp).

    2. vịt 鵯 bēi (SV phi, thiết)

    Derivation: M 鴄 pī, pǐ (phất, tiết) < MC pjie < OC *pʰid, now considered obsolete in Sinitic and Sino-Xenic; the common word for "duck" in modern Sinitic is 鴨 (OC *qraːb).
    Wiktionary: Tai-Kadai: Proto-Tai *pitᴰ ('duck') > Thai เป็ด (bpèt), Lao ເປັດ (pet), Zhuang bit;
    Proto-Vietic *viːt ('duck') > Vietnamese vịt; to which (Alves 2015) proposes a Tai origin;
    Sino-Tibetan: Miju kɹɑi³⁵ pit⁵⁵ ('duck'); Pela pjɛ̱t⁵⁵ ('duck'), Zaiwa pje̱t⁵⁵ ('duck'); Proto-Lolo-Burmese *baj¹/² ('duck') > Burmese ဘဲ (bhai:).
    Dialects: Cantonese 鵯 /bei1/, 鴄 /pat4/.

    3. gạo 稻 dào 'paddy', 'rice' (SV đạo)

    Derivation: M 稻 dào < MC daw < OC *l'uːʔ
    Etymology: Area word (rice culture originated in the south). Often compared with Proto-Hmong-Mien *mbləu (“rice plant/paddy”), whence White Hmong nplej (Bodman, 1980). The relationship with similar-looking Mon-Khmer words is ambiguous (Schuessler, 2007). Ferlus (2010) proposes a connection to Proto-Austroasiatic *srɔ(ː)ʔ (“paddy”) (Sidwell's 2024 reconstruction; revised from Shorto's 2006 *sruʔ)
    Proto-Austroasiatic: *sroʔ (“taro”) (Sidwell's 2024 reconstruction; revised from Shorto's 2006 *t₂rawʔ), as the two plants share the same farming niche. 
    Viet. lúa is an archaic loanword; regular Sino-Viet. is đạo. Protoform: *ly:wH (~ l^-), Meaning: rice, grain, Chinese: 稻 *lhu:? (~L^h-) rice, paddy, Burmese: luh sp. of grain, Panicum paspalum, Kachin: c^@khrau1 paddy ready for husking. Kiranti: *lV 'millet'
    Alternative reconstructions: Sagart (2011) derives this word from 舀 (OC *lowʔ, *lu, *lo, “to scoop (hulled grain) from a mortar”). If so, since the Hmong-Mien comparandum only has the derived sense of “rice”, it would be borrowed from Chinese rather than the other way around. The native Min word 粙 may be a variant (Schuessler, 2007, apud Norman, p.c.), Schuessler: MC dâu < OC *gləwʔ or *mləwʔ. Starostin's posit of 稻 dào (SV 'đạo') as 'lúa' cited above. 

    3. cam 甘 gān 'sweet' (SV cam)

    Vietnamese equivalent: 'ngọt' @ '𩜌 yuē (SV ngạt)'.
    Derivation: M 甘 gān < MC kam < OC *ka:m; *OC 甘 甘 談 甘 kaːm; FQ 古三.
    Phonetic pattern: /g- ~ ng-/.
    Classical sources: Shuowen: 也。从口含一。一,道也。凡甘之屬皆从甘。古三切; Kangxi glosses include 'beautiful, sweet; one of the five tastes', fruit name (俗作柑 'cam'), herbs, and idiomatic uses.
    Examples: 甘心 gānxīn (camtâm), 甘苦 gānkǔ (camkhổ), 甘泉 gānquán (camtuyền), 食不甘味 shí bù gān wèi ('ăn không thấy ngon'), 甘草 gāncǎo (camthảo).
    Note: Maspero related the "cam" doublets to Daic languages such as Thai Blanc, Thai, Laotian, Ahom, Shan, etc.

    4. cam 柑 gān 'orange' (SV cam)

    Derivation: M 柑 gān < MC kam < OC *ka:m.
    Gloss: Orange, Citrus nobilis (Han).
    Etymology: 甘 (OC *kaːm, “sweet”) (Wang, 1982); in light of the citrus fruit's southern origin, possibly connected with Austroasiatic; compare Proto-Austroasiatic *ŋaːm (Schuessler, 2007). 

    5. cam 疳 gān 'infantile disease' (SV cam)

    Derivation: M 疳 gān (historically linked to M 甘 gān) < MC kam < OC *ka:m.
    Dialectal note: Hakka gam1.
    Classical sources: Kangxi: 疳 as pediatric disease from eating sweet things; detailed traditional medical descriptions.
    Example: 疳積 (gānjī, SV camtích, 'infantile disease').

    6. cả 價 jià 'price' (SV giá)

    Compound usage: 'giácả' 價格 (jiàgé, SV giácác, 'price').
    Derivation: 價 jià, jiè, jie < MC ka < OC *krajʔs; related 賈 jià, jiă, gǔ (giá, giả, cổ).
    Classical sources: Shuowen and Kangxi gloss 價 as 'value, price', with historical borrowing interplay between 價 and 賈; Guangyun gives 駕 古訝 for related readings.
    Note: Maspero did not associate 'cả' with 價 or the disyllabic 價格, and thus posited a Daic origin; however, 'giácả' is of Chinese origin in formation.

      C) Old Chinese (Vietnamese words of Thai origin by Maspero)

      Maspero listed a number of Vietnamese words he believed to be of Thai origin. Haudricourt (1961: 51–52), however, argued that many of these are better understood as Old Chinese loans into both Vietnamese and Thai.

      1. chèodiáo 'to row' (SV trạo)

      Derivation: M 棹 (桌, 櫂) zhào, zhuō, zhuó (trạo, trác) < MC ɖaɨw  < OC *rdeːwɢs
      Starostin: originally written 櫂 (Late Zhou), reconstructable as ɬ(h)e:kʷ‑s. After Han, the reading shifted to d.(h)ie:\w (retroflex development in lateral hsieh‑sheng series), hence the later form 櫂 (attested since Jin).
      Later Han reading: ɬ(h)e:kʷ, MC ḍạuk, Mand. zhuo 'a kind of bowl, vessel'.
      Notes: VS chèo is colloquial; regular Sino‑Vietnamese is trạo.
      Austric: Thai ʔcɛ:w.A 'to row', Khmer ce:w 'row, oar', Mon tasu 'paddle': phonology suggests a very late (post-MC) borrowing from Chinese for all these forms.  

      2. bè 'raft' (SV phiệt, VS phà)

       Derivation: M 筏 fá < MC bʷiɐt, pwat < OC *pa:d, *bad || Cf. 'bắc' 艊 舶) bó (SV bạc)  < MC baɨjk < OC *bra:g | Ex. 船舶. chuánbó. (thuyềnbè.) 'ships'
      Note: 舶 (bó, VS 'bắc', 'large oceangoing ship') Japanese: びゃく (byaku) 'large oceangoing ship'

      3. bánhbǐng 'bread, cake' (SV bính)

      Derivation: M 餅 bǐng < MC pjɛŋ < OC *peŋʔ
      Example: 白餅 báibǐng (VS bánhdày), cf. (from Teochow, literally 'bánhbao'), 包餅 (bāobǐng, SV bòbía, 'lapxuong tapioca spring roll'), also 'bánhpía' ('beancake')
      Notes: Early Nôm attestations (Ngọc Nam Chỉ Âm, 16th c.) show alternation /baj2 ~ jaj2/.
      Descendants: 
      • → Khmer: បាញ់ (bañ, 'cake, pastry')
      • → Lao: ແປ້ງ (pǣng, 'flour; starch; powder')
      • → Thai: แป้ง (bpɛ̂ɛng, 'powder; flour; starch')
      • → Vietnamese: bánh ('pastry', 'cake', 'bread'), 'bánhpía' ('Suzhou-style mooncake')

      4. tiếngshēng 'sound, voice, word, speech, language' (SV thanh)

      Derivation: M 聲 shēng < MC ɕiajŋ < OC qʰjeŋ
      Dialects: Cant. ʃieŋ21; Hainanese tje1; Amoy sɨŋ11 (lit.), siã11; Chaozhou siã11; Fukienese siŋ11 (lit.)
      Classical sources: Shuowen defines 聲 as 'sound'; Kangxi cites multiple glosses including 'music', 'resonance', 'speech'.
      Examples: 聲張 shēngzhāng (VS lêntiếng, 'to voice'), 聲名 shēngmíng (VS danhtiếng, 'renown').
      Notes: VS tiếng reflects a colloquial development from the same root.

      5. đũazhú 'chopstick' (SV trợ, chừ, trừ)

      Derivation: 箸 zhù, zhú, zhuó, zhuò < MC ɖɨə̆ < OC *tas, *das
      Dialects: Hainanese /du2/.
      Cultural note: Likely a Yue loan into Chinese. Chopsticks are tied to rice culture, which originated in the South (Hunan region). Northern Chinese, who did not cultivate rice early on, adopted the term later. To avoid the taboo 倒 dào (SV đảo, VS đổ 'overturn') in boat‑based cultures, southerners coined 筷 homophonous with 快 kuài (VS mau, 'fast'). However, it is also said that is homophony with 住 (zhù, 'stopping') in boatmen's language. Still used in almost all Min dialects and sporadically in other topolects, such as Southern Wu topolects including Wenzhounese.

      6. nàngniáng 'miss', 'girl', 'she', 'mother' (SV nương)

      VS variants: ná, nạ, nường.
      Derivation: M 嬢 (娘) niáng < MC ɳɨaŋ < OC *naŋ 
      Dialects: Fukienese nuəŋ12; ZYYY niaŋ12; Amoy nĩu12; Chaozhou niẽ12; Shanghai niã32.
      Related: 妳 (SV nhĩ). In Beijing colloquial 娘兒 niár 'mom'. It is suggested a loan from Old Turkic anaŋ (“your mother”), from Proto-Turkic *ana ~ *eńe (“mother”) (whence Turkish ana and Uyghur ئانا (ana)) and *-iŋ (“second person singular possessive suffix”), (Vovin and McCraw, 2011).
      Notes: '' ancient sound to call '' ('mom'). § 'Phậtthuyết': 'Chẳng biếtơn áng ná.' VS nạ preserves the older sense 'mother'. 
      Descendants:
      • → Khmer: នាង (niəng, 'young woman; girl')
      • → Lao: ນາງ (nāng, 'woman; girl; lady; Mrs.')
      • → Thai: นาง (naang, 'woman; wife; female lover')
      • → Vietnamese: nàng ('lady; young woman; she')

      7. mèomāo 'cat' (SV miêu)

      Derivation: M 貓 (猫) māo, máo < MC miaw, maɨw < OC *mrew, *mreːw
      Related: 卯 mǎo (SV mão, VS mẹo).
      Example: 卯年 mǎonián ~ VS nămmèo or nămmão ('Year of the Cat').
      Note: In the Vietnamese zodiac, 卯 corresponds to the cat, not the rabbit (兔年 tùnián, SV Thốniên, VS nămThỏ). 卯年 (Mǎo year) is interpreted as the “Year of the Cat,” whereas in China it became the “Year of the Rabbit.” The confusion stems from the phonetic similarity between 貓 (māo, 'cat') and 卯 (mǎo), with 卯 functioning as a phonetic substitute. Because cats were considered inauspicious in Chinese belief, 'Year of the Cat' (貓歲 māosuì, SV miêutuế) was reinterpreted and misread as “Year of the Rabbit” (卯兔 mǎotù, SV mãothố).

      D) Additional items (Haudricourt's claims of Austroasiatic loans in Thai)

          In addition to Maspero's cited examples, Haudricourt (1961) identified several more Vietnamese words that he described as Austroasiatic loans into Thai. Amusingly, each of these also shows clear cognacy with Chinese forms:

      1. bụng 'abdomen' (SV phục)

      Derivation: M 腹 fù < MC puwk < OC *pug
      Phonetic shifts: OC p‑ > VS b‑; M f‑ > VS b‑.
      Comparative data: Tibetan (W) ze‑a~bug ''maw, fourth stomach of ruminants'; Burmese pjəuk 'belly', 'stomach'; Lushei KC puk; Lepcha ta‑fuk, ta‑bak 'abdomen'; Kiranti ʔpo/k. Also Sho puk; Kham phu 'belly'; Gyarung tepok.
      Sino‑Tibetan: From Proto-Sino-Tibetan *d-puːk ('belly; vitals; hollow object; cave'); cognate with 𥨍 ('cave'), Tibetan ཕུགས (phugs, 'innermost parts'), Burmese ဗိုက် (buik, 'belly'; 'pregnancy'), အပေါက် (a.pauk, 'hole'), Chepang तुक् ('belly'; 'stomach'), Proto-Bodo-Garo *bi(ʔ)-buk ('guts'), Cogtse Situ /tə-pōk/, 'belly', Brag-bar Situ, /tə-vōk/, 'belly'), Proto-Tani *puk ('heart') (STEDT; Schuessler, 2007; Zhang, Jacques, and Lai, 2019).
      Also compare Austroasiatic words: Proto-Mon-Khmer *bo()k ('belly'), Khmer ពោះ (pŭəh, 'belly'), Vietnamese bụng ('belly') (Shorto, 2006; Schuessler, 2007).

      2. nghetīng 'hear' (SV thính)

      Derivation: M 聽 (听) tìng, tīng < MC tʰɛjŋ < OC *l̥ʰeːŋ, *l̥ʰeːŋs
      Dialects: Hainanese /k'ɛ1/; Amoy thiɛŋ11, thiã11; Chaozhou thiã11.
      Sound correspondences: /t‑, d‑ ~ ng‑/, e.g. 停 tíng (SV đình) ~ VS ngừng ('pause'); 短 duǎn (SV đoản) ~ VS ngắn (short')
      Notes: 聞 wén (VS nghe, 'hear') may underlie VS ngửi 'smell' as a later semantic development. Cf. 門 mén ~ VS ngõ ('gate')
      Example: 聽話 tīnghuà: nghelời ('obey'), 聽說 tīngshuō: nghenói ('hearsay'), 凝聽 níngtīng: nghengóng ('listening'), 聆聽 língtīng: lắngnghe ('listen attentively'), etc.

      3. cổ 'neck, dewlap' (SV hồ, SV cổ, cồ)

      Derivation: M 胡 hú < MC ɦɔ < OC *ga:
      Dialects: Cant. wu4; Hakka fu2. Tang reconstruction: /ho/, Proto-Tai *ɣo:ᴬ.
      Classical sources: Shuowen defines 胡 as 'dewlap of cattle'; Kangxi glosses include 'throat', 'neck', 'dewlap', 'longevity'.
      Proto-Vietic *koh ('throat'; 'neck'), from Proto-Austroasiatic *kɔːʔ ('neck') (Sidwell, 2024). Cognate with Tho (Cuối Chăm) kɔː⁵, Khmer ក (kɑɑ), Bahnar hơko, Mon ကံ.
      Comparative: Tibetan kru‑kru 'windpipe'; Kachin z^jəkhro1 'throat', 'gullet', 
      Notes: VS cổhọng ~ cuốnghọng 胡嚨 húlóng ~ 喉嚨 hóulóng ('throat'). Modern M 脖子 bózi corresponds to VS cáicổ. Vietnamese compounds like cổchân 'ankle' (lit. 'neck of the foot') reflect the same semantic extension.

      4. cằmhàn, 'chin', 'jowl' (SV hàm, VS cằm, ngậm)

      Derivation: M 頷 hàn, ǎn, hán < MC ɦəm < OC *ɡɯːm, ɡɯːmʔ
      Etymology: From Proto-Sino-Tibetan *mV-qəm ('jaw'; 'chin'; 'molar') (STEDT under *gam). Bodman (1980) considers it to be the endoactive of 含 (OC *ɡɯːm, 'to hold in the mouth'), literally 'the thing that holds something in the mouth'. Starostin: glossed as 'chin', 'lower jaw' (Late Zhou). Within Chinese, cognate with 函 (OC *ɡuːm, *ɡruːm, 'to contain; box; letter') (Schuessler, 2007). 銜 (OC *ɡraːm, 'to carry in the mouth; horse's bit') is probably related. 
      Dialects: Amoy, Teochew am4.
      Notes: Modern M 下巴 xiàbā (SV hạba) is the standard Mandarin word for 'chin'. Vietnamese cằm may derive from a disyllabic MC form /xaba/ > /χamba/ > /kamba/ > /kamɓ/ > /kăm/ through epenthesis and labial conditioning.

      5. càqié, 'eggplant' (SV già, VS 'cà')

      Derivation: M 茄 qié < MC kaɨ, gɨa < OC *ga, *gal, *kra:l
      Dialects: Cant. khe12; Amoy khe11, kio12; Chaozhou kie12; Fuzhou kia11; Shanghai ka32.
      Etymology: Attested very rarely and late, earliest in the 59 BCE 'Slave's Contract' (《僮約》) by Wang Bao (王褒) (Wang et al., 2008): '二月春分,……別 茄 披蔥. 'In the second month of the year, the Spring Equinox  […]  separate and transplant seedlings of eggplant and scallion.' Alves (2022) relates this to Proto-Vietic *gaː (whence Vietnamese cà), which he considers an early Chinese loanword. Per Starostin, earliest meaning was 'lotus stalk' (OC kra:j, MC ka). The sense 'eggplant' is attested from Jin.
      Notes: The MC reading ga is exceptional and may be dialectal. Vietnamese is colloquial; regular SV is già. Likely a Yue loan into Chinese, since eggplant was not native to northern China. Compare 西紅柿 xīhóngshì and 番茄 fānqié ('foreign egg‑fruit') for later introductions like the tomato.

        Despite the extensive examples laid out above, both Maspero and Haudricourt overlooked the possibility that nearly all the cited items may trace back to Chinese cognates. If we are to consider the dichotomy between their respective views—regardless of what kind of relationship the etyma might suggest—the core question remains unchanged: whether these words were borrowed from Chinese into Vietnamese, from Vietnamese into Chinese, or whether they stem from a shared ancestral source. This ambiguity persists across cases like lúa ~ gạodào 'paddy (rice)' and qié 'eggplant', especially when viewed alongside other items such as đườngtáng 'sugar', voiwēi 'elephant', chuốijiāo 'banana', dừa 'coconut', chógǒu 'dog', and sôngjiāng ''river'. These, and dozens of other foundational words, consistently point to a Yue substrate—many of which also show cognacy with Austroasiatic and Austronesian forms, as noted by other scholars (see Luce's list below.)

        In the case of Chinese and Vietnamese, whenever correspondences appear in their vocabularies, the likelihood is strong that they are related to one another rather than to any outside language. Their contact history stretches back more than 2,250 years BP, at least from the pre-Han period onward. Whether the direction of borrowing was from ancient Chinese into Vietnamese or the reverse, the relationship is evident in shared items such as the names of the twelve animals of the zodiac, which correspond to the Earthly Branches.

        The Chinese characters that represent these words today are later developments. Each is built on the structural pattern {radical + phonetic}, where the radical functions as the semantic indicator. They are not the original ideographs of the earliest stage, such as 火, 日, 刀. This fact opens the possibility that some basic nominals in Vietnamese may predate the script and reflect Yue or southern sources. As a matter of fact, the Annamese did not need to wait until the twelfth century to know how to pronounce intimate, everyday words with tones. On the contrary, the evidence suggests that many fundamental items may have been Yue loanwords into Chinese. Examples include:

        • 豆 (dòu, nồi, 'pot') [ phonetic loangraph of base meaning for 'bean' 荳, that still exists. ]

        • 弩 (nǔ, , 'crossbow') [ > VS nỏ M 弩 nǔ < MC nuo < OC *naːʔ. According to Starostin, Viet. is an archaic loanword; a later borrowing from the same source is Viet. nỏ. Standard Sino-Viet. is nỗ. In Chinese, 弩 is attested since Late Zhou (Zhouli). Already in Shujing appears 砮 *n(h)āʔ, *n(h)ā, MC nó, no, Mand. nǔ, Viet. nỗ 'flint arrowhead', likely the same root. For *nh- cf. Xiamen lɔ6, Jianou noŋ8. ]

        • 舟 (zhōu, ghe, 'boat') [ M 舟 zhōu (chiêu, châu, chu) < MC tɕɨu < OC *tjɯw, also compare 舠 dāo 'boat' and 刀 dāo 'knife'. The southern Jiangnan people were renowned for water navigation. ]

        • 舠 (dāo, tàu, 'boat') [ M 舠 dāo < MC taw < OC *ta:w. Cf. 刀 dāo 'knife'. According to Schuessler (2007), a loan from Proto-Mon-Khmer *ɗuuk ~ ɗuk 'boat, canoe', whence Khmer ទូក (tuuk) and Vietnamese nốc (< Proto-Vietic ɗoːk 'boat'). Possibly cognate with 輈 (OC tɯw 'trunk, pole'). Yang Xiong's Fangyan notes 舟 (OC tjɯw) was common in central and eastern China, while 船 (OC ɦljon) was used in the west. ]

        • 船 (chuán, thuyền, 'ship') [ > VS xuồng 'small boat'. Note 駕船 (jiàchuán, láithuyền, 'steer a boat', a cognate of chèothuyền), where the signific 馬 mă 'horse' reflects the nomadic north. In the south, however, water-savvy natives coined words with 掉 diáo 'to row' (SV trạo, VS chèo, cf. 櫂 zhào VS chèo 'oar'). ]

        • 井 (jǐng, giếng, 'well') [ M 井 jǐng < MC tsiajŋ < OC *skeŋʔ | Note: It might have been difficult to dig in the northwest where proto-Chinese first arose. ]

        • 耕 (gēng, cày, 'plow') [ M 耕 (畊) gēng < MC kəɨjŋ < OC *kre:ŋ | cf. SV canh. Southern peoples excelled in wet-rice cultivation. ]

        • 種 (zhòng, trồng / zhǒng, giống, 'plant, seed, breed') [ M 種 zhǒng, zhòng, chóng (chủng, chúng, chùng) < MC tɕiowŋ < OC *tjoŋʔ, *tjoŋʔs | Cf. SV chủng. An Chi (2016, vol. II) even boldly suggested a link with trứng 'egg', though the proper form is 蛋 dàn (SV đản) ]

        • 銅 (tóng, thau, 'bronze') [ M  銅 tóng < MC dəwŋ < OC *do:ŋ | Cf. SV đồng 'copper'. The Yue were famed for bronze drums and advanced metallurgy. ]

        • 鋤 (jǔ, cuốc, 'hoe') [ M 鋤 chú, zhù, jǔ < MC dʐɨə̆ < OC *zra | Note: Advanced bronze work likely led to iron extraction and metallurgy as well. ]

        • 鋸 (jū, cưa, 'saw') [ M 鋸 jù, jū (cứ, cư) < MC kɨə̆ < OC *kas | Note: Attested in early Chinese texts (e.g. Shuowen, Hanshu), often paired with 刀 'knife' as 刀鋸. ]

          These examples, among others, suggest that many of the most basic Vietnamese words have deep roots in the Yue substratum, layered with borrowings and convergences across Chinese, Austroasiatic, and Austronesian spheres.

          The genetic affiliation between Chinese and Vietnamese basic words is further affirmed by the theory that the ancient Yue language contributed significantly to proto-Chinese. As the nomadic ancestors of the Chinese expanded east and south, it is plausible that they borrowed many words from the Yue, whom they regarded as southerners outside their cultural sphere. Around 5,000 years ago, when the so-called pre-Chinese were still nomads on horseback before the founding of the Xia Dynasty, the Yue had already mastered wet-rice cultivation, river navigation, and seafaring. Their influence likely extended into the southward dispersal of the Yue and their Austronesian peoples, with many words originating in South Chin, so to speak.

          Early Yue tribesmen (百越 BaiYue, SV BáchViệt, 'Bod') cultivated the fertile lands along both banks of the Yangtze River, where the states of Shu 蜀, Chu 楚, Wu 吳, and Yue 越 later flourished. As populations expanded across regions before the Qin (秦, SV Tần, 'Chin') unified them into what became 'China', Yue loanwords naturally slipped into the speech of many communities. This influence is especially visible in the adoption of the Yue zodiac system of twelve animals, paired with the Earthly Branches: '子 zǐ, 丑 chǒu, 寅 yǐn, 卯 mǎo, 辰 shěn, 巳 sì, 午 wǔ, 未 wèi, 申 shēn, 酉 yǒu, 戌 xù, 亥 hài'. These correspond to Vietnamese basic words for the same animals: chuột, trâu, cọp, mèo, rồng, rắn, , chó, heo, and others.

          One may ask why the pre-Chinese or ancient Vietnamese, who already possessed their own words for these animals, would borrow them from another source. A likely explanation is that such borrowings served spiritual or ritual purposes, whether for the pre-Qin Chinese or for later Vietnamese. In fact, the entire set was reintroduced into ancient Vietic as the Sino-Vietnamese forms , sửu, dần, mẹo, thìn, tỵ, ngọ, mùi, thân, dậu, tuất, hợi, respectively, through Early Middle Chinese

          Similarly, these sounded more elevated or scholarly to the masses, much as modern Vietnamese still borrow Sino-Vietnamese terms for the Western Horoscope, e.g. Bạchdương (白羊 Băiyáng) for 'Aries', Kimngưu (金牛 Jīnníu) for 'Taurus'. Such names carry an academic aura precisely because they are less transparent to everyday speakers.

          For the early pre-Chinese, however, alternate pronunciations of the zodiac animals may not have sounded very different from their own words. Otherwise, they would not have needed to substitute 'cat' 卯 (VS mèo) and 'goat' 未 (VS /je1/) with 'rabbit' 兔 tù (VS thỏ) and 'sheep' 羊 yáng (VS ). These substitutions likely reflected cultural sentiment: the Chinese were superstitious about cats, while their northern culture centered on sheep-herding, in contrast to the southern reliance on water buffalo (丑 chǒu, VS trâu) and pigs (亥 hài, VS heo). The point remains that the twelve zodiac animals were cognates across both traditions in antiquity.

          The Sino-Vietnamese zodiac set, which made a round trip back into Vietnamese, illustrates the coexistence of at least two layers of nominals. This supports the hypothesis that many other basic Chinese words may have evolved from what Norman (1988:17) called "an already extinct foreign source," apart from the common etyma shared with Tibetan. That foreign source may have been the Yue substratum, which also shaped the Vietic language. It was from this base that the Yue (百越, 'Bod'), Chu (楚國), and Zhou (周朝) emerged some 3,000 years ago, possibly with contributions from Taic elements. The term "proto-Chinese," as used here, refers to the racially mixed groups who had not yet blended with all the indigenous peoples before their southern expansion.

          Regular lexical interchange is another indicator of affiliation. Many core words are cognate not only between Chinese and Vietnamese but also across Sino-Tibetan. Basic vocabulary does not appear exclusively in Mon-Khmer. For instance, 娘 (niáng, SV nương) corresponds to nàng ('girl') and nạ ('mother'), while 爹 (diè, SV giả) corresponds to both tía ('daddy') and cha ('father)'. Such parallels raise the question: is Vietnamese truly a Mon-Khmer language?

          The Vietnamese words shared with Mon-Khmer are fewer, and their similarity may reflect cultural influence rather than genetic inheritance. The Khmer Kingdom was once a dominant power in Southeast Asia, and influence often flows from stronger to weaker states. Later, as the southern state of ĐạiViệt expanded, both Champa and Khmer were absorbed, and their linguistic elements blended into Vietnamese. This reflects a broader anthropological pattern: the dominant polity shapes the linguistic landscape. With annexed territories came new populations and speech forms, which merged with Vietnamese and evolved into a new entity. (南).

          After the decline of Cambodia's ancient Khmer Empire, the Annamese realm, by contrast, expanded in size, ambition, and aggressiveness. Over the following millennium of sovereignty, Annam not only eradicated the Kingdom of Champa to its southern border but also absorbed much of the eastern flank of Cambodia's former territories.

          In today's Vietnam, as one travels further south, one encounters placenames such as Phanrang, Phanrí, Sóctrăng, and others that stand in contrast to the ancient Vietnamese toponyms of the far north. There, deeply rooted Sino-Vietnamese etyma have long been embedded in local names. For example, the prefix Kẻ- ('market', 'city') appears in Kẻchèm, interchangeable with today's SV Từliêm 慈廉 Cílián; in Kẻchợ ~ 市街 Shìjiē; in Kẻbảng ~ 棒街 Bàngjiē; and in Kẻon ~ 峴港 Xiàngăng. Similarly, Chằm- ('marsh') corresponds to 澤 zé (SV trạch), as in Chằm Dạtrạch ~ 夜澤 Yèzé or Chằmdơi ~ 蝠澤 Fúzé. These names reflect an older stratum of settlement and linguistic layering.

          In terms of racial composition, later migrants who resettled in the south inevitably intermarried with local populations, producing mixed descendants. This process mirrored earlier developments in the north, where the growth of both southern China and ancient Vietnam was marked by continual blending of peoples.

          As will be seen in later chapters, Vietnamese and Chinese share most of their basic vocabulary with Sino-Tibetan etymologies. Yet when scaled down, only a few dozen cognates overlap with Mon-Khmer, forming a small subset of a much larger union that includes possible Chinese affiliation. Many of the proposed Khmer-Vietnamese cognates may in fact derive from the same roots that also gave rise to ancient Chinese. With so many items in both Vietnamese and Chinese demonstrably cognate, the real question is whether these are cases of genetic affiliation within the same linguistic family or simply straightforward loanwords. Without critically basic items such as 頭 (tóu, đầu, 'head'), 胡 (hú, cổ, 'neck'), 目 (mù, mắt, 'eye'), 翁 (wēng, ông, 'grandfather'), 婆 (pó, , 'grandmother'), 父 (fù, bố, 'father'), 母 (mǔ, mẹ, 'mother'), 兄 (xiōng, anh, 'older brother'), 姊 (zǐ, chị, 'older sister'), 妹 (mèi, em, 'younger sister'), 家 (jiā, nhà, 'home'), 戶 (hù, cửa, 'door'), and others, the ancient Annamese language could not have existed at all if these were merely Chinese loanwords. If they were, the language would have to be considered a case of pidginization or even creolization, arising to meet the communicative needs of Chinese immigrants who followed in the wake of the Han conquest.

          It is more likely, however, that genetic affiliation was the true case. From the dawn of humanity, nothing is closer than kinship. In the deepest lexical stratum, we find a small number of words of mixed origin, including Austroasiatic Mon-Khmer and Sino-Tibetan stocks, or more precisely, cognates of roots yet to be fully identified. Given the spread of language contact across space and time, whether in wave-like or ripple-like patterns, the etyma listed above appear to have originated either within the Sino-Tibetan family or from common Taic-descendant forms, such as Yue languages (Cantonese, Fukienese, etc.) that emerged after the break-up of Taic into Tai-Kadai and Yue branches.

          This postulation suggests that Austroasiatic peoples themselves may have diverged from Taic aboriginals in southern China. Later, when new waves of mixed northern resettlers, such as Yue-mixed Han Chinese, moved south into ancient northern Vietnam, they displaced Muong and other indigenous groups, pushing them closer to Mon-Khmer speakers who had migrated from the southwest centuries earlier (Nguyễn Ngọc San 1993). Through such contact, basic words could have entered Vietnamese, especially since Muong minorities maintained constant interaction with Kinh lowlanders in trade and social life. Indeed, King Lê Lợi, who expelled the Ming occupiers after twenty years of harsh rule in the fifteenth century, was himself likely of Muong origin.

          Linguistically, this proposition cannot be dismissed. Many basic words appear in one Mon-Khmer language but are absent in others, while the same words are found in both Vietnamese and Chinese, traceable to earlier historical periods. The reverse scenario, deriving Vietnamese from Mon-Khmer alone, does not hold when considering the time frame of Khmer-Vietnamese cognates. The persistence of Mon-Khmer words in Vietnamese, after filtering out all Chinese-Vietnamese commonalities, suggests that what remains may stem from a mixed stock of indigenous and proto-Viet-Muong lexical seedlings. These remnants, preserved in Muong, reflect the shared heritage of Viet and Muong before their linguistic split, just as their speakers diverged biologically, some mixing with Han, others with Mon-Khmer. It is also possible that Viet-Muong words re-entered Mon-Khmer languages, since their speakers may have originally migrated into the Red River Delta from the southwest (Nguyễn Ngọc San 1993).

          The similarities between Chinese and Vietnamese are thus parallel, concurrent, and plausible, without requiring detailed discussion of shared features such as tonality and phonology. If we continue tracing beyond what Maspero and Haudricourt (1954) provided through Old Chinese reconstructions and tonegenesis based on Annamese as further evidence emerges. As Shafer's Sino-Tibetan etymologies will show in the next chapter, many more Vietnamese words can be related to Chinese, often surfacing spontaneously in the mind of the researcher, confirming the depth of their historical connection.


           

          Figure 3 – View of the hypothesis of lexical interpolation of respective languages


          Tibetan Unknown extinct foreign elements before the Chinese Mon-
              Chinese  Zhuang, Miao, Yao, etc. Vietnamese   Mường   Khmer
                             
           
           
           
           
           
               
                           
                           

           

          II) Haudricourt's theory of tonal development

          Haudricourt's Haudricourt's hypothesis on the development of tones in Vietnamese holds that tonal distinctions arose from pitch changes conditioned by the nature of initial and final consonants. At the time, his proposal was strikingly innovative and has since become the foundational theory for explaining tonegenesis in a broader sense. Its impact on the study of Vietnamese tones has been profound and enduring, to the point that his view remains widely accepted among scholars. This section, however, challenges his assertion that Vietnamese tones were not fully established until the twelfth century. Haudricourt's postulation can only be sustained if the formation of tones in Vietnamese occurred simultaneously with parallel developments in Old Chinese, unfolding interactively and concurrently, as suggested by their cognate correspondences.

          On related ground, Mei Tsu-lin, in Tones and Prosody in Middle Chinese and the Origin of the Rising Tone (1970), argued that the rising tone (上聲) of Middle Chinese developed through the loss of a final glottal stop -ʔ, corresponding to the hỏi and ngã tones in Vietnamese and to similar tonal categories in modern Chinese dialects. During the seventh and eighth centuries, tonal distinctions were even employed to simulate the length contrast of Sanskrit, fitting neatly into the four-tone system of Middle Chinese in terms of pitch, contour, and duration, as described in a ninth-century Buddhist text. Moreover, rhyming evidence from the Book of Odes shows that Old Chinese words tended to cluster into three or four tonal categories, which later evolved directly into the four tones of Middle Chinese.

          "Argument from analogy is that best suggestive, and without testimony from more direct sources, the theory will remain as one of the many possibilities. Fortunately, three kinds of evidence can now be presented: modern dialects, Buddhist sources bearing upon Middle Chinese, and old Sino- Vietnamese loans."

          "Several dialects of the southeastern coastal area preserve a glottal stop in the rising tone, and the Buddhist sources indicate that the rising tone of Middle Chinese is high, short, and level. Our thesis, then, is that the final glottal stop of Old Chinese is retained intact in the coastal dialects and developed into a high and short syllable in Middle Chinese. We know from acoustic studies that a syllable is high and short if it ends in a voiceless stop, low and long if it ends in a voiced stop, and medium in pitch and duration if it is open [-Ø]. It is also reasonable to assume that when a final stop is lost, the tonal features are retained as reflexes. Therefore, if the final glottal stop (which is voiceless) indeed existed in Old Chinese, its descendant should have precisely the features we said the rising tone did have in the Middle Chinese."

          (Mei Tsu-lin's Tones and Prosody in Middle Chinese and the Origin of the Rising Tone. 1970. )

          Haudricourt emerged as the principal challenger to Maspero's 1916 theory of the non-inheritance of tones, which had been supported by Mon-Khmer non-tonal cognates. Prior to Haudricourt, the prevailing assumption was that tonal contrast could not be derived from non-tonal contrasts, and that tonality in Chinese was an intrinsic, original feature of the language. Even into the late twentieth century, Tung T'ung-ho (董同龢, "中國語音史", Zhōngguó Yǔyīn Shǐ. p. 183), as quoted by Mei Tsu-lin (1977) (M) , had also stated, "Ever since the beginning of the Chinese language, we not only distinguished tones, but possessed a tonal system not much different from the four tones of Middle Chinese." 

          Haudricourt's 1954 study of Vietnamese tonegenesis overturned this view. He argued that the Chinese tonal system developed historically through the loss of certain final consonants. Specifically, the rising tone (上聲) of Middle Chinese, corresponding to the hỏi and ngã tones of Vietnamese, reflected an earlier final /-h/, itself a reflex of an original /-s/. Evidence for this comes from Chinese words borrowed into Vietnamese as early as the Han Dynasty, when the hỏi and ngã tones were still represented by /-s/. Examples include 義 ŋrals > Viet. nghĩa (ngã tone) and 墓 *ma:gs > Viet. mả (hỏi tone) (). 

          From this evidence and by analogy, Haudricourt further proposed that morphological derivation in Old Chinese involved alternation between a final /-s/ and its absence /-Ø/, which later gave rise to the departing tone (去聲) of Middle Chinese. For instance, he reconstructed /dâk/ 度 for the verbal form “to measure” (cf. SV độ) and /dâks/ for the nominal form "a measure" (cf. SV đạc). Similarly, /âk/ 惡 for the adjectival form "bad" (cf. SV ác) and /âks/ for the transitive verbal form "to dislike" (cf. SV ). Mei noted that the second member of these pairs falls into the departing tone category, and indeed in Sino-Vietnamese they are consistently departing tones. However, in Sinitic-Vietnamese vernacular forms such as đo and , they appear instead in the level tone (平聲).

          As for Haudricourt's hypothesis on the Vietnamese rising tones (hỏi and ngã), further discussion is required. Beyond the examples of 義 and 墓, additional cases may or may not conform to the paradigm he outlined, and these variations will be examined in subsequent sections.

          Haudricourt's proposal was further developed by Forrest (R.A.D. Forrest, 1960), who equated the reconstructed -s of Old Chinese with the -s suffix of Classical Tibetan. Pulleyblank (E.G. Pulleyblank, "The Consonantal System of Old Chinese, Part II," Asia Major 9, 1962, pp. 206–265) added further support by pointing to foreign words ending in -s whose Chinese transcriptions, dated to the third century A.D., show -ts > -s, which in his theory gave rise to the departing tone. In the same study, Pulleyblank also proposed antecedents for two other tones: and for the later level tone (平聲), and for the later rising tone (上聲). He argued that Old Chinese contained no open syllables [-Ø]. Having already reconstructed ɗ- and ɓ- as initials, he reasoned by symmetry that they could also occur in final position, i.e. and . Thus, what appears as an open syllable in Middle Chinese may in fact reflect Old Chinese finals or , depending on whether the syllable shows contact with a velar or dental coda. Pulleyblank's connection of to the later rising tone was based largely on analogy with Vietnamese, given the striking parallels between the tonal systems of Vietnamese and Chinese. The steady accumulation of evidence for the -s hypothesis suggests that such analogies may indeed be valid. Since the sắc (') and nặng (.) tones of Vietnamese developed through the loss of an earlier , it is plausible that the Chinese rising tone was derived in the same way.

          In practical terms, the reverse logic has also been invoked: toneless words in Mon-Khmer languages that have tonal cognates in Vietnamese may in fact be Vietnamese loanwords. Haudricourt's theory of tonegenesis rested on the observation that certain Mon-Khmer finals correspond to specific tones in Vietnamese, for example, a final glottal stop [ʔ] correlating with the sixth tone. Yet his claim that Vietnamese only became fully tonal in the twelfth century is untenable. By the end of the tenth century, Tang Middle Chinese had already developed into an eight-tone system. For nearly a millennium, Annamese scholars used Mandarin as their medium of learning, just as scholars in other prefectures of the Middle Kingdom did. The pervasive presence of Tang Middle Chinese disyllabic vocabulary in Vietnamese further demonstrates that tonal development in Annam must have been complete by the time the country achieved full independence, otherwise there exists no Tang stanzas which have been widely appreciated and composed by Vietnamese scholars as recently as the early 1970s.

          Whether or not Haudricourt's chronology is accepted, the evidence shows that Annamese words were already fully tonal by the thirteenth century. This is confirmed by the Annan Yiyu (安南 譯語, Annam Dịchngữ, "Translation of Annamese"), a wordbook compiled by a Chinese envoy to Annam during the Yuan Dynasty, later translated into modern language by Wang Li (王力, 1997). Not only do contemporary Chinese loanwords in Vietnamese carry tones that align precisely with their Chinese counterparts, but older Sinitic-Vietnamese forms also display tonal correspondences traceable to Old Chinese, even before the system evolved into the eight tones of Middle Chinese. For example

          1. OC *ɦljeds > VS 'thề' (the 2nd tone) > MC dʑiaj > SV 'thệ' (the 6th tone) > 誓 M shì ('vow'),
          2. < OC *ŋʷans  > VS 'nguyền' (the 2nd tone) > MC ŋuan > SV 'nguyện' (the 6th tone) > 願 M yuàn (wish),

          3. as well as those voiced versus unvoiced initials such as

          4. 'buồng' vs. 'phòng' 房 fáng (room), 
          5. 'buồm' vs. 'phàm' 帆 fán (sail),
          6. 'bữa' vs. 'phạn' 飯 fàn (meal), etc.

          Taking into account the tonal factor, since both Vietnamese and Chinese are tonal languages, their pronunciations align closely in Sino-Vietnamese and portions of Sinitic-Vietnamese etyma, suggesting that many of these forms may have originated from common sources in antiquity.

          The model {C(+tone) : V(+tone) : MK(-tone)} illustrates how the three language groups — Chinese, Vietnamese, and Mon-Khmer — developed concurrently, in contrast to the hypothesis advanced by Austroasiatic Mon-Khmer theorists. Such a scenario could have unfolded more than 2,200 years ago, when early Annamese already possessed two or more tones comparable to those of Old Chinese. This was long before Annamese took shape as a distinct language, layering Sinitic elements atop an indigenous substratum, after its ancestral Vietic branch had split from proto-Viet-Muong. By the time local Muong tribesmen in northeastern Vietnam retreated into the mountains under pressure from the Han invasion of 111 B.C., the speech of those Kinh who remained had already become highly Sinicized. In contrast, the Muong, who mingled with Mon-Khmer speakers in the southwestern highlands, maintained closer contact with Austroasiatic traditions.

          From an etymological perspective, when all elements are considered in historical context, the plausibility of Vietnamese and Chinese etyma being cognates expands to encompass a wide range of basic vocabulary. This includes items also found in Mon-Khmer languages, many of which have been classified as Austroasiatic lexicons—for example, chết 死 sǐ 'die', máu 衁 huáng 'blood', ngà 牙 yá 'tusk', and others. (see http://tlmei.com/tm17web/1976a_austroasiatics.pdf). 

          On the Sinitic side, words sharing the same phonetic element eventually produced a profusion of homonyms in modern Mandarin, which today preserves only four tones after centuries of northern influence from Altaic languages such as Tartar and Manchu during their rule over parts or all of China. By contrast, the Annamese language mirrored the fuller tonal development of Southern Chinese lects such as Cantonese and Fukienese, both of which retained complete tonal systems.

          The millennium of Chinese colonial rule, from the Han through the Tang dynasties until the 10th century, provided ample time for the eight-tone system of Middle Chinese to become firmly embedded in Vietnamese, where it remains an inseparable feature. The crucial point is that Annamese did not wait until the 12th century to complete its tonal system, as Haudricourt suggested. Rather, the tonal shifts that unfolded from Old Chinese through Early Middle Chinese into Middle Chinese were adopted in Annam contemporaneously, just as they were in other prefectures of the Middle Kingdom.

          As a result, the eight-tone system in Vietnamese words of shared ancient roots allowed for precise differentiation of meaning among characters built on the same phonetic stem. For example, with the sound...

          1.  口 kǒu (SV khẩu, VS cửa, 'opening')

          • ca 哥 (gē, 'brother')
          • ca 歌 (gē, 'sing')
          • 個 (gè, 'each', VS cái)
          • các 各 (gé, 'every')
          • cáo 告 (gào, 'announce')
          • cao 高 (gāo, 'high')
          • cảo 稿 (gǎo, 'manuscript')
          • cẩu 狗 (gǒu, 'dog', VS cầy)
          • cổ 古 (gǔ, 'ancient', VS )
          • 姑 (gū, 'aunt')
          • cố 估 (gù, 'estimate')
          • cố 固 (gù, 'cause', VS cớ)
          • 句 (jū, 'sentence', VS câu)
          • cục 局 (jù, 'bending', VS cong)
          • 居 (jū, 'reside')
          • 何 (hé, 'how, which')
          • 河 (hé, 'river')
          • hồ 胡 (hú, 'neck', VS cổ)
          • hồ 湖 (hú, 'lake')
          • khả 可 (kě, 'able')
          • kha 珂 (kē, 'jade')
          • khắc 克 (kè, 'overcome')
          • khách 客 (kè, 'guest')
          • khấu 摳 (kòu, 'stingy', VS kẹo)
          • khấu 扣 (kòu, 'knock', VS )
          • khô 枯 (kū, 'dried')
          • khổ 苦 (kǔ, 'bitter')
          • khốc 哭 (kù, 'weep', VS khóc)
          • khốc 酷 (kù, 'brutal')
          2. 方 fāng (SV phương, VS vuông, vửa, mới, 'square', 'recently')
          • bàng 旁 (páng, 'side')
          • báng 謗 (páng, 'slander')
          • biên 邊 (biān, 'border')
          • phòng 房 (fáng, 'room', VS buồng)
          • phóng 放 (fàng, 'release', VS buông)
          • phòng 防 (fáng, 'safeguard')
          • phỏng 仿 (fáng, 'imitate')
          • phỏng 訪 (fǎng, 'visit', VS thăm)
          • phương 芳 (fāng, 'fragrant', VS thơm)
          • phường 坊 (fáng, 'quarter', VS hàng
          3. 工 gōng (SV công, 'work')cang 扛 (káng, 'carry', VS khiêng, gánh, gồng, cõng)
          • cang 缸 (gāng, 'vat', VS ảng)
          • công 功 (gōng, 'force')
          • công 攻 (gōng, 'assault')
          • cống 貢 (gòng, 'tribute')
          • củng 鞏 (gǒng, 'consolidate')
          • giang 江 (jiāng, 'river', VS sông)
          • hạng 項 (xiàng, 'nape, item', VS càng)
          • hồng 紅 (hóng, 'pink', VS hường)
          • hồng 虹 (hóng, 'rainbow', VS mống)
          • hồng 鴻 (hóng, 'swan', 'grand')
          • không 空 (kōng, 'empty', VS trống, rỗng)
          • khống 控 (kòng, 'control')
          • khủng 恐 (kǒng, 'terribly')
          • xoang 腔 (qiāng, 'hollow', 'accent', VS giọng)

          4. 共 gòng (SV cộng, VS cùng, cung, cũng, vòng, 'add', 'common', 'together')

          • cảng 港 (gǎng, 'seaport')
          • cung 供 (gōng, 'supply, offerings', VS cúng)
          • cung 拱 (gōng, 'cup hands before the chest', VS vòng)
          • cung 恭 (gōng, 'respect')
          • hang 巷 (xiāng, 'alley', VS hẻm)
          • hồng 烘 (hōng, 'heat by fire', VS hong, hâm, hầm, )
          • hồng 洪 (hóng, 'flood')
          • hồng 哄 (hòng, 'clamor', 'coax', VS hống)

          Concluding Point - The essential point is that all of these fundamental words, and hundreds of others, must have been pronounced with tonal distinctions long before the 12th century. Without tonal differentiation, they would have collapsed into homonymy. This demonstrates that Annamese had already developed a full tonal system well before Haudricourt's proposed timeline, regardless of whether one considers it a Sinicized language.

          In addition, since Vietnamese is the only language in contrast to the Mon-Khmer group that developed a full tonal system, the argument can be reversed: loanwords from a tonal language such as Vietnamese, when transferred into toneless Mon-Khmer languages, would necessarily undergo morphemic innovation to compensate for the absence of tonal contrast. In such cases, pitch or intonational features may have been recruited to preserve distinctions, much as occurred with Chinese loanwords in other non-tonal languages like Japanese and Korean.

          Koichi Honda, in his studyTone Correspondences And Tonegenesis In the Vietic Family (Austroasiatic)approaches the entire issue from a Mon-Khmer perspective (emphasis by dchph):

          "Vietic is known as the only sub-group in the Austroasiatic (Mon-Khmer) language family for having tones. Due to the existence of the tones, Vietnamese (or Viet), one of the members of the Vietic family, has long been discussed in terms of its position to which it belongs. In 1912, Maspero grouped Vietnamese as a member of Tai (Thai) languages, mainly because of its tones. Haudricourt, on the other hand in 1954, claimed it belongs to Mon-Khmer family, due to the correspondence of basic words, and posited a hypothesis which is called "tonegenesis". It seems Haudricourt's hypothesis is widely accepted by linguists. However, his hypothesis has not been well attested due to the scarcely obtainable materials for the comparative method."

          And Honda summarizes Haudricourt's hypothesis as follows:

          "The Vietic language did not have tones in the first stage around the year A.D. O. The birth of tones dates back to the 6th century, when a 3-tone system was established, depending on the syllable ending types: (1) open and sonorant-ending syllables became level tone; (2) fricative-ending syllables created falling tone; and (3) stop and glottal stop-ending syllables created rising tone (phonemicising of the rhymes to tones). The third shift took place in the 12th century where 3 tones split into 6 tones depending on the initial consonants; voiced ones became lower series of tones accompanied by the devoicing of initial consonants (phonemicising of the voiced initial consonants to tones). The last stage has been continuing to now where the devoiced initial consonants became voiced without changing their tones (voicing)." (Honda, p. 3)

          Whether or not Old Chinese, under the same hypothesis outlined above, can be theorized to have developed into a four-tone system through such a phonemicizing process, it is crucial to note that prior to 111 B.C., before the Han Empire's annexation of the Nam Việt Kingdom, ancient Chinese loanwords, already complete with tonal distinctions, may have entered the earlier form of proto-Vietic. By the time Annam gained independence in the early 10th century, the region, having been a Chinese protectorate for nearly a millennium, had become a flourishing center of Tang-style rhymed poetry, second only to the Middle Kingdom itself. Without the already established eight-tone system inherited from Middle Chinese, it would have been impossible for Annamese poets of that era to compose Tang-inspired masterpieces without producing crippled imitations, lacking the tonal prerequisites essential to the tradition. (See Drake, F.S. ed. 1967. Symposium on Historical Archaeological and Linguistic Studies on Southern China, South-East Asia and the Hong Kong Region). In other words, Middle Chinese loanwords with their full eight-tone system must have entered Sino-Vietnamese well before the Middle Ages, already integrated into its two-register framework, so that eight tones were perceptually distinguished, even though only six are visibly marked in the later Vietnamese orthography, and "the register system is well reflected in the present day Vietnamese." (Honda, ibid, p 13)

          For what Honda calls "specific irregular words in Viet" as he came along with his comparative work on data at hand, his postulation on some other factors seem to have influenced in the voicing in Vietnamese where they are devoicing.

          			Arem	    Ruc		       Muong	Viet
          #7 "chicken" 		lakæ:	təlka:1, rəlka:1	ka	gà
          #35 "rice (husked)" 	ŋkɔ:	təlkɔ:3, rəlkɔ:3	kaw.	gạo
          
          "Our expected tone for #7 and #35 are ngang and hỏi tones respectively, both of which belong to high register. However, contrary to our expectations, both of them have low register tones with voiced onsets. Since these two words are so closely related to their daily life, it is hard to believe that only two of them developed in a different course. There must be some other factors for this irregularity. Another factor in common to the above two words is the initial consonant cluster, [tk] or [rk]. For reference, Ferlus' reconstructed forms for the above words are #7 *r-ka: and #35 *r-ko:ʔ This is a supporting evidence where initial consonant (or consonant cluster) has something to do with the voicing.

          However, not all the reconstructed forms of Ferlus are reliable. Please look at the following example.

          		Arem	Ruc		Muong	Viet	Ferlus
          #85 "near" 	-	təkiɲ1 or 2 	xəɲ`	gə`n	*t-kiɲ
          #35 "sand" 	təka:c	təka:c3		kac´	kát	*t-ka:c
          

          When I found word #85, I expected the initial consonant cluster *t-k is working in the same way as *r-k is doing. The expectation, however, was betrayed because of word #35. Word #85 is an old form of a loan word from Chinese called quasi-Sino-Vietnamese. Formal form of Sino-Vietnamese for this word is cận [kə.n]."

          (Honda, p 13)

          Could the so‑called “irregularities” and “faulty items” identified by Ferlus in fact be explained as Vietnamese loanwords into Mon‑Khmer languages? Terms such as gạo (from wet‑rice cultivation), (domesticated fowl), and cát (sand, as found in riverbeds or coastal shores) would not have been essential to the vocabulary of upland Mon‑Khmer montagnards. Perhaps the question itself would never have arisen if one accepts the straightforward anatomy of Vietnamese–Chinese cognates for these etyma, as follows:

          • 雞 (jī, 'chicken', SV ) [See elaboration on the etymology in the previous section.]
          • gạo 稻 (dào, 'paddy, rice', SV đạo) [ M 稻 dào < MC dɑw < OC *lhu:ʔ ~ ɫhu:ʔ (Schuessler: MC dâu < OC *gləwʔ or *mləwʔ). See elaboration in the previous section.]
          • gần 近 (jìn, 'near', SV cận, cấn, ) [ M 近 jìn (cận, cấn, ký) < MC gɨn < OC *ɡɯnʔ, *ɡɯns |  According to Starostin, also *gərʔ‑s; MC gyn; Mand. jìn 'to come near, keep close to'. In Vietnamese cf. gần 'near, close; adjacent, beside'. For etymology cf. 幾 *kəj 'near' (an old ‑r/‑l variation?). § 雞 jī (SV ) 'gà'; 記 jì (SV ) 'ghi'; 寄 jì (SV ) 'gởi'; 急 jí (VS cấp) 'gấp'. ¶ j‑ ~ c‑(k‑). Note: 近 jìn ~ SV ~ VS kề 'close by'.]
          • cát 沙 (shā, 'sand', SV sa) [ M 沙 shā, shà, suō (sa, sá) < MC ʂaɨ < OC *sraːl, *sraːls | ¶ /sh‑, j‑, q‑ ~ k‑/. Examples: 尚 shāng ~ VS còn 'still'; 插 chā ~ VS cài 'stick in'; 擦 cā ~ VS 'rub'; 笑 xiào ~ VS cười 'smile'; 吉 jí ~ VS cát 'luck'; 旗 qí ~ VS cờ 'flag'; 棋 qí ~ VS cờ 'checker'. Alternatively, it may be cognate with sạn 'pebble' or 砂 shā.]

          Additional considerations - Haudricourt's hypothesis is questionable for further reasons:
          1. As Honda (ibid.) has noted, the hypothesis is poorly attested due to the scarcity of reliable comparative materials.

          2. From the outset, the tonal table constructed by Haudricourt for comparison was flawed, diverging from the scheme traditionally employed in Chinese historical linguistics and widely adopted by subsequent philologists.

           It is not like this:

          1. ­­ 3. ´ 5. ʔ 7. ´ -p, -t, -c, -ch
          2. ` 4. . 6. ~ 8. . -p, -t, -c, -ch

          (Sources: Norman. 1988, p. 55)

          but it should be in correct alignment like this:

          1. ­­ 3. ʔ 5. ´ 7. ´ -p, -t, -c, -ch
          2. ` 4. ~ 6. . 8. . -p, -t, -c, -ch

          Hence, according to Honda's observation that "(3) stop and glottal stop-ending syllables created rising tone (phonemicising of the rhymes to tones)." (Honda. Ibid.) , such a process should have given rise instead to the two departing tones (去聲) in Vietnamese, namely, the high [´] and low [.] registers, as classified in the tonal categories of the second table. These categories, faithfully preserved in Chinese historical linguistics and in the classic rhyme books, illustrate how ancient tonal schemes were devised and interpreted, and how the two‑tiered register system evolved out of the earlier three‑ and four‑tone system of Old Chinese.

          Although the discussion concerns the emergence of three tones in Old Chinese, the distributional order of the pitch scheme in Middle Chinese is crucial, for it reflects how Chinese philologists of antiquity classified tones according to their contours as they appeared in Middle Chinese. Strikingly, this framework does not appear in Haudricourt's tonal schema, revealing a surprising gap in his awareness of the already well‑established Middle Chinese tonal system, even though he was otherwise one of the foremost Sinologists of the early twentieth century.

          Moreover, regarding Honda's observation that "the last stage has been continuing to now where the devoiced initial consonants became voiced without changing their tones (voicing)" (Honda, p. 3), it should be noted that tonal change is not as rigid as Haudricourt's hypothesis suggests. Haudricourt attempted to correlate Vietnamese tonal registers directly with Old Chinese categories, for example, assigning sắc and nặng to shàngshēng (上聲 'rising') tones, and hỏi and ngã to qùshēng (去聲, 'departing') tones, and then reversing these correspondences for Middle Chinese. Such a mapping, however, oversimplifies the situation.

          In reality, both initial and final consonants in Vietnamese words are distributed across all tones in both registers, low and high. For clarity, rather than using the traditional notation of tones 1-4 in two registers, the system here numbers them sequentially from 1 through 8 in paired sets. This reflects how the eight Vietnamese tones are classified (see the second table above) and provides a more precise framework for the discussion that follows.

          As several philologists have noted, Haudricourt tended to see only one‑to‑one correspondences between certain initial or final consonants in Mon‑Khmer words and specific tones in Vietnamese. This limitation arose, first, because many of the Vietnamese words of Old Chinese origin in his list were drawn from a relatively small set of attested lexical items; second, because the comparative materials available at the time were scarce and earlier surveys incomplete; and third, because he was likely unaware of the much larger body of Vietnamese words of Chinese origin that exhibit multiple tonal layers, effectively allomorphs. These forms fall into his so‑called "last stage" of tonal development, but in fact they may have undergone repeated cycles of change in the distant past, distancing themselves considerably from their original shapes.

          Moreover, one must also consider the possibility of loanwords moving in the opposite direction: Vietnamese items may have been borrowed into Mon‑Khmer languages, where speakers adapted them to fit a toneless phonological system, just as Chinese lexicons were adapted into Japanese and Korean, rather than the other way around.

          Before proceeding further, it is useful to recall Haudricourt's hypothesis on the development of the six tones of modern Vietnamese, summarized in the following table (Honda, p. 2):


          Haudricourt's hypothesis (1954)

          AD 0 6th Century 12th Century Today toneless 3 tones 6 tones 6 tones pa pa pa ba ba ba pà bà pas, pah pà pà bả
          bas, bah bà pã bã
          paX, paʔ pá pá bá baX, baʔ bá pạ bạ

          and here is the premise for that supposition:

          "It was presupposed that the number of tones indicates chronological development of the Vietic family, i.e. from Arem (0 tone), Ruc (4 tones), Muong (5 tones) and Viet (6 tones). In the light of Haudricourt's hypothesis, Arem shows the first stage (AD.O), then Ruc just after the second stage (6th century), Muong around the third stage (12th century), and Viet the last stage (today)." (Honda, p. 4)

          Let us examine the oft‑cited example of mả (rising tone) [ mis‑cited as mã ] from Haudricourt, which he presented as a cognate of Chinese 墓 mù = SV mộ. The latter is embedded in the nặng register, the 6th tone (陽去 yángqù, 'low departing tone'). In fact, this etymon has evolved into several distinct Sinitic‑Vietnamese and vernacular Vietnamese forms, each pronounced with different tones that are explicitly cross‑tonal and cross‑staged.

          • 墓 mù 'grave' (SV mộ) [M 墓 mù < MC muo < OC *ma:gs. | Chinese dialects: Cantonese mou6, Hakka mu5, Amoy boŋ6, Chaozhou mo4, Fukienese muo5, muoŋ5. According to Starostin, Standard Sino‑Vietnamese is mộ. Cf. also probable borrowings: ma 'funeral', mồ 'tomb'. For mh‑ cf. Amoy boŋ6, Chaozhou mo4, Fuzhou muo5, muoŋ5. GSR: 0802 f. Cf. môđất (土墓 tǔmù, 'mound'), maquỷ (魔鬼 móguǐ, 'ghost'), and machay (墓祭 mùjì, 'funeral ceremony').]

           From this root we find the following Vietnamese reflexes:

          1.  (thanh ngang, 1st tone, 陰平 yīnpíng 'high level'), e.g., môđất 土墓 tǔmù 'earth mound'
          2. ma (thanh ngang, 1st tone, 陰平 yīnpíng 'high level'), e.g., thama #墓地 mùdì 'graveyard', machay 墓祭 mùjì 'funeral ceremony'
          3. mồ (huyền, 2nd tone, 陽平 yángpíng 'low level')
          4. mả (hỏi, 3rd tone, 陰上 yīnshàng 'high rising')
          5.  (ngã, 4th tone, 陽上 yángshàng 'low rising'), Haudricourt posited VS  < OC mâg instead of #4 mả < OC *ma:gs and #7 SV mộ < MC muo
          6. mố (sắc, 5th tone, 陰去 yīnqù 'high departing'), meaning 'bumper' (Fr. butée)
          7. mộ (nặng, 6th tone, 陽去 yángqù 'low departing')
          8. mốc (sắc nhập, 7th tone, 陰入 yīnrù 'high entering'), cf. biamốc (墓碑 mùbēi, 'gravestone')
          9. mộc (nặng nhập, 8th tone, 陽入 yángrù 'low entering'), cf. biamộc (墓碑 mùbēi, 'gravestone')

          The form #(4) mả [mả], or even #(5) [ma4] (as cited by Haudricourt), is in fact a Sinitic‑Vietnamese reflex of the academic Sino‑Vietnamese word mộ [mo6], which is itself a clear cognate of Middle Chinese 墓 mù. In other words, the Vietnamese forms mồ, , and related variants all derive from the same root, or else represent Chinese loanwords that entered at later stages, possibly during Early Middle Chinese. These items may have been borrowed into Vietnamese at different historical periods, beginning as early as Old Chinese, with semantic development from 'mound' to mồ 'tomb'. Alternatively, they may share a common Yue source, perhaps reflected in a form like .

          As Bo Yang (1983, vols. 1–2) notes, in the early Han dynasty denoted an 'earth lump', and graves of that period were flat and level. From this evidence, two scenarios emerge:

          1. Tonal correspondence: A relationship between voiced and devoiced initials or finals could have given rise to the 4th tone. In this specific case, {ʔ ⇒ ~}, e.g., OC mâg > , as Haudricourt postulated. This fits within his system of consonantal alternation producing tonal categories.
          2. Later voicing stage: As Honda (ibid., p. 3) observed, "the last stage has been continuing to now where the devoiced initial consonants became voiced without changing their tones (voicing)." This accounts for other cases where tonal distribution reflects subsequent phonological developments rather than Haudricourt's rigid correspondences.

          In short, the aim here is to demonstrate that a single Chinese word could yield multiple reflexes in Vietnamese, each with different sounds and tones, regardless of whether the original Chinese form involved voiced or voiceless initials or finals (cf. Mei Tsu-lin's Tones And Prosody In Middle Chinese And the Origin of the Rising Tone. 1970.)

          Much like the case of discussed above, the following examples further illustrate how multiple tonal shifts could develop from a single Chinese etymon, giving rise to several distinct Vietnamese reflexes — patterns that Haudricourt may not have fully recognized. The extended list serves two key purposes: first, to demonstrate that divergent tones had already taken shape well before the 12th century, functioning to mark subtle semantic distinctions; and second, to affirm their overwhelmingly Sinitic origins, in contrast to the comparatively scant Mon‑Khmer etymons, by showing how all derivatives remain systematically interconnected closely with the Chinese historical roots phonetically, tonally, and semantically, which the other side seriously lacks.

          • 母 mǔ (SV mẫu, ) [ M 母  mǔ, mú, wǔ, wú < MC məw < OC *mɯʔ || Starostin: MC mʌw < OC *mǝ̄ʔ. For initial *m- cf. Min forms: Xiamen bo3, Chaozhou bo3, Fuzhou, Jianou mu3. | GSR: 0947 a-e || According to Thiều Chửu's Hán-Việt Dictionary: also SV '': VS 'men', 'mẻ'. 母 mǔ ~ VS 'mái', 'cái' | § 海 hǎi (SV hải) | Example: 酵母 (jiàomǔ, VS menrượu), 母雞 (mǔjī, VS gàmái), 父母大王 Fùmǔ Dàwáng (VS Bốcái Ðạivương), 母系 (mǔxì, SV mẫuhệ), 繼母 (jìmǔ, VS mẹghẻ) ]
            1. VS men 'yeast' (thanhngang the 1st tone, 陰平 yīnpíng, 'high level tone'),
            2. VS me 'mother' (the 1st tone),
            3. SV  'mold' (the 1st tone),
            4. SV mỳ 'venter' (the 2nd tone),
            5. VS mẻ 'female elder' (hỏi, 3rd tone, 陰上 yīnshàng 'high rising')
            6. SV mẫu 'mother' (ngã or the 4th, 陽上 yángshàng, 'low rising tone')
            7. VS mái 'female of animal' (sắc or the 5th, 陰去 yīnqù, 'high departing tone'),
            8. VS nái 'female of animal' (the 5th tone) [ ex. 'heonái' (母彘 mǔzhí, 'female pig') ],
            9. VS cái 'female' (the 5th tone) [ ex. 'con dại cái mang', literally translated, '子呆母忙 zǐ dài mǔ máng'. cf. 海 hăi (SV 'hải' /ha̰ːj/) VS 'khơi' ]
            10. VS mạ 'mother' (nặng or the 6th tone, 陽去 yángqù, 'low departing tone'),
            11. VS mệ 'mother' (the 6th tone),
            12. VS mẹ 'mother' (the 6th tone),
            13. VS mợ 'aunty' (the 6th tone)

          • 梅 méi (SV mai[ M 梅 méi < MC moj < OC *mɯː| According to Starostin, Japanese apricot (Prunus mume), plum. Viet. 'me' has a narrowed meaning 'tamarind' (cf. Chin. 酸梅 'tamarind', lit. 'sour plum'). An older loanword is probably Viet. mơ 'apricot'. The regular Sino-Viet. reading is mai. For *m- cf. Min forms: Xiamen m2, Chaozhou bue2, Fuzhou muoi2, Jianou mo2. ] we have:
            1. VS mai 'plum' (thanhngang or the 1st tone, 陰平 yīnpíng 'high level tone')
            2. VS me 'tamarind' (thanhngang or the 1st tone, 陰平 yīnpíng 'high level tone')
            3. VS  'apricot' (the 1st tone)
            4. VS muội 'salted dried plum' (nặng or the 6th tone 'low departing tone') [ ex. 'xímuội' 鹹梅 (xiánméi, 'preserved salty plum') vs. 'ômai' (烏梅 wūméi, 'black preserved salty plum') ]

          • 海 hǎi (SV hải) [M 海 hǎi < MC həj < OC hmlɯːʔ |  Etymological: § 母 mǔ > mái > mệ > mể ~ QT 海 hǎi ~ bể > biển. ¶ Sound correspondences: /h‑ ~ m‑/, /m‑ ~ b‑/. | Dialectal reflexes : Cant. hoi2, Hakka hoi3, Teochew hai2. Related forms: For khơi (open sea), note the alternation ¶ /k‑/ ~ /h‑/. Phonological note: 開 kāi (SV khai) in Hainanese is /k'uj1/, while in Cantonese it is /hoj1/.  Cf. 悔 huǐ (SV hối), 況 kuàng (SV huống). Examples: 出海 (chūhǎi, VS rakhơi), 海外 (hǎiwài, VS ngoàikhơi), Interchange "bể" ~ "biển": Compound words illustrate the sound‑change patterns:
            • 大海 (dàhǎi, VS bểcả, biểncả)
            • 苦海 (kǔhǎiVS bểkhổ, khổải)
            • 海浪 (hǎilàngVS sóngbể)
            • 海口 (hǎikǒu, VS cửabể)
            • 寇 (hǎikòu, VS cướpbể)
            • 海賊 (hǎizéi, VS giặcbể)
            • 海域 (hǎiyù, VS vùngbiển) ]
          1. VS khơi 'sea' (the 1st tone, 陰平 yīnpíng 'high level tone')
          2. SV hải 'sea' (hỏi, 3rd tone, 陰上 yīnshàng 'high rising')
          3. VS bể 'sea' (the 3rd tone),
          4. VS biển 'sea' (the 3rd tone),
            • Toponymic usage: 北海道 Běihǎidào (Hokkaidō) → SV Bắchảiđạo.

            • Comparative notes: Cf. 繁 (緐) fán, pó (SV phồnbàn), 敏 mǐn (SV mẫn), 每 měi (SV mỗi), 梅 méi (SV mai).

            • 溟 míng (SV minh) [ M 溟 míng, mǐng, mì (SV minh, mình, mịch) < MC mieŋ < OC *meːŋ, *meːŋʔ | Etymologically, it is thought by classical commentators to be the same word as 冥 (OC *meːŋ, 'dark', 'black(of water)') (likely in light of the parallelism with the unrelated 海 (OC *hmlɯːʔ, 'sea', 'ocean') < 晦 (OC *hmɯːs, 'dark')). Schuessler (2007) proposes that there's an outside chance this can be instead connected with Proto-Tibeto-Burman *mlik, whence Old Burmese မ္လစ် ('river'), Burmese မြစ် (mrac), Rakhine (mreik, 'sea'), Daai Chin [Term?] (mlik (tui), 'big water, river, sea')]
            1. SV minh 'dark' (thanhngang, the 1st tone)
            2. VS mênh 'expanse' (thanhngang, the 1st tone)
            3. VS mưa 'crizzle' (thanhngang, the 1st tone)
            4. SV mình 'muddle' (huyền, 1st tone)
            5. VS mùng 'vast' (huyền, 1st tone)
            6. VS mờ 'indistinct' (huyền, 1st tone)
            7. VS bể 'sea' (hỏi, the 3rd tone)
            8. VS biển 'ocean' (hỏi, the 3rd tone)
            9. SV mịch 'dull' (nặngnhập, the 8th tone)
            10. VS mịt 'obscure' (nặngnhập, the 8th tone)
            • 放 fàng (SV phóng) [ M 放 fàng < MC pwoŋ < OC *paŋʔ, *paŋs | According to Starostin, to put away, put aside; neglect; banish. In Viet. cf. also a colloquial word: phỗng 'to take away, to carry away'. | ¶ /f- ~ b-/ : Ex. 房 (fáng, SV phòng, VS buồng ('room') ], we have:
              1. VS bỏ 'discard' (hỏi the 3rd tone, 陰上 yīnshàng 'high rising tone')
              2. VS phỗng 'take away' (ngã or the 4th tone')
              3. SV phóng 'release' (sắc or the 5th, 陰去 yīnqù 'high departing tone')
              4. VS buông 'let go' (thanhngang or the 1st, 陰平 yīnpíng 'high level tone')
              5. VS bắn 'shoot' (sắc or the 5th tone')
                • 會 huì (SV hội, cối) [ M 會 huì, kuài, guā, guài, huǐ, guì (hội, cối) < MC kwaj, ɦuɑi < OC *ko:bs, *go:bs ], we have:
            1. VS hay, 'aware' (thanhngang, the 1st tone)
            2. SV hồi 'festival' (huyền or the 2nd tone)
            3. VS hiểu 'understand' (hỏi or the 3rd tone); (cognate to or an alternation of the modern Mand. xiáo 'know', 'understand', SV hiểu)
            4. VS đỗi, 'moment' (ngã or the 4th tone)
            5. VS sẽ, 'will' (ngã or the 4th tone)
            6. SV hội 'festival' (nặng or the 6th tone 'low departing tone')
            7. VS hụi 'loan' (nặng or the 6th tone, from Fukienese or Amoy)
            8. VS họp 'meeting' (nặngnhập or the 8th, 陽入 yángrù 'low entering tone')
            9. VS hẹn 'dating, appointment' (nặng or the 6th tone)
            • 賊 zéi (SV tặc) [ M 賊 zéi < MC dzək < OC *zɯːɡ | Possibly Sino-Tibetan; compare Tibetan ཇག (jag, 'robbery') (Coblin, 1986). Schuessler (2007) points out that a palatalized consonant in Tibetan does not usually correspond to an unpalatalized one in Chinese; instead, he compares it to Khmer ឆក់ (chɑk, 'to snatch; to steal');  Based on evidence from early loans from Chinese, e.g. Lakkia kjak⁸ ('bandit') and Rục kəcʌ́ːk ('bandit'), Baxter and Sagart (2014) reconstructs the Old Chinese with a *k preinitial.]
              1. VS chích 'burglar' (sắcnhập or the 7th tone, 陰入 yīngrù 'high entering tone'); as in đạochích: 盗賊 dàozéi 
              2. VS cắp 'steal' (the 7th tone);  as in đánhcắp: 盗賊 dàozéi 'steal'
              3. SV tặc 'enemy' (nặngnhập or the 8th, 陽入 yángrù 'low entering tone'),
              4. VS giặc 'enemy' (the 8th tone),
            • 粉 fén (SV phấn) [ M QT 粉 fěn, fèn < MC pun < OC *pɯnʔ | Dialects: Minnan, including Hainanese hun2, Amoy hun2, Chaozhou huŋ21, Fuzhou xuŋ2 | According to Starostin, the later (and usual) meaning is 'flour'. The word is also used in compounds meaning 'noodles', thus it seems possible that Viet. bún 'vermicelli' is an independent loan from the same source. | ¶ /f- ~ ph-, b-, v-/ ]
            1. VS phở 'noodle soup' (the 3rd tone)
            2. SV phấn 'powder' (nặng or the 5th tone)
            3. SV phớn 'powder' (the 5th tone)
            4. VS bún 'rice vermicelli' (the 5th tone)
            5. VS bột 'flour' (nặngnhập or the 8th, 陽入 yángrù 'low entering tone')
            6. VS bụi 'dust' (the 8th tone) [ Probably associated with 灰 huì (SV muội) ]
            7. VS vụn 'shard' (the 8th tone) 
            • 照 zhào (SV chiếu) [ M 照 zhào < MC tɕiɛu < OC *tjews ]
            1. VS soi 'look at the mirror' (thanhngang the 1st tone)
            2. VS noi 'follow' (the 1st tone)
            3. VS theo 'according to' (the 1st tone)
            4. SV chiếu 'reflect' (sắc or the 5th, 陰去 yīnqù 'high departing tone')
            5. VS chói 'reflect' (the 5th tone)
            6. VS chụp as in 'chụphình' 'take picture' (nặngnhập or the 8th tone')
            7. VS rọi 'shine' (nặng or the 6th tone)
            8. VS dọi 'shine' (nặng or the 6th tone)
            • 染 rǎn (SV nhiễm) [ M  染 rǎn, ràn < MC ȵiam < OC *njomʔ, *njoms | According to Starostin, 'be soft'. Somewhat later (since late Zhou) the character was also used for a homonymous *nam (~-emʔ) 'to dye, smear; ('dye' <) infect' (with a variant *namʔ-s, MC ɲe\m). Viet. nhiễm is a standard reading; there also exists a colloquial loan nhuộm 'to dye'. Coblin (1986) compares this to Tibetan ཉམས་པ (nyams pa, “be stained, tarnished, spoiled”); Pan (1987) also notes Proto-Tai *ɲuɔmᴬ as well as Vietnamese nhuộm, both meaning "to dye". Schuessler (2007) cites Downer (1986)'s opinion that form with 上 (shàng) tone is the verb, while form with 去 (qù) tone is the noun meaning "kind of cloth" (Lǐjì). ]
            1. VS lây 'contagious' (thanhngang or the 1st tone)
            2. VS sang 'spread a virus' (the 1st tone)
            3. VS lem 'stain' (the 1st tone)
            4.  SV nhiễm 'extract a disease, habit' (the 4th tone)
            5. VS vẩn 'smear' (the 4th tone)
            6. VS nhuốm 'extract a disease' (sắc or the 5th, 陰去 yīnqù 'high departing tone')
            7. VS nhuộm 'dye' (the 5th tone)
            8. VS ruộm 'dye' (the 5th tone)
            9. VS nhẹm 'tarnish' (the 5th tone)
            10. VS mắc 'get sick' (sắcnhập or the 7th tone)
            • 深 shēn (SV thâm) [ M 深 shēn (thâm, thẩm) < MC ɕim < OC *hljum, *hljums | Etymology: Most of dialects read /sjəm1/ | ¶ /sh- ~ đ-/ : Ex. 燒 shāo (SV thiêu, VS đốt, 'burn'). Unger (1995) suggests that 深 (shēn) may have had an Old Chinese initial n‑, based on the phonetic evidence of 淰 (shěn), whose component 念 (niàn*) points in that direction. Schuessler (2007) reconstructs 深 as OC nhəm and proposes connections to several Tibeto‑Burman forms: Mizo hniam 'to be low, to sink into (land)', Burmese နိမ့် (nim.) 'low', Tangkhul Naga kʰənim 'to be humble', and Tibetan ནེམས་ (nems) 'sink a little, give way'. He traces these to Proto‑Tibeto‑Burman nem 'low', which STEDT further derives from Proto‑Sino‑Tibetan *s‑n(i/u)(ː)p/m ~ r/s‑nyap/m 'pinch, squeeze; press, oppress; submerge, sink, west, low, soft'. If so, 深 (OC nhəm) ultimately reflects a Proto‑Sino‑Tibetan root. Schuessler also notes the similarity between 深 (OC nhəm) and 沉 (OC d‑ləm), which he interprets as an areal etymon. In the Rites of Zhou, 深 (OC nhəms) 'depth' appears as a nominal derivation with the suffix ‑s, which yielded the departing tone (MC ɕiɪmH). By contrast, 淰 (OC nhəmʔ) 'to be startled and flee (of fish); to sink into the deep' (in the Liji) represents an endoactive derivative with ‑ʔ, which produced the rising tone (Mandarin shěn). ]
            1. SV thâm 'profound' (thanhngang or he 1st tone)
            2. VS sâm 'deep' (the 1st tone)
            3. VS sâu 'deep' (the 1st tone),
            4. VS sẫm 'dark' (ngã or the 4th tone)
            5. VS thẩm (hỏi or the 5rd tone)
            6. VS thắm 'profound' (sắc or 5th tone)
            7. VS đậm 'dark' (nặng or the 6th tone, 'low departing tone')
            8. VS sậm, 'dark' (the 6th tone)
            • 扛 káng (SV cang) [ M 扛 káng, gāng < MC kaɨwŋ < OC *kroːŋ | cf. cõng, gánh, gồng, chống: M 抗 kàng (SV kháng) 抗 kàng, káng < MC  kʰaŋ < OC *ɡaːŋ, *kʰaːŋs  ]
            1. SV cang 'carry' (thanhngang or the 1st tone, 'high level tone'),
            2. VS khiêng 'carry on one's shoulder' (the 1st tone)
            3. VS gồng 'to shoulder' (huyền or the 2nd tone 'low level tone')
            4. VS cõng 'carry on one's back' (ngã or the 4th tone 'low rising tone')
            5. VS gánh 'carry on one's shoulder' (sắc or the 5th, 陰去 yīnqù 'high departing tone'),

            • 蟲 chóng (SV trùng) [ M 蟲 chóng < MC ɖuwŋ < OC *l'uŋ, *l'uŋs | From Proto-Sino-Tibetan *djuŋ ('insect'; 'bug') (STEDT). According to Starostin: 'insect', 'small bird' Used also for a homonymous *ɬhuŋ (nóng) 'be hot (of weather)'. Standard Sino-Viet. is trùng
            1. VS giun 'earthworm' (thanhngang or the 1st tone, 'high level tone')
            2. VS sâu 'insect' (the 1st tone)
            3. SV trùng 'insect' (huyền or the 2nd tone, 'low level tone'),
            4. VS trùn 'earthworm' (the 2nd tone),
            5. VS sán 'worm' (sắc or the 5th, 陰去 yīnqù 'high departing tone')
            6. VS nóng 'hot' (the 5th tone)
            • 種 zhǒng (SV chủng) [ M 種 zhǒng, zhòng, chóng (chủng, chúng, chùng) < MC tɕiowŋ < OC *tjoŋʔ, *tjoŋs | Etymology: Sino-Tibetan 'Chepang' (tuŋʔ-, 'to plant'), दुङ् (duŋ, 'shoot; sprout'), दुङ्‌सा (duŋ-, 'to sprout; to grow'). Compare 腫 (OC *tjoŋʔ, 'to swell') and 踵 (OC *tjoŋʔ, 'heel'). Related to Proto-Vietic *k-coːŋʔ ('seed') (Vietnamese giống ('seed')), which is likely a loanword from Chinese (Wang, 1948). Pronunciation 2 (trồng, 'to sow'; 'to plant') is the exoactive derivation of pronunciation 1 (giống, 'seed'). According to Starostin: seeds; cereals. Also read *toŋʔ-s, MC couŋ (FQ 之用), Mand. zhòng 'to sow'. The word also means 'kind, sort, race' ( > 'seed'), which is reflected in a colloquial Viet. loanword (from another dialectal source) giống 'kind', 'sort'; 'race', 'breed', strain'. For this word, An Chi (Ibid. 2016. Volume 2) boldly posited it as 'trứng' (egg) that ought to be 蛋 dàn (SV đản), which is the case that Vietnamese scholar still in the mentality of trying to match word-by-word phonologically. ]
            1. VS trồng 'to plant' (huyền or the 2nd tone, 'low level tone', read 'zhòng' in Mandarin)
            2. VS dòng 'breed' (the 2nd tone)
            3. SV chủng 'type' (hỏi or the 3rd tone)
            4. VS giống 'strain' (sắc or the 5th tone, 陰去 yīnqù 'high departing tone')
            • 臭 chòu (SV xú, khứu, khứu) [ 臭 chòu, xìu (xú, khứu) < MC tɕʰiəu < OC *kʰljus | Etymology: Schuessler (2007) considers it to be cognate with 犨 (OC *kʰju, 'sound of an ox breathing') and connects it to Burmese ဟိုက် (huik, 'to pant'). Also compare 朽 (OC *qʰluʔ, 'to rot; to decay') (Baxter and Sagart, 2014). According to Starostin, in MC there also exists a reading xjəw (Mand. xiu) (Jiyun); it is interesting to note that standard Sino-Viet. renders it as khưu. These are most probably dialectal variants of the original *khiw-s which gave the standard MC reflex chjəw (note that Viet. thiu 'stale' is a colloquial reflex of the latter; the standard Sino-Viet. form is .]
              1. VS hôi 'smelly' (thanhngang or the 1st tone)
              2. VS ôi 'rotten' (the 1st tone)
              3. SV khưu 'smell' (the 1st tone) 
              4. VS thiu 'stale' (the 1st tone) [ doublet 餿 sòu ]
              5.  VS ngửi 'to smell' (hỏi or the 3rd tone) [ doublet of M xìu < MC xǝ̀w < OC *xus ]
              6. VS hửi 'to smell' (the 3rd tone)
              7. VS hủi 'rotten' (the 3rd tone)
              8. SV xú 'bad smell' (sắc or the 5th tone)
              9. SV khứu 'smelling sense' (the 5th tone)
              10. VS thối 'foul' (the 5th tone)
              11. VS thúi 'foul' (the 5th tone)
            • 按 àn (SV án) 'case' [ M 按 àn < MC ʔɒn < OC *ʔa:ns || Comments: cf. 案 àn (SV án, VS bàn, 'table'), '安心 ānxīn (VS yêntâm, 'not worry')' ~ '放心 fàngxīn (vữnglòng, 'feel reassured') | ¶ /Ø- ~ y-, nh-, b-,  f-, v-/ ]
            VS ịn 'press' (nặng or the 6th tone),
              1. VS nhồi 'to stuff' (huyền or the 2nd tone)
              2. SV án 'press' (sắc or the 5th tone)
              3. VS ấn 'press' (the 5th tone)
              4. VS nhấn 'press' (the 5th tone)
              5. VS bấm 'press' (the 5th tone)
              6. VS nhận 'stuff' (nặng or the 6th tone)
            • 利 lì (SV lợi) [ 利 lì (lợi, lị) < MC li < OC *rids | Dialects: Amoy li32 (lit.); lai32, Hai. lai32, Cant.: lei32 | Etymology: From Proto-Sino-Tibetan *ri:t ('to reap, scrape, shave, cut, sever') (STEDT, Schuessler, 2007); cognate to Mizo rîit ('to scrap with a hoe'), Western Gurung (wriqba, 'to scratch'), Burmese ရိတ် (rit, 'to cut', 'reap', 'mow', 'shave').]
              1. SV lị 'benefit' (nặng or the 6th tone, 'low departing tone')
              2. SV lợi 'advantage' (the 6th tone)
              3. VS lãi 'profit' (ngã or the 4th tone, 'low rising tone')
              4. VS lời 'profit', 'interest' (huyền or the 2nd tone, 'low level tone')
              5. VS lẽm 'sharp' (ngã or the 4th tone)
              6. VS lém 'witty' (sắc or the 5th tone)
            and the list goes on and on. There are too many of them to list!

            It should be noted that the preceding list contains only monosyllabic items. If each of these were placed into formations yielding disyllabic words, such as those etyma built with the morphemic syllable 海 hǎi exemplified above, the roster would likely expand to colossal proportions.

            From the deliberately extensive set of examples, three points stand out. First, the phenomenon of multiple sound changes radiating from a single root is not unique to Vietnamese but is equally attested in Chinese, where a single character may bear several tonal values within one dialect. Second, the correspondences between Old Chinese tones and those of Sino‑Vietnamese are varied and complex, extending far beyond the narrow set of Vietnamese words with the 3rd and 4th tones (hỏi and ngã) that Haudricourt hypothesized to derive from final ‑ʔ in certain Mon‑Khmer languages on a one‑to‑one basis. Third, and most importantly, if ancient Annamese tones had only reached full development by the 12th century, as Haudricourt proposed, it would be impossible to account for the one‑to‑many Chinese‑Vietnamese monosyllabic reflexes that had already emerged in much earlier periods.

            In general, tonal changes in the Sino‑Vietnamese lexicon have occurred both diachronically and synchronically. A single Chinese character or word could generate multiple Vietnamese reflexes, each accented with distinct tonal contours. The case of 墓 mù (grave) illustrates this: OC mâg > mã versus ma:gs > mả. Such variation is abundant throughout the examples already cited and elsewhere in this study. Moreover, tonal differentiation often carries subtle semantic distinctions. For instance, Vietnamese Phật for 佛 Fó (Buddha) coexists with Bụt, which also denotes 'Buddha' but shades toward the semantic field of Sino‑Vietnamese thánh (聖 shèng, 'saint') and thần (神 shén, 'god').

            In this light, Haudricourt's proposal that Vietnamese tones developed from an absence of tone to partial formation by the 12th century, and only later reached their present six [eight] tones, appears internally inconsistent for several reasons as some have been briefly summed up previously, to be elaborated later on. At this stage, it suffices to note that his theory rests on a limited set of correspondences between toneless Mon‑Khmer and Vietnamese basic vocabulary. Yet many of these fundamental words have demonstrable cognates in Chinese, a language that is highly tonal. Examples such as 口, 方, 母, and 海 show that Vietnamese pronunciations of Chinese characters, as preserved in Sino‑Vietnamese readings, can be verified against the definitions and fǎnqiè (反切) keys recorded in sources compiled in the Kangxi Zidian (康熙字典). Consider 母, for example:

            【說文】蜀人 (dchph: ngườiThục) 謂 母 ("mẹ") 曰 姐 ("chế"),齊人 (ngườiTề) 謂 母 曰 嬭 ("nạ"),又 曰 㜷 ("mợ"), 吳人 (ngườiNgô) 曰 媒 ("mẹ")。 【眞臘 (Chânlạp) 風土記】呼 父 為 巴駞,呼 母 為 米 ("mẹ")。 方音 不同,皆 自 母 而 變。 又 乳母 亦 曰 母 ("mụ, vú")。 【越語 Việtngữ 】生 三 人,公 與 之 母 ("mẹ")。 又 禽獸 之 牝 皆 曰 母 ("mái")。

            English Translation: "Shuowen": The people of Shu (ngườiThục) called 'mother' jie ('chế'); the people of Qi (ngườiTề) called 'mother' nai ('nạ'); they also said mu ('mợ'); the people of Wu (ngườiNgô) said mei ('mẹ').  "Record of Customs of Zhenla" They called 'father' batuo, and 'mother' mi ('mẹ'). The regional pronunciations differ, all deriving from mu [M 母 mǔ, mú, wǔ, wú (mẫu, mô) < MC məw < OC *mɯʔ]. A wet nurse was also called mu ('mụ', '').  "Yueyu" When three children were born, the Duke shared them with their 'mother' ('mẹ'). Among birds and beasts, the female was also called mu ('mái').

            This evidence underscores the tonal and semantic breadth of Sino‑Vietnamese correspondences, extending far beyond the narrow scope envisioned in Haudricourt's original hypothesis.

            The author maintains that there is no need to employ electronic instruments to measure tonal intensity when comparing tonal correspondences between Vietnamese and the Chinese dialects. Such similarities can be reliably assessed by the human ear alone. For example, Mandarin 入 rù /ʐu⁴/ is realized as /zuː⁵/, in Sino‑Vietnamese as nhập /ɲəp8/, and in Cantonese as /jap⁶/. These forms clearly demonstrate cognate tonal values, without requiring mechanical validation of decibel precision. Vietnamese speakers, like other Taic peoples, are capable of reproducing 'Chinese' sounds and tones with remarkable accuracy, without perceptible deviation. Thus, Vietnamese /rú/ aligns closely with Mandarin 入 rù /ʐu⁴/, while nhập /ɲəp8/ corresponds to Cantonese dập (the local rendering of /jap⁶/). In each case, the tonal contours and values remain consistent across the cognate forms, something that speakers of most non‑Sinitic languages would find difficult to replicate.

            One of the principal reasons for rejecting Haudricourt's hypothesis of Vietnamese tonegenesis is that the modern Vietnamese tonal system aligns so closely with the Middle Chinese tonal scheme. This is further reinforced by the nine inherited tones of modern Cantonese (formerly known as the Tang language, 唐話 T'ongwa), whose tonal formation must have been complete long before the 10th century, the very period marking Annam's political separation from China. The strict rhyming matrices of Tang poetry, still observed in Vietnamese Tang‑style verse, testify to this continuity (see Drake, ibid.; Sung Shee, ibid.). In modern Vietnamese, tones are organized into four categories across two registers (traditionally counted as eight, or six in modern orthography, which omits the two entering tones 入聲 Rùshēng). Strikingly, when compared with the nine tones of Cantonese, the degrees of tonal intensity are virtually identical, with the Cantonese ninth tone corresponding to Vietnamese syllables ending in ‑p, ‑t, ‑k, or ‑ch, spoken with level intonation.

            The question of whether tonality is acquired or inherited can be clarified by comparison. Japanese and Korean, like Vietnamese, borrowed vast numbers of Chinese words. Yet despite their deliberate adoption of Chinese vocabulary, neither language preserved tonal distinctions. Japan, for instance, began sending students to the Tang court in the 9th century, importing Chinese literary culture and loanwords with Kan‑on, Tō‑on, and Go‑on readings, but without tones (Bo Yang 1983). Similarly, Korean absorbed large numbers of Chinese words, especially from the Ming period, again without tones (An Chi, ibid., vol. 3, p. 284). These parallels demonstrate that tonality is not the product of acquisition but of inheritance. No matter how intensively a language borrows, tones cannot be artificially imposed. The situation is analogous to modern Chinese borrowings into English, terms in biotechnology, high‑tech, or trade espionage retain their English phonology without tonal adaptation. Vietnamese words borrowed into English likewise appear toneless (Vietcong, Vietminh, Ho Chi Minh, Nu‑yen [Nguyen], bunmee, banhmi, aodai, pho, dong), further illustrating the point.

            Okay, let's refocus on the the discussion: Austroasiatic Mon‑Khmer languages are toneless, while Chinese, Vietnamese, and Mường are tonal, as are related families such as Tai‑Kadai, Thai, Lao, and Hmong. The prevailing Austroasiatic view that many Vietnamese–Mon‑Khmer cognates originated from Mon‑Khmer underpins Haudricourt's theory of tonegenesis. He argued that Vietnamese was originally toneless, deriving from Mon‑Khmer, and only later developed tones. The author proposes the reverse: Vietnamese loanwords in Mon‑Khmer became toneless through the latter's contact with Mường dialects. Unable to reproduce embedded tones, Mon‑Khmer speakers stripped them away, compensating through other phonetic adjustments. Over time, this produced paradigms such as the tonal pattern {~ → ʔ}. This explanation is more consistent than assuming tones were generated ex nihilo in Vietnamese.

            Maspero, in fact, was correct in asserting that tonality cannot be acquired. Middle Vietnamese must already have had a full tonal system, given the long‑standing adoption of Chinese tonal words, especially from Middle Chinese during the Tang Dynasty, when Annamese officials in Chang'an spoke the same Mandarin as their Chinese counterparts. Sino‑Vietnamese pronunciations are simply variants of Tang phonology (Nguyen Tai Can 2000, ibid.). It is inconceivable that Annamese could have borrowed Middle Chinese words without their tones, since tones were morphemically embedded in every syllable. To accept Haudricourt's chronology would require believing that Middle Chinese loanwords entered Vietnamese stripped of tones, as in Japanese and Korean, an implausible scenario.

            Moreover, ancient Annamese words were already attested with eight tones. For example: vuông ('square') ~ 方 fāng → SV phương; VS buồng ('room') ~ 房 fáng → SV phòng; VS buông ('let go') ~ 放 fàng → SV phóng; SV phỏng → 仿  fǎng ('imitate'). In earlier stages, labial‑dental initials /v‑/ and /f‑/ did not yet exist, so vuông may have been buông, and phóng related to bắn ('shoot'), just as buồng corresponds to phòng. The crucial point is that sounds and meanings were already differentiated by at least four tonal categories in Old Chinese: (1) level tone, (2) lower level tone, (3) departing tone, and (4) low departing tone. This is consistent with other cases such as 墓 (mô, mồ, mộ) and 母 (me, mẹ, mái).

            It is untenable to suggest that the phenomenon of tonality in Vietnamese arose as late as the 12th century. Such a claim is both illogical and inconsistent with the evidence. As previously noted, the highly Sinicized vocabulary of ancient Annamese already demonstrates that most Vietnamese words must have acquired tones contemporaneously with Old Chinese. During the Eastern Han (東漢, 25–220 A.D.), Viceroy Sĩ Nhiếp (士攝, 187–226 A.D.) established schools and promoted Han culture in Giao Chỉ Prefecture (交趾部), raising the study of Chinese to unprecedented prominence. This fact underscores that Han language learning had been firmly rooted in Annam for at least 250 years since the annexation of the Nanyue Kingdom in 111 B.C.

            If we were to strip away the tones from Sino‑Vietnamese words of Old Chinese origin, the resulting forms would collapse into homonymy, unlike the Kan‑on borrowings in Japanese, where tones were never adopted. For example, Japanese ichi for 一 /yi¹/ contrasts with Sino‑Vietnamese nhất and Cantonese jat⁵. The Vietnamese evidence instead points to the opposite process: tones were inherited directly from Old and Middle Chinese. Without them, semantic differentiation would have been severely weakened, as seen in words with Mandarin /yáng/ like (羊 yáng, SV dương, 'goat'), giông (颺 yáng, SV dương, 'windstorm'), rượng ( yáng, SV dương, 'aroused'), nắng (暘 yáng, SV dương, 'sunshine').

            The Japanese case further illustrates the weakness of Haudricourt's hypothesis. He proposed that tonality developed from morphemic contrasts, but Japanese and Korean, despite massive borrowing from Chinese, never acquired tones. By contrast, Vietnamese did, which suggests inheritance rather than innovation. Consider words such as mặc ('put on'), chồmhỗm ('squat'), and chànghãng ('stand with legs apart'), often claimed to be of Mon‑Khmer origin. Even if their roots lie in Mon‑Khmer forms like /pec/, /chrohom/, or /choho/, the tonal system in which they are embedded is distinctly Vietnamese. Internally, the tonal contrasts are obvious: mặc, mắc, mắt, mặt, etc. Even without diacritics, Vietnamese speakers instinctively supply tonal contours, much as Cantonese speakers do with their entering tones (thanhsắcnhập, thanhnặngnhập), peculiarly in syllables ending in /‑p/, /‑t/, /‑k/, or /‑ch/.

            Haudricourt also overlooked the many Vietnamese words carrying thanhsắc [5] and thanhnặng [6] that differentiate meaning in subtle ways, as in vs. mộ, me vs. mẹ, tan vs. tán. Therefore, his claim that Vietnamese only became fully tonal after the 12th century fails to capture the reality: tonality in Vietnamese developed in tandem with Chinese, from Old Chinese through Middle Chinese, as reflected in the phonological record. This trajectory is consistent with other southern Chinese languages that followed the same evolutionary path.

            Given that Annamese scholars had been studying official Mandarin continuously from the beginning of the Common Era until at least 939 A.D., it is inconceivable that their language lacked tones during this period. Ancient Annamese must already have possessed the full eight tonal categories of Middle Chinese long before the 10th century.

            There is nothing remarkable about an Asian linguist writing on topics in English linguistics. Yet when a Western linguist happens to publish something on Vietnamese, it is treated as a sensation. The situation is almost comical, reminiscent of Vietnamese or Chinese variety shows on VTV, CTV, or YouTube, where local audiences erupt with applause, cheers, and exaggerated delight whenever a Westerner, often an exchange student, steps on stage and delivers a few simple jokes in Vietnamese or Chinese.

            What explains the enthusiasm of many Vietnamese specialists for Haudricourt's theory? Part of the answer lies in a long‑standing cultural tendency: out of a sense of inferiority, Vietnamese intellectuals have often displayed excessive admiration for Westerners who demonstrate even modest fluency in Vietnamese. What is celebrated as a positive openness to Western scientific methods is, in practice, also a reflection of pride that their language has attracted attention from foreign scholars, an attitude shaped as much by a superiority complex as by deference.

            In reality, most Western linguists possessed only limited fluency in Vietnamese and relied primarily on the analysis of available data. Some achieved a degree of technical mastery through academic training, but they often lacked what might be called “linguistic feeling.” (W)Imagine the scenario: a Western linguist publishes a proposal on Vietnamese linguistics, and immediately a chorus of local admirers responds with "wows" and "hoorays".

            Haudricourt's own listings, as cited by Shafer (1972) and discussed in Chapter 10 on Sino‑Tibetan etymologies, raise doubts about his ability to distinguish Sinitic‑Vietnamese from Sino‑Vietnamese forms. His confusion of VS (which should properly be mả) with SV mộ is a telling example. Yet his disciples repeated such errors uncritically, and the flaws went largely unnoticed because the hypothesis appeared to fit neatly into the intellectual climate of the time. Even today, some Western linguists continue to perpetuate the same mistakes as their predecessors (see Ding Bangxin, ibid., 1977, p. 263).

            Historically, by the 9th century Vietnam had already become a distinguished center outside China that produced prodigious poets skilled in composing Tang‑style stanzas. It is inconceivable that speakers of a supposedly toneless ancient vernacular could have fully appreciated or mastered such poetry (see Drake, F.S., ed. 1967, ibid.). Even today, many Vietnamese poets continue to compose in Tang‑style forms, and the practice of chanting these poems remains popular among the public, an appreciation that, ironically, is now far less common among the general populace in China itself.


            Was Vietnamese originally toneless, or did it possess two or three tonal distinctions—low and high—like certain Mon‑Khmer languages, as Haudricourt proposed for the 12th‑century timeframe? For those seeking a quick answer, the reality is that many Vietnamese speakers find it difficult to accept the notion that their tonally rich folksongs, fixed expressions, proverbs, and idioms, believed to date back to antiquity, were once toneless. Each Vietnamese word is intimately tied to a corresponding Sinitic‑Vietnamese etymon, making the idea of a toneless origin implausible (see the author's Hán‑Nôm etymology dictionary at http://vny2k.com/hannom/ for detailed entries and meanings). For example:

            • Bốcái Ðạivương;
            • Con dại cái mang;
            • Gậtđầu lắccái;
            • Chácđược củarẻ;
            • Chồng chúa vợ tôi;
            • Cõng rắn cắn gà nhà;
            • Giặc đến nhà đànbà phải đánh;
            • Ăn coi nồi, ngồi coi hướng;
            • Một miếng khi đói bằng một gói khi no;
            • Nghèo cho sạch, rách cho thơm;
            • Giấy rách phải giữ lấy lề;
            • Nghêdại chẳng hay cóc;
            • Tiên học lễ, hậu học văn;
            • Đi một đàng, học một sàng khôn;
            • Bỏ thì thương, vương thì tội;
            • Không mợ thì chợ vẫn đông;
            • Có mống tựnhiên lại có cây;
            • Rán đàngđông vừa trông vừa chạy;
            • Nựccười châuchấu đáxe;
            • Bàcon xa khôngbằng lánggiềng gần;
            • Thương cho roi cho vọt, ghét cho ngọt cho ngào;
            • Cảnh cũ non quê nhặt chốc mòng;
            • Quântử hãy lăm bền chí cũ;
            • Cổ tới nhẫnkim, sinh thời có hoá;
            • Thà làmquỷ nướcNam cònhơn làmvua nướcBắc;
            • Bầu ơi thươnglấy bí cùng, người trong một nước phải thươngnhau hoài;
            • Nhiễuđiều phủlấy giágương, người trong một nước phải thươngnhau cùng,

            • etc.

            and so forth with hundreds of similar folk‑styled expressions, whose rhyming stanzas and lyrical structures clearly indicate that they must have existed long before the 12th century, even if some examples were later recorded in 14th‑century works. In the centuries preceding that period, the Vietnamese not only absorbed the Chinese language of the mandarins from the Han through the Tang dynasties – the basis of Sino‑Vietnamese (Hán‑Việt) – but also composed rhymed prose and poetry of notable literary elegance (Drake, F.S., ed. 1967, ibid.). Could such achievements have been possible if their speech were still toneless, like Khmer? If so, how would they have addressed their beloved Princess Huyền Trân (Huyềntrân Côngchúa)? How were historical names pronounced at that time?

            The author therefore challenges Austroasiatic Mon‑Khmer theorists to reconstruct even a fraction of these folklores, idioms, and proverbs using a hypothetical toneless lexicon, of which many in Vietnamese are proved cognate to those of the Chinese, say, 'lárụngvềcội' (落葉歸根 luòyèguīgēn, SV lạcdiệpquycăn, 'a falling leaf returns to the roots') or 'bỏlàngquêxưa' (背井離鄉 bèijǐnglíxiǎng, SV bộitỉnhlyhương, 'tear oneself away from one's native place'). In short, it is inconceivable that ancient Vietnamese remained partially toneless until the 12th century.

            Lest any contemporary Vietnamese philologist forget, or fail to appreciate, the simple expressions cited above, they must first deepen their knowledge of Vietnamese before presuming to write anything serious about the language. More importantly, lest newcomers be led astray, we must guard against their retracing the worn paths of early pioneers in Vietnamese linguistics, that is, paths that obscure access to Sino‑Tibetan etymologies and the overwhelmingly Sinitic strata of modern Vietnamese.

            Turning specifically to Haudricourt's claim that the eight Vietnamese tones were not fully formed until the 12th century: such a view is highly improbable. It is difficult to imagine that the tonal system would have reached completion only two centuries after Vietnam had already secured independence. If his theory had instead placed tonal formation as far back as the 2nd century B.C. – when the 3rd and 4th tones in Old Chinese were beginning to evolve – it might have been more plausible. 

            In fact, the Vietnamese huyền [`] and sắc [´] tones predate their Sino‑Vietnamese first‑tone doublets, which means that by then the full set of tonal registers (1st through 4th, each in upper and lower registers) was already in place. Examples include:

            Early tone contrasts (pre‑Tang)

            Vietnamese pair Tone valueChin.     Pinyin     Meaning
            mồi (lower 1st = modern 2nd) vs.
            môi (upper 1st = modern 1st)
            huyền vs. ngang        méi    coal
            mối (upper 3rd = modern 5th) vs.
            môi (upper 1st)
            hỏi vs. ngang        méi    go‑between
            dì (2nd) vs. di (1st)huyền vs. ngang            aunt
            lấm (5th) vs. lâm (1st)hỏi vs. ngang            lín    soak


            Established 8‑tone system (by end of Tang)


            Vietnamese pair        Tone valueChin.    Pinyin    Meaning
            dâm (1st) vs. dầm (2nd)        ngang vs. huyền        yín    wet
            mả (3rd) vs. (4th)        hỏi vs. ngã        tomb
            bố (5th) vs. phụ (6th)        sắc vs. nặng        father
            mắt (7th) vs. mục (8th)        sắcnhập vs. nặngnhập        eye


            Register splits

            Vietnamese pair        Register contrast         Chin.     Pinyin    Meaning
            buồng vs. phòng    voiced vs. voiceless initials                fáng    room
            đục vs. trọc    voiced vs. voiceless initials                zhuó    murky

            On the broader question of whether languages are 'tonal' or 'non‑tonal', it is reasonable to assume that most human languages began with simple utterances and only later developed either tonal systems or complex consonant clusters (kl‑, kr‑, bl‑, etc.). This is evident in Daic languages of southern China or in tonal Tibetan of Lhasa (see Ding Bangxin, ibid., 1977, p. 263), as opposed to the monosyllabicity of Old Chinese or proto‑Vietic (l‑, s‑, tr‑, etc.), which later differentiated tonally in Sino‑Vietnamese and Middle Chinese. If Vietnamese had truly evolved from a purely toneless Mon‑Khmer root, then many inherited Mon‑Khmer words should have remained toneless. Yet in Vietnamese they are tonal, raising the question: why would tones have been added if they were unnecessary? The existence of tonal correspondences in scores of Mon‑Khmer–Vietnamese cognates suggests otherwise. It is inconceivable that Vietnamese folk songs, proverbs, and lyrical traditions that are so deeply tonal were ever performed in a flat, toneless manner.

            As argued earlier, Haudricourt's theory seems to invert the actual process: rather than Vietnamese developing tones late, it is Mon‑Khmer languages that lost them. His postulation excludes the multiple tonal contours of Vietnamese words that are clearly cognate with Chinese etyma. The sheer volume of Chinese‑origin vocabulary in Vietnamese at every historical stage is sufficient to show that his theory requires revision. Like other Yue‑descendant languages (Zhuang, Daic, etc.), Vietnamese tonality evolved in parallel with Chinese, beginning before the Han Dynasty and continuing through a millennium of colonization after 111 B.C. During this period, Vietnamese tones developed within the Chinese linguistic sphere, as reflected in etymological cognates across the region.

            From 111 B.C. to 939 A.D., Vietnam experienced long stretches of colonization punctuated by brief intervals of independence, mirroring the rise and fall of Chinese dynasties. This history makes clear that Vietnamese tonality was never an isolated phenomenon, but part of the same continuum as Old and Middle Chinese. If Haudricourt's hypothesis were correct, then all Mon‑Khmer languages should have become tonal as well. Yet only a few developed two or three tones, much like Old Chinese during the Han, with its basic level, high, and low intonations. Vietnamese, by contrast, had already inherited and elaborated a full tonal system long before the 12th century.


            In comparison, within less than three centuries of Spanish colonial rule in Latin America, the Indigenous peoples became minorities while racially mixed Spanish‑speaking populations emerged as the dominant majority. This historical reality is recalled here as a reminder intended to stir those idle, nationalistic minds into reconsidering their assumptions.

            In practice, most Mon‑Khmer speakers tend to neutralize or omit the tones embedded in Vietnamese loanwords when speaking, as can be heard in their Mon‑Khmer‑accented Vietnamese across the southernmost and western highland provinces of Vietnam. By contrast, the Kinh Vietnamese consistently apply tonalization to all foreign borrowings. This is most evident in loanwords from French and English, such as xìcăngđan ('scandal'), xìtăngđa ('standard''), or quánhhtùtì ('one, two, three'). The same process naturally accounts for the tones assigned to Mon‑Khmer borrowings for local objects such as bòhóc ('prahok' fish paste), sàrông ('sarong') as well as to placenames like Buônmêthuột (Buon Ama Thuot), Đắklắk (Daklak), and Đàlạt (Dalat).

            If Mon‑Khmer and Vietnamese cognates truly shared the same linguistic root and phonological system, native speakers would have used them without modification, just as Thai, Lao, or Hmong speakers employ cognates within the Daic family without adding extraneous phonemic features. Put differently, if Vietnamese had originally been a toneless Mon‑Khmer‑type language, its speakers would not have needed to 'retrofit' tones onto loanwords in order to compensate for a lack of tonality. For this reason, Haudricourt's argument that Vietnamese tonal development was independent of Chinese influence must be rejected. If that were the case, Vietnamese speakers could just as easily have stripped tones from Chinese borrowings, as Japanese and Korean speakers did.

            At the same time, Vietnamese preserves Old Chinese loanwords that already exhibit the four tonal categories Haudricourt associated with a 12th‑century 'tonegenesis'. These may in fact represent ancient tonal strata preserved in Central Vietnamese dialects, particularly in the Thanhhoá-Thuậnhoá region and especially Nghệ an. For example, chimchóc ('birds') contains chóc, cognate with ancient 雀 què [cf. modern SV tước, VS sóc 'sparrow']. When the Trần Dynasty acquired southern Chamic territory through the marriage of Princess Huyền Trân to the King of Champa Chế Bồngnga (Po Binasuor), Annamese settlers carried with them these fossilized four‑tone systems, which still survive in northern Central dialects. This helps explain why, like Chế Bồngnga (Cei Bunga), originally toneless Chamic placenames were fully tonalized in Vietnamese, e.g., Đànẵng, Quynhơn, Nhatrang.

            Further south, earlier Annamese resettlers also transmitted distinctive consonantal features that preserve keys to the Early Middle Chinese phonological system, including initials such as 知 zhī (SV trí), 徹 chè (SV triệt), 澄 chéng (SV trừng), 于 (SV vu), and 匣 xiá (SV hiệp) (see Ding Bangxin, ibid., 1977, pp. 266–269).

            Bernhard Karlgren in his research entitled Tones in Archaic Chinese (1960) reached the conclusion that Archaic Chinese of pre-Han periods after analyzing rimes in Shijing Odes that

             "Archaic Chinese, like Ancient and Modern Chinese, had distinct tone classes. One of them corresponded to the p'ing-sheng of the Ancient Chinese, another to the shang-sheng, another to the k'ü-sheng and the last (words ending in p-, t-, k-) to the ju-sheng (..) [His conclusion that] Archaic Chinese had tone classes roughly corresponding to those in Ancient Chinese [..] that words figuring in the 'pure one-tone sets of rimes' in all probability belonged to the Arch. tone class corresponding to the Anc. class concerned (p'ing, shang, k'ü, ju). (p. 133)"

            We may therefore state with confidence that when the Han conquered ancient Annam in 111 B.C., the tonal system of Ancient Chinese profoundly shaped the local speech, leaving an imprint that endured well beyond 939 A.D. The Vietnamese did not need to wait until the 12th century to begin accentuating their language with tones; tonal influence was already deeply embedded. Indeed, Vietnamese tonality corresponds in every respect to the tonal categories of Chinese dialects.

            In conclusion, the sheer volume and vitality of Sino‑Vietnamese vocabulary – thousands of items actively used in everyday speech, not merely the etyma catalogued in dictionaries – provides sufficient evidence for this claim. Moreover, the vernaculars spoken by the common people in the Southern Han period, across the mountainous regions south of Lingnan from Jiaozhou (Giaochâu) to Guangzhou, must have been broadly similar (see Bo Yang, ibid., vols. 71–72, 1993). And one must ask: would any historical linguist seriously argue that the ancestors of today's nine‑toned Cantonese were speaking with only four or five incomplete tones as late as the 10th century?

            III) Correspondences in basic vocabularies revisited

            Western scientific methodology, by its very nature, is expected to yield correct theories most of the time; otherwise, it ceases to be scientific. Yet theories inevitably change and are eventually replaced, especially in a field as dynamic as linguistics. In the Vietnamese case, as the preceding discussion has shown, many early authors, though undeniably pioneers, often took shortcuts. They relied on the limited data available to them at the time, while avoiding the more demanding path that required rigorous study of both Chinese and Vietnamese in their historical and phonological dimensions (see Ding Bangxin, ibid., 1977, p. 263). (See more on the Vietnamese tonegenesis paper by Graham Thurgood,   http://www.csuchico.edu/~gthurgood/Papers/Vietnamese_tonegenesis.pdf - as of Jan. 2017)

            Haudricourt's theory of tonegenesis provided a convenient framework for Austroasiatic Mon‑Khmer theorists, who frequently cited his work. Yet, given the limitations of their time, and in light of the advances in Old Chinese and Sino‑Tibetan studies over the past sixty years, their conclusions now require serious re‑evaluation, if not outright revision. When addressing the genetic affiliation of Vietnamese with other Mon‑Khmer languages, their insufficient mastery of both Chinese and Vietnamese historical phonology led them to overlook, or fail to recognize, the deeper connections between Vietnamese and Chinese.

            For instance, among the words Haudricourt used in his illustrative examples, the case of chó 'dog' (Norman 1988) is revealing: it derives from Proto‑Miao‑Yao and is cognate with Chinese 狗 gǒu, demonstrating that Vietnamese and Chinese share basic vocabulary at the most fundamental stratum. Other parallels such as 雞 ~ ('chicken'), 來 lái ~ lúa ('paddy'), 為 wéi ~ voi ('elephant') 熊 xióng ~ gấu ('bear'), etc., further attest to this early relationship in the core lexicon.

            Haudricourt's argument about tonal development in Vietnamese also rested on the etymology of many such basic words, which is crucial for understanding the Sino‑Vietnamese lexical layer. As shown in the cases above, these words often have clear Chinese cognates. I will examine these issues in greater detail below, and expand the discussion in the following chapter on Sino‑Tibetan etymologies.

            To begin, let us revisit Haudricourt's basic word lists, focusing first on his examples from Khmu and Riang, two Mon‑Khmer languages, where words ending in a glottal stop [ʔ] correspond to Vietnamese words bearing the sắc or nặng tones (Norman 1988, pp. 55–56; 1991, p. 206).

            ViệtKhmuRiangChinese correspondences suggested by dchph
             ('leaf')hlaʔlaʔ葉 yè (leaf) (SV diệp) [ M 葉 yè, dié, shè, xiè < MC jiap, ɕiap < OC *leb, *hljeb  | Note: The pattern OC /*l-/ ~ MC /j-/ is very common in Mandarin as /j-/. Most of the Tibetan languages carry the the sound near lá. For example, Tibetan: ldeb lá, tờ, Burmese: ɑhlap cánhhoa (floral petal), Kachin: lap2 , Lushei: le:p búp, Lepcha: lop , Rawang ʂɑ lap  (used to wrap rice pastry) ; Trung ljəp1 lá, Bahing lab. (Shafer p.138; Benedict, p. 70.)

            Per Starostin, Proto-Austro-Asiatic: *la, Proto-Katuic: *la, Proto-Bahnaric: *la, Khmer: sla:, Proto-Pearic: *laʔ.N, Proto-Vietic: *laʔ, s-, Proto-Monic: *la:ʔ, Proto-Palaungic: *laʔ, Proto-Khmu: *laʔ, Khasi: sla-diŋ, Proto-Aslian: *sǝlaʔ, Proto-Viet-Muong: *laʔ, ʔ-, Thomon: la.343ʔ, Tum: la.212 ]
            gạo ('rice')rənkoʔkoʔ稻 dào (SV đạo) [ Starostin posited this etymon as "lúa" (paddy) in Vietnamese. See also etymology of "gạo" and "lúa" in previous sections. ]
             ('fish')kaʔ--魚 yú (SV ngư) [ M 魚 yú < MC ŋɨə̆ < OC *ŋa | According to Starostin, ST fish. For *ŋh- cf. Xiamen hi2, Chaozhou hy2. | Protoform: *ŋ(j)a. Meaning: fish. Chinese: 魚 *ŋha fish. Tibetan: ɳa fish. Burmese: ŋah fish, LB *ŋhax. Kachin: ŋa3 fish. Lushei: ŋha fish, KC *ŋhɑ. Kiranti: *ŋjə. Comments: PG *tàrŋa; BG: Garo năk, Bodo ŋa ~ na, Dimasa na; Chepang ŋa ~ nya; Tsangla ŋa; Moshang ŋa'; Namsangia ŋa; Kham ŋa:ɬ; Kaike ŋa:; Trung ŋa1-plăʔ1. Simon 13; Sh. 36, 123, 407, 429; Ben. 47; Mat. 192; Luce 2. | OC *ŋh- ~ k- (ca-) || See Appendix M on the case of "ketchup" or "catsup", where "ke-", ca-" is "" ('fish'), while '-tsup, -tchup' is 汁 zhí ('sauce'), etymologically. ]
            chó ('dog')soʔsoʔ狗 gǒu (SV cẩu) [ ~ VS 'cầy' | QT 狗 gǒu < MC kjəw < OC *ko:ʔ | Note: In Chinese 狗 gǒu (Proto-Viet **kro, Mon-Khmer *klu) might be a loanword from the Yue. cf. 犬 quán (SV khuyển) ~ VS 'cún' ('poppy') which could be a cognate with 狗 gǒu if both forms descended from the same source, either of the Yue or Sino-Tibetan languages. ]
            chí ('louse')--siʔ虱 shī (SV siết, sắt) [ M 虱 (蝨) shī < MC ʂit < OC *srit || Note: Etymologically, from From Proto-Sino-Tibetan *srik ('louse'). The case of /-ʔ/ ~ "sắc" is similar to "": 葉 (yè, SV diệp) ]

            Notes on the Chinese cognates:

            The shift from final /‑ʔ/ to /‑k/, /‑t/, or even to the sắc tone is hardly remarkable. In fact, in many central and southern Vietnamese dialects, mắt 'eye' is still pronounced as mắc (cf. SV mục). Similarly, mắt (目 ) did not yield the Mandarin reading /mu5/, nor did cắt 割 (SV cát 'cut') become Mandarin —and there are hundreds of comparable cases. Thus, the much‑discussed Mon‑Khmer–Vietnamese correspondences involving final glottals are far less exceptional than sometimes claimed.

            Etymologically, several points are clear:

            1. Lá 葉 yè 'leaf': Vietnamese corresponds to Chinese 葉 , traced through AC *lhap < OC *lap < PC **lɒp. Cognates across Tibeto‑Burman languages preserve initial l‑ with minor semantic variation. In Khmu and Riang, which lack tones, the Vietnamese sắc tone was replaced by a final glottal stop /‑ʔ/.
            2. Gạo 稻 'rice': The Chinese word 稻 is thought to be borrowed from a Yue‑type language, likely Austroasiatic, spoken by ancestors of present‑day minorities in southern China, of which the early Vietnamese were a part. As with , the Vietnamese nặng tone here corresponds to a glottal stop in Khmu and Riang. But that is not the case with lúa ('paddy') which is postulated by Starostin and that is cognate to 稐 lǔn and 來 lái.
            3. Cá 魚 yú 'fish': Vietnamese is plausibly cognate with Old Chinese *ŋa. The shift from OC *ŋh‑ to Vietnamese /k‑/ is not difficult to explain, and the case parallels that of (SV diệp).
            4. Chó 狗 gǒu 'dog': According to Norman (1988), Chinese 狗 gǒu is an early loan from Proto‑Miao‑Yao klu (cf. Mon kle, written Mon kluiw). Vietnamese chó is also known as cầy. This, too, follows the same pattern as (SV diệp).

            As Tsu‑lin Mei and others have noted, these correspondences point not to isolated borrowings but to a deeper stratum of shared vocabulary linking Vietnamese with Chinese and neighboring language families.

              The Shuo-wen says 南越名犬#### “Nan-yüeh calls 'dog' *nôg **g.” This explanation occurs under the entry for ## which implies that the meaning “dog” is attached to this character. The first character of the compound probably represents a pre-syllable of some kind. Tuan Yü-ts'ai mentioned in his Commentary to the Shuo-wen that this word was still used in Kiangsu and Chekiang, but did not give any further detail.

              Karlgren gives **gas the OC value for ## (GSR 109 7h). At the time of the Shuo-wen (121 A.D.), -g had probably already disappeared; in Eastern Han poetry, MC open syllables (OC –b, -d, -g) seldom rhyme with stopped syllables (OC –p, -t, -k); in old Chinese loan words in Tai (specifically, the names for twelve earth's branches 地支 ti-chih), probably reflecting Han dynasty pronunciation, Proto-Tai –t corresponds to OC –d, but no trace can be found for –g. The proper value for our purpose is therefore **ô.

              This is the AA [Austroasiatic] word for “dog,” as the following list shows: “dog”: VN chó; Palaung shɔ:; Khum, Wa soʔ, Riang s'oʔ; Kat, Suk, Aak, Niahon, Lave có; Boloben, Sedang có; Curu, Crau ʃŏ; Huei, Sue, Hin, Cor sor; Sakai cho; Semang cû, co; Kharia sɔ'lɔʔ, ; Ju solok; Gutob, Pareng, Remo guso; Khasi ksew; Mon klüw; Old Mon clüw; Khmer chkɛ.

              The forms after VN represent almost all the major groups spoken in the Indo-China and Malay Peninsulas, as well as the Palaung-Wa, Khmer, and Mal groups. The proto-form for these languages appears to be soʔ or coʔ, preceded perhaps by k- (cf. Khasi, Gutob, etc.). On the basis of Mon, Haudricourt suggested that VN ch- < kl-.** But there is another possibility, namely, VN ch- < kc-; “to die” *kcət, VN chết, Kuy kacet, Kaseng sit. And even if VN ch- did come from kl-, this change must have occurred quite early, since in all the AA languages except Mon, the initial is either a sibilant fricative or affricate.

              Source: http://tlmei.com/tm17web/1976a_austroasiatics.pdf

              The essential point is that Vietnamese chó and Chinese 狗 gǒu stand in direct correspondence, traceable to more than 2,250 years BP, when the indigenous Yue peoples were already in contact with early Chinese. At the same time, Chinese 犬 quán (SV khuyển) served as the native term for 'dog.' [M 犬 quăn < MC kʰʷen < OC *kʰʷeːnʔ. Note that 犬 quǎn and 狗 gǒu may be cognates or doublets, as noted by Tsu‑lin Mei following Tuan Yü‑ts'ai's commentary on the Shuowen, which records 犬 still in use in Jiangsu and Zhejiang.] Both words coexisted, with 狗 gǒu eventually becoming the more frequent form, a fact that helps reconcile the evidence showing that many other basic words in Vietnamese and Chinese share common roots, whether from Old Chinese, Yue, or Austroasiatic sources, encompassing modern Dai, Zhuang, Miao, Yao, and related languages.

              Similarly, in other citations in cases where final /‑s/ or /‑h/ correspond to the Vietnamese hỏi or ngã tones, the Chinese-Vietnamese correspondences as follows are less transparent and require closer scrutiny:


            ViệtMonMnongChinese correspondences by dchph
            mũi ('nose')muhmǔh鼻 bí (SV tỵ) [ M 鼻 bí (tị, tỵ) < MC biɪ < OC *blids | Note: Based on other Chinese ~ Vietnamese solid cognates of human body parts, for this item, we can posit the pattern ¶ /b- ~ m-/ (See footnotes below.) According to Pulleyblank, the Yuan and modern Mandarin readings as well as many other modern dialects, e.g., Taiyuan /piə'/, Amoy literary /pit/, imply E. /bjit, L. pɦjit./ | Etymology: The word derives from Proto‑Sino‑Tibetan bi 'nose' (cf. Nuosu ꅳꁖ hnap bbit 'nose; snot'). An alternative derivation traces it to Proto‑Sino‑Tibetan s‑brit 'sneeze; nose; swallow,' which is reflected in Tibetan སྦྲིད (sbrid 'sneeze'), though Chinese shows no trace of r in this root (Schuessler 2007).

            In several modern lects — including Mandarin, Gan, Jin, Wu, Xiang, and even the literary layer of some Min dialects — the word points to a form with final ‑t. Thus in Standard Mandarin it is pronounced , suggesting an old entering‑tone reflex, rather than , which would be expected from the Middle Chinese departing tone. This irregularity is explained either as an early northwestern loss of ‑s in the ‑ts cluster before final simplification (Baxter 1992), or as a dialectal shift from ‑s to ‑t (Pulleyblank 1998).

            Originally, 自 denoted 'nose' but later shifted to mean 'self', leaving 鼻 (OC blids) to carry the sense of 'nose'. Some scholars interpret 鼻 as depicting a nose (自) together with two lungs (畀), though oracle‑bone evidence shows 畀 representing an arrow rather than lungs. ]

            rễ ('root')rɜhries蒂 dì (SV đế) [ M 蒂 (蔕) dì, dài, zhài < MC tei < OC *te:ds | ¶ /d- ~ r-/ ]
            bảy ('seven')tpahpoh七 qī (SV thất) [ M 七 (柒) qī < MC tsʰit < OC *sn̥ʰid | Note: all dialects, like M, have longer retain the final /-t/ |  Starostin's reconstruction: Protoform nit (s‑) 'seven'. Chinese: chit snhitʔ, Burmese: khu‑natɕ, Kachin: sjənit², Lushei: KC s‑Nis, Limbu: nu‑si, Proto‑Garo: ɲi(s), Garo: sni; Dimasa: sini, Rawang: sanit; Trung: sjə³‑ɲit¹, Kanauri: stiʂ, Mantshati: nyiz/‑i, Rgyarung: ʂnis, ʂnes, Namsangia: iŋit, Andro: sini. (Refs: Sh. 123, 134, 411, 429; Ben. 16; Mat. 203)
            (For further elaboration on this etymology, see Chapter Ten on Sino‑Tibetan etymologies.) ]

            Footnotes on the Chinese cognates:

            1. The Ancient Chinese sound of 鼻 bí for VS "mũi" is reconstructed by different linguists as biuzj (MC) < OC *bjiwer (Chou 1973), b'ji- (MC) < OC *b'òcd (Karlgren 1957), bi (MC) < OC *bjidh (Li 1971), bi (MC) < OC *bjcs (Schuessler 1987), phjì (MC) < OC *bjis (Pulleyblank 1991). While Chou's MC /biuzj/ is the closest sound of VS /muj4/ by way of /b-/ > /ʔɓ-/ > /m-/, any of the proposed sound changes above could have given rise to similar sounds in othe Chinese dialect, for example, bei6 (Cantonese, Wenzhou dialects), pó (Xiamen and Chaozhou dialects) and p'ei6 (Fuzhou dialect), but, amusingly, it became tị [tej6] (conditioned by -j-) in SV along with other irregular patterns in Sino-Vietnamese ¶ /b-, p- ~ t-, th-/ where there exist no similar Fanqie spellings in Kangxi dictionary. However, if it could become /bei6/, it could be nasalized (fronted due to the original labial like /b-/) to become /mej6/, giving rise to /mwoj6/ then /mwoj4/ (fronted due to a rounding effect of the glide -w-). Compare the pattern of /-ej/ ~ /-uj/ as follows.
              • 酸梅 suānméi 'salted dried plum' (VS xímuội ~ mechua, SV toanmai) [ cf. 梅 méi (SV mwoj6, mai), Chaozhou /bhuê5/ ¶ /m- ~b-/ ],
              • 每 měi 'each' (VS mỗi, SV mỗi) [ M 每 méi < MC mɔj < OC *mjə:ʔ | Dialects: Cant. mui22, Amoy muĩ2. Chaozhou mue21, Fuzhou muei2. cf. 母 mǔ (SV mẫu, VS mẹ), Chaozhou /bho2/. ],
              • 妹 mēi 'younger sister' (VS em, SV muội) [ VS 'em' /ēim/ (contraction) <~ 妹妹 mēimēi | M 妹 mēi < MC moj < OC *mhjə:ts < PC *mjət | According to Starostin, Burmese: mat husband's younger brother, younger sister's husband. Comments: Kham mama mother's younger brother. For *mh- cf. Xiamen be6, Chaozhou mue6, Fuzhou muoi5, Jianou mue ]
              • 魅 méi 'obscure' (VS mờ , SV muội) [ M 魅 mèi < MC mɔj < OC *mjə:ts ],
              • 味 wèi 'smell' (VS mùi, SV vị) [ M 味 wèi < MC mʊj < OC *mjəts | FQ 無沸 | According to Starostin: Standard Sino-Viet. is vị. Since the Chinese word also means (in later times) 'interest', Viet. muồi 'interesting' may be traced back to the same source. For *m- cf. Xiamen, Chaozhou bi6, Fuzhou muoi6, Jianou mi6. | cf. 未 wèi (SV 'mùi'), 'mìchính 味精 wèijīing (SV vịtinh) 'MSG'], 

                and correspondence ¶ /b- ~ m-/ between Middle Chinese ~ Sino-Vietnamese and Mandarin and Vietnamese can be found, such as

              • 疲 pì 'tired' (VS mệt, SV ) [ M 疲 pì < MC be < OC *bhaj | ¶ b- ~ m- ],
              • 肥 féi 'fat' (VS mập, mỡ, phệ, phị, SV phì) [ M 肥 féi < MC bwyj < OC *bjəj | According to Starostin, ST be fat, rich. Viet. phệ is a colloquial reading (cf. also reduplicated: phềphệ); standard Sino-Viet. is phì (reduplicated: phìphị).],
              • 秘 mì 'secret' (SV bí /bei5/) [ M 秘 bì < pi < OC *prits],
              • 忙 máng 'busy' (VS bận , SV mang) [ M 忙 máng < MC mjəŋ < OC *ma:ŋ | Dialects: Amoyu boŋ12 (lit.), baŋ12; Chaozhou maŋ12; Fuzhou mouŋ12; Shanghai mã32 ], and
              • 悶 mèn 'sad' (VS buồn, SV muộn) [ M 悶 mèn < MC mɔn < OC *mjə:ns | Dialects: Amoy bun32, Chaozhou buŋ32. According to Starostin, 悶 mèn means 'melancholy, sorrow', absent from Schuessler's dictionary, although attested already in Yijing. The character is also used (since L.Zhou) for *mjə:n, MC mon, Mand. mén 'to be stuffy, stifling, close, airless' (both readings may be actually related). cf. Viet. 'ngộp' (stuffy) \ m- ~ ŋ- ]

            2. The appearance of { 蒂 dì rễ ~ SV đế } corresponds to the patterns of

              • 婿 xù 'son-in-law' (VS rể ~ SV tế) [ M 婿 xù < MC siej < OC *sas. Also: *sēs (Zhou zyxlj p.256), Karlgren: OC *srir, TB *krwy | cf MC *sa 胥. MC siej could be from OC *sēs. MC description 解開四去 ],
              • 鬚 xū 'beard' (VS râu ~ SV tu /tʊ1/) [ M 鬚 xū ~ 須 xū < MC ʂjʊ < OC *so | FQ 相俞 | ¶ x-, s- ~ r- : ex. 蛇 shě (SV xà) rắn 'snake', 縮 suō (SV thúc) rút 'shrink' ],
              • 縮 suò 'shrink' (VS rút ~ SV  thu  /thʊ1/ [ Also, VS 'co', 'thụt' | M 縮 suò < MC ʂʊk < OC *sruk ],
              • 菜 cài 'vegetable' (VS rau ~ SV thái  /the3/ [ Also, SV 'thể', VS 'cải' | M 菜 cài < MC chɤj < OC *shjə:ʔs ],
              • 愁 chóu 'sad' (VS rầu ~ SV sầu  /sʌw2/ [ cf. 秋 qīu VS thu /thʊ1/) | M 愁 chóu < MC ʐjəw < OC *dhu | Dialects: Suzhou zoy12; Wenzhou zau12; Changsha cou12; Nanchang chɜu12 ; Cant. sʌu12 ],
              • (及)速 (jí)sù 'hasty' (VS (gấp)rút ~ SV (cấp)tốc ) [ M 速 sù < MC suk < OC *so:k ]
            3. Etymology

            Khmer lacks a native morpheme for 'seven.' The Vietnamese form may derive from Proto‑Vietic *pəs, ultimately traceable to Proto‑Mon‑Khmer *d₁puulh, *d₁puəlh, *d₁pəlh. Cognates include Bahnar tơpơh and Bolyu pei⁵⁵.
                Shafer's hypothesis To account for Old Bodish bdun 'seven'—in contrast to s‑Nis in most Tibeto‑Burman languages and nwi in Karenic—Shafer proposed an original form sibdunis. With accentual variation (sibdúnis), this yielded O.B. bdun. Since Old Bodish disallowed clusters such as sbd‑, the initial consonant was dropped, as in other examples:
              • Sino‑Tibetan m‑lt'ei 'tongue' → O.B. ltśe
              • Sino‑Tibetan p‑l‑ŋa → O.B. lŋa
                • Siamese: tśěţ₃
                • With accent sibdunís, the development may have proceeded as sibunís > siwunís > sinwis (Karenic nwi), while sibdunís > sunís > s‑Nis in most Tibeto‑Burman languages. Metathesis frequently preserved consonants that would otherwise have been lost, particularly in Bodish dialects, and a similar process may explain the Karenic forms.
            Comparative forms
                • Western Bodish (Sbalti): bdun
                • Burig: ŕdun
                • Kharao: tśă‑ri
                • Vietnamese:
                • Sino‑Vietnamese: thất /t'ɐt7/, as in đệ thất 'seventh'
                • Vernacular Vietnamese: thứ bảy 'Saturday; seventh'

            Some of the Sinitic‑Vietnamese examples above may suggest that their sound changes were derived from Sino‑Vietnamese, itself a development from Middle Chinese. Yet the reverse scenario is at least as plausible: basic vocabulary is more likely to preserve deeper connections with Old Chinese—or even with proto‑Chinese—than with the later strata of Middle Chinese.



            Conclusion

            The evidence reviewed in this chapter makes clear that Vietnamese cannot be reduced to a late‑tonalized offshoot of Mon‑Khmer. The substratum of basic vocabulary, the tonal correspondences with Old and Middle Chinese, and the persistence of Sino‑Vietnamese etyma across centuries all point to a deeper and more complex history. Haudricourt’s theory of tonegenesis, while pioneering in its time, inverted the actual process: rather than Vietnamese acquiring tones belatedly, it is more likely that certain Mon‑Khmer languages lost them, while Vietnamese developed in parallel with Chinese within the broader Sino‑Tibetan–Yue continuum.

            The correspondences in basic vocabulary — chó ~ 狗 gǒu, ~ 雞 , lúa ~ 來 lái, voi ~ 為 wéi, gấu ~ 熊 xióng — demonstrate that Vietnamese shares fundamental etyma with Chinese and related languages at the deepest lexical stratum. These parallels cannot be explained away as late borrowings; they reflect long‑standing contact and shared inheritance.

            From the Han conquest in 111 B.C. through nearly a millennium of colonization, Vietnamese evolved within the Chinese linguistic sphere, absorbing, adapting, and elaborating tonal categories that remain central to its identity today. The tonal system of Vietnamese is thus not an isolated innovation but part of a regional continuum that includes Old Chinese, Middle Chinese, and the Yue‑Daic languages of southern China.

            In sum, the Vietnamese language emerges as a product of layered interaction: indigenous Vietic roots, Austroasiatic affiliations, and sustained Sinitic influence. Its tonal system and basic vocabulary testify to a history of convergence rather than divergence, demanding a reassessment of long‑standing assumptions about its genetic affiliation. Vietnamese is best understood not as a peripheral Mon‑Khmer language, but as a central participant in the Sino‑Tibetan–Yue nexus that shaped the linguistic landscape of East and Southeast Asia.

            ENDNOTES


            (H)^ "The Phùng Nguyên culture of Vietnam (c. 2,000 - 1,500 B.C. ) is a name given to a culture of the Bronze Age in Vietnam during the Hong Bang Dynasty which takes its name from an archeological site in Phùng Nguyên, 18 km (11 mi) east of Việt Trì discovered in 1958. It was during this period that rice cultivation was introduced into the Red River region from southern China. The most typical artifacts are pediform adzes of polished stone." Source (as of March 2018):   https://en.wikipedia.org/wiki/Phùng_Nguyên_culture

            (S)^ List of he 23 identified fundamental basic words for which we could plug in all Vietnamese and Chinese cognates into place without much difficulty. Let's save this for worksheet practice in the end, and wait and see what the Austroasiatic Mon-Khmer camp will come up with.

            1. Thou:_____________ 
            2. Not:______________
            3. To give:___________ 
            4. Man/male:_________ 
            5. Mother:___________ 
            6. Bark:_____________ 
            7. Black:____________ 
            8. I:________________
            9. That:_____________ 
            10. We:______________
            11. Who:_____________ 
            12. This:_____________ 
            13. What:____________ 
            14. Ye:______________
            15. Old:_____________ 
            16. To hear:__________ 
            17. Hand:____________ 
            18. Fire:_____________ 
            19. To pull:___________ 
            20. To flow:__________ 
            21. Ashes:____________ 
            22. To spit:___________ 
            23. Worm:___________ 
            See "Ancient Languages Have Words in Common" by Zachary Stieber, Epoch Times (May 6, 2013).

            (T)^ As previously discussed, those Cantonese and Hokkien subdialects of the common ancestral Yue language are officially classified as of Sino-Tibetan language family.

            [In the meanwhile, ] the Tai–Kadai languages, also known as Daic, Kadai, Kradai, or Kra–Dai, are a language family of highly tonal languages found in southern China and Southeast Asia. They include Thai and Lao, the national languages of Thailand and Laos respectively. There are nearly 100 million speakers of these languages in the world. Ethnologue lists 95 languages in this family, with 62 of these being in the Tai branch.

            The diversity of the Tai–Kadai languages in southeastern China, especially in Guizhou and Hainan, suggests that this is close to their homeland. The Tai branch moved south into Southeast Asia only about a thousand years ago, founding the nations that later became Thailand and Laos in what had been Austroasiatic territory.

            [...] The Tai–Kadai languages were formerly considered to be part of the Sino-Tibetan family, but outside China they are now classified as an independent family. They contain large numbers of words that are similar in Sino-Tibetan languages. However, these are seldom found in all branches of the family, and do not include basic vocabulary, indicating that they are old loan words.

            Several Western scholars have presented suggestive evidence that Tai–Kadai is related to or a branch of the Austronesian language family. There are a number of possible cognates in the core vocabulary. Among proponents, there is yet no agreement as to whether they are a sister group to Austronesian in a family called Austro-Tai, a backmigration from Taiwan to the mainland, or a later migration from the Philippines to Hainan during the Austronesian expansion.

            The Austric proposal suggests a link between Austronesian and the Austroasiatic languages. Echoing part of Benedict's conception of Austric, who added Tai–Kadai and Hmong–Mien to the proposal, Kosaka (2002) argued specifically for a Miao–Dai family.

            In China, they are called Zhuang–Dong languages and are generally considered to be related to Sino-Tibetan languages along with the Miao–Yao languages. It is still a matter of discussion among Chinese scholars whether Kra languages such as Gelao, Qabiao, and Lachi can be included in Zhuang–Dong, since they lack the Sino-Tibetan similarities that are used to include other Zhuang–Dong languages in Sino-Tibetan.
            [...]
            Tai–Kadai consists of five well established branches, Hlai, Kra, Kam–Sui, Tai, and the Ong Be (Bê) language:
            - Ong Be (Hainan; Lin'gao (臨高) in Chinese)
            - Kra (called Kadai in Ethnologue and Gēyāng (仡央) in Chinese)
            - Kam–Sui (mainland China; Dong–Shui (侗水) in Chinese)
            - Hlai (Hainan; Li (黎) in Chinese)
            - Tai (southern China and Southeast Asia)
            (Source (as of Jan. 2017:  https://en.wikipedia.org/wiki/Tai%E2%80%93Kadai_languages

            (A)^ As for that modern broad grouping of languages in the Austroasiatic linguistic family, except for the same concept that is used to refer to a smaller scale of a linguistic sub-family to include only those Mon-Khmer languages while separately the Vietnamese language and its Vietic sibling descents, e.g. Muong, Tha, Vung, Ruc, etc., all originated from those ancestral speeches which originated from a proposed ancient proto-Taic language – which "were once spoken much more widely in China" (Norman, ibid.) – and that their variants have been explicitly referred to as remotely diverged from Taic forms that gave birth to the Yue languages which in turn gave rise to all those contemporary languages that are classed as of Sino-Tibetan linguistic family, such as Cantonese and Hokkien dialects. That is how nominally the Yue languages have come to fit into a much larger picture. Note that "Vietnamese" and "Muong" are specifically not grouped into the Mon-Khmer languages (Norman, ibid.), which indicates that Norman was also aware of the problems in their affirmative classification.

            (W)^ Without the mastery level of "linguistic feelings" that a specialist needs with near native level of the target language due to lack of first-hand experience in modern Chinese, both standard and colloquial, they would never know the roots of many Vietnamese words such as:

            - 'đầunậu' (ring leader) 頭腦 tóunăo (SV đầunão),
            - 'dàydạn' (experienced) 經驗 jīngyàn (SV kinhnghiệm),
            - 'láibuôn' (merchant) 大販 dăipán (Cant. /tai2pan3/),
            - 'lẻtẻ' (trivial) 零星 língxīng (SV linhtinh, 'miscellaneous'),
            - 'ănnhậu' (social engagement) 應酬 yìngchóu (i.e., 'eat and drink'),
            - 'cụngly' (raise glasses and cheers) 碰盃 bèngbèi,
            - 'đừnghòng' (don't you ever) 甭想 péngxiăng,
            - 'luônluôn' (always) 牢牢 láoláo,
            - 'lạcloài' (solitude) 落落 luòuò [ ~ '失落 shìluò (SV thấtlạc) ],
            - 'đượclắm' (pretty good) 得來 délái,
            - 'đượclòng' (pretty good) 心得 xīndé,
            - 'giờgiấc' (time) 時間 shíjiān [ while 'thuở (thủa)' (a period of time), a contraction of phonetic sandhi of 時候 shíhòu (SV thờihậu) ],

            all that match exactly the same usage and meanings of the Chinese counterparts, not to mention in-depth knowledge required for the Chinese phonological historical linguistics to appreciate the roots of basic lexicons such as

            - 'chỉ' 線 xiàn (thread) and 'chỉ' 錢' (ancient monetary unit weighed approximately a 10th of a Chinese unit of 兩 tael) [ cf. 錢 qián (SV tiền) 'money' ],
            - 'đường' 唐 táng (road, as apposed to 途 tú (SV đồ), to 道 dào (SV đạo),
            - 'lá' 葉 yè (leaf) [ the pattern /j-/ ~ */l-/ is very common in Chinese. ],
            - 'lúa' 來 lái (paddy, as opposed to 稻 dào 'gạo' rice) [ cf. 麥 mài (SV mạch) ],
            - 'cá' 魚 yú (fish) [ /ke-/ and /ca-/ in English 'ketchup' and 'catsup' is cognate to V 'cá' ],
            - 'sông' 江 jiāng (river) as opposed to 川 chuān (SV xuyên) [ cf. 水 shuǐ (SV thuỷ) 'water', another word for 'river' ],
            - 'mây' 霧 wù (cloud), as opposed to 雲 yún (SV vân),
            - 'mưa' 雨 yǔ (rain) [ the pattern /y-/ ~ /m-/ is very common in Chinese ~ Vietnamese. ],
            - 'nắng' 陽 yáng (sunshine) [ Who says there is no Chinese word for 'sunshine'? ],
            - 'cóng' 寒 hán (chilly) [ Hai. /kwɔ5/ ],
            - 'biển' 海 hăi (sea) as opposed to both VS 'bể' and 'khơi' [ SV 'hải', for 'khơi', cf. Cant. /hoj3/; it is not hard to associate the 2 related sounds. Ex. 海外 hăiwài V 'hảingoại' (overseas) vs. VS 'ngoàikhơi' (out in the seas) ],
            - 'bữa' 飯 fàn (Hainanese /buj2/ 'meal' as opposed to SV 'buổi' (period of the day),
            - 'ăn' 唵 ăn (eat) [ cf. 吃 chī (cf. 乙 yǐ (SV ất) as opposed 'xơi 食 shí (SV thực) ],
            - 'uống' 飲 yǐn (drink) as opposed to 'hớp' 喝 hè (SV hát) 'sip',
            - 'đi' 去 qù (go) as apposed to 走 zǒu (SV tẩu) 'run' for 'chạy',
            - 'đứng' 站 zhàn (stand),
            - 'ỉa' 屙 é (to poo), 'đái' 尿 niào (to pee, same as VS 'tiểu' conotatively as 'urinate', cf. 尿尿 niàoniào 'điđái'),
            - 'ngủ' 臥 wò (lie down to rest, hence 'sleep', as opposed to 睡 shuì, connotatiively 'somnus'),
            - 'đụ' 嫖 piáo (fuck, a derivative of VS 'đéo', coloquially 他媽 Tāma ('Your mother's fucker'),
            - 'đẻ' 生 shēng (Hainanese /te1/) 'give birth to', in addition to 'tái' (Hai. /ta5/) 'uncooked',
            - 'việc' 活 huó (work) as apposed to 務 wù (SV vụ),役 yì (SV dịch),
            and of a great number of other words cited in this paper. For the same reason, due to lack of first-hand experience in modern Vietnamese the same authors will never know dissyllabic words such as
            - 'đốivới' (with respect to) 至於 zhìyú giving rise to 'đếnnổi' (to such a degree that) as apposed to 對於 duìyú,
            - 'vòmtrời' 重圓 chóngyuán (SV trùngviên 'sky vault') instead of 宇宙 yúzhōu (SV vũtrụ 'universe'),
            - 'gỏi' 膾 (鱠) kuài (SV khoái 'mince meat (fish) salad') instread of 'chopped meat or fish',
            - 'quà' 饋 kuì (SV quỹ 'gift') instead of 禮物 lǐwù,
            - 'cảirỗ' 菜蘭 càilán (Chinese brocolli) instead of 'cảilàn' or 'cảilan',
            - 'dưahấu' 塊瓜 kuàiguā (SV khốiqua 'watermelon') in stead of 西瓜 xīguā,
            - 'ănmày' 要飯 yàofàn (beggar) in stead of 乞丐 qǐgài,
            - 'thầymô' 巫師 wùshì as opposed to 'phùthuỷ' (shaman), etc.,

            all are cognates.

            (南)^ Yet, all such events occurred at much later times not long ago with less impact in terms of cross-cultural influence, though, as compared with what could have come from another indigenous kingdom called Nanzhao 南詔 (Namchiếu) where half of today's North Vietnam's territory to the west belonged to it, which flourished between 649 and 902 during the Tang Dynasty.

            (菲)^ Archaeological excavations suggest that the aboriginals of Austronesian and Malay origin in the Philippines were from the China South's region when there had still existed the land bridge cross the seas to the Philippine islands during the Glacier Period (Bao Shi-Tian 鮑事天, "菲律賓 的 漢學 研究 (Sinological Study in the Philippines)" (pp.55, 67) in Sung Shee, et al., Symposium on the Sinological Study Over the World (Taipei, 1967).

            (M)^ Also, see the Chinese translation by Huang Xuan-Fan 黃宣範. "中古 漢語 聲調 與 上聲 的 起源" (in <中國 語言 學論集>. 1977. pp. 175-197