Comparative Evidence from Disyllabic Sino‑Vietnamese
by dchph
Historical linguistics has long emphasized monosyllabic correspondences in Sino‑Vietnamese etymology. Yet the disyllabic domain – compounds, reduplicated forms, and paired morphemes – remains underexplored. This article proposes a new approach to disyllabic sound change, highlighting how Vietnamese and Chinese reveal systematic correspondences beyond the single syllable.
The most revealing insight of this chapter is the role of disyllabicity. Vietnamese compounds often invert the order of their Chinese models, as in bắtnạt # 欺負 (qīfù, 'bully') or hồnthiêng # 靈魂 (línghún, 'spirit'). This inversion is not accidental but systematic, reflecting a stage when syllable order was still fluid. By testing reversed orders, scholars can uncover hidden cognates, reconstruct plausible etyma, and explain otherwise opaque forms. This disyllabic approach reframes Vietnamese etymology: instead of treating the language as a borrower of isolated monosyllables, it recognizes Vietnamese as a polysyllabic system that reshaped Chinese inputs according to its own rhythm and logic.
Contrary to a long‑standing misconception in certain linguistic circles, neither Chinese nor Vietnamese is inherently monosyllabic. This belief has persisted largely because novices tend to accept and repeat what they have been told without examining the evidence. A simple survey of modern Vietnamese vocabulary, which closely parallels that of Chinese, reveals that both languages are dominated by disyllabic words, that is, lexical items composed of two syllables, consonant dominant as well. Strikingly, the majority of these are of Chinese origin. Such forms are variously referred to as lexical disyllabicity, disyllabicism, or simply disyllabics. In the following section, we will examine disyllabic words in detail, tracing their linguistic changes and the processes by which they have evolved.
The term dissyllabic can be spelled with a single “s,” yet in this
research when it is deliberately written with “ss”, it is to underscore
the central importance of disyllabicity in both Vietnamese and Chinese.
Recognizing this feature is a prerequisite for any serious study of the
two languages.
Metaphorically, they may be seen as growing
from a vast, ancient linguistic tree with a monosyllabic stem, its roots
sunk deep into fertile soil enriched by thick Sinitic layers atop an
indigenous substratum. Over time, its branches have become heavy with
dissyllabic leaves and dotted with polysyllabic fruits, each with
distinct textures, forms, and appearances.
Understanding this
natural evolutionary path is an unprecedented insight, one that can
guide researchers in identifying further etyma and tracing their origins
with greater precision.
Many prominent Sinologists of the 20th century – including Maspero (1912), Karlgren (1915), Haudricourt (1954), Wang Li (1956), Chang (1974), Denlinger (1979), and Vietnamese scholars such as Lê (1967), Nguyễn (1979), and Ðào (1983) – made extensive use of Chinese data to illuminate the etymology of Sinitic‑Vietnamese words of Chinese origin over the past two millennia. While they recognized an affinity, whether genetic or not, between Chinese and Vietnamese, their analyses were overwhelmingly confined to monosyllabic forms. As a result, many Sinitic‑Vietnamese etyma escaped notice. In reconstructing Middle Chinese (MC) phonology, they relied on Sino‑Vietnamese readings but often failed to identify cognates of the same root embedded in Sinitic‑Vietnamese vocabulary.
Consider 東 as an illustrative case. Pulleyblank (1984) reconstructed its Middle Chinese (MC) value as /*təwŋ/, corresponding to modern Mandarin /dong1/, within his Early Mandarin framework. This aligns with the Sino‑Vietnamese đông /dowŋʷ1/ [ɗəwŋ], a form belonging to a rare division class marked by a closed, rounded lip final /‑owŋʷ/. Pulleyblank, together with Li Fang‑Kuei (1971), was among the few scholars to recognize this distinctive Old Chinese articulation.
From such a reconstruction, it is possible to posit that words ending in /‑owŋʷ/ , or even /‑owŋm/, could evolve into /‑ow/. Following the sound‑change pattern of clipping, one finds parallels in forms such as đau ('painful') from 痛 tòng, thau ('bronze') from 銅 tóng, and đỏ ('red') from 彤 tóng. The reverse pattern may also be observed, as in đường ('road') from 道 dào (SV đạo), which can be postulated with a /daw/ final. The author doubts that most renowned linguists have recognized this type of clipping interchange between /‑owŋʷ/ and /‑aw/.
In the realm of Old Chinese reconstruction, eminent Sinologists such as Bernhard Karlgren and Henri Maspero devoted much of their attention to tracing Chinese loanwords in Thai, Khmer, Japanese, and Korean. In doing so, they overlooked a crucial fact: during the millennium of imperial Chinese rule, the inhabitants of ancient Annam regularly articulated forms of Mandarin in everyday speech. At the official level, local administrators addressed the China‑appointed viceroy in Mandarin; in domestic and social contexts, native wives conversed with their Chinese husbands and children in a mixed Chinese–Annamese vernacular, which also served as a lingua franca among Annamese themselves in daily colonial life.
Phonologically, when Pulleyblank and Wang drew upon Sinitic‑Vietnamese and Sino‑Vietnamese material in their reconstructions of Old Chinese, they may have recognized that such Chinese elements were embedded in early Annamese, some of which later crystallized into the Sinitic‑Vietnamese stratum. Yet their engagement with Vietnamese was largely confined to the Sino‑Vietnamese (Hán‑Việt) stock, and their proficiency in the language was limited, evidenced by frequent misspellings in cited examples. This narrow scope became a methodological constraint, perpetuating the same one‑to‑one correspondence issues that have long characterized earlier Sino‑Tibetan theorization on Vietnamese.
Because they failed to appreciate the close phonemic proximity between Chinese and Vietnamese, these scholars were unable to detect plausible sound‑change variations beyond monosyllabic stems. Consequently, none identified the expanded potential of monosyllabic roots when they appear in disyllabic formations, a phenomenon central to the present study.
Table 1 - Monosyllabicity
"Monosyllabicity" (tínhđơnâmtiết 單音節性) refers to the predominance of single‑syllable words in a language’s vocabulary. In earlier periods, some Western linguists even equated this feature with linguistic "primitivity," likening it to the speech of so‑called "savage" tribal groups in remote Amazonian jungles. The author is unaware of any truly monosyllabic language existing on earth.
In the case of Vietnamese, those who have labeled it "monosyllabic" reveal a fundamental misunderstanding. They have never undertaken the basic numerical exercise of calculating how many possible consonant‑vowel combinations a language restricted to monosyllables could produce, given standard syllable structures – VC, V, CV, and CVC – combined with the eight tones, and all possible phonemic permutations (for example, tac, tap, tat, and so on).
In the author’s most recent count, the Sinitic‑Vietnamese lexical collection in the Han-Nom Etymology dictionary contains nearly 80,000 entries. Many of these monosyllabic forms occur with frequency rates ranging from 25% to well above 100%, and require replacement with polysyllabic forms (tínhđaâmtiết 多音節性) to resolve semantic ambiguity. For example:
-
manh 單 dān ("single") → áomanh 單衣 dānyi ("sweater")
-
manh 氓 máng ("folk") → lưumanh 流氓 líumáng ("hooligan")
I) The problem with monosyllabism
-
Traditional reconstructions privilege monosyllables, leaving disyllabic forms treated as irregular or secondary.
-
Vietnamese, however, abounds in disyllabic words (thànhphố, xinlỗi, cảlũ) whose etymological depth is obscured when analyzed only through monosyllabic lenses.
-
Chinese likewise preserves disyllabic compounds (jiànliàng, tónghuǒ) that align with Vietnamese forms in meaning and phonology.
For newcomers to the field, it is worth beginning your exploration of Vietnamese etymology within the realm of polysyllabicity – that is, the vocabulary stock consisting of words with two or more morphemic syllables. Examples include bảvai ('shoulder'), bângkhuâng ('melancholy'), sựcnhớ ('suddenly remember'), lộnxàngầu ('chaotic'), tóctaibùxù ('uncombed hair'), mặtmàybíxị ('unhappy face'), and many others. You will quickly see that two‑syllable words dominate the vocabulary stock, a pattern identical to the lexical status of Chinese.
Following this path leads naturally into historical linguistics, revealing far more about Vietnamese than the oft‑repeated misconception, still found among some native specialists, that it is a monosyllabic language.
Vietnamese is, in fact, far more than a disyllabic language in the lexical sense; it possesses the capacity to coin words with affixes, syllabically, a topic to be discussed later in relation to phonology, semantics, and syntax. This section introduces the dissyllabicity approach and explains its advantages over older, self‑constrained methods that focus solely on monosyllabic words and one‑to‑one correspondences as base units of investigation. Once the linguistic community recognizes the superiority of the dissyllabicity approach, it can serve as a framework for newcomers to identify more Sinitic‑Vietnamese etyma, whose syllabic components are mostly of Chinese origin. Research in either Vietnamese or Chinese historical linguistics cannot be complete without reference to the other.
As noted earlier, the two languages are closely related and intertwined not only in Yue and Sino‑Tibetan etymologies but also in historical phonology. Chinese loanwords make up a large portion of Vietnamese vocabulary and have generated both focused and extended derivatives as the language has developed into full polysyllabicity. Examples include xứsở ('birthplace'), hợppháp ('legal'), tửtế ('kindness'), and xàcừ ('tridacna').
The point is clear: Vietnamese is not 'monosyllabic'; the language has evolved over three millennia into a sophisticated language on par with any modern tongue. Polysyllabically speaking, it stands alongside English, French, and others, though it still wrongly remains written in a monosyllabic orthography. Reform toward a polysyllabic writing system – just like pinyin that is applicable to Chinese character‑block writing following smart Korean script,– would place Vietnamese on truly equal footing with these languages. Such a shift would free specialists from outdated perceptions and methodologies, opening the Sinitic‑Vietnamese domain as a new frontier in historical linguistics.
II) The Approach
-
Polysyllabic grouping: Treat disyllables as units rather than decomposed syllables.
-
Nucleus‑based annotation: Track vowel nuclei across disyllabic pairs to reveal systematic shifts.
-
Comparative grids: Map Vietnamese forms against Middle Chinese and Old Chinese reconstructions, noting doublets and semantic overlap.
-
Diachronic layering: Recognize substratal Yue and Tai‑Kadai influence in disyllabic survivals.
Modern Vietnamese clearly shows its dissyllabic nature, with numerous high‑frequency disyllabic words formed either from two word‑syllables or from morphemic‑syllables – the former being independent monosyllabic words used as syllables, the latter bound morphemes that cannot function alone.
Most disyllabic words, often direct Chinese loanwords, entered Vietnamese intact and later evolved within the language. Chinese syllables have also been used to coin new disyllabic forms in much the same way as in modern Chinese, resulting in many Vietnamese and Chinese terms being near‑mirror counterparts. These compounds may be formed from individual morphosyllables that are synonymous, parallel, opposite in meaning, or simply assembled from existing lexical material.
In the following examples, we see how tức|giận ('mad/angry'), thương|yêu ('affection/love'), and trước|tiên ('firstly/initially') illustrate parallel or symmetric formation; how chiềucao ('height') and caothấp ('ranks') show antonymous pairing; and how locally innovated forms such as chàohỏi ('greeting, hello') or xinlỗi ('apologize') demonstrate semantic extension beyond their original Chinese meanings.
a. Parallel or symmetric compounds
- tức|giận 氣憤 qìfèn ('mad/angry'),
- thương|yêu 疼愛 téng'ài ('affection/love'),
- trước|tiên 首先 shǒuxiān ('firstly/initially'),
- kề|cận 切近 qièjìn ('by/near'),
- đường|cái 街道 jièdào ('road/street'),
- đường|lộ 道路 dàolù ('road'),
b. Opposite or antonymous compounds
For compounds formed from opposite, or antonymous, word‑syllables, as noted earlier, examples include 高低 gāodì ('high/low'), which in Vietnamese corresponds to chiềucao ('height') with the same connotation as độcao 高度 gāodù ('the height'). At the same time, 高低 gāodì is also associated with the existing form caothấp ('ranks') to denote hierarchical levels in a competition. Similarly, 大小 dàxiăo ('large/small') becomes kíchthước ('size'), associated with 尺寸 chǐcùn (SV xíchthốn), while the original 大小 dàxiăo has evolved further in Vietnamese as tonhỏ ('whisper'). In these cases, the resulting forms are modified disyllabic words shaped by local innovation.
c. Locally innovated or semantically extended forms
Similarly, many other words reuse existing vocabulary to develop new, locally modified meanings, presenting themselves as extended and renewed lexical items whose senses exist only in Vietnamese through phonological association. This can be seen in cases such as @ 招 zhāo ~ chào ('greet') 早 zăo, @ 呼 hu ~ hỏi ('ask') 問 wèn, @ 見 jiàn ~ xin ('request') 請 qǐng, and others, exemplified below:
- chàohỏi 打招呼 dăzhāohu ('greeting, hello') [@ 招 zhāo ~ chào 早 zăo; @ 呼 hu ~ hỏi 問 wèn]
- xinchào 見過 jiànguò ('greeting, hello') [@ 見 jiàn ~ xin 請 qǐng; @ 過 guò ~ chào 早 zăo; archaic usage]
- xinlỗi 見諒 jiànliàng ('apologize') [@ 見 jiàn ~ xin 請 qǐng; @ 諒 liàng ~ lỗi; archaic usage; cf. modern M 道歉 dàoxiàn; also cognate to VS xinlỗi via @ 歉 xiàn ~ xin 請 qǐng and @ 道 dào ~ lỗi 罪 zuì (SV tội) → VS lỗi ('wrongdoing')]
- thươnghại 傷害 shānghài ('sympathize') [opposed to 'injure'; cf. modern M 同情 tóngqíng or SV đồngtình ('sympathize'); alternatively VS thươngtình ('pity')]
- tửtế 仔細 zǐxī ('kindness') [opposed to VS tỉmỉ ('meticulous') in Mandarin; cf. modern M 細心 xīxīn]
- lịchsự 歷事 lìshì ('polite') [opposed to VS bặtthiệp ← lịchthiệp ← 渉歷 shèlì SV thiệplịch ('polite')]
- chùxị 主事 zhǔshì ('host') [opposed to cognate 主席 zhǔxí SV chủtịch ('chairman')]
- đànghoàng 堂皇 tánghuáng ('solemnly') [opposed to 'stately']
- sựcnhớ 想起 xiăngqǐ ('suddenly remember') [@ 想 xiăng ~ sực ('chợt' 突 tù); @ 起 qǐ ~ nhớ 記 jì (ký)]
- nặngnhẹ 輕重 qīngzhòng ('criticize') [opposed to 'weight']
- khốnnạn 困難 kùnnán ('wretched') [opposed to 'difficulty' and '混蛋 húndàn' in modern Mandarin]
- bànội 內婆 nèipó ('paternal grandmother') [opposed to modern M 奶奶 năinài ('grandmother'); cf. 內公 nèigōng ('grandfather') in parallel with 外婆 wàipó bàngoại ('maternal grandmother') and 外公 wàigōng ôngngoại ('maternal grandfather')]
- anhem 兄妹 xiōngmei ('siblings', literally 'older brother and younger sister') [cf. em ← emgái 妹妹 mèimei ('younger sister', by metathesis and contraction) and 俺 ăn (VS 'em', 'younger brother'), a first‑person self‑addressing pronoun in Northern Mandarin (Shanxi, Shandong, Liaoning, etc.)]
- cậunhỏ 小舅 xiăojìu ('little boy') [opposed to original meaning: wife's younger brother addressed by her husband]
- chúnhỏ 小叔 xiăoshù ('little boy') [opposed to original meaning: husband's younger brother addressed by his older brother's wife]
- cônhỏ 小姑 xiăogū ('little girl') [opposed to original meaning: husband's younger sister addressed by his older brother's wife]
- khoảngđường 途徑 tújīng (route),
- cáibàn 案子 ànzi (desk),
- cáighế 椅子 yízi (chair),
-
cámực 墨魚 mòyú (to cover the modern M 魷魚 yóuyú 'squid' to),
d. Other categories
Why do these details on disyllabic words matter in the study of Vietnamese etymology? They demonstrate that such modified forms originate from changes in semantic, phonological, or lexical aspects – including localized words built from the same Chinese material in disyllabic formation – which provide greater semantic precision than single monosyllabic words.
III) Illustrative Examples
-
xinlỗi ↔ 見諒 jiànliàng: apology forms showing nucleus inversion and semantic equivalence.
-
cảlũ ↔ 大伙 dàhuǒ: group terms with shared morpheme huǒ ‘companion’.
-
đồngloã ↔ 同夥 tónghuǒ: accomplice terms, demonstrating disyllabic semantic pairing.
-
ungthư ↔ 癰疽 yōngjū: medical vocabulary, showing substratal survival in disyllabic form.
From a semantic perspective, close examination of the previously cited examples reveals recurring sound‑change patterns that underpin the etyma of derived Vietnamese words with extended meanings. These often represent selective alternations among Chinese disyllabic equivalents.
To expand your Sinitic‑Vietnamese corpus, refresh your memory and consider the following additional examples:
1. Semantic patterns
Close examination of the examples above reveals consistent sound‑change patterns underlying the etyma of Vietnamese disyllabic words with extended meanings. Many are selective alternations of Chinese disyllabic equivalents, adapted semantically to local usage.
Examples:
- phànnàn: 抱怨 bàoyuàn (SV báooán, 'complain') [→ thanphiền ('complain') ← thanvan ('lament') ← 'than' 嘆 tàn + 'phiền' 煩 fán]
- dànhriêng: 限於 xiànyú (SV hạnvu, 'purposely reserved for') [dànhcho ('reserved for') ← 'be limited to'; extended meaning synonymous with SV giớihạn (界限 jièxiàn, 'set the boundary')]
- rànhmạch: 明白 míngbǎi (SV minhbạch, 'unequivocal') [sángtỏ ('understand, bright') ← M 明白 míngbǎi; M 明 míng < MC maiŋ < OC *mraŋ || See VS 'biết' ('know') <~ (clipping of 明白 míngbǎi), cf. Hainanese, Amoy /bat7/]
- ănhàng: 吃貨 chīhuò SV ngậthoá ('eat gluttonously') [also 'like to eat junk food'; extended meaning 'run contraband']
- ănchơi: 應酬 yìngchóu ('drinking and eating') [ ănnhậu ('be invited to dinner') ← 'engage in social activities' ]
2. Phonological variations
Sound‑change articulation often produces multiple Vietnamese variants from a single Chinese source, sometimes with altered tone or final consonant.
- riêngtư: 隱私 yǐnsī SV ẩntư ('private') [variants: riêngtây, tưriêng]
- sànhvề: 善於 shànyú SV thiệnvu ('be good at') [variants: rànhvề, rànhrẽ, sànhsỏi, sỏivề, hayvề... given /sh‑ ~ s‑, r‑, l‑/; thiện > hiền > hay ]
- dùrằng: 雖然 suīrán SV tuynhiên ('although, even if, however') [variants: chodù, dùsao, mặcdù, tuyvậy, dẫurằng, dùlà, mặcdầu... given /s‑ ~ j-(d-)/, /r‑ ~ l‑, m/]
3. Lexical associations
Many disyllabic compounds are built from two monosyllabic words that are themselves variants of Chinese morphosyllables, linked by synonymy, association, or metathesis.
Examples:
- tức|giận ('angry'): Vietnamese variation of 生氣 shēngqì; tức ← 氣 qì, giận ← 恨 hèn; reversed word order compared to Chinese.
- trước|tiên ('firstly'): Cognate to 首先 shǒuxiān; trước ← 前 qián, tiên ← 先 xiān; association with đầu 首.
- cũ|kỹ ('old'): Reduplication; cũ ← 舊 jìu, kỹ as variant; parallels 陳舊 chénjiù.
- kề|cận ('nearby'): Differs from 切近 qièjìn; aligns with gầnkề, gầngũi; metathesis in local speech patterns.
4. Historical Development
Aside from older inherited forms like khủnglong 恐龍 ('dinosaur') or yểuđiệu 窈窕 ('graceful'), much of Vietnamese lexical dissyllabicity is a later development. It functions as a mechanism to reduce monosyllabic homonymy after tonal differentiation, a process unique to both Chinese and Vietnamese.
5. Shared mechanisms
Chinese and Vietnamese share internal characteristics in disyllabic formation: pairing homonymous morphosyllables with tonal variation to create precise meanings.
Example:
- hiếuthảo ('filial'): Cognate to 孝順 xiàoshùn (SV hiếuthuận); thảo originates from thuận /tʰwʌn6/, denasalized to /‑ảo/; meaning shifted to 'generous' in contexts like thảoăn ('share food generously').
- 順 (shùn) acts as a free‑floating affix, as in xuôigió 順風 ('tail wind'), suônsẻ 順利 ('smoothly').
Phonologically, for now, newcomers to this field should begin by accepting at face value the many regular interchanges between Chinese and Vietnamese, in both directions. These shifts often occur in clusters across syllables rather than isolated phoneme changes:
- /‑eng → ‑e/
- /‑ang → ‑ac/
- /‑ong → ‑aw/
- /‑k → ‑ng/
- /n‑ → đ‑/
- /‑n → ‑i, ‑t/
- /‑wan → ‑oi/
- /‑u → ‑ang/
Comparable sound‑change patterns follow logical linguistic rules: phonemic shifts occur within the realm of neighboring sounds sharing similar articulatory attributes.
Table 2 - Common sound‑change patterns of Chinese → Vietnamese in disyllabicity
| Chinese Mandarin | Vietnamese reflex | English | Sound‑change patterns |
|---|---|---|---|
| 生 shēng | đẻ | 'give birth' | /sh‑ ~ đ‑/; cf. Hainanese /te1/ |
| 忙 máng | mắc, bận | 'busy' | /m‑ ~ b‑/; /‑ang ~ ‑ak/ |
| 痛 tòng | đau | 'pain' | /t‑ ~ đ‑/; /‑ong ~ ‑aw/ |
| 尿 niào | đái, tiểu | 'urinate' | /n‑ ~ đ‑, t‑/ |
| 蒜 suàn | tỏi | 'garlic' | /s‑ ~ t‑/; /‑uan ~ ‑oi/ |
| 前 qián | trước | 'before' | /q‑ ~ tr‑/; /‑ian ~ ‑uok/; cf. Hai. /tai2/ |
| 幕 mù (SV mạc) | màn | 'curtain' | MC mak → VN /‑an/ |
| 高低 gāodì | chiềucao, caothấp | 'height', 'ranks' | Semantic extension; antonym pairing |
| 大小 dàxiăo | kíchthước, tonhỏ | 'size', 'whisper' | Semantic shift; antonym pairing |
| 無聊 wúliáo | côliêu (SV vôliễu) | 'in extreme depression' | /w‑ ~ c‑/; semantic narrowing |
| 緣分 yuánfèn | duyênnợ (SV duyênphận) | 'fate, lot' | /‑en ~ ‑iên/; semantic re‑association |
Historically, Chinese became increasingly disyllabic, likely stabilizing during the Tang Dynasty. Many Middle Chinese disyllabic words entered Vietnamese in batches, with all related sound clusters shifting within the paired syllables as a complete unit, not simply vowel‑to‑vowel or initial‑to‑initial changes, nor rigid one‑to‑one syllable correspondences. Some disyllabic loanwords also appear in reverse syntactic order compared to modern Mandarin, reflecting earlier Middle Chinese usage. Examples:
- bảođảm ~ 擔保 dànbăo ('guarantee'),
- liênquan ~ 聯關 liánguān ('related to'),
- thithố ~ 措施 cuòshī ('show'),
- vinhquang ~ 光榮 guāngróng ('glorious'),
- trángkiện ~ 健壯 jiànzhuàng ('strong').
For researchers, attention to dissyllabicity is essential: sound‑change patterns in paired syllables are a key process in tracing Chinese roots of many Sinitic‑Vietnamese etyma. Both Vietnamese and Chinese are, in structural terms, disyllabic languages. Chinese is already classified as polysyllabic by major linguistic institutions worldwide (Chou 1982, p.106), and Vietnamese can be formally classed as disyllabic based on its word‑formation characteristics and shared commonalities with Chinese.
Only within this framework can a reliable system of sound‑change patterns be established. Without such recognition, one might overlook correspondences such as
- 無聊 wúliáo ('in extreme depression') → VS côliêu (SV vôliễu)
- 緣分 yuánfèn ('fate, lot by which couples are brought together') → VS duyênnợ (SV duyênphận).
Recognizing that each Chinese word‑syllable in a pair may shift to a different sound in Vietnamese has led to the formulation of this dissyllabicity approach, enabling the identification of over 20,000 Vietnamese etyma cognate with Chinese forms, from ancient to modern dialects, literary and vernacular , many long regarded by purists as indigenous Nôm or "pure" Vietnamese words. Examples include:
- cá 魚 yú ('fish') [cf. 魚汁 yúzhi ('catsup', 'anchovy sauce') from Amoy dialect; OC nga]
- chim 禽 qín ('bird'), chóc 雀 què ('bird') [→ chimchóc ('birds')]
- dưa 瓜 guā [→ dưahấu 奎瓜 kuìguā ('watermelon')]
- chả 炸 zhà ('ham') [→ chảgiò 炸肉 zhàròu ('fried spring roll')]
- lụa 肉 ròu ('meat') [→ chảlụa 炸肉 zhàròu ('boiled meat loaf'); also 肉 ròu ~ 縷 lǚ ('silk')]
- giò 肉 ròu ('spring roll') [cf. chảgiò above]
- rọi 肉 ròu ('meat') [→ barọi 肥肉 féiròu ('bacon')]
- ruốc 肉 ròu ('meat') [Northern Vietnamese]
- dồi 肉 ròu ('sausage') [Northern Vietnamese]
- mặn 咸 xián ('salty') [→ mắm 鹹 xián; mắmcá ← 咸魚 xiányú ('fermented anchovy')]
Because many lexical compounds derive their meanings from the pairing of syllables, their true form should be written in polysyllabic orthography, as implemented in this paper. Chinese disyllabic words retain their paired‑syllable attributes when transformed into Vietnamese, often with significant semantic and phonological shifts. For example,
- 氣 qì shifted from hơi (Cant. /hei1/, 'air, steam') as in 汽車 qìchē ('automobile', VS xehơi) to kiệt in keokiệt 小氣 xiăoqì ('stingy'), while 小 xiăo became keo, cognate to 摳 (kòu, VS kẹo, 'stingy').
- In 客氣 kèqì (SV kháchkhí, 'polite'), 氣 qì appears as sáo or khứa in kháchsáo, which evolved into kháchkhứa.
- In 生氣 shēngqì ('angry'), 氣 qì is tức ('angry'), while 生 shēng means sống ('live') or đẻ ('give birth'), the latter implying 'becoming angry'.
The magnitude of these sound changes is far‑reaching and multi‑layered. In analysis, disyllabic words are treated as single units, with syllabic portions – macro-syllabic changes – capable of altering their vocalic shells in ways quite different from their monosyllabic counterparts – micro-phonetic changes.
A single monosyllabic word can, in fact, have more than one pronunciation. The phonological constraints governing an independent monosyllable do not necessarily limit the range of sound changes that may affect it when embedded in a disyllabic formation, especially across languages, where internal forces such as speech habits or localization take over. Examples include: ‑子 zi (cái, con, cây, trái), ‑兒 ‑r (nhi, nhí, nhỏ), ‑者 zhe (kẻ, giả, gia, nhà). In other words, deviations in sound change can occur across the entire string of sounds in a disyllabic unit, producing results quite different from the stand‑alone monosyllabic form. This is a case of one syllable yielding many outcomes.
If Vietnamese is still regarded as a monosyllabic language, the underlying dynamics of sound change in the Sinitic‑Vietnamese dissyllabicity approach will never be fully appreciated. Once the rule of sound change is accepted, as illustrated in earlier examples, questions such as why /‑ư/ corresponds to /‑a/, /‑iê/ to /‑a/, /‑au/ ~ /‑ông/, /‑ong/ ~ /‑au/, /‑at/ ~ /‑an/, /‑an/ ~ /‑ôt/, /‑ai/ ~ /‑ua/ will no longer arise. Nor will there be insistence on rigid one‑to‑one correspondences such as /‑ia‑/ → /‑ươ‑/, /‑ng/ → /‑ng/, or /d‑/ → /n‑/. In reality, combinations of phonological changes – affecting initials, medials, and finals separately – can produce entirely new sounds in the target language, e.g.,
- MC 學 /ɦaɨwŋk/ 'study' (SV học) → M xué;
- MC 一 /ʔjit/ 'one' (SV nhất) → M yī /ji1/.
When certain Chinese loanwords entered Vietnamese, sound changes may already have occurred within Chinese itself, or they may have taken place later in Vietnamese. In either case, apart from synchronically irregular items, these changes operated within linguistic constraints, often influenced by cultural factors. Local speech habits, for example, yield 手板 shǒubăn → bàntay 'palm' instead of taybàn, or 母 mǔ 'mother' → mẹ /mɛ6/, which further evolved into mợ /mə6/ 'maternal uncle’s wife' – likely through contraction by dropping 舅 jìu 'maternal uncle' while retaining Middle Chinese features. In Chinese, 舅母 jìumǔ remains the disyllabic form, avoiding homonymy.
Cultural factors have facilitated selective borrowing and triggered sound changes. Even at the time of borrowing, many words followed established phonological patterns, continuing to evolve over time due to locality, social status, education, and historical context. Examples:
- 他 tā (SV tha, 'he, him') → nẫu, nó, họ;
- 我 wǒ (SV ngã, 'I, me') → tôi, tao, tui, tớ, qua;
- 咱 zá (VS ta, 'I, we, us'); 咱們 zánměn (VS chúngmình, 'we' inclusive);
- 我們 wǒmén (VS tụimình, 'we').
Borrowing from neighboring Mon‑Khmer languages is far less common, showing Vietnamese reluctance to adopt such vocabulary. Even in multi‑ethnic highland and southern provinces, indigenous placenames (Đắklắk, Kontum, Đàlạt, Pleiku, Sóctrăng, Càmau) have been “Vietnamized” with tonal accents, but the spoken language remains unaffected.
By contrast, Vietnamese readily imports Chinese words, often adding parallel forms of the same root with similar meanings. For example,
- 粉條 fěntiáo (VS phởtiếu → hủtiếu, as in hủtiếu Namvang, 'Phnom Penh‑style seafood noodles') uses fěntiáo for hủtiếu, while "Namvang" is a transliteration of the name of Cambodia’s capital.
- 麵條 miàntiáo 'wheat noodles' shifted to sợimiến 'mungbean vermicelli' and sợimì and mìsợi,
- 粉條 fěntiáo to búntàu (implied 'Chinese vermicelli' → phởtiếu or sợiphở 'rice noodles'.
- 麵 miàn 'wheat flour/noodles' became mì, as in bánhmì (from 麵包 miànbāo 'bread', with 包 bāo linked to bánh 餅 bǐng 'bread'), bộtmì (麵粉 miànfěn, 'wheat flour') and, of course, mìsợi (麵條 miàntiáo).
Other examples: - 味精 wèijīng ('MSG') → vịtinh → mìchính;
- 水餃 shuíjiăo ('dumpling') → taivạc → quaivạc;
- 餛飩 húndùn ('wonton') → hoànhthánh → vằngthánh.
Semantic shift is common in all languages, but Vietnamese preference for Chinese material is telling – if Mon‑Khmer were the root, such borrowing would be far less extensive, mostly on one-to-one basis.
Historically, Vietnamese linguistic development paralleled Chinese for at least 1,200 years before the 10th century, and continued thereafter. Numerous Chinese lexical items entered via dialectal contact, and vice versa. Many Vietnamese cognates appear in the Kangxi Dictionary as dialect forms, e.g., mềm (面 miàn, 'soft'), ăn (唵 ǎn, 'eat') ; others, like mèo (卯 mǎo, 'cat'), are absent because Chinese scholars reject the cognacy.
Loan doublets with different pronunciations often entered Vietnamese in different periods or from different dialects. They followed acquisitive models within a linguistic kinship boundary. Thus, neither French bande ~ Vietnamese băng, pot-au-feu ~ phở nor English cut ~ Vietnamese cắt; they are not cognate – but Chinese 繃 béng [baŋ] and VS băng, 粉 fěn for VS phở, or 隔 gé [kat] and cắt are.
The dissyllabics approach rests on analysis of Old Chinese, Middle Chinese, dialectal, and Mandarin data, rationalizing semantic relevance and generalizing sound‑change processes that match hundreds of Vietnamese words in their polysyllabic shells, including tonal correspondences across Chinese dialects. By recognizing the disyllabic nature of both Vietnamese and Chinese, we focus less on isolated phonemic shifts (e.g., /s‑ ~ t‑/, /sh‑ ~ th‑/, /t‑ ~ d‑/) and more on the dynamic, synchronous process in which clusters of sounds change together as a unit that is capable of producing multiple Sinitic‑Vietnamese variants, each distinct from the monosyllabic equivalents of their component syllables.
Table 3 - Benefits of polysyllabicity
As a side note, it is worth emphasizing how efficiently the human brain processes polysyllabic structures, a phenomenon long recognized in the Western world. Cognitively, the chief advantage of combining written forms into polysyllabically linked blocks lies in the ability to absorb information more rapidly by perceiving entire conceptual units at once. This parallels the experience of reading in Latin‑based writing systems–particularly German–where speakers combine and capitalize noun strings to achieve the same effect. Comparable outcomes are also evident in other block‑writing systems such as Korean or Thai, though notably not in Chinese.
A similar principle is applied in practical contexts. In U.S. motorway signage, for instance, drivers can recognize street names more easily when presented in polysyllabic blocks. The City of San Francisco, for example, replaced all‑uppercase street name signs with capitalized‑letter formats nearly two decades ago, significantly improving legibility and recognition from a distance.
Acknowledging this cognitive process provides a foundation for the dissyllabicity approach advanced in this paper. Building on its basic concepts and general principles, this methodology enables the identification of a vast number of Vietnamese words of Chinese origin. One striking feature of Chinese polysyllabic words borrowed into Vietnamese is the extent to which their vocalism has undergone drastic sound changes, diverging sharply from the original pronunciation.
As the examples illustrate, writing disyllabic words in their true combined form is central to postulating Sinitic‑Vietnamese etyma. In polysyllabic formation, individual syllables frequently undergo dynamic phonological shifts–deformation, contraction, or assimilation–moving from one form to another according to patterned articulations. Symbolically, this can be represented as:
XX XX X XX X XX XX X XX...
Here, spaces mark word boundaries in combined forms, with each XX modeled on block characters, as is scientifically denoted in Korean. Adoption of a similar system is strongly recommended for Chinese as well.
Comparatively, these sound‑change patterns resemble the way Latin polysyllabic roots generated diverse forms across the Indo‑European languages. Their variations are easier to trace because they are transcribed in Latin, and even in Cyrillic or Greek, alphabets. For the Sinitic‑Vietnamese etyma, we adopt a similar process by treating them as phonetic clusters rather than as Chinese ideographic blocks, which can distort conceptual analysis.
Unconventionally, in Romanized transcription, Vietnamese disyllabic words in this paper are written in combining formation, just as Mandarin multisyllabic words are transcribed in pinyin. For example,
- 廢話 fèihuà ('nonsense') → VS baphải
- 大話 dàhuà ('pompous') → VS bahoa
- 溫馨 wēnxīn ('warm') → VS ấmcúng
- 溫水 wēnshuǐ ('hot water') → VS nướcnóng
- 溫泉 wēnquán ('hot spring') → VS suối(nước)nóng
- 開心 kāixīn ('pleased') → VS hàilòng, vuilòng
- 忍心 rěnxīn (SV nhẫntâm, 'cool‑heartedly') → VS đànhlòng
- 忍讓 rěnràng ('forbearing') → VS nhườngnhịn
We will continue to examine this phonetic phenomenon to understand why, in many cases, sound changes in disyllabic words are both phonologically and semantically distinct from their original roots. Seeing multiple derived morphs from the same syllable in different disyllabic forms helps reveal that sound patterns operate across the entire cluster, not as isolated syllables. Yet this same formation may confuse lay readers, leaving the impression that phonological variants of the same Chinese monosyllabic stem are ad hoc.
As with earlier examples of dissyllabicity, the following illustrations expand on how syllabic changes in combined forms create new meanings. Consider 廢話 fèihuà ('nonsense'). If we accept bahoa as cognate to fèihuà, then 廢 fèi ('waste') aligns with ba through the interchange /f‑/ ~ /b‑/, while +/hoa/ conveys the sense of 'nonsensible', but only in this morphemic form and context.
One may wonder how ba and fèi could be related. They are etymologically connected only within the disyllabic form 廢話 fèihuà. Vietnamese ba here has nothing to do with ba to mean 'three' or ba 'father'. The /ba/ of bahoa exists only within the phonetic shell of /fèihuà/. Monosyllabic 廢 fèi by itself corresponds to SV phế ('waste') and VS bỏ ('discard'). Thus, ba and hoa individually carry no lexical meaning when isolated; they function only as bound morphemes within the disyllabic structure that may convey a little phonosemantic.
In this way, /ba/ as a bound morph contributes to both baphải and bahoa, yielding two distinct concepts in different vocalic shells. Here, one plus one produces more than two: several new disyllabic words emerge from 'recycled material' combined with other morphemes. Breaking them into monosyllables misses the point, since the morphemes alone do not carry meaning.
With the same affix 廢 fèi, we see further developments:
- bỏphế 廢除 fèichú ('eradicate')
- bỏđi 廢棄 fèiqì ('abandon')
- đồbỏ 廢物 fèiwù ('trash')
- bỏhoang 荒廢 huāngfèi ('deserted')
Like ba, the Sinitic‑Vietnamese bỏ is not always tied directly to 廢 fèi. It arises through multiple sound‑change processes, especially in disyllabic words. Other examples include:
- bãibỏ 排除 páichú ('abolish')
- bỏphiếu 投票 tóupiào ('cast a ballot')
- bỏrơi 抛棄 pàoqì ('leave behind')
- bỏđi 放棄 fàngqì ('abandon')
- bỏqua 放過 fàngguò ('let go')
- bỏlỡ 錯過 cuòguò ('miss an opportunity')
- bỏmặc 不管 bùguăn ('do not care')
- bỏbê 不理 bùlǐ ('abandon')
- bỏphí 白費 báifèi ('to waste')
bỏlỡ dịpmay = 放過 機會 fàngguò jīhuì ('miss an opportunity')and phrases:
bỏtiền (vô túi) 把錢 進入 口袋 里 bă qián jìnrù kǒudài lǐ ('put money into the pocket')
bỏtiền ra mua 花錢 來 買 huàqián lái măi ('spend money to buy')These shifts to bỏ reflect contextual innovation, involving not only phonological and semantic assimilation but also syntactic reshuffling, such as reversal of word order (đồbỏ, vứtbỏ, bỏhoang) to fit Vietnamese speech habits.
Similarly, in baphải 廢話 (fèihuà, SV phếthoại), 話 huà evolves into hoa, but how does it become phải? The sound‑change rule ¶ /hw‑ ~ fw‑/ applies, a common pattern in Cantonese and Fukienese compared to Middle Chinese or Mandarin. For example, 葩 pā (SV ba) ~ 花 huā (Cant. /fa1/). In disyllabic formation, /fwa/ could shift to [fai3]. Note also that 話 huà /hwa5/ in its monosyllabic form could evolve into nói ('talk', SV thoại). A parallel pattern ¶ /th‑ (sh‑) ~ n‑/ is seen in 水 shuǐ (SV thuỷ, 'water') → nước; Viet‑Muong /dák/ parallels M 踏 tà 'đạp', VS chà ('trample').
開 kāi ~ mở ('open') [cf. Cant. /hoj1/; SV khai; Hai. /k'uj1/; Viet. khui; note pattern ¶ /k‑ (kh‑) ~ m‑ \ hw‑/]For the same reasons, sound changes can occur in a variety of other ways. For example:
口 kǒu ~ mỏ ('muzzle') [SV khẩu /kow3/, Cant. /how3/; cf. mồm 吻 (wěn, 'mouth', SV vẫn, VS hôn, hun, 'kiss'); note pattern ¶ /k‑ (kw‑, kh‑) ~ m‑ \ hw‑/]
底 dǐ ~ trệt ('street level') [Ex. 一樓一底 yī lóu yī dǐ → một lầu một trệt 'the street level and one upper floor']
快 kuài ~ mau ('fast'; a loan‑graph from the character meaning 'happy', SV vui) [SV khoái /k'waj5/, cf. Cant. /faj1/; note pattern ¶ /k‑ (kw‑, kh‑) ~ m‑ \ hw‑/]
點 diǎn, as in 快點 kuàidiǎn → maulên ('hurry up') [here 點 diǎn (SV điểm) shifts because ¶ /d‑ ~ l‑/]Of course, lên herein does not mean 'ascend, go up, get on'; instead, it functions as a grammatical particle indicating a course of action, similar to 'up' in 'hurry up'. Phonologically, M /tjen3/ corresponds to Vietnamese /len1/, and etymologically both are cognate.
- 點 [tjen3] also yields tiếng ('hour'), châm ('ignite'), chấm ('dot, dip'), tí ('a bit'), điểm ('point'), đếm ('count'), etc. [all from M 點 diǎn, diàn, dian, zhān < MC tiɛm < OC *te:mʔ ]. Remarkably, the Vietnamese meanings match the full semantic range of 點 (diǎn) in Chinese dictionaries.
lênđây 上來 shànglái ('come up here').Separately, for the connotation of lên ('up'), compare:
Here 上 shàng corresponds to lên ('ascend'), while 來 lái is a grammatical particle equivalent to VS ‑đây, assimilated as an adverb of direction, cognate with 此 cǐ (SV thử) or 這 zhè (SV giả) meaning 'here'.
- 溫 wēn → ấm, but how does 馨 xīn become cúng? It is not the same as cúng 供 (gòng, SV cống, 'make offerings'), but a result of sound change. 馨 xīn is also pronounced xīng in pinyin, hinh /xejng1/ in SV [ 馨 xīng, xīn (hinh, hấn) < MC heŋ < OC *qʰeːŋ ]. The velar /x‑/ often shifts to labiovelar /kw‑/, /k'w‑/ in Chinese [cf. 慶, 磬, 罄, all qìng in Mandarin, khánh in Sino-Vietnamese]. In 馨香 xīnxiāng ('fragrance'), 馨 xīn aligns with thơm, associated with hương (香 xiāng, 'fragrant')
This illustrates the dissyllabicity approach: begin with a word in Vietnamese or Chinese, expand it into all plausible disyllabic cognates, then eliminate the unreliable to establish the most plausible etyma.
Many examples show how the same morphemic syllable in derived disyllabic words evolves into bound morphemes or compounded morphs, which cannot stand alone but function only within polysyllabic formations.
For instance, 利 lì (SV lợi, lị) yields VS lời and lãi:
- 利息 lìxì (SV lợitức) → VS tiềnlời ('profit, interest rate')
- 利得 lìdé (SV lợiđắc) → VS đượclãi ('making money')
- 贏利 yínglì (SV anhlợi) → VS ănlời ('earning profits')
- 有利 yǒulì (SV hữulợi) → VS cólợi ('beneficial')
- 利事 lìshì (SV lợisự) → Cant. /lei2sei2/ → VS lìxì ('red‑enveloped money')
- 流利 líulì (SV lưulợi) → VS lưuloát ('fluently')
- 順利 shùnlì (SV thuậnlợi) → VS suônsẻ ('smoothly')
- 伶利 línglì (SV linhlợi) → VS lanhlẹ, lanhlẹn ('quick')
The same syllable thus produces lãi or lời as stand‑alone words, or morphemic derivatives like ‑sẻ, ‑loát, ‑lẹn. These may or may not carry meaning independently, depending on their associative strength. All are related: lời ~ lãi ~ lợi ~ lị. Note that lời and lãi were coined euphemistically to avoid the taboo of King Lê Lợi’s name (黎利, Lê Thái Tổ – 黎太祖 Lí Tàizhǔ, 15th century).
Further illustration:
- lẹ ('quick') in 伶利 línglì (SV linhlợi) → VS lanhlẹ.
Compare mauchóng 敏捷 mǐnjié ('quickly'), a variation of chóngmau (盡快 jìnkuài, 'as fast as possible'), colloquially linked to 馬上 (mǎshàng, VS mauchóng, 'immediately', lit. 'on horseback'). Here 快 kuài (SV khoái) is the source of VS mau ('fast') as no other Chinese character directly matches /maw/. Interestingly, kuài also meant 'happy' (VS vui), showing the pattern ¶ /k‑ (kh‑) ~ wj‑, v‑/ \ hw‑}.
These examples show that in disyllabic forms, either syllable can evolve into various Vietnamese sounds in other compounds. They also demonstrate metathesis (reversal of order), e.g., mauchóng vs. chóngmau, nhanhchóng vs. chóngvánh.
The novel disyllabic approach departs from the old focus on isolated monosyllables and the boring pattern of one‑to‑one cognates. For example, few specialists have posited bậnviệc with 忙活 mánghuó, or phànnàn with 抱怨 bàoyuàn. Many stop at 務 wù (SV vụ) or 役 yì (SV dịch), missing that VS việc ('work') is cognate with 活 huó (SV hoạt), too.
Thus, many Sinitic‑Vietnamese etyma have been overlooked due to the misconception of Vietnamese and Chinese as monosyllabic. Monosyllabicity is a primitive feature of proto‑languages, not of modern Vietnamese or Chinese. This misconception has hindered breakthroughs in Vietnamese etymology. The antithetical views presented here aim to correct this and open new approaches.
The dissyllabicity approach rests on two premises:
- Both modern Vietnamese and Chinese are fundamentally disyllabic, with a
high percentage of two‑syllable words (see Chou Fa‑Kao, 1982).
- There exists a deep kinship between them, traceable through Tai > Yue > Dai and Tai > Chu > Han lineages.
The author's hypothesis began with instinct and suspicion of this distant genetic affinity, then was confirmed by systematically matching forms in disyllabic structures.
Intuitively, applying the dissyllabicity approach has already uncovered thousands of Vietnamese words of Chinese origin. This method has also enabled the author to identify certain basic words with a high degree of accuracy. To illustrate how this process works, consider the following examples:
1. chimchóc 禽獸 (qínshòu, SV cầmthú, 'birds')
- Also VS thúvật ~ convật for SV cầmthú ('animals').
- For chóc, often treated in modern Vietnamese as a reduplicative syllable, the evidence suggests it was originally an independent monosyllabic word. It is associated with 雀 què, qiăo, qiāo (SV tước).
- M 禽 qín < MC gim < OC *ghjəm. Dialects: Chaozhou ʑin12, Wenzhou ʑiaŋ12, Shuangfeng ʑin12.
- For chim ('bird'), cf. M 鳥 niăo (SV điểu) ~ Hai. /jiao2/.
- Starostin notes that 禽 was frequently used since Late Zhou with the meaning 'wild bird(s)' or 'something caught', while 擒 was used for 'to capture'. 獸 shòu < MC ʂjəw < OC *ʔjəwʔh.
Thus, 禽獸 qínshòu corresponds to thúvật or convật ('animals'). But chóc is a basic word synonymous with chim, preserved as a dialectal variant in Thanhhoá and Ninh bình, the region of the ancient capital Hoalư in the 10th century. Its survival there strengthens the case for chóc as an authentic monosyllabic root, likely cognate with Old Chinese forms.
2. chóc 雀 (què, qiăo, qiāo, SV tước, 'bird')
- M 雀 què, qiăo, qiāo < MC cjak < OC *tɕekw. | Used in compounds like chimchóc 禽雀 qínqiāo.
- The regular Sino‑Vietnamese reflex is tước, but chóc survives in Vietnamese as a true monosyllabic word.
- Starostin reconstructs tɕekʷ, noting 雀 was also used as a general name for small birds in early Chinese.
3. chảcá 炸魚 (zhàyú, SV tạcngư, 'fried fish cake')
- Literally 'fried fish'.
- M 炸 zhà < MC tɕak < OC *tɕra:ks; 魚 yú < MC ŋʊ < OC *ŋha.
- Vietnamese chả (boiled ground meat cake, 'ham') derives from 炸 zhà ('deep fry'). Semantically, it shifted from 'fry' to 'meat cake', but retains the original sense in chảcá 炸魚 ('fried fish').
- Cf. chảrươi 炸虲 ('fried worm'), chạotôm 炸蝦 ('fried shrimp cake').
- Vietnamese rán ('fry') is cognate with 煎 jiān ('fry'), hence both rán and chiên coexist.
- In the south, chả is used; in the north, giò (giòlụa, chảlụa 炸肉 zhàròu). Taiwanese usage: 紮肉 zhāròu ('boiled pork meatloaf'), 魚扎 yúzhā ('fish cake').
4. chảlụa ~ giòlụa 炸肉 (zhàròu, SV tạcnhục, 'boiled meatloaf')
- Literally 'fried meat'.
- M 肉 ròu < MC ȵuwk < OC *njuɡ
- lụa here is a phonetic variant of 肉 ròu (¶ /r‑ ~ l‑/), not related to 綢 chóu ('silk').
5. cậtruột 骨肉 gǔròu (SV cốtnhục, 'blood kinship')
- Literally 'bone and flesh'.
- 骨 gǔ aligns with cật; 肉 ròu aligns with ruột ('intestine'), extended metaphorically to 'kinship'.
- Cf. 親子 qīnzǐ ('conruột'), 親爹 qīndiè ('charuột'), 親母 qīnmǔ ('mẹruột').
6. barọi 肥肉 féiròu: SV phìnhục ('bacon')
- 肥 féi 'fat' → ba; 肉 ròu → rọi (voiced variant).
- Possibly linked to ba (三 sān, 'three'), innovated as 'three layers of meat' (thịtbarọi). Cf. 五花肉 wǔhuāròu ('streaky pork').
7. búnriêu 蟹粉 xiéfěn (SV giảiphấn, 'crab noodle soup')
- M 粉 fěn, fèn < MC pun < OC *pɯnʔ
- Vietnamese bún ('vermicelli') is likely an independent loan from 粉 | ¶ /f‑ ~ b‑/.
8. mắmriêu 鹹蟹 xiánxié (SV hàmgiải, 'salted crab sauce')
- 蟹 xié 'crab' → ghẹ, cua, cáy.
- M 蟹 (蠏) xiè, xiě, xié < MC ɦaɨj < OC *gre:ʔ
- Vietnamese riêu/rêu plausibly derives from 蟹 xié.
9. mắmruốc 鹹蝦 xiánxiā (SV hàmhà, 'shrimp paste')
- 蝦 xiā 'shrimp' → ruốc, tôm, tép.
- ¶ /x‑ (OC *ghr‑) ~ r‑/.
10. mắmcá 鹹魚 xiányú (SV hàmngư, 'salted fish paste')
- 鹹 xián 'salty' → mặn.
- Vietnamese mắm ('fish sauce') can be traced to 鹹 xián.
Synthesis: These examples demonstrate how the dissyllabicity approach reveals Vietnamese cognates of Chinese etyma, often hidden in bound morphemes or dialectal variants. Words like chóc, lụa, ruột, rọi, mắm, riêu, and ruốc show how phonological shifts, semantic extensions, and local innovations transformed Chinese monosyllables into Vietnamese disyllabic forms.
For the etymon mắm (the well‑known Vietnamese 'fish sauce'), we may posit a connection with 鹹 xián, originally cognate with mặn ('salt, salted'), to denote a staple that in fact has Chamic origins (see Nguyễn Ngọc San, 1993). For related items, riêu or rêu can plausibly be traced to 蟹 xié, which also underlies ghẹ, cáy, and cua (three Vietnamese words for different kinds of crabs; notably, ghẹ resembles small Alaskan king crabs). Likewise, bún ('vermicelli') derives from 粉 fěn, which also produced phở ('noodle'), bột ('flour'), phấn ('chalk'), and bụi ('dust'). We may also postulate ruốc as cognate with 蝦 xiā (VS tép, tôm, 'shrimp, prawn'), while another type of ruốc ('fried shredded meat jerky') reflects 肉 ròu, by dialectal variation in the south, alongside ruột, rọi, and lụa functioning as morphs in the compounds cited above.
In short, the dissyllabicity approach has enabled the identification of more than 20,000 so‑called "pure Vietnamese" etyma that are in fact cognate with Chinese forms, spanning ancient to modern dialects, both literary and vernacular. These Sinitic‑Vietnamese items include words long regarded by purists as indigenous Nôm, such as chim, chóc, chả, giò, cá, lụa, ruột, rọi, mặn, mắm, riêu (rêu), ghẹ, cua, cáy, ruốc, tôm, tép, bún, bột, phấn, bụi, phở, and others. Amusingly, some specialists have even proposed French or unrelated Chinese origins, for example, linking phở to French feu (as in pot‑au‑feu) or lụa to 綢 chóu ('silk'), but the disyllabicity analysis demonstrates Chinese 粉 fěn is more plausible Sinitic connections.
As shown, the discovery of a large body of authentic Chinese-Vietnamese cognates through syllabic association provides strong evidence of genetic kinship between the two languages. These etymological findings serve as building blocks for a framework of their historical affiliation and, at the very least, offer solid evidence accounting for over 95% of the Vietnamese lexicon of Sinitic origin.
Table 4 – The French do not speak French
Anthropologically, the development of Vietnamese can be examined in parallel with the ethnogenesis of the Kinh people, provided we accept that the origins of language are reflected in its most basic words. Such a perspective challenges long‑standing claims advanced by Austroasiatic Mon‑Khmer theorists. The Vietnamese Kinh are a mixed people, descended from Sinicized Yue migrants from southern China who resettled in the ancient land known in Vietnamese history as Vănlang. There, they intermarried with Daic indigenous populations. This process unfolded over at least two millennia and continued until the Han conquest of ancient Annam.
Linguistically, Vietnamese etyma of solid Chinese origin dominate, shaping both the sound system and structural character of the language as it exists today. To understand this affiliation, one may compare the relationship between Vietnamese and Chinese to that between English and its Greek, Latin, and French elements, as opposed to its Anglo‑Saxon base. All belong to the Indo‑European family, yet English vocabulary reveals striking contrasts: water vs. French eau, one vs. une, snow vs. nuage, and so forth.
By way of analogy, it is worth recalling that the French today do not speak their ancestral Gaulish tongue, but rather a Latin‑derived language now called French. Readers should note, however, that this is not the case with Vietnamese, whose linguistic foundation remains deeply and enduringly tied to its Sinitic heritage.
In each polysyllabic Chinese word, composed of two or more syllables or morphemes represented by individual characters, every unit, regardless of its meaning, is associated with a morpheme that may appear under different phonetic shells, whether in monosyllabic or polysyllabic form.
For example, in VS bồhòn ('wingleaf soapberry), we find parallels such as 無患 wúhuàn (SV vôhoạn), 苦患 kǔhuàn (Hainanese, SV khổhoạn), 油患 yóuhuàn (Sichuan, SV duhoạn), and 木患 mùhuàn (in Li Shizhen, 'Sapindus saponaria', SV mụchoạn). Here 患 huàn ('trouble') functioned as a loangraph (假借 jiăjiè) for 丸 wăn (SV hoàn, VS hòn 'ball‑shaped object') (An Chi 2016, Vol. 2, p. 154). This illustrates how syllabic presentations in Chinese characters may convey entirely different meanings, e.g., 患 vs. 丸, regardless of their written form. Such loangraphs, or 'internal loanwords', were often sound‑loans unrelated to the original semantic value, and by association they entered Vietnamese compounds as well.
In both languages, a morpheme typically coincides with a syllable, which can freely combine with others to form new words, regardless of its core meaning. For instance:
(i) on the Chinese side,
- 運氣 yùnqì: hênxui ('by luck')
- 起碼 qǐmă: ítra ('at least')
- 馬虎 măhǔ: qualoa ('carelessly')
- 馬上 măshàng: mauchóng ('quickly')
- 便宜 piányi: giábèo, rẽbèo ('cheap')
- 便秘 piànmì: táobón ('constipation') [<~ SV tiệnmật]
- 東西 dōngxī: đồđạc ('things')
- 東家 dōngjiā: chủnhà ('host')
- 聊天 liáotiān: tròchuyện ('chat')
- 無聊 wúliáo: lạtlẽo, nhạtnhẽo ('boring'), vôduyên ('nonsense')
- 陌生 mòshēng: lạlùng ('strange')
- 花生 huāshēng: đậuphụng ('peanut')
- 棒子 bàngzǐ: tráibắp ('corncob')
- 包米 bāomǐ: bắpmì ('corn kernel')
- 玉米 yùmǐ: ngôbắp ('corn')
- 玉丸 yùwăn: hòndái ('testicle') [cf. SV ngọchoàn in medical usage]
- 點心 diănxīn: dằnbụng ('snack') [~ VS lótlòng; SV điểmtâm 'breakfast']
- 點錢 diănqián: đếmtiền ('count money')
- 運氣 yùnqì: hênxui ('by luck')
- 起碼 qǐmă: ítra ('at least')
- 馬虎 măhǔ: qualoa ('carelessly')
- 馬上 măshàng: mauchóng ('quickly')
- 便宜 piányi: giábèo, rẽbèo ('cheap')
- 便秘 piànmì: táobón ('constipation') [<~ SV tiệnmật]
- 東西 dōngxī: đồđạc ('things')
- 東家 dōngjiā: chủnhà ('host')
- 聊天 liáotiān: tròchuyện ('chat')
- 無聊 wúliáo: lạtlẽo, nhạtnhẽo ('boring'), vôduyên ('nonsense')
- 陌生 mòshēng: lạlùng ('strange')
- 花生 huāshēng: đậuphụng ('peanut')
- 棒子 bàngzǐ: tráibắp ('corncob')
- 包米 bāomǐ: bắpmì ('corn kernel')
- 玉米 yùmǐ: ngôbắp ('corn')
- 玉丸 yùwǎn: hòndái ('testicle') [cf. SV ngọchoàn in medical usage]
- 點心 diǎnxīn: dằnbụng ('snack') [~ VS lótlòng; SV điểmtâm 'breakfast']
- 點錢 diănqián: đếmtiền ('count money')
etc.
(ii) and here on the Vietnamese side,
- đườngmật 甜蜜 tiánmì ('sweetly') [@ 甜 tián (SV điềm) ~ 糖 táng 'sugar']
- dưahấu 塊瓜 kuàiguā ('watermelon') [modern M 西瓜 xīguā (SV tâyqua) → VS dưatây 'honeydew']
- thathiết 體貼 tǐtiè ('heartily')
- bênhvực 包庇 bāobì ('take side')
- bánhmì 麵包 miànbāo ('bread')
- làmviệc 幹活 gànhuó ('work')
- bậnviệc 忙活 mánghuó ('busy')
- cậtruột 骨肉 gǔròu ('blood kinship')
- chảgiò 炸肉 jiàròu ('fried spring roll')
- cẩuthả 苟且 gǒuqiě ('carelessly')
- nhưngmà 而且 érqiě ('but also')
- mứcđộ 幅度 fúdù ('extent')
- bứcvẽ 畫幅 huàfú ('a painting')
- đòngang 渡江 dùjiāng ('ferry boat')
- núisông 江山 jiāngshān ('country')
- trờinắng 太陽 tàiyáng ('sunshine')
- tạnhtrời 晴天 qīngtiān ('dry weather')
- banngày 白天 báitiān ('daylight') [<~ 白日 báirì]
- bồcâu 白鴿 báigē ('dove')
- ănbám 白吃 báichī ('live on others’ labor')
- vívon 比方 bǐfāng ('exemplify')
- thídụ 比喻 bǐyù ('for example')
- mồcôi 無根 wúgēn ('orphan')
- mùtịt 無知 wúzhī ('ignorance')
- lạtlẽo 無聊 wúliáo ('boring') ~ VS nhạtnhẽo
- lùmù 朦朧 ménglóng ('vague') ~ lờmờ (SV mônglung)
- bưngbít 蒙蔽 méngbì ('hoodwinking')
- vỡlòng 啓蒙 qǐméng ('pre‑schooling')
- hàilòng 開心 kāixīn ('pleased')
- vừalòng 滿心 mănxīn ('satisfied')
- vừaý 滿意 mănyì ('pleased')
- chấpnhất 在意 zàiyì ('to mind') ~ đểý
- hứngchịu 丞受 chéngshòu ('undergo')
- chấpnhận 忍受 rěshòu ('endure')
- rũimà 萬一 wànyī ('just in case')
- muônvàn 千萬 qiānwàn ('countlessly')
- ôngchủ 主公 zhǔgōng ('master') ~ chúacông
- gàtrống 公雞 gōngjī ('rooster') ~ gàcồ
- đànbà 婦道 fùdào ('woman')
- bàxã 媳婦 xífù ('wife, term of endearment')
- tiêupha 花銷 huāxiāo ('spend money')
- đồngbạc 銅版 tóngbăn ('dong, monetary unit')
- đồnghồ 銅壺 tónghú ('clock, watch')
- bánsỉ 批發 pīfā ('wholesale')
-
and the same process can be extended to many other words, abundantly,
which is open for our specialist in Vietnamese to fill them in. Here are
some suggestions,
- nóngnảy 衝動 chōngdòng ('hot temper')
- phơira 披露 pīlù ('expose')
- vấtvả 奔波 bēnbō ('struggling, hand‑to‑mouth') [~ VS tấttả, SV bônba]
- múarối 木偶戲 mù'ǒuxì ('puppetry')
- bắtđền 賠償péichăng (ask for compensation) (~ bắtthường),
- lánggiềng 鄰居: línjū (neighbor' (~ 'hàngxóm'),
- duyênnợ 緣份: yuánfèn (marital encounter),
- yêuđương 愛戴 àidài (love),
- conruột #親子: qīnzǐ (biological child),
- đạochích 盜賊: dàozéi (burglar, thief) (~ 'trộmcắp'),
- dêxòm 婬蟲: yínchóng (lecherous) (~ 'quỹrâuxanh'),
etc.
For the Chinese examples cited, any trained linguist knows that the ideographs involved often have little to do with the meanings they convey. Within a compound, each character frequently functions as nothing more than a sound unit, especially in the case of so‑called “internally borrowed characters” (假借 jiǎjiè), characters used primarily to coin new words by sound‑loan rather than by semantic value. The same principle applies on the Vietnamese side. A Chinese dictionary will show countless characters, including polysyllabic words, with multiple meanings; yet in many of the examples above, they are loan graphs employed simply to phoneticize or transcribe sounds for particular concepts.
IV) Implications
-
Disyllabic sound change reveals systematic correspondences overlooked in monosyllabic analysis.
-
Vietnamese emerges not as irregular but as structurally consistent when disyllables are treated as primary evidence.
-
This approach reframes Sino‑Vietnamese etymology as a polysyllabic continuum, integrating substratal Yue influence with Sinitic layering.
When Chinese words were borrowed into and localized in Vietnamese, either one or both syllables of the compound could be re‑associated with Vietnamese words of similar sound and meaning. Amusingly, what emerges in Vietnamese is sometimes no longer what the original Chinese form signified. In other words, the Vietnamese reflex may not descend directly from the same Chinese root. Words of this type are innumerable.
Take 起 qǐ, which signifies 'rise' (SV khởi, VS dậy, nổi) [M 起 qǐ < MC kʰɨ < OC kʰɯʔ]. In actual usage, this morpheme readily acquires re‑assigned meanings shaped by the semantic context of speech. This is no longer a matter of metathesis, as when a new disyllabic word was originally coined – for example, in choosing between 興起 xīngqǐ, xìngqǐ or 起興 qǐxìng ('arousing') for VS nổihứng – but rather a case of adapting and extending an existing form with whatever is conveniently available in context. Illustrations include:
-
起床 qǐchuáng → VS ngủdậy ('wake up, rise') [@ 起 qǐ ~ ngủ, @ 床 chuáng ~ dậy]
-
起義 qǐyì → VS nổidậy ('rise against') [@ 起 qǐ ~ nổi, @ 義 yì ~ dậy]
Yet in other compounds, 起 qǐ associates with different sounds and concepts:
-
起馬 qǐmă → VS ítra ('at least')
-
起源 qǐyuán → VS bắtnguồn ('originate') [起 Cant. /hej3/ > /bej3/ > /bæt7/ > VN /bắt‑/]
-
起頭 qǐtóu → VS khởiđầu ('start')
-
起步 qǐbù → VS cấtbước ('take steps') [起 Cant. /hej3/ > /kej3/ > /kʌt7/ > VN /kất‑/]
-
興起 xìngqǐ → VS hứngchí ('excited') [起 qǐ > /cij5/] → cf. nổihứng, mừngrỡ
Similarly, consider 順 shùn (SV thuận)
[M QT 順 shùn < MC ʑwin < OC *ɢljuns (Schuessler: mljuəns)].
Examples include:
-
順利 shùnlì → VS suônsẻ, chótlọt ~ trótlọt ('smoothly')
-
順風 shùnfēng → VS xuôigió, thuậngió ('tail wind')
-
順水 shùnshuǐ → VS xuôidòng ('sail with the current')
-
順手 shùnshǒu → VS thuậntay, sẵntay, luônthể ('conveniently')
-
順便 shùnbiàn → VS luôntiện, sẵntiện ('conveniently')
-
孝順 xiàoshùn → VS hiếuthảo ('filial piety')
Thus, morphemic syllables like 起 and 順 are binding forms that have evolved into different sounds, meanings, and words in Vietnamese. Within Chinese itself, such morphemes are innumerable. By pursuing the dissyllabicity approach, nearly all Sinitic‑Vietnamese words can be traced back to Chinese equivalents or roots.
These etyma have long been overlooked due to the entrenched misconception that both Vietnamese and Chinese are fundamentally monosyllabic. This view has obscured recognition of the exponential sound changes that occur in disyllabic formations, where shifts may diverge from their monosyllabic equivalents. Phonologically, in ancient times both languages were likely monosyllabic, as languages generally evolve from simplicity to complexity. It is easier to confirm monosyllabicity in Chinese, given literary evidence from three millennia ago, than in Vietnamese, whose last written forms in Chinese characters date only to the early 20th century. For over a thousand years after independence, records of ancient Vietnam were largely compiled from Chinese bibliographies. Still, the basic words shared by both languages point to an early monosyllabic stage, with some items evolving into disyllabic forms to differentiate meaning, e.g., đầugối 膝蓋 xīgài ('knee'), cùichỏ 手肘 shǒuzhǒu ('elbow').
Modern Vietnamese orthography, however, disguises this reality. Most words are disyllabic in nature, yet still written as separate monosyllabic components. This convention has misled untrained readers scanning dictionaries, where thousands of disyllabic words appear disconnected in isolated syllables. The practice stems from the adaptation of Chinese characters into separate written forms, as reflected in early dictionaries such as Đại‑Nam Quấc‑âm Tự‑vị 大南 國音 字彙 (Dictionary of National Sounds of the Great Southern Kingdom, compiled by Huình Tịnh Của, late 19th century). There, each entry is listed character by character, e.g., 江 giang 'river', 山 san 'mountain', and compounds like 江山 giang san are treated as separate words, even though together they mean 'country', not merely 'rivers and mountains'. Such shortcomings reflect the limited linguistic training of early lexicographers.
Cognitively, this monosyllabic way of writing is even more misleading than the earlier use of hyphens (e.g., giang‑san, quốc‑gia), which remained common until the early 1970s. The persistence of this practice owes much to user convenience and educational neglect, both of which have contributed to the shortcomings of the modern national orthography. (See What Makes Chinese So Vietnamese? - Appendices)
In the past, many imperfect specialists on Vietnamese insisted on its supposed monosyllabicity. A representative view was expressed by Barker (1966, p. 10): "With the exception of certain compounds, reduplicative patterns, and loanwords, Vietnamese and Muong are both monosyllabic languages." If we were to take this paradigm seriously and apply it equally to English, the Anglo-Saxon component, so to speak, then English too would appear monosyllabic in many respects, let alone Vietnamese.
Barker's statement, however, revealed the limits of his mastery of Vietnamese. Even today, Western specialists still confuse Sino‑Vietnamese with Sinitic‑Vietnamese words. In Barker’s time, linguists like him were often surrounded by 'Vietnamese admirers' from half‑trained linguistic circles, eager to celebrate a foreigner who could simply pronounce Vietnamese sounds. One can almost picture the scene: exclamations of "oh,", "ah", "wow", "he can even speak Vietnamese!" Yet nobody marvels at the millions of Vietnamese who speak English fluently, or the hundreds who have authored books in that language. The rarity of Westerners who master Vietnamese has made their words seem disproportionately precious.
To be fair, Barker was a respected linguist of Southeast Asian languages, academically well‑equipped with methodologies and field experience. But in his Vietnamese study, his reliance on linguistically untrained informants and interpreters, combined with the application of "bookish" formulae, left him ill‑prepared. His conclusion that only "certain compounds, reduplicative patterns, and loanwords" break the supposed monosyllabic mold is enough to disqualify his authority in this specific field. For readers unfamiliar with Vietnamese, such a statement misleadingly suggests that only a handful of multisyllabic words exist. Nothing could be much further from the truth.
More than three generations have passed since Barker's era, yet no major breakthrough has overturned the lingering influence of his view. Vietnamese learners still encounter dictionaries that present wordlists monosyllabically, a practice replicated across print and digital media. Novices are thus visually misled by an orthography that insists on listing syllables separately, treating them as characters 字 zì (chữ) rather than words 辭 cí (từ). By contrast, it would be unimaginable for students of Mandarin or Korean to treat Romanized monosyllables as words in their own right. Yet in Vietnamese linguistics, this outdated perspective persists.
It is true that many ancient and later disyllabic lexemes can be analyzed as combinations of monosyllabic elements, each of which may function as an affix in other compounds (cf. English homepage, website, logon, blogger, facebook, facetime). But many Vietnamese words formed in this way denote entirely new concepts. Analytically, they are not “compounds” but composite words: forms built from bound morphemes that cannot be broken down into independent syllables.
Some of the most basic Vietnamese anatomical terms illustrate this point. Words such as bànchân ('foot'), đầugối ('knee'), mắccá ('ankle'), cổtay ('wrist'), càngcổ ('neck'), bảvai ('shoulders'), cùichỏ ('elbow'), màngtang ('temple'), mỏác ('fontanel'), and chânmày ('eyebrow') are all disyllabic composites. Each is made up of bound morphemes that must appear together; neither syllable can stand alone. Conceptually, they function just like their English counterparts.
The original meanings of the individual syllables often diverge from the meaning of the composite. For example, in đầugối ('knee'), đầu (cf. C 頭 tóu 'head') and gối (cf. M 枕 zhěn 'pillow', as in 枕頭 zhěntóu) have nothing to do with the concept of 'knee'. The Chinese equivalent 膝蓋 xīgài conveys the meaning directly, but Vietnamese expresses it through a composite whose parts no longer transparently relate to the whole and each syllable cannot stand alone. This example illustrates how Vietnamese and Chinese share cognate structures in fundamental vocabulary, though often through divergent semantic paths.
Beyond anatomy, countless other composite disyllabic words formed from bound morphemes exist across semantic domains: càunhàu ('growl'), cằnnhằn ('grumble'), bângkhuâng ('pensive'), bồihồi ('melancholy'), mồhôi ('sweat'), mồcôi ('orphan'), hàilòng ('pleased'), taitiếng ('infamous'), tạmbợ ('temporary'), tráchmóc ('reproach'), tuyệtvời ('wonderful'), tămhơi ('whereabouts').
Polysyllabic forms further demonstrate this creativity: cườimĩmchi ('shoot a smile'), tủmtỉmcười ('hide a smile'), mêtítthòlò ('fatally irresistible attraction'), nhảyđồngđổng ('jump up in protest'), bađồngbảyđổi ('change unpredictably'), hằnghàsasố ('innumerable'), lộntùngphèo ('turn upside down'), tuyệtcúmèo ('fabulous').
All of these examples underscore the same point: Vietnamese is not a monosyllabic language, but one rich in disyllabic and polysyllabic composites, whose bound morphemes function in ways comparable to any other world languages. (1)
Polysyllabically and morphemically speaking, even in the case of solid Sino‑Vietnamese words of verified Middle Chinese origin such as hiệntại 現在 xiànzài ('present'), phụnữ 婦女 fùnǚ ('woman'), or sơnhà 山河 shānhé ('country'), each lexical stem derived from a Chinese character – though capable of functioning as a syllable‑word – cannot be used independently as a free form in Vietnamese. Each must combine with another syllable to form a complete lexical item. For example, núivàng ('gold mountain') corresponding to SV kimsơn 金山 jīnshān cannot be arbitrarily recombined across classes, such as {SV sơn + VS vàng}, {VS núi + SV kim}, or {VS trèo + SV san}, to yield legitimate words. Such hybrid pairings are impermissible. The only exception arises when no “pure” Vietnamese equivalent exists to replace one of the syllables, as in bàntay 手板 shǒubăn ('palm'), phấnviết 粉筆 fěnbǐ ('chalk'), or miến gà 雞麵 jīmiàn ('chicken noodle soup'), where elements like bàn 板, phấn 粉, or miến 麵 are Sino‑Vietnamese morphemes.
Remaining with the subject of polysyllabicity, it is worth stressing that if disyllabic and polysyllabic words were written in combining formation rather than as separate syllables in Vietnamese orthography, children would gain a stronger cognitive foundation for abstract thought. At the same time, foreign learners would acquire vocabulary more logically and efficiently, and even specialists such as Barker would be less likely to misinterpret Vietnamese as monosyllabic. Compare the difficulty ESL learners face with English phrasal verbs (keep up, go on, put up with, come on) versus the relative ease of acquiring polysyllabic composites (nevertheless, meanwhile, aforementioned, albeit, regarding, pan‑America, trans-Siberia), which parallel Vietnamese forms such as dùlà, trongkhiđó, kểtrên, dùthế, đốivới, xuyênMỹ, xuyênTâybálợiá. In short, reliance on the antiquated orthography of separated syllables is inadequate grounds for labeling Vietnamese as monosyllabic Vietnamese2020 Writing Reform Proposal.)
On the question of polysyllabicity, prominent Vietnamese linguists such as Bùi Đức Tịnh (1966, p. 82), siding with Hồ Hữu Tường, rejected the notion of Vietnamese as monosyllabic. Both argued for its dissyllabic character, citing the high frequency of dissyllabic words in ordinary passages. Analogously, just as English draws heavily on Latin and Greek roots, the fact that Sino‑Vietnamese words constitute over 90 percent of entries in modern dictionaries is itself sufficient evidence of Vietnamese disyllabicity.
From a broader perspective, virtually all world languages are polysyllabic, and their orthographies reflect this nature – including recently romanized systems such as Hmong. French and English loanwords in Vietnamese also entered as indivisible polysyllabic units: French acide → axít, boursebois → buộcboa, auto → ôtô, compas → cômpa, toilette → toalét. To break them apart into a xít, buộc boa, ô tô, côm pa, toa lét, as Phan Hữu Dật (1998) and others did – is misguided. By contrast, younger generations have sensibly accepted complete loan packages as indivisible units: Washington, New York, Canada, dollar, visa, silicon, restroom, toilet, cellphone, smartphone, data, webpage, internet, monitor, computer, iPhone, Apple, rather than the earlier calques Hoa‑thịnh‑đốn, Nữu Ước, Gia‑nã‑đại, đô‑la, etc.
Other cultures have long recognized this cognitive principle. Koreans and Japanese, for instance, consistently write polysyllabic words in grouped formations, visually reinforcing their unity and efficiency. Their orthographies appear in patterns like XX XXX XX X XX XXX XX, and this structural clarity has arguably contributed to their innovative linguistic and technological achievements. Thai, Malay, and others follow similar practices. By contrast, modern Vietnamese orthography still separates dissyllabic words into two units, even though, as noted, each syllable alone may be semantically empty. If the same were attempted with Kanji, Romanized Korean, or Chinese pinyin, the fallacy of the monosyllabic view would be obvious.
Closer to home, all modern Chinese dialects are effectively disyllabic. The same holds true for Vietnamese. On the issue of Chinese polysyllabicity, Chou (1982, p. 106) cites Kennedy, de Francis, and Eugene Chin: "If we admit that words, not morphemes, are the construction material of Chinese, we cannot but admit that Chinese is polysyllabic. If we may use the majority rule here, we will have no trouble establishing the fact that Chinese is disyllabic." Indeed, by the majority rule, Chinese vocabulary is dominated by disyllabic words. The same principle applies to Vietnamese, given their structural similarities. Thus, every disyllabic Chinese loanword – and likewise French or English loans – in Vietnamese must be treated as polysyllabic, and should be written in combining formation, such as San Francisco, not Xan Phơ-ran-xít-xơ-cô.
Finally, recall our earlier postulate: phonologically, a single disyllabic Chinese word can evolve into multiple disyllabic forms in Vietnamese, including modern neologisms. For instance, the Chinese 三八 sānbā (SV tambát, originally used to mock women on March 8, International Women's Day, with the sense of "nonsense") has likely given rise to, or at least associated with, a cluster of Vietnamese expressions for the same concept: tầmphào, tầmbậy, tầmbạ, bảláp, bảxàm, basạo, xàbát, xằngbậy, and others.
Not only can the one‑to‑many sound‑change rule be applied to disyllabic words, it also extends to monosyllabic forms. We have already seen solid cases where a single Chinese monosyllable gave rise to multiple Vietnamese reflexes. The problem with the "monosyllabicity camp" of linguists is that they typically search for only one Vietnamese equivalent for each Chinese character, treating both as strictly monosyllabic units. In most cases, they confine themselves to a one‑to‑one correspondence, forcing each Chinese etymon into a single Vietnamese match. They also tend to accept only those forms that fit neatly into their pre‑established sound patterns, for example, deriving 惃 kūn as VS con ('child'), or 亨 hēng as part of hên(h)+xui ('by luck') (An Chi 2016, Vol. 2, pp. 32, 113). In doing so, they overlook more plausible correspondences such as 子 zǐ (Fukienese /kẽ/ > VS con) or 運氣 yùnqì > VS hênxui.
In reality, the fact that a single Chinese character often has multiple pronunciations across dialects strongly supports the principle of one monosyllabic source yielding many disyllabic outcomes in Vietnamese. Both rules – one‑to‑many from monosyllables and from disyllables – operate in parallel. For example:
-
Teochow 餅 /bẽ/ → VS bánhbẽn ~ bánhpía (Teochew‑style pastry) [ bánh 餅 M bǐng + bẽn 餅 Teochow nasalized /bẽ/]
-
Teochow 包餅 /bao1bẽ/ → VS bòbía (spring roll) [bò 包 bāo + bía 餅 /bẽ/]
-
麵包 miànbāo → VS bánhmì ('bread') [bánh 包 bāo + mì 麵 miàn]
-
包子 bāozǐ → VS bánhbao ('steamed dumpling') [bánh 餅 M bǐng + bao 包 bāo]
-
粽子 zōngzǐ → VS bánhchưng ('steamed glutinous rice cake') [bánh 餅 M bǐng + chưng 粽 zōng / 烝 zhēng 'steam']
Here 餅, 包, and 子 have each generated multiple disyllabic Vietnamese forms.
This analysis exposes the flaw in the old monosyllabicity approach. While multiple Vietnamese etyma can be extracted from the same Chinese root, earlier scholars restricted themselves to a rigid one‑to‑one mapping. Thus they could not reconcile forms like bánhpía or bánhbẽn, since their framework allowed only bánh = 餅 M bǐng (SV bính). They could not accept that disyllabic Vietnamese words might emerge from recombinations involving other morphemes such as 包 bāo. Note also that 麵包 miànbāo → bánhmì ('bread') reflects a French colonial product, but the word itself is cognate with Chinese 麵包, via 餅 bǐng > bánh and 麵 miàn > mì.
Only when Vietnamese is formally recognized as a disyllabic language, consisting primarily of two‑syllable words, like Chinese, can its sound‑change rules be properly understood. The situation is no different from Indo‑European (IE) languages, where words of the same root evolve differently across languages, and at least one syllable often diverges from the expected phonological pattern. Consider “police”: politi, polizei, policía, polizia, polite, polis, polisi, or even the old Vietnamese loans phúlít and cúlít (from French; see An Chi 2016, Vol. 2, p. 13), alongside the modern colloquial cốm for English cop. Historical linguists of IE are well aware of such complex sound shifts, which are widely accepted as normal.
What does this have to do with Vietnamese etyma? Orthodox theorists, clinging to monosyllabicity, assume that each Chinese character corresponds to only one Vietnamese equivalent. They reject one‑to‑many correspondences as "chaotic.". Yet in reality, numerous Vietnamese forms undeniably derive from a single Chinese character, some even within the strict Middle Chinese → Sino‑Vietnamese framework. For example:
-
唐 Táng ('Tang', 'path') → SV Đường, đàng)
-
元 yuán ('origin', 'beginning') → SV nguyên, ngươn; VS (tháng)giêng, ngọn, vị ('first, top, unit')
-
利 lì ('advantage', 'interest', 'benefit', 'profit') → SV lợi, lị, VS lì, lãi, lời
-
貴 guì ('precious') → SV quý, quới; VS mắc, đắt ('expensive')
-
度 dù ('measure') → SV độ; VS dò, đo, đạc, đức, tấm
-
拜 bài ('kowtow') → SV bái; VS vái, lạy, van
-
粉 fěn ('flour, noodle') → SV phấn; VS bún, bột, phở, bụi
Similarly, 場 chǎng (SV trường, tràng) has produced multiple Vietnamese outcomes:
-
劇場 jùchǎng (SV kịchtrường) → VS sânkhấu ('theatrical stage')
-
在場 zàichǎng (SV tạitrường) → VS tạichỗ, tạitrận ('on the spot', 'red‑handed')
-
試場 shìchǎng (SV thítrường) → VS trườngthi ('examination site')
-
戰場 zhànchăng (SV chiếntrường) → VS chiếntrận, trậnchiến ('battle')
As a classifier, 場 chǎng also fuses into fixed Vietnamese compounds:
-
一場夢 yīchǎng mèng (SV nhất trường mộng) → VS một giấcmơ, giấcmộng, cơnmơ, cơnmộng ('a dream')
-
一場病 yīchǎng bìng (SV nhất trường bệnh) → VS một trậnbệnh, cơnbệnh ('illness')
-
一場戲 yīchǎng xì (SV nhất trường hí) → VS một tuồnghát, xuấthát ('a show')
-
一場空 yīchǎng kōng (SV nhất trường không) → VS cónhưkhông, córồikhông, một khoảngtrống ('emptiness')
Here the process of association reshaped the sound‑change continuum. Vietnamese reflexes of 場 chăng were influenced by neighboring morphemes with similar sounds and meanings (e.g., 陣 zhèn SV trận, 齣 chù SV xuất), and by semantic transfer into new disyllabic compounds (e.g., cơn 'a bout, a thrust'). As Starostin notes, the Vietnamese variants of 場 chǎng were likely borrowed relatively late, perhaps from vernacular Mandarin after the Middle Chinese period, yielding forms such as sân, chỗ, xuất, giấc, dài, ruột (with /ch‑ ~ j‑/ < /*/‑ ~ *j‑/).
A) Analytical case study of sound shifts
Let us examine another case: the syllable‑word 匠 jiàng, which frequently attaches as an affix to different morphemic syllables and, through this process, can be posited as the Vietnamese equivalent thợ ('smith', 'artisan').
Example:
thợmộc [tʰə̰ːʔ˨˩ mə̰ʔwk˨˩] ~ 木匠 mùjiàng [/mu⁵¹⁻⁵³ ʨi̯ɑŋ⁵¹/] ('carpenter')1. The compound thợmộc
Mandarin 木匠 mùjiàng ('carpenter') ~ Fukienese /ba̍k‑chhiūⁿ/.
SV mộc [mə̰ʔwk˨˩] < MC [mowk8] for 木 mù.The Vietnamese thợ is unlikely to be a direct sound change from 匠 jiàng (SV tượng) /tɨə̰ʔŋ˨˩/. [ M 匠 jiàng < MC ʐjɑŋ < OC *ʐhaŋs | Note: Japanese "しょう" /shō/ with the correspondence /sh‑/ ~ /th‑/ and /-ō/, cf. 折 zhé (SV chiết), 逝 shì (SV thệ), 誓 shì (SV thệ) > VS thề ]. Rather, thợ is more plausibly a derivative from compounds X + 匠 jiàng.
2. The compound thợthầy
Expressions such as Không ra thợthầy gì cả! ("His skills are not even up to those of an apprentice, let alone a master!") or Nửa thầy, nửa thợ, nửa đườiươi! ("Not even half as good as a master, an apprentice, or even a chimpanzee!") illustrate the mocking sense of thợthầy. This compound (師 shī + 匠 jiàng) parallels thầytrò 師徒 shītú (SV sưđồ). Here thợ, in contrast to thầy 師 shī ('teacher', 'master'), aligns with 徒 tú ('apprentice', 'pupil', 'follower'), giving rise to the broader Vietnamese sense of thợ as 'apprentice, journeyman, artisan, smith'.
3. Thợ as a productive morphemeOnce established, thợ became a free morphemic syllable functioning as a prefix, combining with other elements to form compounds, in all these cases, the associated sound thợ has been imposed on other compounds:
- ngườithợ 人匠 rénjiàng ('artisan')
- thợthầy 師匠 shījiàng ('master')
- thợvẽ 畫匠 huàjiàng ('painter')
- thợvàng 金匠 jīnjiàng ('goldsmith')
- thợsơn 漆匠 qījiàng ('painter')
- thợđá 石匠 shíjiàng ('stone mason')
- thợđồng 銅匠 tóngjiàng ('coppersmith')
- thợsắt / thợthiết 鐵匠 tiějiàng ('blacksmith, tinsmith')
- thợgiày 鞋匠 xiéjiàng ('shoemaker')
- thợnề 泥水匠 níshuǐjiàng ('bricklayer')
- thợmài 磨光匠 móguāngjiàng ('grinder')
- thợkhoá 鎖匠 suǒjiàng ('locksmith')
- thợnhuộm 洗染匠 xǐrănjiàng ('dyer')
- thợin 印刷匠 yìnshuăjiàng ('printer')
- thợngói 瓦匠 wăjiàng ('tiler, bricklayer')
4. Modern extensions:
In modern usage, thợ has expanded into new domains, often paralleling Sinitic‑Vietnamese forms:
- thợtóc 髮師 fàshī ('hair stylist')
- thợvẽ 畫師 huàshī ('artist, painter')
- thợchụphình 攝影師 shèyǐngshī ('photographer')
- thợmay 裁縫師 cáifèngshī ('dressmaker')
Here 師 shī (SV sư, VS thầy) is effectively downgraded to the apprentice level of thợ 徒 tú. The key point is that Chinese elements come alive in Vietnamese through the Sinitic‑Vietnamese adaptation of thợ, where both 師 shī and 徒 tú are associated with 匠 jiàng.
5. Etymological considerations:
MC /dzɨaŋ/ > SV tượng /tɨə̰ʔŋ/ > thượng /thɨə̰ʔŋ/ > VS thợ /thə̰/We, however, should not completely exclude the possibility that thợ derives directly from 匠 jiàng [ M /ʨi̯ɑŋ⁵¹/, 匠 jiàng < MC /dzɨaŋ/ < OC */sbaŋs/, Hokkien /chhiūⁿ/, Japanese /shō/]. The sound change may follow a pattern similar to:
Compare:
- 獎 jiǎng ('award') → SV tưởng /tɨə̰ʔŋ/ → VS thưởng /thɨə̰ʔŋ/
- 承 chéng → VS thừa /thɨə̰/
In the cases above, disyllabic words – or polysyllabic words more generally – often undergo processes of assimilation or association that govern sound change in compound formation. Irregular outcomes can be attributed to natural phonetic phenomena whereby one or more syllables in a compound may be deformed, corrupted, dropped, contracted, transposed, or otherwise altered. Such changes can transform the original syllable into a new phonological shell into the opposite, sometimes beyond recognition.
6. Assimilation and association in polysyllabic forms
- 攝影師 shèyǐngshī ('photographer') → VS thợnhiếpảnh
Also yields thợchụphình by associating 攝影 shèyǐng with 照相 zhàoxiàng ('take pictures')
- 相 xiàng assimilated with 形 xíng (SV hình)
- 影 yǐng (SV ảnh) reinterpreted with sound shift as bóng ('shadow, reflection')
From this transformation arose a cluster of synonymous forms: thợchụpảnh, thợchớpảnh, thợchụpbóng, thợchớpbóng. All of these variants align with the polysyllabic model of 攝影師 shèyǐngshī. Their emergence illustrates how synchronic association of similar sounds or meanings, or both, can generate new lexical items. Each form remains interconnected within the broader transformation process, including reversals or reassignments of elements such as 師 shī.
7. Innovation of loanwords
In recent years, there has been a trend to import a whole new Sino-Vietnamese words from Chinese for use without any further modifications or changes, at least for now, such as,
- namthần: 男神 (nánshén, 'Mr. Perfect.')
- ngọcnữ: 玉女 (yùnǚ, 'Miss Pretty')
- giaođãi: 交待 (jiāodài, 'instruct')
Throughout its lexical development, nevertheless, Vietnamese has consistently demonstrated a tendency to innovate with loanwords. In the past speakers do not merely reproduce the original sounds but often reshape them, whether consciously or unconsciously.
As a result, a single lexical root, once subjected to sound change, can generate multiple Vietnamese variants. Over time, this process enriches the recipient language by expanding its borrowed vocabulary, layering new sound forms onto the original stem and extending their meanings. For instance:
- 天 tiān: SV thiên ('heaven'); VS trời ('the Almighty'); giời ('sky'); trán ('forehead')
- 心 xīn: VS tim ('physical heart'); SV tâm; VS lòng ('spiritual heart, inner feelings')
- 戶 hù: SV hộ ('household'); VS cửa ('door'); ngõ ('gate')
- 主 zhǔ: SV chủ ('master'); VS Chúa ('Lord')
- 注 zhù: SV chú; VS chua ('annotate'); chảy ('flow')
- 生 shēng: SV sanh; SV sinh; VS sống ('live, raw'); VS đẻ ('give birth'); tái ('raw meat').
- 回 huí: SV hồi ('return'); VS về ('come back'); quay ('turn')
- 會 huì: SV hội ('fair'); VS họp ('meeting'); VS hẹn ('date'); hiểu ('understand'); hay ('aware'), hụi ('loan'); hồi ('time'); sẽ ('will'); VS đỗi ('moment')
- 沖 chōng: SV xung, trùng; VS dội ('pour water'); sôi ('boil'); xông ('charge'); xấn ('dash'); tông ('collide'); đụng ('collide'); đường ('road'); sang ('print photo'); xối ('wash out').
- 粉 fěn: SV phấn (powder', 'chalk'); VS phở ('noodle'); bún ('vermicelli'); bột ('flour'); bụi ('dust')
- 鏡 jìng: SV kính; VS kiếng; gương ('mirror', 'eyeglasses').
- 機 jī: SV cơ; VS cửi ('shuttle'); dịp ('opportunity'); máy ('machine'); máycửi ('loom').
8. Disyllabic examples
- 太陽 tàiyáng: SV tháidương ('the sun'); VS mặttrời; trờinắng; màngtang ('temple')
- 月亮 yuèliàng: VS ánhtrăng; trăngsáng; nàngtrăng; mặttrăng
- 機會 jīhuī: VS cơmay; dịpmay; cơhội; códịp; SV cơhội ('opportunity')
- 問答 wèndá: SV vấnđáp; VS hỏiđáp ('Q&A')
- 問題 wèntí: SV vấnđề; VS thắcmắc ('question')
- 聽寫 tīngxiě: VS ngheviết ('dictation'); SV chínhtả ('spelling')
- 過去 guòqù: SV quákhứ ('past'); VS quađi; điqua; đãqua ('pass by')
- 邊界 biānjiè: SV biêngiới ('border'); VS bờcõi ('frontier')
- 鹹魚 xiányú: VS cámặn ('salty fish'); mắmcá ('fish paste'); nướcmắm ('fish sauce')
etc.While most words retain their original forms and associated meanings, some evolve by differentiating meanings through either older pronunciations or newer articulations. This process produces subtle shifts in both sound and semantics, sometimes even leading to the emergence of new written characters. However, this does not mean that the majority of Chinese loanwords in Vietnamese necessarily become richer by generating multiple variants. In fact, the reverse is also true: many Vietnamese lexemes consolidate multiple Chinese sources under a single form, uniting different sounds and meanings "under one roof", much like the phenomenon of loangraphs in Chinese itself.
B) Cases of many‑to‑one mappings (Chinese → Vietnamese):
- sợ ('fear, dread, scared'): 嚇 xià, 怕 pà, 怵 chù, 恄 xì, 悚 sǒn, 愳 jù, 懼 jù, 慴 zhé (懾), 怯 qiē
đời ('life, generation'): 世 shì, 代 dài, 輩 bèi, 生 shēng
- ex. 人生 rénshēng → đờingười ('life'); cf. đẻ ('give birth'), tái ('raw, uncooked'), sống ('unripe')
- chết ('death, die, pass away'): 死 sǐ, 折 zhé, 逝 shì, 殛 jí, 殊 shū, 陟 zhì
- mưa ('rain, drizzle, shower'): 溟 míng, 雨 yǔ, 霂 mù, 霡 mò
- mây ('cloud, fog, haze'): 蔓 mán, 雲 yún, 霨 wèi, 霧 wù, 霾 măi
- nhà ('house, family, -ist, dynasty'): 屋 wù, 家 jiā (SV gia), 者 zhě (SV giả), 朝 cháo
- đường ('path, road, route'): 唐 táng, 道 dào, 途 tú, 沖 chòng
- việc ('work, task, duty'): 活 huó (SV hoạt), 作 zuò (SV tác), 役 yí (SV dịch), 務 wù (SV vụ)
- xanh ('green, blue, azure'): 倉 cāng (SV xanh), 滄 cāng (SV xanh), 蒼 cāng (SV xanh), 青 qīng (SV thanh), 清 qīng, 葱 cōng (SV song)
- đỏ ('red, burgundy'): 丹 dān (SV đơn), 彤 tóng (SV đồng), 朱 zhū, 絑 zhū, 赭 zhé
- trường ('school, campus'): 場 cháng, 堂 táng (SV đường), 庠 xiáng (SV tường), 校 xiào
- tàu ('boat, ship'): 刀 dāo (SV đao), 舠 dāo, 槽 cáo (SV tào), 舟 zhōu, 艚 cáo (SV tào), 艇 tǐng
- cho ('give, allow'): 給 jǐ, 準 zhǔ, 許 xǔ, 賜 cì, 贈 zèng
C) Disyllabic parallels: word‑concepts grouped by sound/meaning
- 同感 tónggăn → thôngcảm ('sympathy') vs. 同情 tóngqíng → đồngtình ('sympathy')
- 幫忙 bāngmáng → bênhvực ('side with') vs. 包庇 bāobì → bảobọc ('support')
- 混蛋 húndàn → khốnnạn ('wretch, bastard') vs. 困難 kùnnăn ('in difficulties')
- 堂皇 tánghuáng → đànghoàng, đườnghoàng ('stately, magnificent') vs. 端莊 duānzhuāng → đoantrang ('dignified, demure')
- 遊蕩 yóudàng → duđảng ('loaf about, loiter') vs. 流氓 líumáng → lưumanh ('hoodlum, hooligan')
- 有錢 yǒuqián → giàusang ('affluent') vs. 富有 fùyǒu → giàucó ('rich')
- 高尚 gāoshàng → caosang ('noble') vs. 高望 gāowàng → caovọng ('socially high class')
At the same time, some sound changes evolve into innovative words that become independent of their original forms. This developmental path is common across languages. For example, in English:
- albeit < 'all be it' ('though')
- morning < 'morn' < Old English morgen
- evening < æfnung (from the verb æfnian, 'grow toward night')
Following the analogy of evening, the form morn eventually developed into morning.
All of the above illustrate how Vietnamese doublets and cognates, like other languages, develop many‑to‑one and one‑to‑many mappings, semantic shifts, and idiomatic innovations. Conceptually, this aligns with what Addam Makkai described under the notion of "lexemic idiom" and "lexeme" in his Pragmo‑Ecological Grammar (PEG): Toward a New Synthesis of Linguistics and Anthropology (1978), where idiomatic expressions such as "Emperor of Japan," "old wife," "hot potatoes," and "red herring" acquire metaphorical meanings far removed from their literal origins.
"[..]the participant morphemes once had (and in other environments still have) separate lexemic status with separate sememic realizates, and these past (or elsewhere still active) meanings have a definite shining-through effect, suffusing the meaning of these lexemic idioms with the old, suppressed, literal meanings. The denotatum in each case is primary or lexical meaning, and the TRANSLUCENT CONNOTATUM is the original literal meaning of the form. What makes lexical idioms unusual is that they, therefore, have two meanings simultaneously, i.e., the REFLECTING DENOTATUM together with the meaning TRANSLUCENT CONNOTATUM. Whether the language has a heavy morpheme reinvestment ratio or not in its lexeme inventory becomes an interesting typological question, but there is little doubt that there are any real languages that do not somehow utilize morpheme reinvestment in the building of new lexemes. "In Vietnamese specifically, the author characterizes this phenomenon of association as assimilative sandhi, that is, the sandhi process of assimilation or association. It represents a common form of phonological, and therefore lexical, "reinvestment". Such developments are best understood as natural products of time, arising without deliberate human intervention.
待 dài → SV đãi, VS đợi (‘wait’) [M 待 dài, dāi < MC dəj < OC *dɯːʔ | FQ 徒亥 (→ VS đợi) \ 亥 hài ~ SV hợi ]To illustrate, consider the lexeme 待 dài (‘wait’) and its associatory variations under this process. Note how the sound of dài shifts and aligns with related Vietnamese forms:
等待 děngdài → SV đẵngđãi, VS chờđợi ('wait for') [Here 等 děng is reinterpreted as ‘đón’ and assimilated with 待 dài ‘đợi’, creating a semantic doublet. Compare the sound‑change patterns of 寺 sì → SV tự ~ VS chùa, and 承 chéng → SV thừa ~ VS đằng.]When incorporated into disyllabic compounds, the articulation of 待 dài changes further, producing new idiomatic forms:
期待 qídài → SV kỳđãi, VS chờđón ('expect') [M 期 qī, jī, qí, qǐ (kỳ, kì, ki) < MC gɨ < OC *kɯ, *gɯ. In Vietnamese, 期待 qídài is cognate with chờđợi ('wait for'), and in Chinese it means more like 'expect'.]Thus, in Vietnamese, 等待 děngdài and 期待 qídài appear to have exchanged meanings.
對待 duìdài → SV đốiđãi → VS đốixử ('treat')Meanwhile, alongside its Chinese semantic variance, 待 dài also develops the sense of 'treat' in Vietnamese:
待承 dàichéng → SV đãithừa, VS đãiđằng ('entertain', 'treat with a feast')
接待 jiēdài → SV tiếpđãi → VS tiếpđón ('reception, to greet')Here 待 dài associates with 'xử' 處 chǔ ('handle'), or with 'đón' ('receive'), depending on context.
D) Lexical association as innovation
Beyond natural sandhi, Vietnamese also employs conscious lexical association to coin new compounds from existing lexemes. This process produces:
1. Modern innovations
- táichế 再製 zàizhì ('recycle') [modern M 回收 huíshōu]
- bấmnút 按紐 ànnǐu ('press/click a button')
- mạnglưới 網絡 wǎngluò ('network, computer network’)
- viênchức 職員 zhíyuán ('civil servant, officer')
- trangmạng 網頁 wǎngyè ('web page')
- tinnhắn 短信 duǎnxìn ('text message')
2. Hybrid borrowings
- bánhbao 餅+包 (‘dumpling’)
- bòbía 包餅 (‘spring roll’, Teochew style)
- tủlạnh 冷+櫝 (‘refrigerator’)
- thangmáy 梯+機 (‘elevator’)
- thangcuốn 梯+捲 (‘escalator’)
- xếhộp 盒+車 (‘automobile sedan’)
- nhuliệu 柔+料 (‘software’)
- phầncứng 份+剛 (‘hardware’)
- trangnhà 張+家 (‘homepage’)
- liênmạng 聯+網 (‘internet’)
2. Extended idiomatic compounds
- trànggiangđạihải 長江+大海 (‘lengthy writing’)
- vòngvotamquốc 三國演義 (‘beat around the bush’)
- rượuchè 酒+茶 (‘alcoholic, drinking party’)
- cờbạc 棋+博 (‘gamble’)
- côngnhânviên 公+人員 (‘civil servant’) ~ côngchức 公+職
- toàán 座+案 (‘court’)
- quantoà 官+座 (‘judge’)
- ratoà 出庭 (‘appear in court’)
For those who deny that both Chinese and Vietnamese are fundamentally disyllabic languages, such developments are difficult to explain. A monosyllabic Vietnamese word, originally cognate with a single Chinese character, can evolve into multiple disyllabic forms with varied sound changes and meanings. Only by recognizing Vietnamese as a disyllabic language–like Chinese and indeed most languages–can we account for these transformations.
These changes do not strictly follow the predictable phonetic rules of historical sound change. Instead, they obey their own principle: associative sandhi. In this process, sound shifts are driven by semantic association and compound formation, not just phonological inheritance. The longer and more complex the multisyllabic form, the more drastic the changes tend to be.
Sound change may occur with or without human intervention, but locality and time play decisive roles, especially in cases of prolonged historical contact. The longer the contact, the greater the cumulative effect. To reconstruct the historical pronunciations of Sino‑Vietnamese lexicons, particularly those uncommon Vietnamese transcriptions of Chinese characters, scholars rely on the Fǎnqiè (反切, VS Phiênthiết) spelling method. This system, preserved in sources such as the Kangxi Dictionary (康熙字典, SV Khanghi Tựđiển), Guangyun (廣韻, SV Quảngvận), and Tangyun (唐韻, SV Đườngvận), provides explicit phonetic instructions for how a character was read. Without such guides, even specialists would be unable to pronounce many forms with accuracy.
For common characters embedded in daily speech, however, their forms emerge naturally as part of the language itself. Examples include 起 qǐ → SV khởi, 順 shùn → SV thuận, and 場 chăng → SV trường, (and usually lexicographers started from the sounds those common characters to decipher the unknowns.)
As more examples are amassed, the picture becomes increasingly complex. Many words derived from Old Chinese and Middle Chinese exhibit multiple lexical and phonological developments. Without first grasping the principle of dissyllabicity outlined above, such variation can appear confusing.
Consider, for instance, the following cases:
- sang ('develop/print photo'), as in sanghình, sangảnh (沖印 chōngyìn)
- xối ('wash out')
- dội ('pour water on')
- sôi ('boil up')
- xông ('charge')
- xấn ('dash against')
- tông ('collide')
- đụng ('collide')
- đường ('public road')
- dàipǎn 大販 → láibuôn ('merchant')
- dàifu 大夫 → đạiphu ('minister, high official'; modern M. 'physician')
- dàge 大哥 → đạica ('big brother')
- dàdăn 大膽 → togạn, cảgan ('daring')
- dàshēng 大聲 → totiếng ('raise one's voice')
- dàyǔ 大雨 → mưato ('heavy rain')
- dàxiōng 大兄 → anhcả ('elder brother')
- dàjiě 大姐 → chịcả ('elder sister')
- dàhăi 大海 → bểcả ('big ocean')
- dàhuǒ 大夥 → cảlũ ('the whole group')
- dàjiā 大家 → tấtcả ('everyone')
- dàjiāng 大江 → sôngcả ('large river')
- dàyì 大意 → sơý ('inattentive')
- dàhuà 大話 → tàolao ('talk nonsense')
- dàyuè 大月 → thángđủ ('full lunar month')
- dà'ài 大礙 → đángngại ('formidable')
- pángdà 龐大 → khổnglồ ('enormous')
- hóngdà 宏大 → tolớn ('great')
- lăodà 老大 → thằnglớn ('eldest son')
- dàgézi 大格子 → tocon ('big body')
- lăotàbùshăo 老大不少 → lớnđầu ('grown‑up')
- dàhóngdàliáng 大宏大量 → tấmlòngđạilượng ('magnanimous')
[ M 沖 chōng, chòng (xung, trùng) < MC ɖuwŋ < OC *duŋ ]
Derived Vietnamese forms include:
[ M 大 (太) dà, duò, dài, dăi, tài (đại, thái) < MC daj, da < OC *da:d, *da:ds ]
This root yields a wide range of Vietnamese forms:
[M 海 (𣴴, 𣳠) hǎi < MC həj < OC *hmlɯːʔ ]
Vietnamese developments:
- biển → bể (‘sea’)
- khơi (‘open sea’)
Examples:
- 大海 dàhǎi → biểncả ('big sea')
- 苦海 kǔhǎi → bểkhổ ('sea of suffering')
- 海浪 hǎilàng → sóngbể ('sea wave')
- 海口 hǎikǒu → cửabể ('seaport')
- 海寇 hǎikòu → cướpbể ('sea pirate') [modern hảitặc 海賊 hǎizéi → VS giặcbể]
- 出海 chūhǎi → rakhơi ('put out to sea')
- 外海 wàihǎi → ngoàikhơi, ngànkhơi ('open seas')
E) Many‑to‑one correspondences
In cases of the many‑to‑one model from Chinese to Vietnamese, it is evident that a single Vietnamese word, whether monosyllabic or disyllabic, may correspond to multiple Chinese sources, depending on context. Such examples directly challenge the outdated monosyllabicity viewpoint, which assumes that sound change must be restricted to one‑to‑one correspondences. Instead, these cases demonstrate the dynamic and fluid nature of sound change, for example:
- cho 給 jǐ, gěi → SV cấp ('give')
- cho(phép) 準 zhǔn → SV chuẩn ('allow')
- cho 許 xǔ → SV hứa ('allow')
- cho 賜 cì → SV tứ ('present with')
- cho 贈 zèng → SV tặng ('give a gift')
- chodầu 雖然 suīrán ('although')
- chonên 所以 suǒyǐ ('therefore')
- chotới 直到 zhídào ('until')
- chotiền 捐錢 juānqián ('donation')
- dànhcho 專用 zhuānyòng ('specialized for')
- khiếncho 引起 yǐnqǐ ('cause')
- làm 幹 gàn → SV cán ('do, work')
- làm 辦 bàn → SV bạn ('handle')
- làm 弄 nòng → SV lộng ('make')
- làm 令 lìng → SV lệnh ('cause') [ Ex. 令人驚訝. Lìng rén jīngyá. (Làm ngườita kinhngạc. 'It caused surprise to everybody.') ],
- làmruộng 耕田 gēngtián → SV canhđiền ('to farm')
- làmcàn 蠻干 mángàn → SV mancán ('foolhardy')
- làmơn 頒恩 bān'ēn → SV banân ('bestow')
- làmdốc 排架子 báijiàzi → SV bàigiátử ('pretend')
- làmgương 旁樣 pángyāng → SV bàngnhan ('exemplify')
- làmphiền 勞煩 láofán → SV lạophiền ('please help')
- làmăn 生意 shēngyì → SV sinhý ('make a living')
- làmviệc 幹活 gànhuó → SV cánhoạt ('work')
- làmthinh 安靜 ānjìng → SV antịnh ('keep quiet')
- làmkhôngkịp 來不及 láibùjí → SV laibấtcập ('cannot make it')
- làmlại 再來 zàilái → SV táilai ('try again')
- làmcông 勞工 láogōng → SV laocông ('to labor')
- làmlụng 勞動 láodòng → SV laođộng ('to labor')
- làmquan 當官 dàngguān → SV đángquan ('be an official')
- làmlính 當兵 dàngbīng → SV đángbinh ('be a soldier')
- làmchủ 當家 dàngjiā → SV đánggia ('be the boss')
- làmtóc 理髮 lǐfá → SV líphát ('hairdo')
- làmtiền 賺錢 zhuànqián → SV chuyếntiền ('make money')
- làmtiền 勒索 lèsuǒ → SV lặctác ('extortion')
Examples:
1. 'cho'
Compounds with cho
These examples demonstrate that Vietnamese sound change is not confined to rigid one‑to‑one correspondences. Rather, it reflects dynamic processes of association, assimilation, and reinvestment, producing both many‑to‑one and one‑to‑many mappings. The longer and more complex the multisyllabic forms, the more drastic the changes tend to be.
With a measure of linguistic common sense, one can readily accept certain implicit mechanisms, illustrated by generalized Vietnamese làm (cf. English 'do', 'make', 'work', 'perform'), that underlie the sound changes giving rise to the lexical variants discussed above. Forms such as 弄 nòng, 幹 gàn, and 當 dāng are easily recognized as cognates, linked by phonological relations across languages and reflecting shared roots.
It is also important to recognize that in Chinese a single concept‑word (whether lexeme, morpheme, allophone, or doublet) may be represented by several characters, which can be transcribed or pronounced similarly or differently depending on time period and locality. For example, 作 zuò (SV tác) and 做 zuò (SV tố) both mean 'do' or 'make'. In such cases, the multiplicity of forms in Chinese itself, and their Vietnamese reflexes, require careful analysis to understand how sound changes unfolded.
a) Illustrative cases:
-
phong 風 fēng ('wind') → giông, gió
- 颱風 táifēng → giôngtố ('typhoon')
- 暴風 bàofēng → bãogiông, gióbão ('storm')
- 風雨 fēngyǔ → giómưa, mưagió ('rainstorm')
- 蜂 fēng → ong ('bee') [ = 螉 wēng → ong, a doublet of the same root for 'bee'. ]
-
gong 公 gōng ('male, public, baron') → công, cồ, ông, trống
-
cf. 翁 wēng ('elder, hair') → ông
-
母 mǔ ('mother') → mẫu, mẹ, mợ, mái
-
In these cases–ong, cồ, công, ông, trống–the multiple forms can be recognized as doublets, phonologically related variants of the same root.
b) Less transparent associations:
Not all cases are so straightforward. For instance:
- 健康 jiànkāng (SV kiệnkhang, kiệnkhương) → VS sứckhoẻ ('health', 'healthy')
-
Vietnamese reflex: sức (via associative sandhi, cf. 力 lì → SV lực)
- 健 jiàn → SV kiện → reinterpreted as sức ('strength')
- 康 kāng → SV khang / khương → evolved into khoẻ ('strong, well')
- 康 kāng (MC kʰɑŋ, Pulleyblank kʰaŋ): SV khang → variant khương /kʰjɨəŋ1/
- → /kʰaŋ1/ → /kʰwəɒn5/ (khoắn)
- Shift: kʰw‑ → w‑ → m‑ → /majŋ6/ (mạnh)
Middle Chinese (Baxter): kɛnH, (Pulleyblank): kɛnH (departing tone, velar initial)
c) Comparative table
| Chin. | Mandarin | Sino-Viet. | Sinitic-Viet. | MC (Baxter) | MC (Pulleyblank) | OC (Baxter-Sagart) | OC (Zhengzhang) |
|---|---|---|---|---|---|---|---|
| 健 | jiàn | kiện | khoẻ | kɛnH | kɛnH | [k]ˤi[n]-s | [k]ˤi[n]-s |
| 壯 | zhuàng | tráng | khoắn / mạnh | tsrjangH | tsrjangH | [ts]ˤroŋ-s | ʔs-toŋ-s |
The comparative evidence we’ve examined, from jiànkāng 健康 → sứckhoẻ to jiànzhuàng 健壯 → khoẻkhoắn, illustrates how Vietnamese reflexes emerge not through rigid one‑to‑one correspondences, but through dynamic associative sandhi, semantic reanalysis, and phonological reinvestment.
-
健 jiàn consistently aligns with sức or khoẻ, reflecting its semantic field of 'strength, health'.
-
康 kāng and 壯 zhuàng interact with each other in Vietnamese, producing khoẻ, khoắn, and even mạnh, showing how reduplication and sound shifts (kʰw‑ → w‑ → m‑) generate new forms.
-
Reconstructions across Middle Chinese (Baxter, Pulleyblank) and Old Chinese (Baxter–Sagart, Zhengzhang) confirm the plausibility of these transformations, grounding Vietnamese developments in well‑attested phonological histories.
Etymologically, the initial kh- /kʰ-/ of the second syllable may have been 'sandhized' with the final /‑n/ of the first syllable /jiàn/. The first syllable can be identified with sức (cf. 力 lì, SV lực 'strength'), while the second, kāng 康 (with an alternate, probably older SV form khương /kʰjɨəŋ1/), developed into khoẻ ('strong').
An alternative possibility is that sứckhoẻ arose as an innovation from 力氣 lìqì (SV lựckhí, 'power, stamina'), which could yield sứckhoẻ or hơisức ('strength'), implying 'having the strength to do heavy tasks' and thus extending to the general sense of 'being healthy'. In this scenario, 力氣 lìqì denotes primarily 'strength, stamina', whereas 健康 jiànkāng conveys the broader meaning of both 'health' and 'healthy'.
A further comparison can be made with the compound khoẻmạnh. If we separate the two elements, khoẻ + mạnh, they could be reconstructed as 壯 zhuàng + 猛 měng ('strong' + 'powerful'), or in reverse order as mạnhkhoẻ corresponding to 猛壯 měngzhuàng (SV mãnhtráng). Yet these Chinese forms, while semantically close, emphasize 'energetic' or 'powerful' rather than the more specific sense of 'health' conveyed by 健康 jiànkāng.
Moreover, the concept of khoẻ must have existed independently in Vietnamese, as seen in the everyday greeting Chào, có khoẻ không? ('Hello, are you well?'), which parallels the modern Chinese expression 早, 你好? ('Hello, how are you?'). Here, khoẻ aligns with 好 hǎo (SV hảo), suggesting a pre‑existing Vietnamese word later reinforced by Sinitic influence.
Applying the principle of associative sandhi further postulates the development of 健壯 jiànzhuàng (SV trángkiện, 'feeling fit, well') → khoẻkhoắn. In this case, 健 jiàn (SV kiến) corresponds to khoẻ, while 壯 zhuàng can be associated with 康 kāng (SV khang, khương), producing either the reduplicative khoắn or the variant mạnh. The phonological pathway may be reconstructed as: /kʰjɨəŋ1/ (khương) → /kʰaŋ1/ → /kʰwəɒn5/ (khoắn) → /majŋ6/ (mạnh), with the shift kʰw‑ > w‑ > m‑.
Parallel cases (cho, làm, phong, gong, etc.) reinforce the principle that Vietnamese often maps many Chinese sources to one Vietnamese form, or one Chinese source to multiple Vietnamese outcomes, depending on context.
In sum, these patterns strengthen the dissyllabicity hypothesis: Vietnamese, like Chinese, is fundamentally polysyllabic in its lexical organization. The "mystical morphs" in compounds (mĩm in mĩmcười, thútthít in khócthútthít, bạt in bạtmạng) are not anomalies but natural outcomes of this system.
Vietnamese sound change is best understood not as a static set of correspondences, but as a living system of associative sandhi, where phonological shifts, semantic layering, and cultural adaptation converge. This framework allows us to trace Vietnamese etyma back to their Sinitic roots with greater clarity, while also appreciating the uniquely Vietnamese innovations that emerged along the way.
d) Other examples of associative development:
- bắtđầu 劈頭 pītóu ('start')
- bắtcóc 綁架 bǎngjià ('kidnap')
- bắtđền 賠償 péicháng ('demand compensation')
- bắtnạt 撥弄 bōnòng ('order about')
- mĩmcười 含笑 hánxiào ('smile')
- khócthútthít 哭泣 kùqì ('weep')
- lỗtai 耳朵 ěrduō ('ear')
- bạttai 巴掌 bāzhǎng ('spank')
- bạtmạng 拼命 pìnmìng ('risk one’s life')
- đồngbạc 銅板 tóngbǎn ('monetary unit')
- đitiền 隨錢 suíqián ('monetary gift')
- thầytrò 師徒 shītú ('teacher and students')
- họctrò 學子 xuézǐ ('student')
- trườnghọc 學堂 xuetáng ('school')
- nhẹnhàng 輕輕 qīngqīng ('slightly')
- chungquanh 周圍 zhōuwéi ('around')
- thôinôi 周年 zhōunián ('anniversary')
- nhiềunăm 有年 yǒunián ('many years')
- mấynămnay 近年來 jìnniánlái ('in recent years')
- hoahồng 花紅 huāhóng ('commission')
It is a rule of thumb that phonetic sandhi processes occur only within disyllabic formations. To reinforce this postulation, let us examine several additional unique examples:
Table 13 - Disyllabic Sandhi Examples
| Chin. | Sino-Vietnamese | Sinitic-Vietnamese | Meaning | Notes |
|---|---|---|---|---|
| 垃圾 lāji | lạpcấp | rác | 'trash' | Likely from rácrưới, rácrưởi, rácrến. Sandhi: ra- (l‑ ~ r‑) + ‑c (j‑ ~ k‑). Cf. 伊拉克 Yīlākē → Irắc. |
| 毫無 háowú | hảovô | không | 'no, not' | Via hông < khônghề. Shows contraction and semantic narrowing. |
| 天啊 Tiānna | thiêna | Trờiơi | 'My Lord' | Fusion of 天 tiān + 啊 ā. |
| 甭 béng | bằng | đừng | 'don't' | From 不用 bùyòng ('do not'). |
| 別 bié | biệt | chớ | 'do not' | From 可別 kěbié ('do not'). 可 kě ~ có (有 yǒu). Originally 不要 bùyào → 別 bié. |
This table makes the sandhi processes and associative shifts much easier to scan:
- Phonological mergers (e.g., ra- + ‑c → rác).
- Semantic contractions (e.g., háowú → không).
- Exclamatory fusions (e.g., Tiānna → Trờiơi).
- Colloquial reductions (e.g., béng → đừng).
- Reanalyses (e.g., bié → chớ).
Taken together, these cases reinforce the dissyllabicity principle. Vietnamese monosyllabic forms such as bắt, thầy, thợ, trò, trường, hàm, ngậm, mĩm, tai, tay, quanh, thôi, nôi, năm, đồng, bạc, nhiều, and disyllabic forms such as bạtmạng or bạttai, can be securely traced to Chinese sources through associative sandhi and phonological reinvestment. This framework also clarifies the role of “mysterical morphs” in polysyllabic composites–mĩm in mĩmcười, thútthít in khócthútthít, or bạt in bạtmạng–showing them not as anomalies but as natural outcomes of dynamic sound change.
Of course, Chinese → Sinitic‑Vietnamese sound changes sometimes occur beyond the strict constraints of formal linguistic rules. It is unnecessary, however, in the scope of this paper, to enumerate every possible rule of sound change for each Chinese character that yields a Sinitic‑Vietnamese morphemic syllable, including those that intersect with Vietnamese phonology. Most readers can readily grasp the mechanisms behind the examples already cited, sound changes that are both plausible and intuitively recognizable. For instance: bīng 兵 → lính ('soldier'), bèi 盃 → ly ('glass'), bài 拜 → lạy ('kowtow'), dǎ 打 → đánh ('strike'), yǐn 飲 → uống ('drink'), yóu 游 → bơi ('swim'), yóu 柚 → bưởi ('pomelo'), bǐrú 比如 → vínhư ('for example'), and so forth.
In virtually all Sino‑Vietnamese readings, the pronunciation keys, namely 反切 fǎnqiè (FQ), correspond closely to the phonetic descriptions preserved in classical rhyme books and dictionaries such as 廣韻 Guǎngyùn and the 康熙字典 Kāngxī Zìdiǎn. These attestations confirm the reliability of the system. For example: xié 鞋 → SV hài ('shoes'), kù 哭 → SV khốc ('weep'), bǐng 偋 → SV sính ('betroth'), chéng 承 → SV thừa ('inherit'), among many others.
Critics may continue to mock the author’s supposed ignorance of Western historical‑comparative methodologies, which are indeed effective for Indo‑European generalities but ill‑suited to the peculiarities and irregularities of Chinese and Vietnamese. Ironically, these same critics insist on classifying Chinese and Vietnamese as "isolated monosyllabic languages", thereby overlooking the fact that Western rules of sound change cannot be applied to patterns involving more than one syllable.
It remains customary, of course, for a system of well‑established phonological rules to guide specialists in analyzing and explaining the phenomena of sound change. Historical background is essential for understanding how words became homonyms within the Chinese phonological system. Many Mandarin words that share a Middle Chinese origin still retain distinct reflexes in Sino‑Vietnamese, as well as in Cantonese and Hokkien, and these match the phonological spellings recorded in the Kāngxī Zìdiǎn. For example, the morpheme yi [i], pronounced with four different tones in Mandarin, corresponds to a wide range of Sino‑Vietnamese forms: nhất, nghĩa, nghệ, ngãi, nghị, y, dịch, duệ, dị, dĩ, dì, etc., representing 一, 義, 藝, 議, 醫, 易, 裔, 異, 以, 姨, respectively.
Here’s the compact comparative table for the morpheme yi [i], showing how one Mandarin syllable with multiple tones corresponds to a wide range of Sino‑Vietnamese (SV) outcomes. I’ve included Middle Chinese (Baxter, Pulleyblank) and Old Chinese (Baxter–Sagart, Zhengzhang) reconstructions so you can see the historical depth of each reflex.
Table 14 - The Morpheme yi [i] Across Chinese and Sino‑Vietnamese
| Chin. | Sino-Vietnamese | Meaning | MC (Baxter) | MC (Pulleyblank) | OC (Baxter-Sagart) | OC (Zhengzhang) |
|---|---|---|---|---|---|---|
| 一 yī | nhất | 'one' | ʔit | ʔit | ʔit | ʔit |
| 義 yì | nghĩa | 'righteousness' | ngjijH | ngjieH | ŋ(r)aj-s | ŋi̯e-s |
| 藝 yì | nghệ | 'art, skill' | ngjejH | ngieiH | ŋ(r)at-s | ŋi̯at-s |
| 議 ùi | nghị | 'discuss, deliberate' | ngjijH | ngjieH | ŋ(r)aj-s | ŋi̯e-s |
| 醫 yī | y | 'medicine, doctor' | ʔij | ʔi | ʔij | ʔi |
| 易 yì | dịch | 'change, exchange' | jek | jek | lek | lek |
| 裔 yì | duệ | 'descendant' | jejH | jieH | ljats | ljat-s |
| 異 yì | dị | 'different' | jiH | jiH | ljəʔ-s | ljɯ-s |
| 以 yī | dĩ | 'use, take' | yiX | yiX | ʔijʔ | ʔiʔ |
| 姨 yí | dì | 'aunt' | ji | ji | ljə | ljɯ |
-
A single Mandarin syllable yi [i] corresponds to much more than just the listed ten distinct Sino‑Vietnamese outcomes, depending on tone, historical layer, and semantic differentiation.
-
Middle Chinese reconstructions (Baxter, Pulleyblank) show tonal and initial distinctions that later collapsed in Mandarin but were preserved in Sino-Vietnamese.
-
Old Chinese reconstructions (Baxter-Sagart, Zhengzhang) reveal deeper roots, often with complex onsets (ŋ‑, lj‑, ʔ‑) that explain the diversity of Sino-Vietnamese reflexes (nh‑, ngh‑, d‑, y‑).
-
This multiplicity demonstrates how Vietnamese disyllabicity and associative sandhi interact with Middle-Chinese phonological history to produce a rich array of forms.
Etymologically, many sound changes within the Chinese phonological system can be reconciled when compared against the Middle Chinese (MC) and Old Chinese (OC) sound systems. Sinologists such as Wang Li and Bernhard Karlgren also compared these with Chinese loanwords in Vietnamese, Korean, and Japanese to reconstruct earlier stages of Chinese. For illustration, consider the first three [i] readings discussed above:
-
nhất 一 yī [i1] ('one') [ M 一 yī, yí, yì, yāo < MC ʔjit < OC *qliɡ ] → SV nhất
-
nghĩa ~ ngãi 義 yì [i4] ('righteousness') [ M 義 yì < MC ŋjiə̆ < OC *ŋrals | According to Starostin, ‘be right, righteous, proper’; derived from 宜 ŋaj. Vietnamese nghĩa preserves an archaic reading (late Han a‑vocalism, but with loss of final ‑j), while SV ngãi /ŋaj4/ and Quảng Nghĩa dialect nghiẽ /ŋie4/ retain the final. Cf. Chaozhou ŋi4, Fuzhou ŋie6. ]
-
nghệ 藝 yì [i4] ('arts', 'skill') [ M 藝 yì < MC ŋjaj < OC *ŋeds. | Starostin: 'to plant, cultivate; skill'. MC unusually preserves ŋ before j. Vietnamese also has the colloquial nghề. Cf. Xiamen ge6, Chaozhou goi6, Fuzhou ŋie6. Possible cognates: VS nghề ('profession'), nghệ ('turmeric'), ngãi (‘turmeric'), tỉa ('to plant’), gieo ('to sow'). ]
Linguistic literature abounds with such radical changes. As King (1969: 109, 111) observes, "loss of segments is an almost commonplace kind of historical development: Greek lost its final stops, Germanic lost word‑final consonants and vowels under certain conditions." Mandarin exemplifies this process, having undergone extensive loss of phonological finals under the influence of accented speech from Altaic Turkic peoples who tule the Yan State in northern China, the Liao Dynasty, the Mongols, the Jurchen (金 Jin), the Manchurians, over centuries prior to 1911.
Concrete cases of loss of initials and finals in Mandarin are well documented. While the details of this complex process are not to be addressed here, it suffices to note that the contracted forms seen in the 360 four‑toned [i] syllables arose from the dropping of archaic initials and endings during diachronic sound change.
Ancient rhyme books such as Guǎngyùn (廣韻) and Zhōngyuán Yīnyùn (中原 音韻) provide ample evidence of these gradual changes. Yet synchronically, the same patterns emerge across Northeastern Mandarin, Wu, and Southwestern Mandarin (e.g., Sichuanese), in contrast to southern dialects such as Cantonese and Hokkien, which preserve ancient initials and finals (/d‑/, /ŋ‑/, /‑p/, /‑t/, /‑k/, /‑m/, etc.). From the early centuries of the last millennium through the Mongol Yuan Dynasty (13th century) and later the Manchu Qing Dynasty (17th–20th centuries), Mandarin remained in close contact with Altaic languages (Turks, Tartars, Jurchen, Mongols, Manchus). These northern non‑Han languages left a profound impact on Early Mandarin, contributing to contraction, omission, corruption, and loss of initials, medials, and finals (Bo Yang 1983; Zhou 1991). In other words, descents of the ancient Yan (燕Yên) still rule Beijing today.
Although it may appear straightforward to chart sound change patterns by systematically tabulating ancient and modern forms, in practice this is a painstaking task. Often the intermediate steps are opaque, making it difficult to reconstruct the precise pathways. For newcomers, it is sometimes best to accept the outcomes at face value. For example, Vietnamese học /hawk͡p̚˧˨ʔ/ (‘study’) derives from MC ɦaɨwŋk < OC *ɡruːɡ, rtc. fitting into a diachronic continuum that leads to Mandarin xué [ɕyɛ2]:
-
học ('study') 學 xué < EM /xjaw/ < MC ha:wk < OC ɣɶ:kʷ [ Rule: final /‑k/ conditioned by /‑w‑/ → /‑kʷ/ | Starostin: MC ɣauk < OC ghrūk. Pulleyblank: LM xɦja:wk < EM ɣaɨwk. The Vietnamese /‑kw/ in học parallels Cantonese /hɔk8/, though Cantonese has lost the labialization. ]
-
tiết ('blood') 血 xiě, xiè, xuè (SV huyết) < MC hwet < OC *qʰʷiːɡ | Starostin: Viet. also has tiết 'animal blood' - an archaic loan (with t- regularly representing OC *s-, which was already lost in MC) ]
Similarly:
-
khóc (‘weep’) 哭 kù < MC kʰəwk < OC *ŋ̥ʰoːɡ | SV khốc /kʰəwk͡p̚˦˥/ ] preserves the archaic final. VS khóc /kʰawk͡p̚˦˥/ reflects the same development. Dialectal parallels: Yangzhou /khɔʔ4/, Suzhou /khoʔ41/, Cantonese /huk41/, Amoy /khoʔk41/, Chaozhou /khok41/, etc. ]
-
khóc 泣 qì (SV khấp ) < MC kʰɯip < OC *kʰrɯb | Wiktionary: Phono-semantic compound OC *kʰrɯb: semantic 氵 (“water”) + phonetic 立 (OC *rɯb). The character originally meant "tears," and by extension, it came to represent the act of 'crying.' | Ex. 喪家 同 泣報. Sāngjiā tóng qìbào. (Tanggia đồng khấpbáo.) 'The bereaved family tearfully announces the death.' ]
These examples show that sound change is dynamic and diverse, affecting not only single syllables but also entire multisyllabic strings, as seen in disyllabic forms discussed earlier.
One of the most striking features of syntactic adaptation in Vietnamese is the reversal of compound word order to align with Vietnamese speech habits. This reflects the [noun + adjective (modifier)] order of Old Chinese grammar, in which the second element modifies the first, as opposed to the [adjective + noun] order of modern Chinese. Yet both patterns coexist in Vietnamese. For example:
-
mắtkiếng (VS mắtkính) for 目鏡 mùjìng (Hai. /mat7keng1/, ‘eye‑glasses’), paralleling the Old Chinese [noun + modifier] order.
-
kiếngmắt (VS gươngmắt), reflecting the modern Chinese [modifier + noun] order.
This phenomenon of lexical re‑arrangement – metathesis or inversion, that is, the transposition of sounds or letters in a word – has had a profound effect on the formation of disyllabic compounds, determining which syllable comes first. Many such words, especially Chinese loanwords, were originally composed of two lexical elements. When introduced into Vietnamese, speakers either retained the original order or reversed it to suit local grammar. The relative looseness of these paired syllables allowed for such fluidity, particularly during the period from the Jin (晉) through the Tang (唐) dynasties, when large numbers of both literary and colloquial words entered Vietnamese. Over time, one form often stabilized as the standard, while alternate variants persisted in parallel.
Examples of such alternation includes:
- thơdại # ngâythơ 幼稚 (yōuzhī, SV ấutrỉ # VS trẻdại, 'childish')
- sôngnúi # nonsông 江山 (jiāngshān, cf. 山河 shāhé, SV sơnhà, 'country')
- nhànước # nướcnhà 國家 (guójiā, SV quốcgia, 'government' vs. 'nation')
- hoamắt # mắthoa 眼花 (yǎnhuā, 'dazzling vision')
- trườnghọc # họcđường 學堂 (xuétáng, 'school')
- chợbúa # phốchợ 市鋪 (shìpǔ, SV thịphố # phốthị, 'marketplace')
- bảođảm # đảmbảo 擔保 (dànbǎo, 'guarantee')
- âmthanh # thanhâm 聲音 (shēngyīn, 'sound')
- lạnhcóng # cónglạnh 寒冷 (hánlěng, 'chilly')
- tráicây # câytrái 果實 (guǒshí, 'fruit')
- nhãnlồng # longnhãn 龍眼 (lóngyǎn, 'longan')
Readers who do not yet recognize the transposition should either suspend judgment or accept the stated propositions as working premises, to be used as a springboard for further inquiry. This is, after all, the natural process of human learning.
Such lexical shuttling – anastrophe or inversion – often places semantic weight on the modified element rather than the modifier. For instance, 罪惡 zuì’è (SV tộiác 'crime') semantically corresponds more closely to 惡罪 èzuì 'evil‑crime', yet both orders coexist in the lexicon.
The logical outcome of this disyllabic treatment is clear: when in doubt, one should test the reversed order of syllables. This principle reflects the fact that many sound changes occurred before the final stabilization of disyllabic forms in either Chinese or Vietnamese. By applying this inversion 'trick', we can often reconstruct plausible etyma and establish credible connections between Vietnamese and Chinese cognates.
The Vietnamese lexicon is full of compounds that seem, at first glance, oddly inverted. Yet this inversion is not random: it reflects a deep historical process in which borrowed syllables were still fluid, their order unsettled, and speakers chose whichever arrangement best suited local grammar and rhythm.
Take bắtnạt # 欺負 (qīfù, SV khiphụ, 'bully'). The Chinese compound places 欺 'to deceive' before 負 'to bear', but Vietnamese speakers flipped the order, letting bắt carry the weight of action while nạt sharpens the sense of intimidation. The result is a form that feels native, even though its roots are unmistakably Sinitic.
The same pattern appears in thầymô # 巫師 (wūshī, SV usư, 'sorcerer'). Here, the Chinese order is 巫 'shaman' + 師 'master'. Vietnamese, however, foregrounds thầy 'teacher, master' and lets mô carry the shamanic nuance. The inversion not only naturalizes the compound but also aligns it with the Vietnamese cultural schema of thầy as a figure of authority.
Other cases are subtler but no less telling. khônlanh # 靈巧 (língqiáo, SV linhxảo, 'witty') reverses the Chinese order, giving primacy to khôn 'clever' while letting lanh 'quick, nimble' follow. hồnthiêng # 靈魂 (línghún, SV linhhồn, 'spirit') likewise inverts the Chinese sequence, foregrounding hồn 'soul' and letting thiêng 'sacred' qualify it.
Even everyday kinship terms bear this mark of inversion. bàxã # 媳婦 (xífù, 'wife') and ôngxã # 相公 (xiànggōng, SV tướngcông, 'husband') both re‑order the Chinese elements, mapping them onto Vietnamese kinship vocabulary in ways that feel natural to the ear. Similarly, ôngchủ # 主公 (zhǔgōng, SV chúacông, 'master') places ông first, in keeping with Vietnamese address norms.
Geographic and cultural compounds show the same play. nonsông # 江山 (jiāngshān, SV giangsơn, 'nation') reverses the Chinese order, yielding the familiar Vietnamese pairing sông núi 'rivers and mountains'. yêuthương # 疼愛 (téng’ài, SV đôngái, 'love') likewise reshuffles the Chinese sequence, foregrounding yêu 'love' and letting thương 'affection' follow.
Even the mundane đườngcái # 街道 (jièdào, SV cáiđạo, 'road') and phốchợ # 市舖 (shìpū, SV thịphố, 'market') reveal the same logic: Vietnamese speakers instinctively re‑ordered the syllables to fit local patterns of emphasis. And in ôngnghè # 衙門 (yámén, SV nhamôn, 'civil servant'), the inversion is almost playful, mapping 門 'gate' onto ông and 衙 'office' onto nghè, producing a form that is both intelligible and culturally resonant.
The pattern is unmistakable: inversion was a strategy of naturalization. By flipping the order, Vietnamese speakers made foreign compounds feel native, aligning them with local rhythm, semantics, and cultural schemas. For the historical linguist, this is more than a curiosity – it is a method. When a Vietnamese form resists explanation, try reversing the syllables. More often than not, the hidden cognate will surface.
What these examples show is not chaos but a principle: when a compound entered Vietnamese, its syllables were not fixed in stone. They could be reversed, re‑weighted, and re‑aligned until they fit the cadence of Vietnamese speech. This "inversion trick" is more than a curiosity; it is a diagnostic tool. When a Vietnamese form seems opaque, try flipping the order. More often than not, the hidden cognate will emerge.
In sum, our exploration of disyllabicity demonstrates that many Vietnamese words, long overlooked by scholars, can only be properly understood through this lens. By tracing how Vietnamese disyllabic forms diverged from their Chinese roots, we gain a powerful methodological tool.
The renewed recognition of Vietnamese as a fundamentally disyllabic language establishes a new polysyllabic approach to etymology. Many peculiar sound changes from Chinese into Vietnamese occurred only under such conditions. This approach, long overdue, challenges the entrenched but mistaken notion of Vietnamese as a purely monosyllabic language. (2)
Conclusion
Exploring disyllabic sound change opens a new frontier in comparative linguistics. By treating disyllables as coherent units, we uncover patterns of phonological shift and semantic pairing that reshape our understanding of Vietnamese origins. This approach not only enriches etymological analysis but also strengthens the case for a shared Sinitic-Yue heritage embedded in the Vietnamese lexicon.
The evidence presented in this chapter shows that Vietnamese cannot be understood apart from the wider Sinitic-Yue continuum. From the earliest strata of loanwords to the fluid alternations of disyllabic compounds, the language reveals centuries of contact, adaptation, and inversion.
Several points stand out. Vietnamese compounds often invert the order of their Chinese models, a process that naturalized foreign forms into local rhythm and semantics. This inversion is not random but systematic, and it provides a diagnostic tool for reconstructing etyma. The lexicon is layered: Sino‑Vietnamese readings, vernacular Sinitic‑Vietnamese forms, and substratal elements from Mon‑Khmer, Chamic, and Tai. Yet the Sinitic layers dominate, and their depth suggests not mere borrowing but shared inheritance from a Yue substrate. Borrowed compounds were not only phonologically reshaped but semantically re‑weighted to align with Vietnamese cultural schemas, as in ôngchủ, bàxã, or sôngnúi.
Historically, the great influx of Chinese vocabulary coincided with political domination and migration from the Jin through Tang dynasties. Stabilization came only gradually, leaving behind doublets, alternates, and inversions that still mark the lexicon today. The methodological lesson is clear: when a form resists explanation, one should test the reversed order of syllables. This simple procedure often uncovers hidden cognates and reconstructs plausible etymologies.
Taken together, these findings support the view that Vietnamese and Chinese share a common Yue foundation, later overlaid by successive waves of Sinitic influence. Vietnamese is not simply a Mon‑Khmer language with heavy Chinese borrowings, nor merely a Sino‑xenic reflex. It is a Yue‑based language, refracted through centuries of contact, conquest, and cultural negotiation.
Recognizing Vietnamese as fundamentally polysyllabic and disyllabic in structure opens a new path for etymological research. It allows us to explain sound changes, semantic shifts, and lexical alternations that have long puzzled scholars. More importantly, it reframes Vietnamese not as a peripheral tongue under Chinese shadow, but as a central witness to the shared Yue heritage of southern China and northern Vietnam.
This chapter establishes the methodological springboard: to study Vietnamese etymology, one must embrace polysyllabicity, test inversion, and situate the lexicon within the Yue–Sinitic continuum. Only then can we recover the true historical depth of the language and its people.
References
-
Brindley, Erica Fox. Ancient China and the Yue: Perceptions and Identities on the Southern Frontier, c.400 BCE–50 CE. Cambridge University Press, 2015.
-
Brindley, Erica Fox. "Ancient China and the Yue". Journal of Chinese History, Cambridge University Press, 2017.
-
Hà Văn Tấn. Lịch sử Việt nam Cổ đại. Hà nội: Nhà xuất bản Đại học Quốc gia, 2002.
-
Henry, Eric. The Submerged History of Yue. Sino‑Platonic Papers, No. 176. University of North Carolina.
-
Leith, Seamus P. An Investigation into the Tai‑Kadai Substratum in Yue. MA Thesis, Leiden University, 2017.
-
Liang Tingwang. The Zhuang and the Ancient Yue. Guangxi Normal University Press, 2012.
-
Nguyễn Ngọc San. Nguồn gốc người Việt và tiếng Việt. Hà nội: Nhà xuất bản Văn hoá Thông tin, 1993.
-
Nguyễn Tài Cẩn. Nguồn gốc và Quá trình hình thành tiếng Việt. Hà nội: Nhà xuất bản Giáo dục, 1995.
-
Papin, Philippe. Việt nam: Histoire et Civilisation. Paris: Éditions Fayard, 2003.
-
Phạm Đức Dương. Ngôn ngữ và Văn hoá Việt nam trong bối cảnh Đông Nam Á. Hà nội: Nhà xuất bản Khoa học Xã hội, 2001.
-
Poisson, Emmanuel. Mandarins et Modernité: Les Pratiques Administratives au Vietnam au XIXe siècle. Paris: École Française d’Extrême‑Orient, 2004.
-
Trần Quốc Vượng. Việt nam: Văn hoá và Con người. Hà nội: Nhà xuất bản Khoa học Xã hội, 1998.
-
Yue, Anne O. "The Yue Language". In The Oxford Handbook of Chinese Linguistics. Oxford University Press, 2015.
x X x
FOOTNOTES
(平)^ Here are postulations of possible Chinese cognates posited for these Vietnamese words:
-
(i) dissyllabicity:
- đầugối #膝蓋 xìgài (knee),
- mắccá 踝節 guǒjié ‘ankle) [ Also: 踝骨 guǒgǔ ],
- cổchân #腳脖 jiăobó (ankle),
- càngcổ #脖頸 bójiīng, (back of the neck),
- bảvai #肩膀 jiānbăng (shoulders),
- cùichỏ 胳膊肘 gēbozhǒu (elbow),
- màngtang 太陽穴 tàiyángxué (temple),
- mỏác #囟門 xìnmén (fontanel),
- chânmày 眉尖 méijiān (eyebrow),
- càunhàu 僝僽 chánzhòu (growl),
- cằnnhằn 埋怨 mányuàn (grumble),
- bângkhuâng 彷徨 pánghuáng (pensive),
- bồihồi 徘徊 páihuái (melancholy),
- mồhôi 冒汗 màohàn (sweat),
- mồcôi 無辜 wúgù (orphan),
- hàilòng 開心 kāixīn (pleased),
- taitiếng 丟臉 dìuliăn (infaous),
- tạmbợ 暫時 zànshí (temporary),
- tráchmóc 折磨 zhémó (reproach),
- tuyệtvời 絕妙 juémiào (wonderful),
- tămhơi 音信 yīnxìn (whereabouts),
- cườimĩmchi 笑眯眯 xiàomīmī (crack a smile),
- tủmtỉmcười 偷偷笑 tòutòuxiào (hide a smile),
- mêtítthòlò 迷離糊塗 mílíhútú (irresistable),
- nhảyđồngđổng #蹦蹦跳 pèngpèngtiao (jump up in protest),
- bađồngbảyđổi 說三道四 shuōsāndàosì (unpredictably),
- lộntùngphèo ® 亂七八糟 luànqibāzao (upside down),
- tuyệtcúmèo ® 妙不可言 miàobùkěyán (fabulous),
- hằnghàsasố 恆河沙數 hénghéshāshù (innumerable),
- hiệndiện 現在 xiànzài (present),
- phụnữ 婦女 fùnǚ (woman),
-
sơnhà 山河 shānhé (country),
etc.
(ii) polysyllabicity:
and sure (iii) Sino-Vietnamese equivalalents:
(2)^ Once the dissyllabic nature of Vietnamese is acknowledged, it follows almost inevitably that the writing system must also change. A new orthography, provisionally called Việtngữ 2020 or Vietnamese2020, as outlined in the Vietnamese2020 Writing Reform Proposal) offers such a path. In this paper the author has consistently written Vietnamese dissyllabic words in combined forms to demonstrate the principle. The reasoning is straightforward: many of these words cannot be split into isolated syllables, since each syllable functions as a bound morpheme, a composite element that only gains full meaning when paired with its partner. The numerous examples presented here illustrate this point. The hope is that a truly polysyllabic orthography will prove to be the most faithful way of writing Vietnamese, and that it may, within our lifetime, gain wide acceptance.