Tuesday, November 18, 2025

Sino-Vietnamese vs. Sintic-Vietnamese words

Phonology, Semantics, Syntax: How Vietnamese Localized Chinese Roots

by dchph




The Sino‑Vietnamese layer, comprising roughly two‑thirds of the lexicon, represents a systematic borrowing from Middle Chinese. These words were adapted into Vietnamese phonology and tonal categories, stabilizing after the tenth century. Their structure preserves Tang‑era rhyming and tonal matrices, which explains why Tang poetry remains accessible in Vietnamese literary circles. Much like Latin in English, Sino‑Vietnamese provides a learned, elevated register that is instantly recognizable to native speakers.

The Sino‑Vietnamese stratum, comprising nearly two‑thirds of the lexicon, represents a systematic borrowing from Middle Chinese. These words were absorbed into Vietnamese phonology and tonal categories, stabilizing after the tenth century. Their structure preserves Tang‑era rhyming and tonal matrices, which explains why Tang poetry remains accessible and aesthetically resonant in Vietnamese literary circles. Much like Latin in English, Sino‑Vietnamese provides a learned, elevated register that is instantly recognizable to native speakers.

Beyond this formal stratum lies a class of Sinitic‑Vietnamese words that have been fully localized. Some predate Middle Chinese borrowings, while others are innovative variants that diverge semantically or phonetically from their sources. Functioning as naturalized elements of the vernacular, they are often indistinguishable from native words. Together with the Sino‑Vietnamese layer, they demonstrate the depth of Chinese influence on Vietnamese, while also highlighting the creative processes of adaptation, localization, and semantic shift that define the language's unique identity.

This study revisits the Sino‑Vietnamese vocabulary stratum, focusing on basic words, many of which are cognate with Sinitic‑Vietnamese forms derived from Old Chinese. The evidence shows that Vietnamese did not simply adopt Chinese roots; it localized them through phonological adaptation, semantic layering, and syntactic restructuring.

Examples include:

  • 父 fù (bố, 'father') vs. 爹 diē (tía  → cha, 'dad, daddy')

  • 中 zhōng (trung → trong, 'inside') →  trúng ('hit in the right spot')

  • 得 dé (đắc → được, 'positive passive marker') vs. 被 bèi (bị, 'negative passive marker')

  • 當 dāng, dàng, dǎng (đang → đán→ đương → tưởng,  'is being', 'act', 'thought', 'pawn')

These cases demonstrate how Vietnamese reshaped Chinese elements into its own grammatical system, while still preserving the cognate backbone.

I) Review of Sino-Vietnamese words

Beyond the most elementary Sinitic stratum, the pervasiveness of Sino‑Vietnamese vocabulary becomes immediately clear. Select almost any word in a Vietnamese paragraph–or even in a short sentence–and the odds are high that it derives from Sino‑Vietnamese, at least in an older form. In longer sentences, most items cannot be replaced by hypothetical native equivalents without losing nuance or precision.

This dominance is borne out quantitatively. Chinese loanwords, known as Hán‑Việt or Sino‑Vietnamese, account for roughly sixty‑five percent of the Vietnamese lexicon. They are spelled and pronounced in a distinctively Vietnamese manner, a system that stabilized after the tenth century, when courtly Mandarin ceased to function as a living language in Annam. Their roots, however, lie firmly in Middle Chinese. As Bernhard Karlgren observed (MFEA, Bulletin 22, 1954, p. 216), the Sino‑Vietnamese layer had formed into a relatively complete system by the end of the Tang Dynasty. Phonologically, these words align closely with Middle Chinese, with its twenty‑odd initials and more than three hundred finals. Crucially, the eight Vietnamese tones correspond neatly to Tang‑era tonal categories, preserving the rhyming structures and tonal matrices described in works such as the Tangyun and Guangyun (Nguyễn Tài Cẩn, 1979).

This correspondence explains why Tang poetry remains accessible and aesthetically resonant in Vietnamese literary circles. Even today, poets compose regulated verse in Tang style, adhering to strict tonal and rhyming conventions. By contrast, modern Chinese poetry has largely lost this connection, since Mandarin's tonal system no longer matches Tang‑era prosody. Cantonese, like Vietnamese, preserves many of these archaic phonological features, which accounts for their striking similarities.

Much as Latin‑derived vocabulary functions in English, Sino‑Vietnamese words are instantly recognizable to Vietnamese speakers without specialized training. Their sound changes generally follow consistent phonological rules, producing a harmonious system of correspondences. Yet not all forms align perfectly. Dialectal variation, phonetic erosion, and performance factors have introduced irregularities, creating one‑to‑many relationships between Chinese sources and Vietnamese outcomes.

Despite such divergences, Sino‑Vietnamese words–whether in their original or extended grammaticalized forms–remain indispensable to the structure and expression of Vietnamese. Examples include:

  • 學 xué – học ('study')
  • 文 wén – văn ('literature')
  • 字 zì – chữ ('word')
  • 詩 shī – thi ('poetry')
  • 樂 yuè – nhạc ('music')
  • 練 liàn – luyện ('practice')
  • 福 fú – phước ('luck')
  • 公 gōng – công ('public')
  • 私 sī – ('private')
  • 錢 qián – tiền ('money')
  • 男 nán – nam ('male')
  • 女 nǚ – nữ ('female')
  • 婦女 fùnǚ – phụnữ ('woman')
  • 青年 qīngnián – thanhniên ('youth')
  • 祖國 zǔguó – tổquốc ('nation')
  • 江山 jiāngshān – giangsan ('country')
  • 家庭 jiātíng – giađình ('family')

The same pattern extends to modern concepts, many of which entered Vietnamese via Sino‑Japanese mediation during the late nineteenth and early twentieth centuries (Wang Li et al. 1956, p. 9). Representative examples include:

  • 政府 zhèngfǔ – chínhphủ ('government')
  • 自由 zìyóu – tựdo ('liberty')
  • 資本 zīběn – tưbản ('capital')
  • 投資 tóuzī – đầutư ('investment')
  • 經濟 jīngjì – kinhtế ('economics')
  • 階級 jiējí – giaicấp ('social class')
  • 心理學 xīnlǐxué – tâmlýhọc ('psychology')
  • 文人 wénrén – vănnhân ('literati')
  • 學者 xuézhě – họcgiả ('scholar')
  • 教堂 jiàotáng – giáođường ('church')
  • 大學 dàxué – đạihọc ('university')
  • 哲學 zhéxué – triếtthọc ('philosophy')
  • 意識 yìshì – ýthức ('consciousness')
  • 相對 xiāngduì – tươngđối ('relative')  
  • 絕對 juéduì – tuyệtđối ('absolute')

The same principle applies when examining compounds derived from Chinese core syllabic stems, which in turn generate entire families of Vietnamese expressions. Representative examples include:

  • 再三 zàisān – haiba ('twice or thrice')
  • 三番兩次 sānfānliăngcì – nămlầnbảylượt ('so many times')
  • 一而再, 再而三 yī'érzài, zài'érsān – mộtrồihai, hairồiba ('again and again')
  • 主日 Zhǔrì – Chủnhật ('Sunday') [also VS Chúanhật 'Day of God']
  • 周二 zhōu'èr – Thứhai ('Monday' in Vietnamese; Chinese 'Tuesday')
  • 周三 zhōusān – Thứba ('Tuesday' in Vietnamese; Chinese 'Wednesday')

Even more revealing are cases where Vietnamese diverges from the expected Sino‑Vietnamese reflex. For instance, 周年 zhōunián, which in literary usage denotes 'anniversary', survives in the vernacular as thôinôi, literally 'the time when a baby leaves the cradle'. This appears to be a corrupted form, created by popular mispronunciation of the learned word in earlier times. A related case is đầytháng, the 'baby's first‑month shower'. This is not equivalent to modern Mandarin 滿月 mǎnyuè ('full moon'), but rather reflects a contextual shift: the Vietnamese expression derives from the mother's full‑month recovery period, a practice still central in both Vietnamese and Chinese culture and known as điởcử or 坐月子 zuòyuèzi ('one‑month confinement after childbirth'). The Vietnamese form đầytháng is thus both phonetically plausible and culturally grounded.

Such examples could be multiplied. The semantic fields of 'birthday' (生日 shēngrì → sanhnhật), 'age' (歲數 suìshù → sốtuổi), or the cyclical designations of the zodiac – 'Year of the Goat' (屬羊 shǔyáng → tuổiDê), 'Year of the Rooster' (屬雞 shǔjī → tuổiGà) – all demonstrate the same pattern. Across domains both ancient and modern, Sino‑Vietnamese etyma and their vernacular counterparts stand side by side, matching one another with remarkable fidelity and continuing to shape the expressive resources of the Vietnamese language.

In Chinese, virtually every morphemic syllable – each character in a disyllabic formation – can in principle be used independently as a complete word. For this reason, Chinese and Japanese specialists often classify such items as disyllabic words, or binoms. In Vietnamese, however, many of the syllables that make up Sino‑Vietnamese binoms are not free to stand alone. They appear only in fixed combinations, bound to one another in ways that limit their independent use. The situation is comparable to English borrowings from Latin or Greek: words such as sociologist, geology, librarian, intersection, missionaries, or psychology contain recognizable elements like socio‑, geo‑, lib‑, inter‑, or psych‑, but these cannot be deployed in isolation as ordinary words.

In most cases, the Chinese syllabic morphemes that entered Vietnamese underwent further processes of transformation – innovation, adaptation, and localization – that reshaped them into a distinct lexical class. It is this class, conventionally termed Sinitic‑Vietnamese, that will occupy our attention in the following section.

II) Localized Sinitic-Vietnamese words

Let us wrap up some randomly picked Sinitic Vietnamese words.

  • buồng 房 fáng 'room'
  • đũa 箸 zhú 'chopsticks'
  • thìa 匙 chí 'spoon'
  • ăn 唵 ăn 'eat' [cf. modern M 吃 chī 'eat']
  • uống 飲 yǐn 'drink' [cf. modern M 喝 hè 'drink']
  • đái 尿 niào 'urinate'
  • ỉa 屙 ē 'defecate' [also VS ốm 'ill']
  • đẻ 生 shēng 'give birth'
  • chạy 走 zǒu 'run'
  • đìa 池 chí 'pool'
  • bốmẹ 父母 fùmǔ 'parents'
  • chúbác 叔伯 shūbó 'uncles'
  • chịem 姊妹 jiěmēi 'sisters'
  • anhchị 兄姐 xiōngjiě 'siblings'
  • anhem 兄弟 xiōngdì 'brothers' [ancient VS anhtam]
  • cậumợ 舅母 jìumǔ 'uncle and aunty'
  • buồngngủ 臥房 wòfáng 'bedroom'
  • bưngbít 矇蔽 méngbì 'hoodwink'
  • bênhvực 包庇 bāobì 'take side'
  • ngànhnghề 行業 hángyè 'profession'
  • trướctiên 首先 shǒuqiān 'firstly'
  • thươngyêu 疼愛 téngài 'loving'
  • dandíu 有染 yǒurǎn 'have an affair with'
  • thùhằn 仇恨 chóuhèn 'hatred'
  • tứcgiận 生氣 shēngqì 'anger'
  • chờđợi 期待 qídài 'expecting'
  • sânkhấu 劇場 jùchǎng 'stage'
  • trườnghọc 學堂 xuétáng 'school'
  • tầmbậy 三八 sānbā 'nonsense'
  • nóixàm 瞎說 xiàshuō 'talk nonsense'
  • giôngbão 暴風 bàofēng 'rainstorm'
  • đấtđai 土地 tǔdì 'land'
  • chốitừ 推辭 tuìcí 'refuse'
  • rútlui 退走 tuìzǒu 'withdraw'
  • lẽsống 理想 lǐxiǎng 'ideal'
  • căngthẳng 緊張 jǐnzhāng 'urgent'
  • riêngtư 隱私 yǐnsī 'privacy'
  • chửimắng 咒罵 zhòumà 'scolding'
  • trongsạch 清潔 qīngjié 'clean, pure'
  • banngày 白日 báirì 'daytime'
  • bantrưa 白晝 báizhòu 'noon time'
  • chạngvạng 旁晚 pángwǎn 'dusk'
  • tốităm 黑暗 hēiàn 'darkness'
  • quêhương 家鄉 jiāxiāng 'homeland'
  • lánggiềng 鄰居 línjū 'neighbor'
  • bầubạn 陪伴 péibàn 'accompany'
  • xơitái 吃生 chīshēng 'eat raw'
  • đánhcá 打魚 dǎyú 'fishing'
  • đánhbạc 賭博 dǔbó 'gambling'
  • ănthua 輸贏 shūyíng 'competing'
  • suônsẻ 順利 shùnlì 'smoothly'
  • hiếuthảo 孝順 xiàoshùn 'filial piety'
  • sẵnsàng 現成 xiànchéng 'ready'
  • bồihồi 徘徊 páihuái 'melancholy'
  • bắtcóc 綁架 bǎngjià 'kidnap'
  • hòhẹn 約會 yuèhuì 'dating'
  • tháovác 操持 cāochí 'manage'
  • côngcuộc 工作 gōngzuò 'task'
  • xinlỗi 道歉 dàoqiàn 'apologize'
  • xinchào 見過 jiànguò 'greeting'
  • tiềncủa 錢財 qiáncái 'wealth'
  • vốnliếng 本錢 běnqián 'capital'
  • đitiền 隨錢 suíqián 'give monetary gift'
  • cógiá 好價 hǎojià 'high‑priced goods'
  • củacải 財產 cáichǎn 'property'
  • đánhcắp 打劫 dǎjié 'rob'
  • ngâythơ 幼稚 yòuzhì 'naive'
  • khônlanh 靈巧 língqiǎo 'quick, intelligent'
  • lanhlợi 伶俐 línglì 'witty'
    and the wordlist can go on and on.

Like Sino‑Vietnamese, the class of Sinitic‑Vietnamese words also consists of Chinese loanwords rather than cognates of basic etyma. What distinguishes them is that they have been thoroughly localized, that is, completely Vietnamized in form and function. Some may even predate the Middle Chinese stratum. A case in point is VS buồngngủ, from 臥房 wòfáng (SV ngoạphòng), the source of later Sino‑Vietnamese formations such as 臥龍君 Wòlóngjūn (Ngoạlongquân, the title of King Lê Long Đỉnh). Many such items are full variations or modifications of original words, sometimes retaining their earlier meanings, sometimes diverging into new semantic territory.

In other instances, they appear as variants that have evolved from an original form, reshaped through different spellings and pronunciations as though coined from fresh material. Examples include lịchsự, from 歷事 lìshì ('experience'), now meaning 'polite', and tửtế, from 仔細 zǐxì ('meticulous'), now meaning 'kindness'. Their development is analogous to English pairs such as familial and familiar (cf. 慣 guàn, SV 'quán' versus VS quen 'accustomed' versus cưng 'pampered'), infant and infantile (兒 ér for VS nhí versus nhỏ), or road and route (路 lù for SV lộ versus lối). The semantic drift and nuance resemble that seen in English contrasts like coffee and café, blond and blondie, aerospace and airspace, grand and grandiose, entrance and entry, serpent and serpentine. Their phonological reshaping, meanwhile, recalls the multiple adaptations of foreign loanwords in English: pho, banhmi, chowmein, sushi, burrito, taco, kowtow, typhoon, kindergarten, wagon, vendor, agent, bourse, rendezvous, accord, regard, guard, résumé, exposé, mercy, pardon, à la carte, en masse, and many others.

The number of common Vietnamese words of Chinese origin extends far beyond the illustrative examples cited here. The purpose of the list reviewed above is to give readers a sense of the magnitude of Chinese influence on Vietnamese vocabulary, which surpasses the scope of the basic substratal lexicon. Even when compared with the more than four hundred fundamental Sino‑Tibetan cognates discussed in the previous chapter, the Sinitic layer is broader and more pervasive. Apart from later loanwords, a significant portion of fundamental vocabulary may have originated or evolved from shared roots – those of the aboriginal Taic peoples, often identified with the native Yue (百越 or BáchViệt or 'Bod'; see Lacouperie 1887) – in parallel with Chinese. These are native or aboriginal words, traceable to non‑Han languages spoken in southern China, now classified by comparative linguists as Austroasiatic. Vietnamese versions of many such indigenous forms have been repeatedly cited throughout this research:

  •  sông 江 jiāng 'river' 
  •  弩 nú 'crossbow' [cf. 拏 ná like 拿 ná (SV nã) VS lấy
  • đường 糖 táng 'sugar' 
  • dừa 椰 yě 'coconut' 
  • chuối 蕉 jiāo 'banana' 
  • soài 檨 shē 'mango' 
  • bưởi 柚 yóu 'pomelo' 
  • chanh 橙 chéng 'lemon' [Cf. modern Mandarin 檸檬 níngméng < Eng. 'lemon'; 橙 chéng denotes a citrus, SV camsành
  • trầu 檳榔 bīngláng 'betel areca' [Note the interchange between Mandarin /bīngl-/ and ancient Viet‑Chamic /bl-/ > Vietnamese /tr-/] 
  • mít 波羅蜜 pōlómì 'jackfruit' 
  • sầuriêng 榴蓮 líulián 'durian'

along with many others such as 'lúa' 來 lái (paddy) ~ 'gạo' 稻 dào (rice), 'chó' 狗 gǒu (dog), 'cọp' 虎 hǔ (tiger), 'voi' 為 wēi (elephant), 'gấu' 熊 xiōng (bear), etc. as have been quoted in the previous chapter.

In any case, the list could be extended indefinitely; it is by no means confined to the examples already cited. The essential point is that no comparable analysis can be carried out with any of the Mon‑Khmer languages. Nor do they exhibit the peculiar linguistic traits that Vietnamese and Chinese share in morphology, phonetics, tonal organization, metaphorical idioms, and distinctive expressions. The two languages also converge in much of their grammar, including the use of classifiers, grammatical markers, prepositions, conjunctions, and other fine structural details. None of the Mon‑Khmer languages – apart from a few features that certain Miao dialects display, if one accepts Forrest's classification of them as Mon‑Khmer – shows even the slightest trace of such distinctive correspondences.

Conclusion

The magnitude of this influence cannot be overstated: Chinese loanwords permeate every register of Vietnamese, from the most elevated literary diction to the most ordinary household vocabulary. At the same time, the persistence of native Austroasiatic forms ensures that Vietnamese remains anchored in its own heritage. The interplay of these two forces – indigenous substratum and Sinitic superstratum – defines the language's character and explains its unique position in East and Southeast Asia.

  • Vietnamese and Chinese correspondences are neither incidental nor superficial.
  • Evidence points to a continuum of contact, adaptation, and integration.
  • Vietnamese identity emerges as hybrid yet distinctive: absorbing Chinese elements on a massive scale, reshaping them through its own phonology, tonal system, and cultural imagination.
  • Result: a lexicon where learned binoms coexist with vernacularized compounds, Tang‑era tonal matrices still resonate, and everyday expressions carry traces of Yue, Han, and local innovation alike.

This paper thus closes with a reframing of Vietnamese linguistic identity: not as a passive recipient of Chinese influence, but as an active participant in a centuries‑long process of borrowing, localization, and creative renewal.


References

  • Alves, Mark J. Identifying Early Sino‑Vietnamese Vocabulary via Linguistic, Historical, Archaeological, and Ethnological Data. 2017.

  • Alves, Mark J. Sino‑Vietnamese Grammatical Vocabulary and Sociolinguistic Conditions for Borrowing. 2005.

  • Karlgren, Bernhard. “The Sino‑Vietnamese Layer.” Bulletin of the Museum of Far Eastern Antiquities 22 (1954): 216.

  • Nguyễn Tài Cẩn. Ngữ âm lịch sử tiếng Việt [Historical Phonology of Vietnamese]. Hà Nội: Nhà xuất bản Khoa học Xã hội, 1979.

  • Wang Li et al. Hànyǔ Shǐgào [Outline of Chinese Linguistics]. Beijing: Zhonghua Shuju, 1956.

  • "Sino‑Vietnamese Vocabulary". Wikipedia. Accessed 2025.