Saturday, April 5, 2025

Chapter 7 - Hypothesis of Common Yue Origin of Vietnamese and Chinese

Executive Summary:

  1. Historical background

    Vietnamese history is inseparable from the long arc of Yue–Han contact. From the Red River Delta through successive dynasties, waves of migration, colonization, and cultural exchange layered the language with Sinitic elements. Archaeological finds such as Đông Sơn drums, genetic studies of Yue‑Han admixture, and the persistence of Yue cultural practices all point to a shared foundation between southern China and northern Vietnam. Over centuries, Vietnamese identity emerged not as an isolated branch but as a composite, shaped by Yue substrata, Han overlay, and later southward expansion into Chamic and Khmer territories.

  2. Core matter of Vietnamese etymology

    The etymological core of Vietnamese is not a single inheritance but a stratified lexicon. Sino‑Vietnamese readings, vernacular Sinitic‑Vietnamese forms, and native substrata coexist, often in doublets or alternates. What appears as 'pure Vietnamese' frequently reveals deeper connections: chim 'bird' with Mon‑Khmer parallels, 'fish' with Austroasiatic cognates, thỏ 'hare' with Chinese roots. The integrity of the language lies not in purity but in the interplay of these strata. Vietnamese etymology must therefore be studied as a layered system, where cultural history and linguistic borrowing are inseparable.

  3. Chinese and the Vietnamese basic vocabulary stock

    The Vietnamese lexicon is saturated with Chinese forms, from high literary registers to everyday speech. Many basic words—terms for kinship, body parts, natural elements, and daily activities—show clear Sinitic correspondences. These entered through multiple channels: formal Sino‑Vietnamese readings, colloquial borrowings from Yue and Minnan dialects, and shared Yue inheritance. Doublets such as sôngnúi # nonsông 江山 (jiāngshān, 'country') or âmthanh # thanhâm 聲音 (shēngyīn, 'sound') illustrate how both orders and multiple sources were absorbed. Without these Sinitic layers, modern Vietnamese would be stripped of much of its expressive power.

  4. A new disyllabic sound change approach to be explored

    The most revealing insight of this chapter is the role of disyllabicity. Vietnamese compounds often invert the order of their Chinese models, as in bắtnạt # 欺負 (qīfù, 'bully') or hồnthiêng # 靈魂 (línghún, 'spirit'). This inversion is not accidental but systematic, reflecting a stage when syllable order was still fluid. By testing reversed orders, scholars can uncover hidden cognates, reconstruct plausible etyma, and explain otherwise opaque forms. This disyllabic approach reframes Vietnamese etymology: instead of treating the language as a borrower of isolated monosyllables, it recognizes Vietnamese as a polysyllabic system that reshaped Chinese inputs according to its own rhythm and logic.

x X x


This study applies a comparative-historical linguistic approach to examine the hypothesis of a common Yue origin for significant portions of the Vietnamese and Chinese lexicons. Historical records, archaeological evidence, and phonological reconstruction are brought together to trace lexical strata across time and space. The primary dataset consists of Sinitic-Vietnamese vocabulary, encompassing both formal Sino-Vietnamese readings and colloquial Vietnamese forms. Supplementary data include Mon-Khmer basic cognates whose relationship to a Yue origin remains unresolved. Lexical items are drawn from historical texts, modern usage, and dialectal sources in both Vietnamese and Chinese.

Sound correspondences between Vietnamese and Chinese forms are identified, with attention to regular shifts, loanword adaptation patterns, and the retention of archaic phonemes. Special focus is given to features preserved in Vietnamese that have been lost or altered in modern Chinese dialects. Lexical meaning is examined diachronically, noting cases of semantic retention, narrowing, broadening, or shift, and cultural context is considered, especially for terms tied to indigenous practices, folk concepts, and ceremonial vocabulary.

Linguistic data are situated within the historical timeline of Yue-Han contact, including periods of colonization, migration, and trade. Archaeological evidence such as inscriptions and artifacts is used to corroborate linguistic findings. Where possible, the study assesses whether a given lexical item is more plausibly a Vietnamese loan into Chinese, a Chinese loan into Vietnamese, or a shared inheritance from Yue, weighing phonological conservatism, semantic distribution, and historical plausibility. By combining philological rigor with comparative reconstruction, the analysis aims to clarify the depth and nature of Yue influence on Vietnamese and to situate that influence within the broader Sino-Tibetan and Mon-Khmer linguistic landscape.

Perpetually situated within China’s cultural and political orbit, Vietnam has navigated a delicate balance, acting as a subordinate state while striving to preserve its sovereignty. Unlike Japan or South Korea, which have decisively stepped out from China’s civilizational shadow, Vietnam has remained closely bound within its gravitational pull. Sino-centric dynamics form the foundational framework for the hypothesis asserting the Chinese origin of identified Sinitic-Vietnamese lexicon within the Vietnamese language, commonly referred to as Sinitic-Vietnamese (VS). Across Vietnam’s long and turbulent history, the specter of Chinese invasion has recurred with a rhythm as familiar as seasonal illness, shaping a persistent backdrop of geopolitical tension between the two nations. Each Vietnamese generation has regarded these threats as tangible and pressing, particularly during periods when the northern empire expanded beyond its bounds.

Having established the historical and methodological foundations, the discussion now turns to the corpus itself. The following datasets present the lexical evidence on which this study’s arguments rest, organized to reveal both the depth and breadth of Sinitic-Vietnamese strata. Each entry is drawn from the primary and supplementary sources described above, and is aligned with its Modern Mandarin form, Middle Chinese reconstructions (Baxter, Pulleyblank), and Old Chinese reconstructions (Baxter-Sagart, Zhengzhang Shangfang), among others. Where relevant, parallels in other Chinese dialects, Mon-Khmer languages, and Sino-Tibetan languages are noted to illuminate phonological correspondences and semantic relationships.

The corpus is arranged to allow the reader to trace individual items across time and space, from their earliest attested forms to their modern reflexes in Vietnamese. This structure makes it possible to observe regular sound changes, loanword adaptation patterns, and the retention of archaic features. It also highlights semantic developments, including narrowing, broadening, or shift, and situates each item within its cultural and historical context.

By moving directly from the conceptual framework into the data, the reader can see how the comparative-historical approach operates in practice. The tables and annotations that follow are intended not only to document the lexical material but also to demonstrate the analytical process, and show how each item contributes to the larger picture of Yue influence on Vietnamese and the interplay between inherited vocabulary and loanwords.


I) Historical background

Table 1 – TIMELINE OF VIETNAM'S HISTORY*


For the most part of its history, the geographical boundary of present day Vietnam covered 3 ethnically distinct nations: a Vietnamese nation, a Cham nation, and a part of the Khmer Empire.

The Viet nation originated in the Red River Delta in present day Northern Vietnam and expanded over its history to the current boundary. It went through a lot of name changes, with Văn Lang being used the longest. Below is a summary of names:

Period Country Name Time Frame Boundary
BaiYue (Prehistoric Yue tribes) 2879-2524 B.C. Stretching from the near bank of the Yangtze River to the southernmost area now called Quảng Trị, adjacent to Champa Kingdom, including the Yunnan, Kweichow, Hunan, Kwangsi and Kwangtung provinces of China.
Hồngbàng Dynasty Vănlang 2524-258 B.C. It was bordered to the east by the East Sea, to the west by Ba Thục; today Sichuan), to the north by Dongting Lake (Hunan), and to the south by Lake Tôn (Champa). The Red River Delta is the home of the LạcViệt culture.
Thục Dynasty ÂuLạc 257-207 B.C. Red River delta and its adjoining north and west mountain regions.
Triệu Dynasty NamViệt 207-111 B.C. ÂuLạc, Guangdong, and Guangxi.
Han Domination Giaochỉ (Jiaozhi) 111 B.C.-39 AD Present-day north and north-central of Vietnam (southern border expanded down to the Ma River and Ca River delta), Guangdong, and Guangxi.
Trưng Sisters Lĩnhnam 40-43 Present-day north and north-central of Vietnam (southern border expanded down to the Ma River and Ca River delta).
Han to Eastern Wu Domination Giaochỉ 43-229 Present-day north and north-central of Vietnam (southern border expanded down to the Ma River and Ca River delta), Guangdong, and Guangxi.
Eastern Wu to Liang Domination Giaochâu (Jiaozhou) 229-544 Same as above
Anterior Lý Dynasty Vạnxuân 544-602 Same as above.
Sui Domination Giaochâu 602-618 Same as above
Tang Domination Annam 618-866 Same as above
Tang Domination, Autonomy (Khúc family, Dương Đình Nghệ, and Kiều Công Tiễn), Ngô Dynasty Tĩnh Hảiquân 866-967 Same as above
Đinh, Anterior Lê and Lý Dynasty ĐạicồViệt 968-1054 Same as above.
 and Trần Dynasty ĐạiViệt 1054-1400 Southern border expanded down to present-day Huế area.
Hồ Dynasty Đạingu 1400–1407 Same as above.
Ming Domination and Posterior Trần Dynasty Giaochỉ 1407–1427 Same as above.
, Mạc, TrịnhNguyễn Lords, Tâysơn Dynasty, Nguyễn Dynasty ĐạiViệt 1428-1804 Gradually expanded to the boundary of present day Vietnam.
Nguyễn Dynasty Việtnam 1804–1839 Present-day Vietnam plus some occupied territories in Laos and Cambodia.
Nguyễn Dynasty Đạinam 1839–1887 Same as above
Nguyễn Dynasty and French Protectorate French Indochina, consisting of Cochinchina (southern Vietnam), Annam (central Vietnam), Tonkin (northern Vietnam), Cambodia, and Laos 1887–1945 Present-day Vietnam, Laos, and Cambodia.
Republican Era Việt Nam (with variances such as Democratic Republic, State of Vietnam, Republic of Vietnam, Socialist Republic) Democratic Republic of Vietnam (1945–1976 in North Vietnam),
State of Vietnam (1949–1955),
Republic of Vietnam (1955–1975 in South Vietnam),
Socialist Republic of Vietnam (1976–present)
Present-day Vietnam.

Almost all Vietnamese dynasties are named after the king's family name, unlike the Chinese dynasties, whose names are dictated by the dynasty founders and often used as the country's name.

The Hồngbàng Dynasty was a dynasty of the LạcViệt nation before recorded history. The Thục, Triệu, Anterior Lý, Ngô, Đinh, Anterior Lê, , Trần, Hồ, , Mạc, Tâysơn, and Nguyễn are usually regarded by historians as formal dynasties. Nguyễn Huệ's "Tâysơn Dynasty" is rather a name created by historians to avoid confusion with Nguyễn Ánh's Nguyễn Dynasty.


Historically, both prior to and following China’s domination of Annam, Chinese immigrants undeniably introduced Confucian culture into Vietnamese society. In Vietnam, numerous facets of daily life, ranging from customs and traditions to family names and place names, mirror their Chinese counterparts, often to the point of replication. Chinese culture has long been held in high esteem and practiced with such rigidity that changes, whether beneficial or detrimental, have occurred gradually. This stands in contrast to Japan and Korea, where Chinese festivals and holidays were decisively abandoned in favor of localized cultural identities.

For example, while the Lunar New Year Festival (Tết) and the Mid-Autumn Festival (Tết Trung Thu) have largely disappeared from public life in Korea and Japan, they continue to be passionately celebrated in Vietnam. These festivities are deeply intertwined with ancestral tomb-clearing rituals observed during the Spring and Winter Solstices, practices that remain firmly rooted in the Vietnamese cultural psyche.

However, other Chinese-origin festivals such as Tết Nguyên Tiêu (元宵節 'Lantern Festival') and Tết Đoan Ngọ (端午節 'Dragon Boat Festival') have seen a decline in observance, particularly due to the disruptions caused by prolonged warfare, most notably during the final decades of the 20th century. Despite this decline, there are clear signs of a cultural resurgence: these traditions have begun to reemerge in the early 21st century with renewed vigor and public enthusiasm.

In the wake of the 1979 border conflict with China, a wave of nationalism emerged in Vietnam. During this period, authorities attempted to recalibrate the national Lunar New Year Festival by aligning it with the Vietnamese lunar calendar in such a way that it would precede China’s celebration by one month. The result? In February 1985, Vietnamese citizens ended up celebrating Tết twice, once according to the revised calendar and again in sync with China’s traditional date. Ironically, the second celebration was even more elaborate and joyous than the first. To appreciate the cultural magnitude of Tết in Vietnam, one need only consider that its duration and significance rival the combined festivities of Thanksgiving, Christmas, and New Year in the West.

Table 2 – Computing the Vietnamese lunar calendar

"1985 is one of the few years where Vietnamese and Chinese calendars differ significantly: the Vietnamese New Year was 1 month earlier than the Chinese one. The reason can be detected from the above table (informatik.uni-leipzig.de). The Winter Solstice 1984 falls on 12/21/1984 Hanoi time, but on 12/22/1984 Beijing time, the same day as the New Moon. The month 11 of the Chinese year must contain the Winter Solstice, so it is not the month from 11/23/1984 to 12/21/1984 like in the Vietnamese calendar, but the one starting 12/22/1984. Consequently, the subsequent months (12, 1,...) also start about one month later than the corresponding months of the Vietnamese calendar. While New Year in Vietnam falls on 1/21/1985, it is on 2/20/1985 in China. The two calendars agree again after a leap month is inserted to the Vietnamese calendar (month from 3/21/1985 to 4/19/1985, as seen above). Also, in year 1984 the Chinese lunar month from 11/23/1984 to 12/21/1984 is the first lunar month after Winter Solstice 1983 that does not contain a Major Term and is therefore a leap month."

In the 21st century "there are 3 years where the Lunar New Year begins at different dates in Vietnam and in China. In 2007 the Vietnamese New Year is on 2/17/2007, the Chinese one on 2/18/2007. In 2030 the dates are 2/2/2030 and 2/3/2030, and in 2053 they are 2/18/2053 and 2/19/2053. "

Source: http://www.informatik.uni-leipzig.de/~duc/amlich/calrules_en.html

Over time, Vietnamese cultural and linguistic elements have decisively supplanted the indigenous Chamic and Khmer characteristics that once defined the central territories of the now-vanished Champa and Khmer kingdoms—regions annexed by Việtnam in the relatively recent past. Strikingly, many of their placenames are direct duplicates of those found in the old Middle Kingdom.

With the exception of Vietnam’s northwestern provinces—where native toponyms reflect the languages of indigenous minority groups who remain the demographic majority—most other regions have undergone sweeping renaming. Late resettlers from northern Vietnam, including migrants from southern regions of China, gradually replaced local placenames from north to south. This transformation extends from the current northern central territory at the 16th parallel all the way to the southernmost tip of Càmau Province, facing the Gulf of Thailand, spanning more than 3,260 kilometers of coastline.

Illustrative examples include

  • Tháinguyên (Tàiyuán 太原) 
  • Sơntây (Shānxī 山西) 
  • Hànội (Hénèi 河內) 
  • Hànam (Hénán 河南) 
  • Hàbắc (Héběi 河北) 
  • Hàđông (Hédōng 河東) 
  • Hàtây (Héxī 河西) 
  • Trùngkhánh (Chóngqìng 重慶) 
  • Tràngan (Cháng’ān 長安) — also known as Trườngyên, both used for the 10th-century capital during the Lý Dynasty and still applied to Hànội in early 20th-century usage 
  • Bắcninh and Tâyninh (Běiníng 北寧 ‘Pacified North’ and Xīníng 西寧 ‘Pacified West’) — paralleling Xīníng in Xīnjiāng and contrasting with Nánníng 南寧 ‘Pacified South’ in Guăngxī Thuậnhoá (Shùnhuá 順化) Quảngnam (Guăngnán 廣南 ‘Greater South’) — in contrast to Guăngdōng 廣東 ‘Greater East’ and Guăngxī 廣西 ‘Greater West’

Even ostensibly native Vietnamese terms often reveal Sino-Vietnamese roots. For example: 

Kẻchợ (Jīngchéng 京城, SV kinhđô) meaning 'Capital', comparable to Japanese Keijō Đànẵng, historically transcribed as Xiàngăng 峴港 (SV Hiệncảng), was pronounced Kẻon by Fukienese and Hainanese communities as early as the 18th century

Beyond major cities, countless townships and villages bear names constructed from Sino–Vietnamese elements, such as

  • Hoàihương (Huáixiāng 懷鄉) 
  • Bồngsơn (Péngshān 蓬山) 
  • Bìnhtân (Píngxīn 平津) 
  • Longan (Lóng’ān 隆安) 
  • Gianghĩa (Jiāyì 嘉義) 
  • Longxuyên (Lóngchuān 龍川)

This naming convention mirrors colonial practices in the United States, where English placenames were transplanted to the East Coast, e.g., New England, New York, New Hampshire.

The number of Chinese placenames used in Vietnam is virtually incalculable. In addition to inherited names, new ones have been coined using Sino-Vietnamese vocabulary, particularly in territories acquired from the Champa and Khmer kingdoms as recently as the 18th century. These names evoke a nostalgic familiarity with HánViệt tradition while still retaining traces of aboriginal identity. Examples include Quynhơn, Nhatrang, Phanrang, Sóctrăng, each reflecting the southward expansion of early Vietnamese migrants over the past few centuries.




Figure 1 – Map of the ancient Kingdom of Champa (Campadesa - 2nd to 18th century)


The territory of Champa, depicted in green, lay along the coast of present-day southern Vietnam. To the north (in yellow) lay ĐạiViệt; to the west (in blue), Angkor.
(Source: https://en.wikipedia.org/wiki/Champa)

Archaeological excavations in Việtnam have unearthed bronze drums, most notably the Đôngsơn and Ngọclữ types, buried deep beneath thick layers of earth. These artifacts reflect a highly advanced metallurgical tradition, yet the modern Vietnamese inhabitants living atop these layers had no prior knowledge of their existence. Nevertheless, they claim descent from the creators of these drums. The decorative motifs etched onto the surface and rim depicting wooden boats and long-feathered birds are attributed to the LạcViệt 鵅越 (LuóYuè) people and closely resemble designs found on bronze drums still used by the Zhuang ethnic group, known in Vietnamese as Nùng.

The Zhuang are the largest minority in southern China’s Quảngtây Autonomous Region, numbering over 18 million, not including those residing in Vietnam’s northern highlands. Unlike the forgotten relics buried in Vietnamese soil, Zhuang communities continue to use these drums in ritual and ceremonial contexts, preserving a cultural continuity that suggests direct descent from the original artisans. In contrast, the self-identified Yue descendants, namely, the Vietnamese, appear disconnected from the spiritual and technical heritage embedded in these artifacts.

This disconnect may stem from centuries of warfare that fractured aboriginal cultural links. The survival of bronze drums in Vietnam is likely due to their burial, which spared them from the Han Dynasty’s widespread melt-down policies. Despite the enduring bronze drum subculture in China South, which evidences Yue cultural roots in ancient Annam, prolonged Chinese rule contributed to the extinction of Yue metallurgical knowledge. Following Han domination, waves of Chinese immigrants accelerated the colonization of Annam.

Vietnamese archaeologists have claimed ownership of these relics, asserting they belong to native Yue ancestors who once inhabited northern Việtnam. However, the Bronze Age predates the emergence of the Annamese, an ethnic amalgam of Yue and Han. Thus, the terms Annamese or Vietnamese denote evolving indigenous identities shaped by successive Han migrations. This mirrors how the term Sinitic came to represent the broader concept of “Chinese.”

Claims by overzealous Vietnamese nationalist scholars regarding artifact origins in southern Vietnam are historically tenuous. The assertion that these relics were created by "Vietnamese ancestors" is unfounded, as the region was only annexed in the late 18th century. Cultural artifacts found there belonged to the ancient Chamic and Khmer civilizations. As recently as five centuries ago, the border of ĐạiViệt ended at present-day Thanhhoá Province. The Chamic kingdoms to the south had ruled the area for over a millennium under hereditary monarchies. Only after their decline in the 13th century did Annamese dynasties begin territorial expansion. From that point, Kinh settlers migrated en masse beyond Thuậnhoá Province, where the capital Huế was established in the early 19th century. By then, Việtnam had extended its reach to the southern tip of Càmau Province, facing the Gulf of Thailand.

Anthropological evidence from the past six decades supports the hypothesis that Taic aboriginals, ancestors of the Yue, intermixed with nomadic Tibetan-origin peoples to form the proto-Chinese population. These pre-Sinitic groups migrated from infertile northwestern regions toward the fertile lands of China South as early as 4,000 years ago (see Shifan Peng, 1987). The terms Dai and Tai are often used interchangeably, with Taiwanese scholars in the 1960s using 臺 for Tai (Ding Bangxin, 1977) and 傣 f or   Dai.

Over millennia, facing invasions by northern Tartarian horsemen, various Yue tribes (BáchViệt 百越) fled southward, eventually reaching the Indo-Chinese peninsula, including northeastern Myanmar,  southeastern India, Thailand, and Càmau Cape (formerly Ttœ̆kkhmau, 'Black Ink'), Cambodia . This migratory pattern aligns with Austroasiatic linguistic theories, linking Yue ancestry to groups like the Munda, Mon-Khmer, and Chamic branches of Austroasiatic and Austronesian languages. Shared lexical items, such as Khmer numbers 1–5 or Chamic demonstratives, support this connection.

Historically, the Khmer and Chamic peoples founded two of Southeast Asia’s most powerful kingdoms: the Khmer Empire and the Champa Kingdom. Genetic studies link the ancient Chamic people to the Li ethnic group of Hainan Island. These Li may not have known that their southern cousins built a kingdom lasting over 1,600 years, from the 2nd century to 1832, recorded in Chinese history as Lâmấp (林邑) and Chiêmthành (占婆國).

As for Austroasiatic linguistic influence on Vietnamese, its roots lie deep in aboriginal substrata. Basic lexical remnants, such as 'cá' (fish) from OC */nga/ and 'mắt' (eye) from OC */mukw8/, appear across regional languages: Khmer /ka:/, Proto-Austroasiatic /*ka/, and Malay 'mata'. These shared forms suggest that Khmer and Chamic lexicons are embedded in Vietnamese, classified under the Austroasiatic Mon-Khmer family. Yet this phenomenon likely reflects the result of geographic contact and lexical diffusion, rather than direct lineage.

For example, Chinese zodiac animal names were first adopted by pre-Qin-Han peoples and later integrated into Vietnamese with localized forms: SV 'tý' 子 zǐ (VS chuột), 'sửu' 丑 chǒu (trâu), 'dần' 寅 yín (cọp = chằn), 'mão' 卯 máo (mẹo = mèo), and so on (see An Chi, Rong chơi Miền Chữ nghĩa, 2016, Vol.1, pp. 80–86, 159–183).

Ethnically speaking, it is unsurprising that in the 21st century, Vietnam’s population, officially composed of 54 recognized ethnic minorities, allows any citizen to trace their ancestry to one of these groups. This is especially plausible when considering historical and geographic factors. For instance, native archaeologists born in Sahuỳnh, a region once part of the Champa kingdom and annexed by the Trần Dynasty in the 13th century, may proudly assert that the cultural artifacts of the Sahuỳnh Civilization unearthed in their homeland were created by their ancestors, regardless of whether those ancestors were originally indigenous.

However, from a strict national perspective, it is inaccurate to claim that the creators of these artifacts were truly ancestors of the Vietnamese people. The Kinh majority now with over 85.32% of the population, distinct from the ethnic minorities, only gradually settled near these archaeological sites over the course of only nearly 1000 years, following a southward migratory trajectory from present-day Thanhhoá Province. This long historical movement complicates any direct ancestral claims and underscores the layered and composite nature of Vietnamese identity.

The core issue discussed above pertains specifically to Vietnamese nationals identified as of Kinh ethnicity in the most recent census, including those whose family lineage traces back to ancestors born and raised in regions where archaeological artifacts have been discovered. Such claims are only justifiable within a historical timeline that aligns with the full annexation of lands following the decline of the Champa kingdom or Khmer empire. Individuals of Chamic or Khmer descent may rightly assert ancestral ties to these artifacts. The author himself was born in the Champa region in central Vietnam, as were his parents; however, cultural relics excavated from that area do not belong to long-deceased native artisans from his paternal line. Considering the age of these artifacts, it would be historically inaccurate to claim descent from their original creators, especially when the author's paternal ancestors migrated from China as recently as the 19th century.

As previously emphasized, the Kinh people, whose early ancestors formed the demographic majority, emerged from a genetic amalgam symbolized here as {4Y6Z8H}, representing grafted Yue-Han lineages rooted in Taic ancestry. Anthropologically, Han dynasty Chinese, descendants of ancient Chu and other Yue states, invaded Annam and intermingled with indigenous populations in the Red River Basin, many of whom also traced their origins to China South. Over the past 900 years, Vietnamese forebears replicated this migratory pattern, gradually expanding southward. The final leg of this journey brought the Kinh majority to Camau Cape at the southern tip of the Indochinese peninsula. Thus, the rightful cultural ownership of artifacts found in central and southern Vietnam depends heavily on the time period in which those relics were created, represented here as {4Y6Z8H+CMK} versus {CMK}.

When discussing "roots", we return to the biological foundation, genomes that manifest in physical traits such as appearance and complexion. To the untrained eye, even Western observers may struggle to distinguish Vietnamese individuals from Chinese in mixed groups, such as second-generation students in American institutions. Similarly, from a linguistic standpoint, while Westerners may easily differentiate Vietnamese from Mon-Khmer languages, they often find it difficult to distinguish between Cantonese and Vietnamese speakers. This is due to the fact that Vietnamese functions more as a Sino-xenic language, Yue-based but infused with extensive Sinitic elements, than as a Mon-Khmer tongue.

It is no secret that Vietnamese shares a substantial portion of its vocabulary with Chinese, more so than with any other linguistic source. These shared features stem from deep historical imprints left by ancient Chinese forms and regional dialects. Why do their speech patterns resemble each other so closely? Evidence suggests a biological connection. In terms of genetic affiliation with neighboring populations in China South, advances in DNA biotechnology are poised to help anthropologists uncover more precise genetic data regarding the Vietnamese people’s composition, herein symbolized as {4Y6Z8H+CMK}.

As a matter of fact, genetically, on the DNA side, at present time there appear new scientific studies made available on the internet at our finger tips, for example, the quoted abstract from https://www.ncbi.nlm.nih.gov/ cited in the textbox below is one among them.

Table 3 - HLA-DR and -DQB1 DNA polymorphisms in a Vietnamese Kinh population from Hanoi.


Vu-Trieu A, Djoulah S, Tran-Thi C, Nguyen-Thanh T[sic], Le Monnier De Gouville I, Hors J, Sanchez-Mazas A.
Source: Department of Immunology and Physiopathology, Medical College of Hanoi, Vietnam.

Abstract:

We report here the DNA polymerase chain reaction sequence-specific oligonucleotide (PCR-SSO) typing of the HLA-DR B1, B3, B4, B5 and DQB1 loci for a sample of 103 Vietnamese Kinh from Hanoi, and compare their allele and haplotype frequencies to other East Asiatic and Oceanian populations studied during the 11th and 12th International HLA Workshops. The Kinh exhibit some very high-frequency alleles both at DRB1 (1202, which has been confirmed by DNA sequencing, and 0901) and DQB1 (0301, 03032, 0501) loci, which make them one of the most homogeneous population tested so far for HLA class II in East Asia. Three haplotypes account for almost 50% of the total haplotype frequencies in the Vietnamese. The most frequent haplotype is HLA-DRB1*1202-DRB3*0301-DQB1*0301 (28%), which is also predominant in Southern Chinese, Micronesians and Javanese. On the other hand, DRB1*1201 (frequent in the Pacific) is virtually absent in the Vietnamese. The second most frequent haplotype is DRB1*0901-DRB4*01011-DQB1*03032 (14%), which is also commonly observed in Chinese populations from different origins, but with a different accessory chain (DRB4*0301) in most ethnic groups. Genetic distances computed for a set of Asiatic and Oceanian populations tested for DRB1 and DQB1 and their significance indicate that the Vietnamese are close to the Thai, and to the Chinese from different locations. These results, which are in agreement with archaeological and linguistic evidence, contribute to a better understanding of the origin of the Vietnamese population, which has until now not been clear.
PMID:9442802[PubMed - indexed for MEDLINE]

Source: http://www.ncbi.nlm.nih.gov/pubmed/9442802

From the first chapter the author has gone a great length to substantiate a hypothesis that today's Vietnamese Kinh racial stock come out of a mixed stock, so is their language as a result of the proto-Chinese moving in into China South from the southwest hundreds of years prior to the Western Han period (206 B.C.). After hundreds of years the new racially-mixed populace then continued to emigrate southward on a lager scale to today's Vietnam's northern region, and after that, part of land had been annexed to the Han's map. For the prehistoric evidences, archaeologically,

Table 4 - Affinities of the Mạnbắc people
to later early Metal Age Đôngsơn Vietnamese


The excavation of the Man Bac site (c. 3800–3500 years BP) in Ninh Binh Province, Northern Vietnam, yielded a large mortuary assemblage. A total of 31 inhumations were recovered during the 2004–2005 excavation. Multivariate comparisons using cranial and dental metrics demonstrated close affinities of the Man Bac people to later early Metal Age Dong Son Vietnamese and early and modern samples from southern China including the Neolithic to Western Han period samples from the Yangtze Basin. In contrast, large morphological gaps were found between the Man Bac people, except for a single individual, and the other earlier prehistoric Vietnamese samples represented by Hoabinhian and early Neolithic Bac Son and Da But cultural contexts. These findings suggest the initial appearance of immigrants in northern Vietnam, who were biologically related to pre- or early historic population stocks in northern or eastern peripheral areas, including Southern China. The Man Bac skeletons support the ‘two-layer’ hypothesis in discussions pertaining to the population history of Southeast Asia. 
(See Morphometric affinity of the late Neolithic human remains from Man Bac, Ninh Binh Province, Vietnam: key skeletons with which to debate the ‘two layer’ hypothesis, co-authored by Hirofumi MATSUMURA, Marc F. OXENHAM, Yukio DODO, Kate DOMETT, Nguyen Kim THUY, Nguyen Lan CUONG, Nguyen Kim DUNG, Damien HUFFER, Mariko YAMAGATA (2007) at http://www.jstage.jst.go.jp/article/ase/116/2/135/_pdf

The findings above support not only the theory behind the formation of Vietnam’s dominant Kinh population, but also reflect broader ethnological patterns observed across China South and Southeast Asia. Over thousands of years, successive waves of Chinese migrants, some of Altaic origin from China North, including eastern Hakka groups, migrated southward, gradually displacing native populations in regions such as Mạnbắc of Ninhbình Province.

Interestingly, among these early resettlers, one of the most picturesque sites in the province where water and mountains converge was named 'Tràngan', a name echoing Chang’an (長安), the ancient capital of imperial China. This makes  Tràngan > the second location in Vietnam, alongside Hanoi, to bear a name directly inspired by the historical heart of the Middle Kingdom. (It is worth noting that the original Chang’an is now known as Xi’an, located in Shaanxi Province, China, and served as the capital during the Tang dynasty.)

This migratory process may have occurred at various points over the past 3,800 years. According to prevailing hypotheses, the racial composition of today’s Vietnamese Kinh people reflects a blend of displaced Chinese migrants from both China North and China South. The latter are considered descendants of the BaiYue 百越, or BáchViệt, collectively known as the Yue 粵 (also 越, 鉞 in ancient records), encompassing ethnic groups such as Dai 傣 (VS Tày), Zhuang 莊 (Bouxcueng, VS Nùng), Tong 垌, Shui 水, Maonan 毛南 (Môn), Miao 苗 (Mèo, Hmong), and other southern minorities.

The ancestors of the Yue people are believed to have descended from the Taic people (原始 傣族) prior to 3000 B.C. These Taic groups were not only progenitors of the modern Dai people, now found in Yunnan, Guangxi, Thailand, Laos, and northern Vietnam, but also populated early states such as Zhou 周朝, Chu 楚國, and the pre-Qin-Han polities. Over generations, these populations intermingled with other ethnic groups during the Warring States period, and later with Qin 秦 and Western Han 西漢 peoples, forming the Han majority and the broader Chinese national identity. Many of their descendants, including native minorities, still reside in these regions today (see Xu Liting, 1981).

Before the rise of the Yue, especially those of Chu and NamViet polities, and Altaic Turkic groups, the Taic people were dominant across vast territories in China South. Their lands stretched along both banks of the Yangtze River, extending eastward to the East China Sea and southward to provinces such as Yunnan, Sichuan, Guangxi, Guizhou, Hubei, Hunan, Jiangxi, Guangdong, Fujian, and Jiangsu, including northern Vietnam.

Populations of the Eastern Zhou period (770-221 B.C.) are thought to be racially mixed, combining Taic ancestry with earlier Shang Dynasty (1600-1050 B.C.) and Western Zhou (1046-771 B.C.) peoples. These groups may have emerged from interactions between Tibetan nomads and proto-Taic communities during the Xia Dynasty. The Qin State (778-222 B.C.), the most powerful among the Warring States, absorbed many of these populations. Continuous warfare with Yin invaders and other rival states disrupted agricultural life for Yue communities along the southern Yangtze, forcing many to flee southward in search of safety and sustenance.

Terrien de Lacouperie (1887), in The Languages of China Before the Chinese, examined the linguistic heritage of pre-Chinese races. He cited Mencius (孟子 Mengzi, the 4th century B.C.), who noted the distinct shrillness of the Chu language compared to that of Qi  (齊 of Shandong) . In the Zuozhuan (左傳) chronicle (663 B.C.), a Chu child named 'Tou-wutu' whose name combined words for "suckling" ('Tou' or 'nou') and "tiger" ('wutu' 於虎兔), was saved and nursed by a tigress, later becoming Tze-wen, a minister of Chu. The terms “tou” and “wutu” reflect Taic-Shan vocabulary: "dut" in Siamese means "suckle", and "htso", "tso,", or "su" refer to "tiger.". In Vietnamese, similar expressions exist:  'cọp đút', 'hùm đút', or 'hổ đút'. Though these forms have decayed over time, they persist in Tchungkia dialects of Jiangxi (江西), of the ancient Chu proper. which resemble Taic-Shang speech "to such an extent that Siamese-speaking travelers could without much difficulty understand it." The Erya  (爾雅) dictionary contains 928 regional loanwords, many transcribed from Taic-Chu languages using Chinese homonyms. These linguistic remnants suggest that Taic and Yue languages predate Chinese.

During the rise and fall of the Qin Empire (221-206 B.C.), many Yue natives were absorbed into the unified state. The new entity known as either Qin, Chin, Chine, or Chinese, comprised racially mixed populations from conquered northeastern and southeastern territories, roughly equivalent to half of modern China.

After Qin’s collapse, the Chu State was defeated by Liu Bang, the Viceroy of Hanzhong, who founded the Han Empire (208 B.C.). The term "Han" derives from this lineage. Although the Han Chinese identity formally emerged after 208 B.C., it retroactively applies to earlier populations.

Following the Han conquest of the NanYue Kingdom in 111 B.C., colonization of Annam (Giaochâu Prefecture) continued through successive dynasties until 939 A.D. The Sinicization of native Yue peoples, ancestors of today’s Guangxi and Guangdong inhabitants, including those of Giaochi Prefect, accelerated. The formation of the Middle Kingdom can be viewed as the birth of a "Chinese Union".

Ironically, the same expansionist model later applied by Chinese dynasties was replicated by Annamese monarchs. From their base in the Red River Basin, they expanded westward and southward, encountering Austronesian and Austroasiatic speakers, namely, Chamic and Mon-Khmer peoples. Highland minorities who resisted assimilation were labeled Mọi, or "barbarians", echoing the Chinese term Man (蠻 SV Man) for non-Han peoples.

This raises a historical contradiction: if Mon-Khmer groups shared racial ties with Vietnamese Kinh  (京), why were they subjected to widespread discrimination? In contrast, northern minorities of Dai or Yue origin received relatively fair treatment. Chinese immigrants, culturally distinct, were often treated favorably (華) and typically assimilated into Vietnamese society within one or two generations.

Today, wherever Vietnamese communities exist globally, individuals can attest to the subtle but pervasive influence of Chinese heritage in Vietnamese identity. It is no surprise, then, that what is "Vietnamese" often feels unmistakably "Chinese.".

By now, it is clear that after Annam was established as a prefecture of China, it was administered in much the same way as Fujian and Guangdong provinces. This arrangement persisted until its formal independence in 939. The name 'Vietnam' did not appear in international usage until as late as 1920. The ancient Annamese entity emerged through waves of immigration from China South, under the expansive reach of the Middle Kingdom.

Initially, many of the newcomers to Annam were long-march soldiers, exhausted from continuous campaigns of conquest and pacification. These were followed by émigrés, including large numbers of disgraced political exiles and their families, who had been purged by volatile dynasties they once served (Bo Yang, 1983-1993) (南). Alongside them came a broader influx of resettlers: impoverished peasants fleeing war, famine, and oppression in their native provinces. For most, the journey south was one-way; few ever returned to their homeland.

As previously noted, the decision to settle permanently in what is now northern Vietnam may have stemmed from a deeply rooted migratory impulse. Many of these men married, or were married into, local indigenous families, forming new kinship ties with native wives. This pattern of integration mirrors similar cases observed in modern Taiwan, where migration and intermarriage have likewise shaped cultural identity.

Anyone familiar with Chinese history knows that Chinese immigrants almost never return to their native birth villages. This "melting pot" dynamic also supports the observation that virtually all Vietnamese bear Chinese surnames.

Geographically, for the same reason, most of the placenames where earlier settlers and the Kinh have lived correspond directly to names of places in China. Examples include Sơntây (山西 Shānxī), Hànội (河內 Hénèi), Hàđông (河東 Hédōng), Hànam (河南 Hénán), Thuậnhoá (順化 Shùnhuà), Quảngnam (廣南 Guăngnán), Tâyninh (西寧 Xīníng), Bìnhtân (平津 Píngjīn), Longan (隆安 Lóng'ān), and Gianghĩa (嘉義 Jiāyì), among others.

Over the years and many generations later, those early immigrants, whether long resettled from regions further north or arriving after the Han Chinese expansion to the south, were fully assimilated into the highly Sinicized Annamese society. This was especially true for southern Chinese groups such as Hakka (客家 Hẹ), Cantonese (廣東 Quảngđông), Hainanese (海南 Hảinam), Fukienese (福州 Phúckiến), and Tchiewchow (潮州 Triềuchâu). Their assimilation may have occurred slowly but steadily, one generation at a time, each eventually merging into the melting pot of racially mixed people identified in official census records as the Kinh ethnicity.

Together, they form the Vietnamese nationals alongside more than 50 other major ethnic groups, such as the Miao (苗 Hmong, or 'Mèo'), Zhuang (壯 Nùng), Dai (傣 Tày), Tai Noir (黑傣 Tháiđen), Tai Blanc (白傣 Tháitrắng), Chamic (占婆 Chămpa), and Khmer (高棉 Caomiên).

Chinese immigrants brought with them their culture and dialects, which infused fresh colloquial elements into the early Vietic language, supplementing the more prestigious Mandarin lingua franca spoken by ruling officials. This process gave rise to both Sinitic-Vietnamese and Sino-Vietnamese. While there are studies analyzing the transformation of Middle Chinese phonology into Sino-Vietnamese and reconstructing its possible sound system (Nguyễn Tài Cẩn, 1979), there is still no comprehensive research on how both official Mandarin and its vernacular forms penetrated ancient Annamese languages, leading to the development of Sinitic-Vietnamese vocabulary after a millennium of Han domination.

This long process shaped both the semantic and phonological aspects of Sinitic-Vietnamese words of Chinese origin. In modern Vietnamese, there are disyllabic forms and common expressions that can be traced to early Mandarin, for example:

  • 'bậnviệc' = 忙活 mánghuó (busy)
  • 'bưngbít' = 蒙蔽 méngbì (hoodwink)
  • 'mắcbịnh' = 犯病 fànbìng (get sick)
  • 'ănmày' = 要飯 yàofàn (beggar)

The presence of these etyma suggests the deep influence of ancient Chinese on Vietnamese by the end of the Tang Dynasty (907 A.D.). Etymologically, Sinitic-Vietnamese cognates with basic words in Amoy (廈門 Xiàmén) and Cantonese indicate a shared aboriginal linguistic substratum, likely of Taic origin. Examples include Hainanese and Amoy /bat7/ 'biết' (know), /kẽ/ 'con' (child), /suã/ 'soài' (mango), and Cantonese /t'aj3/ 'thấy' (see), /lei2/ 'lưỡi' (tongue), /o5/ 'ỉa' (poo).

Vietnamese grammar also reveals a distinctive local word order, probably inherited from the aboriginal Yue language, in which adjectives (modifiers) follow nouns (the modified). For example, 'gàcồ' 雞公 (rooster). This reverse word order, similar to that of the Zhuang (Nùng) and Dai (Tày), is a clear inheritance from the original Taic speech. Later Old Chinese grammatical forms of this type can still be found in some southern Chinese dialects, such as Cantonese, Amoy, and Hainanese (/kaj1kong1/).

For these latter linguistic groups, Chinese academic institutes officially classify them as Chinese dialects due to their large Sinitic vocabularies and other linguistic features, grammar, tonality, and vocabulary, largely on par, both semantically and phonologically, with those of Chinese.  (門)

While Cantonese and Fukienese speakers within their own regions continued to be increasingly Sinicized under the influence of the Sino-sphere across successive Chinese dynasties from ancient times to the present, the Annamese people and their independent state, by contrast, have for the past 1,200 years charted their own course and forged a distinct identity. They were fortunate to avoid the fate of their ancient Fukienese and Cantonese neighbors, who became fully Sinicized under northern rulers without even realizing it. As a result, despite sharing certain parallels in historical development during antiquity, the Chinese language never fully supplanted the core syntactic structure of the Vietnamese language. This is why Vietnamese did not become a Chinese dialect, even though the integration of northern Chinese diasporas, such as colonial officers and their accompanying military forces, into the Annamese majority took place over many centuries.

Cultural factors, such as a shared foundation in Confucian values, undoubtedly eased the integration of Chinese immigrants into the Annamese social fabric, which readily absorbed new settlers. Linguistic similarities in the host country further accelerated their assimilation into their new homeland. Without understanding this anthropological accelerant of how the children of Chinese immigrants became members of the Kinh majority, we cannot fully grasp the true nature of both the origins of the Vietnamese language and the identity of its speakers.

By contrast, Chinese immigrants who settled in other Asian countries often experienced a different process of adaptation, one heavily dependent on the generosity and tolerance of the host nation. In many cases, their assimilation into the host culture, language, and society was far less complete than in Vietnam. Nowhere else in Austroasiatic- or Austronesian-speaking regions can we observe a process quite like that in Vietnam. For example, in Malaysia, Indonesia, and even in Confucian societies such as Japan and Korea, Chinese minorities, descendants of immigrants who arrived many generations ago, have often faced persistent discrimination.

In fact, demographic statistics show that in Southeast Asian countries such as Malaysia and Indonesia, the proportion of ethnic Chinese is significantly higher than that of Vietnam at the mere 1% as recorded in her official 2009 census. This striking fact strongly supports the conclusion that Chinese immigrants in Vietnam have been almost entirely assimilated. It raises the question: "What has actually happened to so many of those earlier Chinese immigrants in Vietnam, now that they seem to have disappeared from census data?" The evident answer is, "They have already become an integrated part of the Kinh populace."

Figure 5 – Vietnam's territorial expansion


Map of Vietnam showing the conquest of the south (the Namtiến, 1069-1757).

Orange: Before the 11th century. Yellow: 11th century. Light Green: 15th century. Dark Green: 16th century. Purple: 18th century. Laichâu and Điệnbiên (the Northwest): 19th century.
(Source: https://en.wikipedia.org/wiki/History_of_Vietnam)

In the 21st century, a troubling trend has emerged in Vietnam, marked by both the growth and ethnic segregation of non-immigrant Chinese workers—a development that recalls unsettling parallels from the past. Over the last two decades, new Chinese resettlements have appeared to accommodate recently arrived laborers, most of them poor migrants from remote villages in China who often overstay their work visas. These communities, resembling new 'Chinatowns,' have sprung up across Vietnam, typically clustered around Chinese-invested overseas economic zones, many of which operate under 50- or 90-year leases for mines, forests, and land. The anti-China riots of mid-May 2014, which spread across hundreds of Chinese-owned factories in Vietnam and resulted in widespread arson and 26 deaths, were speculated by some to have been staged events orchestrated by Chinese agents to escalate tensions and justify more aggressive actions in Vietnam. This episode echoed earlier events, most notably the expulsion of the Hoa ethnic community, which ultimately contributed to the outbreak of the Sino-Vietnam border war in 1979.




The history of the formation of Vietnam’s Kinh people is essentially the story of the integration of Chinese immigrants into Vietnamese society, both in the past and in the present. This view is supported by historical evidence, beginning in the northern regions of today’s Vietnam, where racially mixed tribes descended from various branches of the ancient Yue had already existed. The same process of integration continued as the country expanded southward, and this is reflected in the physical appearance of populations across different regions.

Northern Vietnamese today resemble southern Chinese more closely than the earlier Annamese who migrated further south after the 13th century. Those settlers intermarried with indigenous groups theorized to be of Polynesian and Malay origin within the Austronesian stock, as well as with Mon-Khmer populations belonging to the Austroasiatic stock.


The integration of early Chinese immigrants into Annam began in the aftermath of the Qin invasion, when troops marched southward. This was followed by the arrival of Han colonial administrators and their infantry, and later by successive waves of Chinese migrants drawn to Annam’s fertile lands and rich vegetation. These newcomers merged with earlier settlers and their descendants, gradually forming part of the Kinh majority in ancient Vietnamese society. Alongside this demographic transformation, the colonial authorities introduced Mandarin as the lingua franca. Over time, it evolved in tandem with local speech, producing a hybrid language reflected in the extensive presence of both Sinitic-Vietnamese and Sino-Vietnamese vocabulary.

Historically, the annals of China's Annam Prefecture are sparse. Chinese records occasionally mention rebellions and their suppression, but they provide little detail about local uprisings or decisive battles that ultimately led to Annam's independence (see Bo Yang, 1992–93, volumes 52–67). Meanwhile, Annam itself lacked comprehensive historical records prior to the 10th century. Genealogical chronicles of notable families were rare, vague, or of limited historical value. As a result, anthropologists seeking to study the origins of the Vietnamese must often rely on Chinese sources, sometimes discovered only incidentally through citations in unrelated works. Even though Annam produced literary figures of note, such contributions were often dismissed as trivialities in Chinese records.  Yet during the Tang Dynasty, Annam was a prosperous region, sending some of its most gifted individuals to serve at the imperial court in Chang’an, where they attained high office. At the same time, Bắcninh Province next Vietnam’s own Chang’an, became renowned as the source of hundreds of local women who were sent to the Tang court every year, where they were expected to contribute to the continuation of Chinese royal lineages.

Linguistically, the Sino-Tibetan hypothesis of the Vietnamese language is more persuasive than Austroasiatic or Austronesian theories, largely because historical records support a Sinitic foundation. Nearly every word can be traced to a root. If Chinese philologists can reconstruct Old Chinese, the same methods could be applied to Ancient Vietic and Middle Vietnamese, since Vietnam’s history prior to the 10th century was closely tied to China’s. For example, An Chi (2016, Vol. 1, pp. 177-180) argued that the Vietnamese word 'vượn' (monkey) derives from 申 shēn = 猿 yuán, while 'khỉ' comes from 狐 hú = 猴 hóu. Thus, prior to 939, the language of the early Annamese people can be understood through Sino-Tibetan etymologies, in addition to the Sinitic-Vietnamese lexicon derived from official and vernacular Mandarin as well as regional Chinese dialects.

Analogies from later history illustrate how linguistic and cultural development parallels political events. First, during French colonization of Indochina (1861-1954), a new intelligentsia emerged, including Vietnam’s last monarch, Bảo Đại, who spoke French more fluently than his native tongue. Second, in terms of racial mixing, the short decade from 1965 to 1975 saw the presence of American soldiers in South Vietnam, then a country of fewer than 22 million people, result in nearly 50,000 Amerasian children, roughly one in every 440 births. Third, according to kyotoreview.org, by 2022 more than 133,000 Vietnamese women had married Taiwanese husbands. One might ask: given Taiwan’s population of about 25 million, how many mixed-race children have been born into these families? By extension, how many racially mixed Vietnamese were born during the thousand years of Chinese colonization?

In this sense, Taiwan today may be seen as a parallel case to Vietnam, and if projected back in time to 939, an independent Taiwan could well resemble Vietnam’s historical trajectory a millennium later.

The analogy of Taiwan is useful in illustrating the process of Sinicization in ancient Annam, which began with a comparatively small population—estimated at around 900,000 inhabitants according to the earliest Han records shortly after 111 B.C. It is important to note that Annam had to absorb a far larger influx of Chinese soldiers, numbering in the hundreds of thousands, who advanced steadily southward from the Qin and Han dynasties over the course of a thousand years beginning in the 2nd century B.C. As Chinese colonists established firm footholds in Annam, additional immigrants from the mainland followed, much like the later pattern observed on the island of Formosa.

Anyone other than a staunch Vietnamese nationalist can readily recognize the reasoning behind the affirmation of Chinese admixture with earlier resettlers in ancient Annam. The origins of Sinitic-Vietnamese etyma can likewise be explained through this analogy. Comparable cases are found elsewhere in the world. For instance, the French no longer speak their ancestral Gaulish tongue but instead use a Romance language of Latin origin, closely related to Italian and Portuguese. Bulgarian, too, is a hybrid language heavily shaped by loanwords. Beyond Europe, in Central and South America, the legacy of Spanish colonization demonstrates both genetic and linguistic transformation: Spanish conquistadors intermingled with indigenous populations, producing entirely new societies within less than four centuries. Conquest, in these cases, profoundly altered the racial and cultural composition of the populace, as is evident today.

In Vietnam, most early Chinese immigrants blended seamlessly into the Kinh majority. However, many later arrivals over the past four centuries often retained their distinct Chinese identity if they chose. These migrants came largely from the southern provinces of Guangxi, Guangdong (Canton), and Fujian ('Hokkien'), and were categorized into groups such as Chaozhou (Teochow), Cantonese, Hakka, Hainanese, and Hokkienese. A particularly significant wave arrived after the fall of the Ming dynasty, known as the Minhhương (明鄉)—descendants of Ming loyalists who fled the Manchu conquest. They were resettled in the vast, sparsely populated southernmost regions of Vietnam under unique geo-historical circumstances. Among them, the Tchewchow formed the majority of latecomers, and over time they were thoroughly absorbed into Vietnamese society. This assimilation is evident in the localized variants of Chinese surnames: Hoàng vs. Huỳnh (黃), Vũ vs. Võ (武), Hàn vs. Hàng (韓), Lưu vs. Lều (劉), and many others.

In effect, these surnames represent nearly all Vietnamese family names of Chinese origin, serving as living proof of their descent from a broader Chinese lineage inherited across generations. Ask a Vietnamese today about their surname,  for example,   Trần vs. Chen (陳), Trương vs. Zhang (張), and three or four out of ten may still be able to trace their genealogy back to Chinese roots. (See Appendix I.)

The continuous southward migration of Chinese immigrants eventually reshaped the composition of the Kinh ethnicity in Vietnam, just as it transformed the Vietnamese language through the gradual layering of vast stocks of Chinese vocabulary. The emergence of Vietnamese as we know it may also have been the result of the forceful imposition of Chinese as a lingua franca during the centuries when Annam functioned as a prefecture of China. Inevitably, Chinese influence permeated every aspect of Vietnamese, leaving permanent marks across the linguistic spectrum,  from the most basic stratum  —  distinguishable from core indigenous remnants of proto-Taic origin  —  to the elevated scholarly lexicon. These words remain in widespread use in daily life today. Indeed, if all Chinese-derived vocabulary were removed from modern Vietnamese, it would be nearly impossible to form a complete intelligible sentence; at best, speech would sound rigid and archaic, resembling the classical Chinese style of wenyanwen (文言文), especially since even grammatical function words (虛詞 xūcí, 'hưtừ'), including prepositions, are of Chinese origin.

Vietnam’s path to statehood unfolded under successive names: NamViệt, Annam, Giaochỉ, Giaochâu, ĐạiNgu, ÐạicồViệt, ÐạiViệt, ÐạiNam, and eventually Việtnam. The evolution of its people and language is increasingly corroborated by archaeological and prehistorical discoveries along the routes traversed by their ancestors. In antiquity, transportation and communication were extremely difficult, yet the trajectory of Vietnam’s national development parallels that of other societies worldwide, such as those in South America, South Africa, Singapore, and Taiwan.

The case of Taiwan offers a particularly illuminating comparison. After imperial China seized the island from the Dutch in the early 17th century, three centuries later, in 1949, Kuomintang rulers and soldiers fleeing the mainland established their exile government there. To consolidate power, they imposed authoritarian rule on the island’s indigenous peoples — comparable to the fate of the Mường minority in ancient Vietnam — who gradually became a marginalized minority within their ancestral homeland. By 2024, Taiwan’s population had reached approximately 23.6 million, shaped by waves of mainland resettlers and the enforced adoption of Mandarin as the national language. If we project this scenario back to the 2nd century B.C., when Jiaozhi (交趾) had a population of roughly 900,000 according to Han census data, and imagine Taiwan with only one twenty-third of its current population surviving as an independent nation into the 21st century, its trajectory would closely resemble that of Vietnam, both in terms of its people and its language, given the communication limitations of the time.

Colonial dynamics further accelerated the process of assimilation in Annam, producing what can be described as more than a millennium of self-inflicted Sinicization. This term underscores the fact that Annamese monarchs voluntarily adopted the Chinese despotic model, including its linguistic framework. Linguistic adoption was inevitable, as Vietnam relied on the Chinese writing system until the early 20th century. Alongside this, Confucianism, Taoism, and Buddhism profoundly shaped Vietnamese culture and belief systems, later blending with indigenous traditions in locally developed religions such as Caodaism and Hoahaoism, which combined ancestral worship with imported doctrines.

Efforts to establish a distinct national writing system emerged in the 15th century, as seen in works like Phậtthuyết… (Doctrine of Buddhism on…). The creation of Nôm characters (ChữNôm, 𡨸喃, from 字南 ZìNán, SV TựNam) adapted Chinese ideographs to represent native sounds, including local place names and Sinitic-Vietnamese variants of Chinese words. Nôm literature flourished from the 16th century onward. By the late 19th century, however, under French colonial rule, Quốcngữ —t he romanized orthography devised by Western missionaries — gradually replaced Chinese characters, enforced both by colonial decrees and national consensus.

Modern Vietnamese orthography thus reflects three major lexical strata:

  • HánViệt 漢越 (HànYuè / Sino-Vietnamese, SV): the largest body of vocabulary derived from Chinese.
  • HánNôm 漢喃 (HànNán / Sinitic-Vietnamese, VS): lexicons of Chinese origin adapted into Vietnamese usage.

Native substrata: words from Daic, Chamic, Austroasiatic Mon-Khmer, and other sources, including later loanwords.

The first two, of Chinese origin, dominate the Vietnamese lexicon. Put differently, had Vietnam remained a prefecture of China throughout its history, adding another 1,200 years of direct rule, Vietnamese would almost certainly be classified today as a Sino-Tibetan language, akin to Cantonese or Fukienese as of now. Just as Latin and Greek elements are indispensable to English, Sinitic elements form the very essence of modern Vietnamese. Based on solid linguistic evidence, we can identify clear commonalities between archaic Chinese and their Vietnamese equivalents, including a wide range of pre-Sino-Vietnamese (Tiền-HánViệt) forms rooted in proto-Vietic speech. Altogether, hundreds of Old Chinese forms have, over the centuries, found their way into Vietnamese, shaping it in profound and enduring ways .

II) Core matter of Vietnamese etymology

C ao Xuân Hạo (2001), a renowned contemporary Vietnamese cultural and linguistic scholar, in his article Tiếng Việt là Tiếng Mãlai? (Could Vietnamese be of Malay origin?), argues that most words regarded as original – từ thuầnViệt – are in fact not aboriginally pure. From his perspective, in linguistics there is no such thing as absolute purity. He emphasizes that it matters little whether Vietnamese is classified as having Chinese, Thai, Mon-Khmer, or Austroasiatic cognates; the central issue remains the same. As he illustrates, basic words such as chim ‘bird’ (Mon-Khmer origin), vịt ‘duck’ (Thai origin), ‘fish’ (Austroasiatic origin), and thỏ ‘hare’ (Chinese origin) are all still considered ‘pure Vietnamese’ (p. 90).

As has been suggested repeatedly, the story of nation formation is more significant than the question of the precise origin of its people or their ancestral speech. What matters in any language is its integrity as a whole, not the provenance of a handful of basic words. The holistic character of Vietnamese in its present form carries more weight than debates over whether its core lexicon is Austroasiatic or Austronesian, claims that remain inconclusive or speculative. With the discovery of Sino-Tibetan etyma in the Vietnamese basic lexical stock, readers may reconsider which linguistic family Vietnamese most appropriately belongs to (see Chapter 10 on Sino-Tibetan etymologies.)

Etymologically, Chinese words entered Vietnamese through borrowings from several dialects, especially Yue 粵 (Cantonese) and Minnan 閩南 (Fukienese, Tchiewchow, Hainanese, etc.), in addition to items already shared in the common lexical pool. For example: Fukienese /kẽ/ ~ VS con 'child'; Teochow /yẽo/ ~ VS 'goat'; Hainanese /bat7/ ~ VS biết 'know'. Dialectal variants of the same root, reintroduced by immigrants from different regions of southern China, are recognized as doublets that is  a phenomenon common in Chinese itself. In this general context, ‘Chinese’ refers broadly to the Sinitic family, encompassing Mandarin and other  dialects. Such synonymic layering accounts for the heavy vernacular influence of Chinese  lects on Vietnamese across historical stages. For instance, the concept of 'cold' appears in multiple forms: 寒 (hán SV hàn) vs. Hainanese /kwa2/ → VS cóng ‘freezing cold’; alongside VS giá, rét, and lạnh, cognate with Chinese (淒), liè (冽) , and lěng (冷), respectively, usage preferences differing between northern and southern speakers.

Later, doublets with identical pronunciations may have been popularized by prominent literati, facilitating their adoption. Some characters were retained in colloquial speech, while others were replaced by forms from different dialectal sources, leading to differentiated contexts. For example, cộ 'carriage'  could be written 車, 檋, 輂, 輁, or 梮 (). A  total number of Chinese glyphs is estimated at around 74,900, while the Kangxi Dictionary records about 50,000 entries, many of them dialectal variants or extinct doublets.

The integration of Old Chinese loanwords into early Vietic was driven by powerful social forces. On one hand, the common people, often illiterate, played a crucial role: sentries, village chiefs, market vendors, artisans, laborers, and especially native wives married (sometimes voluntarily, sometimes by decree of the emperor) to Han soldiers or officials. This explains why so many scholarly Sino-Vietnamese words entered everyday usage. On the other hand, Middle Chinese loanwords from the Tang Dynasty, though originally part of the scholarly court language (Mandarin of the time), also filtered into daily life, much as they did in Cantonese. Without this vernacular adoption, Middle Chinese words in Sino-Vietnamese form could never have achieved such widespread and frequent usage in modern Vietnamese.

In contemporary times, many have witnessed accounts, sometimes dramatized in media, of Vietnamese women, often from impoverished backgrounds, who once served as maids to French colonialists or companions to American soldiers in the recent past. This pattern has continued into the 21st century, now taking the form of Vietnamese women seeking to escape poverty through marriage to men in Singapore, Taiwan, China, and South Korea. Looking back historically, the influence of such unions on local vernaculars has been significant, to the point that new dialects could emerge for practical communication, just as ancient Annamese once did. In a similar way, 'Taiwanese' developed as a distinct speech form. At its core, all of this unfolded for economic reasons.

The author observes that the contextual manner of Vietnamese speech bears notable similarities to northern Mandarin, as illustrated in several examples throughout this survey. Linguistically, this situation resembles that of contemporary Vietnamese brides and their Taiwanese husbands today. Looking further back, during the Qin and Han periods, native brides likely spoke a form of ‘pidginized’ Ancient Chinese in order to communicate with their 'outlander' husbands. Unlike the foot soldiers stationed in southern China, however, the long-march cavalrymen from northern China, by virtue of their higher social status, may have contributed more of their own northern dialectal vocabulary into local communal speech, while simultaneously adopting individual words from the indigenous language.

In either case, these newcomers were adapting to new environments, often with the intention of permanent resettlement. Children born into privileged families of the ruling class were more likely to receive formal education and, as adults, to participate in local governance. Their literary usage, derived from the mainstream Mandarin linguistic stock, was reinforced by the vernacular lingua franca of the imperial court and official correspondence. Within only a few generations, many of these families might fall into decline, a common phenomenon in both China and Vietnam, yet some individuals continued to pursue scholarship, serving as teachers and eventually assimilating into the Kinh majority. These were the people who had once spoken the language of mandarins and officials across successive dynasties.

This process helps explain why so many Middle Chinese scholarly terms remain embedded in modern Vietnamese and are used in everyday contexts. Examples include sínhlễ (聘禮 pìnglǐ, ‘betrothal’), vuquy (于歸 yúguī, ‘bridal nuptial ceremony’), kínhtrọng (敬重 jìngzhòng, ‘respect’), or ẩmthực (飲食 yǐnshí, ‘food and drink’). These words are indispensable in Vietnamese today. At the same time, learned scholars of the past spoke 'Annamese' at home, already infused with numerous early Sinitic-Vietnamese lexical items, some predating the Sino-Vietnamese layer of Middle Chinese, as well as newly coined forms derived from Old Chinese materials such as syllabic stems, roots, and affixes. For example, VS chủxị ('host') < VS chủtiệc < SV chủtịch <  主席 M zhǔxí  <  MC /tɕiozjek/.

The process of linguistic localization continually accelerated the integration of loanwords into the mainstream Vietnamese lexicon. This mechanism functioned much like the rapid adoption of computer jargon and texting slang in modern times. Such phenomena are common across languages, including those of the Indo-European family, as demonstrated in well-documented cases such as Albanian or Haitian French.

Over time, colloquial Annamese developed into regional subdialects in the north, central, and south. These variants blended indigenous elements with Chinese dialectal influences to varying degrees. For example:

  • ăn 唵 vs. xơi 食 shí / 吃 chī (eat)
  • uống 飲 yǐn vs. hớp 喝 hè (drink)
  • buồn 悶 mèn vs. phiền 煩 fán (sorrow)
  • khoái 快 kuài vs. vui 娛 yú (joyful)
  • lùn 短 duǎn vs. thấp 低 dī (short)
  • bảnh 昺 bǐng vs. sáng 亮 liàng (bright)
  • mơi 明 míng vs. mai 明ㄦ mír (tomorrow)
  • mới 萌 méng vs. xịn 新 xīn (new)
  • 舊 jiù vs. cổ 古 gǔ (ancient)
  • heo 亥 hài vs. lợn 腞 dùn (pig)
  • cọp 虎 hǔ (SV hổ) vs. hùm 甝 hán (tiger)

Such borrowings and their variants enriched the Sinitic-Vietnamese vocabulary. Over the last two millennia, this process also differentiated homonyms through tonal distinctions and fostered polysyllabicity, particularly disyllabic forms, a trend that continues today.

As these loanwords matured, they became fully localized, appearing in both inseparable compounds and independent forms. Remarkably, the transformation from Chinese to Vietnamese was generally smooth, requiring little intervention from the intelligentsia, who often dismissed colloquial mispronunciations as vulgar. For instance, the disyllabic kinhkhủng (驚恐 jīngkǒng, 'terrifying') evolved to convey both terrifying and, more recently, terrific. In modern usage, khủng alone has come to mean 'terrific', a semantic development absent in contemporary Chinese.

Before the 17th century, literary works from earlier periods required extensive annotation for modern readers to grasp their vocabulary. This reflects the rapid pace of linguistic modernization and the phonological shifts that occurred independently of colloquial speech. At one stage, two literary registers coexisted: Classical Chinese (Wenyanwen 文言文), e.g., SV niên for 年 nián, and the spoken vernacular, e.g., VS năm (𢆥) , which remained unrecorded until the emergence of ChữNôm in the 12th century. Nôm adapted Chinese characters to transcribe local phonetics, bridging the gap between written and spoken forms.

Meanwhile, Yue and MinNan dialects (Cantonese and Hokkienese) underwent a different trajectory. After the fall of the NanYue Kingdom in 111 B.C., their speakers remained within the Sino-sphere and, under sustained Sinicization, their languages were absorbed into the Sino-Tibetan family. Earlier Yue substrata were gradually buried beneath layers of Chinese superstrata, though vestiges survive in Vietnamese: 戶 hù = cửa (door), 胡 hú = cổ (neck).

The development of ancient Annamese parallels cultural continuities such as the Đôngsơn and Ngọclữ bronze drum traditions, which trace back to Phùngnguyên culture. Yue entities existed long before Han colonization (111 B.C.-939 A.D.), and remnants of Sahuỳnh, ÓcEo, and Khmer civilizations in the south predated Annamese expansion there. In essence, Taic-Yue elements preceded the emergence of Taic-Han influences, which later recombined with Yue substrata to form what became known as Annamese.

With modernization, phonological change slowed, particularly after the adoption of the Romanized Vietnamese script (Quốcngữ) in the early 20th century, first imposed by the French colonial administration. Today, mass communication, internet access, and rapid transportation have further reduced regional accent gaps and pronunciation discrepancies.

In the modern era, scientific and technological vocabulary has entered Vietnamese largely through Japanese (via Chinese mediation), French, and later English. The influx of new terms continues at a rapid pace, alongside creative acronyms, abbreviations, and texting shorthand. Examples include khủng for kinhkhủng (terrific), ko for không (no), or playful text forms like Hum ni là sn of e, dc gì hit for Hôm nay là sinh nhật của em, đâu có gì hết. Proposals such as Bùi Hiền’s (2018) Tiếq Việt reform, introducing letters like F, W, J, and Z, further illustrate this trend.

Ultimately, the essence of Vietnamese lies in its holistic structure. Both Sino-Vietnamese and Sinitic-Vietnamese elements form the living core of the language. Just as English cannot be understood without its Latin and Greek components, Vietnamese cannot be properly classified by isolating a handful of Austroasiatic residues. To crown Mon-Khmer as the defining feature of Vietnamese on the basis of a few lexical survivals would be misleading. The Sinitic elements are not peripheral, they are the very substance of the language as it exists today.

Table 5 - Cases of Mon-Khmer ~ Vietnamese cognates analogous to ethnic cooking, more than a salad bowl

Specialty dishes often change slightly as they move from one place to another. For instance, a bowl of 'hủtiếu Namvang' (Phnom Penh–style noodle soup) or 'cơmgà Hảinam' (Hainan–style chicken rice) prepared in Saigon can taste even better than in their original homelands of Cambodia or China. In many cookbooks, these recipes are sometimes adjusted by Western chefs of ethnic cuisine, who add extra spice to suit new culinary trends in Western Asian cooking schools. This is metaphorically comparable to Austroasiatic influences in linguistics: Vietnamese dishes that began as Mon-Khmer or Hainanese creations are now infused with local Vietnamese ingredients to satisfy local palates. Similarly, a bowl of 'phở' (beef noodle soup) in Lyon, France, does not taste the same as what a discerning diner might experience in Saigon or California. Historically, however, French beef stocks contributed to the invention of 'phở', which was then cooked with Vietnamese anchovy sauce (though the word 'phở' does not derive from 'pot-au-feu'). In fact, the essence of Vietnamese 'phở' (粉 fěn) lies in its blend of cinnamon and anise—spices long used in Chinese cooking and the very condiments that once drew European colonialists to Asia. At the same time, it was Westerners who eventually introduced the Latin alphabet into the Vietnamese language.

It is crucial not to overlook prominent linguistic features, and newcomers to the field should avoid retracing well‑worn paths that merely reaffirm a small set of Mon‑Khmer cognates in Vietnamese. Such limited evidence cannot substantiate an Austroasiatic origin for the language, since historical linguistics must ultimately be anchored in history itself. A strictly prehistoric Austroasiatic approach risks collapsing into the outdated notion that all languages descend from a single common source. This framework strips Vietnamese of its historical grounding and distorts its overall balance: more than 90 percent of its roughly 420 fundamental items can be traced to Sino‑Tibetan etymologies (see Chapter 10), layered over only about 10 percent of the hypothesized Austroasiatic Mon‑Khmer base. In contrast, the holistic perspective advanced here emphasizes that it is the totality of the language—not a narrow subset—that truly matters.

Moreover, proponents of the Austroasiatic Mon‑Khmer hypothesis have often overlooked the fact that many of the fundamental words they cite also occur as cross‑cognates in Chinese or other Sino‑Tibetan etymologies within the same basic lexical domains. Although these overlaps are well documented, they were excluded from percentage counts — 'mắt' (目 mù, 'eye'), 'lúa' (來 lái, 'paddy'), 'chim' (禽 qín, 'bird'), 'vịt' (鴄 pī, 'duck'), 'cá' (魚 yú, 'fish'), 'cọp' (虎 hǔ, 'tiger)', etc.— likely due to limited awareness of the deeply intertwined histories and cultures of China and Vietnam when the hypothesis was first advanced. Such crossover phenomena, in which basic Mon‑Khmer words also appear in both Vietnamese and Chinese, may have been observed by Austroasiatic theorists, but they were interpreted narrowly through the lens of genetic affiliation rather than situated within a broader historical‑linguistic framework.

In a parallel development, the neglect of cultural and historical dimensions—such as the intimate kinship terms for parents, siblings, and other close relatives (to be examined in detail later)—may be compared to the unresolved puzzle of how bronze drum relics came to be discovered in the distant Indonesian archipelago. Strikingly, no comparable bronze artifacts have ever been unearthed within the former territories of the ancient Khmer Kingdom. It is implausible to suggest that these drums were transported by sea as tributes from later Chamic Muslims on pilgrimage to Indonesia’s Islamic centers, since the chronology does not align with the early period of the Han invasion of Annam, when the Champa Kingdom was still firmly rooted in Hindu religion and culture.

Archaeological evidence reinforces this point: no bronze artifacts have been found within the boundaries of the former Champa territories, nor within the aboriginal regions inhabited by the proto‑Chamic Li minority (黎族) in the Tongzha mountainous autonomous area of southern Hainan Island, China where the prehistoric ancestors of the Cham people in Central Vietnam had emigrated from.

Food for thought — One of China’s ethnic minorities, the Li people (黎族) trace their origins to the LuoYue branch of the ancient BaiYue. They are concentrated on Hainan Island, with a population of about 1.1 million. Their main livelihood is agriculture, supplemented by hunting and handicrafts. They speak the Li language, which belongs to the Li branch of the Tai–Kadai group within the Sino‑Tibetan language family, and most are also fluent in Chinese. Their oral literature is especially rich, with myths and folk songs occupying an important place. They practice ancestor worship and believe in spirits and deities; because of their reliance on farming and hunting, they place particular emphasis on the worship of mountain spirits.

From an anthropological perspective, no cultural artifacts comparable to bronze relics have been discovered linking these three regions, nor does any written history support the claim that Austroasiatic Mon‑Khmer peoples formed the aboriginal base for constructing a linguistic‑historical profile.

Thus, the Austroasiatic Mon‑Khmer explanation remains only a tentative hypothesis, especially when weighed against the more than 420 Sino‑Tibetan etyma identified in the Vietnamese core lexicon, which provide a far stronger foundation for classification.

Despite well‑intentioned nationalism, some zealous Vietnamese enthusiasts attribute archaeological discoveries from southwestern Mon‑Khmer sites to the cultural heritage of their own forefathers. However, in serious academic research, such claims cannot be sustained in both directions at once

By similar reasoning, an important linguistic question arises: how could several Mon‑Khmer basic words have entered Vietnamese by the end of the 12th century, when the country’s southern frontier still extended no further than Thanhhoá in present‑day north‑central Vietnam? Historical records indicate that it was only in the 13th century and afterward, well after the emergence of ĐạiViệt (大越, the Great Viet) as a consolidated state, that its people began their southward expansion, ultimately conquering and erasing the 1,600‑year‑old Champa Kingdom and annexing nearly one‑third of the territory east of the former Khmer realm. For those etyma whose origins remain uncertain, the answer must be sought through deeper historical inquiry. Linguistic questions cannot stand alone; they must be corroborated by written history.

For instance, history also records that the ancient Khmer Kingdom lost its western lands to Siam (modern Thailand), then inhabited by the Dai (傣) aboriginal people. The Thai are descendants of the Dai, who themselves descended from the Taic peoples that once ruled the Chu State (楚國) in ancient China. Cognates of their basic vocabulary are preserved in the Erya (爾雅) dictionary (De Lacouperie [1887] 1963).

From an analytical perspective, one can observe clear etymological connections between Thai and Vietnamese basic words, as noted by Haudricourt in the early 20th century. Examples include Vietnamese gạo and Thai ข้าว /kʰâːw/ (稻 dào, "rice"), or Vietnamese gà and Thai ไก่ /kài/ (雞 jī, "chicken"). Such cognates point to a shared Taic‑Yue heritage that predates the later southward expansion of ĐạiViệt. (T).

These two Vietnamese and Thai cognates appear to date back to a period preceding the split of the hypothetical Taic-Yue (傣越) linguistic family into the Dai (傣, 臺 Tai, as discussed by Ding Bangxin 1977: 36–45) and Yue (越) branches in China South. Between these developments lies a clear Khmer gap, underscoring the historical reality that Mon‑Khmer basic words entered Vietnamese only in regions where Mon‑Khmer speech communities existed as isoglosses.

Austroasiatic Mon‑Khmer theorists have tended to treat non‑Mon‑Khmer basic words as separate cases, dismissing any fundamental items not cognate with Austroasiatic languages. Their assumption has been that Vietnamese vocabulary must derive from Mon‑Khmer, rather than the reverse. Consequently, they have cast an Austroasiatic blanket over the lexicon, labeling all Sinitic‑Vietnamese words that do not fit the Mon‑Khmer stock as mere Chinese loanwords. Yet if we set aside the term loanword, it becomes clear that many of these items are not borrowed at all. Instead, they display structural and phonological affinities with Cantonese and Fukienese dialects, both classified by Chinese institutes as Sino‑Tibetan, or with other Sino‑Tibetan etymologies, none of which are rooted in Austroasiatic Mon‑Khmer (see Chapter 10 on Sino‑Tibetan Etymologies).

It has long been observed that stronger and more advanced societies exert influence over weaker ones. History affirms this principle, most clearly in the Han dynasty’s domination of Vietnam, enforced through harsh measures such as General Ma Yuan’s (馬援, Mã Viện) destruction and melting down of bronze drums during the early years of conquest. By the same reasoning, certain unverifiable basic words in Vietnamese may trace their origins to the Khmer Empire, which between the 9th and 13th centuries rose to extraordinary power in Southeast Asia. The scale of its dominance remains visible today in the monumental ruins of Angkor Wat and Angkor Thom; they are vast citadels and palaces that continue to inspire awe across the world.

Recent years, as reported by the BBC London, the western technology had find out that there would be more walled cities of the ancient Khmer Kingdom to be discovered in the years come after they laser-scanned the Cambodia's tropical jungle and found some still stayed hidden deeply underneath dense layers of the rain-forest.

By that same period, nevertheless, the young Annam State had just barely emerged from the long submergence in the Chinese sphere that everything was permeated with Chinese elements, culturally and even racially. It was postulated that in very late period that the Mon-Khmer isoglosses spread out and got in touch with the Middle Vietnamese when the ancient Annam had been still located in the upper north of 16th latitude. In fact, the ancient Vietnam still remained as a vassal state of their China long after her independence and easily succumbed to its power until these days.

In the study of lexical development, etyma that can be traced to verifiable roots within a span of roughly 1,000 to 1,500 years are most plausibly loanwords, transmitted through contact, especially among neighboring languages. In the case of Vietnamese, the evidence points overwhelmingly toward Chinese influence, a consequence of Vietnam's long history as a prefecture under Chinese rule. Geographically, prior to this extended period of contact, the indigenous LạcViệt people were located further north in the Red River Basin, while to the south lay the thousand‑year‑old Kingdom of Champa, positioned between ancient Annam and the Khmer Empire.

Rather than addressing the racial mixture of populations in mainland China, Austroasiatic Mon‑Khmer theorists have attempted to explain the cognateness of Chinese and Vietnamese basic words through cultural causality, arguing that whatever China possessed, Vietnam must also have acquired. So the reverse is not true: China does not share certain uniquely Vietnamese cultural elements. A striking example is nướcmắm ("fish sauce"), with its modern Chinese equivalent, 魚露 (yúlù), is a loan translation whose morphemic structure parallels 魚汁 yúzhī ("fish extract"), deriving rom Fukienese, a Minnan dialect. In fact, this semantic pathway even contributed to the English word "catsup" or "ketchup", as previously noted. (See  Michael Barris's   Ketchup's Chinese origins a sticky subject for US foodies.) Interestingly, in Vietnamese nướcmắm, the order of the morphemes is reversed compared to 鹹液 (xiányè, SV hàmdịch, Cant. /ham2jik8/), suggesting that the original etymon may have referred to anchovy sauce, a condiment long consumed by peoples of Yue origin.

The Austroasiatic Mon‑Khmer theorists, therefore, could not avoid acknowledging that Vietnamese basic cognates reflect not only Mon‑Khmer but also Chinese influence as fundamental to the formation of the modern language. Confronted with the overwhelming influx of Chinese vocabulary, however, they stopped short of classifying Vietnamese as a mixed language. Instead, they treated the approximately 98 percent of its lexicon cognate with Chinese as mere loanwords, while attributing the remaining fraction, less than 2 percent, to Austroasiatic roots. On this basis, they reclassified Vietnamese under a sub‑branch of Mon‑Khmer, rather than considering the reverse possibility.

This reclassification appears to have been largely a matter of convenience. The label "Austroasiatic Mon‑Khmer" was employed to account for basic words shared between Vietnamese and the Mường dialects. Yet Vietnamese has never been the source from which other Mon‑Khmer languages diverged. What was overlooked is the possibility that ancient Taic or Yue languages may represent the deeper ancestral roots of multiple linguistic families—including Chinese and Mon‑Khmer—depending on the perspective adopted. The present author emphasizes this latter view, grounded in the linguistic peculiarities of Vietnamese, particularly its pervasive Chinese elements.

The difficulty lies in the fact that Austroasiatic specialists lacked historical evidence to support their hypothesis. Their arguments rested almost entirely on linguistic mechanics, citing a handful of Mon‑Khmer basic words that may be regarded as etymological relics, preserved mainly among mountain minorities who contributed only marginally to the genetic and linguistic makeup of the Kinh majority. In reality, the Vietnamese have long represented the ancient Yue, historically known as the  " >Yue of the South " >, that is, "Vietnam ",  > the final stronghold of the Yue. Thus, Vietnamese historical linguistics must be understood as the study of the Yue language, which was closely related to the ethnic languages of southern China, not to the Mon‑Khmer languages of the Indochinese peninsula. Mon‑Khmer could only be considered part of the Yue continuum if it could be demonstrated that Mon‑Khmer groups migrated southward from China into Indochina centuries before the rise of ancient Annam, rather than originating from the southwest or northwest in the direction of the Munda isoglosses.

From an etymological perspective, geographical proximity and cultural contact naturally facilitated lexical exchange. Just as Mon‑Khmer words entered Vietnamese, practical Vietnamese terms could also have diffused into Mon‑Khmer languages. Many of these words likely reached Vietnamese through the Mường, whose genetic and linguistic closeness to the Vietnamese is undeniable. The Mường, in turn, maintained contact with Mon and Khmer speakers after the Viet‑Mường split, which coincided with the arrival of the Han Chinese. In this sense, the Mon‑Khmer elements in Vietnamese can be understood as linguistic inheritances transmitted through Mường speech, functioning as a buffer between Vietnamese and Mon‑Khmer lexical strata.

From a broader perspective, the author concedes that this interpretation remains a minority view, and his lone voice insufficiently persuasive to overturn the entrenched Austroasiatic Mon‑Khmer hypothesis, which continues to be refined and upheld by successive generations of linguists. It is apparent that new entrants to the field often follow the path established by their predecessors, constructing their arguments upon the same old foundational premises. In the debate over tonality, for example, many have embraced Haudricourt’s theory that ancient Annamese was originally toneless due to its Mon‑Khmer origin, and that his model of tonogenesis explains the transformation of atonal Vietnamese into a tonal language rather than attributing its tonal system to the extensive infusion of Old and Middle Chinese vocabulary.

Theoretically, tones should arise as a natural feature of language, as Haudricourt argued in the case of ancient Annamese. Yet it is clear that spoken languages cannot artificially acquire such attributes; one cannot simply “add” tones to convert a toneless language into a tonal one. In practice, Vietnamese speakers instinctively applied tones even to early French loanwords—such as cờlê, mỏlết, bíttết, bơsữa, and càphê—thereby integrating them seamlessly into the tonal framework of Vietnamese. This demonstrates that tonality is intrinsic to the language. Haudricourt’s hypothesis of Vietnamese tone genesis is therefore untenable: according to his model, Vietnamese only became fully tonal in the 12th century through internal phonological transformations rather than prolonged contact with Chinese, an explanation that is logically inconsistent.

Admittedly, in their earliest stages, all human languages may have originated as toneless or monosyllabic sounds. That, however, is not the central issue here. From a Sinitic perspective, it is more plausible that proto‑Yue or early Taic languages developed tonal distinctions by intonating proto‑consonantal pitches into four tones. For instance, 恐龍 kǒnglóng (SV khủnglong, "dinosaur") can be reconstructed from /klong/, where the disyllabicization of the complex consonantal initial /kl‑/ helps account for the structural similarities.

It is evident that tones did not accompany Chinese loanwords in unrelated non‑tonal languages, nor did those languages subsequently develop tonality. The clearest examples are the toneless Chinese borrowings in Japanese and Korean. In both cases, Chinese loanwords were stripped of their tonal distinctions. During the Tang dynasty (618–907), Japan and Korea systematically borrowed a substantial body of Chinese vocabulary. Remarkably, they also devised ways to extract phonemic values from Chinese characters to create their own national writing systems, a revolutionary departure from the Sino‑centric mindset. Yet, despite this innovation, the intrinsic structures of Japanese and Korean prevented them from accommodating both tones and semantics simultaneously. As a result, they continued to pronounce Chinese loanwords without tonal distinctions. For instance, to Korean ears, 防火 fánghuǒ ("prevent fire and 放火 fànghuǒ(">set fire are both rendered as banghwa. By contrast, modern Vietnamese speakers still distinguish phònghoả and phónghoả as two entirely opposite concepts. In other words, every Vietnamese word is morphemized with one or more of the eight tonal categories inherited from Middle Chinese.

Ironically, modern Putonghua (Mandarin), said to descend directly from Middle Chinese, now retains only four tones. This suggests that northern Chinese populations, many of Manchurian or Altaic origin, were themselves unable to preserve the full tonal system of Middle Chinese, which comprised four tones in two registers (often counted as eight in total). If even they could not maintain the original tonal distinctions, it is implausible to expect speakers of Mon‑Khmer languages, who may struggle to differentiate subtle tonal contrasts, to have done so. The difficulty is evident when Western learners of Mandarin attempt to distinguish the four tones in simple syllables such as ma1, ma2, ma3, and ma4. Austroasiatic theorists, therefore, undermine their own position by overlooking the profound historical impact of both Vietnamese and Chinese tonal systems on the development of Vietnamese.

Academically, the Austroasiatic Mon‑Khmer hypothesis is not the only explanatory path. The portion of Mon‑Khmer cognates in Vietnamese is both scant and of marginal significance. Many of these supposed cognates can be traced instead to Chinese origins, for example, chồmhỗm (“squat”) /chromom/ vs. 犬坐 quánzuò, or chòhõ (“stand at ease with legs apart”) /choho/ (đứng chànghảng) vs. 伸站 shēnzhàn.

A more fruitful approach to Vietnamese etymology requires a new methodology grounded in two pillars: first, the extensive body of Sino‑Tibetan basic vocabulary, over 420 fundamental items by recent counts, demonstrating common roots, as illustrated in Shafer’s monumental Sino‑Tibetan study (1972); and second, the growing body of evidence for cognateness between Chinese and Vietnamese etyma, with undeniable correspondences across virtually every linguistic category.

Initially, the author, too, accepted the Austroasiatic Mon‑Khmer theory, as did author Bình Nguyên Lộc (1972). His conviction rested largely on its wide acceptance and the numerical dominance of its adherents over Sino‑Tibetan proponents, compounded by his own lack of familiarity with Mon‑Khmer. Over time, however, his research in Vietnamese historical linguistics led him to reconsider. He observed that Austroasiatic specialists often employed an umbrella approach, drawing conclusions solely from the Mon‑Khmer elements in Mường sub‑dialects and then extending these to Vietnamese as a whole. This reasoning, however, fails to account for discrepancies, such as the Khmer numerals one through five, which are frequently cited but remain problematic.

The comparative wordlists themselves raise questions. Austroasiatic linguists, trained in Western methods, often relied heavily on local informants or interpreters during fieldwork — individuals who may not have fully understood linguistic principles, let alone comparative or historical linguistics. By contrast, when I began analyzing Shafer’s Sino‑Tibetan listings (1972) alongside reconstructions of Old Chinese phonology by Karlgren, Schuessler, Wang Li, Zhou Fagao,Nguyễn Tài Cẩn, and others, I found that Sinitic‑Vietnamese fundamental vocabulary shared far more commonality with Sino‑Tibetan than with Austroasiatic Mon‑Khmer.

Over the years, I have collected examples from both Chinese classics and modern Chinese media that demonstrate how deeply classical Chinese and vernacular Mandarin have permeated Vietnamese. Interestingly, modern Vietnamese preserves certain lexical usages and expressions absent even in Cantonese or Minnan dialects. For instance, while Cantonese uses /fajng1kao1/ and Hainanese /k’waj5majk8/ for "sleep", VS employs ngủ, a cognate of 臥 wò (SV ngoạ).

Additional Sinitic‑Vietnamese etyma further illustrate the close integration of Vietnamese daily speech with Chinese vernacular forms, independent of the classical written tradition once dominant in Annam. Examples include:

  • 何故 hégù ("how come") → cớsao
  • 為啥 wèishá ("why") → vìsao
  • 卸罪 xièzuì ("to blame") → đổlỗi
  • 賴他 lài tā ("because of him") → tạinó

and

  • 幹活 gànhuó ("to work") → làmviệc
  • 忙活 mánghuó ("to be busy") → bậnviệc
  • 生活 shēnghuó ("life") → cuộcsống
  • 勤勞 qínláo ("diligent") → làmsiêng
  • 勞動 láodòng (“labor”) → làmlụng
  • 再來 zàilái ("do it again") → làmlại
  • 上來 shànglái ("come up here") → lênđây
  • 離近 líjìn ("come closer") → lạigần
  • 離開 líkāi ("leave") → rờikhỏi

    etc.

Taken together, these examples demonstrate that Vietnamese is far more deeply and systematically connected to the Sinitic tradition than the Austroasiatic Mon‑Khmer hypothesis allows.

The cited samples above represent new findings prepared by the author. They may be regarded as a long‑overdue breakthrough in Vietnamese historical linguistics, one that Austroasiatic theorists, despite decades of effort, have not been able to produce. Recall that Austroasiatic specialists have often exaggerated the significance of a handful of "basic words", many of which were collected during field trips with the assistance of local guides from Mon‑Khmer minority communities, often under the sponsorship of short‑term institutes.

Although the broader linguistic community has not embraced the Sino‑Tibetan hypothesis of Vietnamese, Austroasiatic theorists have effectively confined the language within a Mon‑Khmer framework for so long that they have hindered serious progress in Vietnamese etymology for decades. As a result, new researchers entering the field often rely solely on academic coursework and then imitate their predecessors, conducting field surveys, asking questions of local informants, and repeating outdated methods. These antiquated approaches have produced little of substance over the past sixty years, digging deeper trenches without uncovering new insights, and widening the gap that others hesitate to cross.

In truth, the same methodological weaknesses can afflict any camp. A lack of expertise in related languages, such as Chinese dialects or Mon‑Khmer isoglosses, can lead to flawed conclusions. Etymological research requires not only access to resources but also intellectual rigor, analytical precision, and creative insight. It is not enough to manipulate tools, apply mechanical rules, or tabulate word lists to justify the legitimacy of a few shared items. Among more than 20,000 Sinitic‑Vietnamese words in daily use, how many Mon‑Khmer cognates can truly be identified as fundamental? Do those few cognates constitute the essence of the language? Vietnamese speakers can communicate effectively without them, which underscores their marginal role.

No one can master all relevant languages, Thai, Zhuang, Mon, Khmer, Mandarin, Cantonese, Hainanese, Fukienese, Vietnamese, and others. Students may pass examinations in historical linguistics through aptitude and training, but specialists often reach conclusions based on limited knowledge. Those who begin from flawed premises risk perpetuating errors and passing on misinformation, unless corrected by well‑informed experts.

This research introduces a novel approach to Sinitic‑Vietnamese etymology, offering students a methodology for tracing Vietnamese etyma that align with Sino‑Tibetan and Sinitic roots, comparable to dialectal forms across the Chinese linguistic sphere. The work is original, first appearing online some twenty years ago, and has since been cited as supporting evidence by others in the field.

Comparative historical linguistics demands more than cleverness; it requires sensitivity to cultural identity and linguistic belonging. Nationalism often shapes perceptions: for example, descendants of Chinese immigrants in Vietnam may identify fully as Vietnamese, just as Taiwanese or Singaporeans of Chinese descent embrace their local national identities. Such factors can cloud academic judgment.

The purpose of this research is not to "prove" a Sino‑Tibetan genetic origin for Vietnamese, nor to deny the existence of Mon‑Khmer cognates. Rather, it seeks to introduce new data and methodologies to re‑evaluate thousands of Vietnamese words with demonstrable Chinese roots. These tools can also reassess words previously classified by earlier linguists in both camps.

The lexical commonalities are undeniable. Austroasiatic theorists have highlighted overlaps with Mon‑Khmer, but they cannot account for the breadth of basic vocabulary that clearly aligns with Chinese: nạ = 娘 niáng ("mother, bố = 父 fù(">father, mẹ = 母 mǔ ("mother"), xơi = 食 shí("eat"), ăn = 唵 ǎn("eat"), uống = 飲 yǐn ("drink"), ngủ = 臥 wò ("sleep"), mắt = 目 mù("eye"), đầu = 頭 tóu("head"), sọ = 首 shǒu ("cranium"), ngực = 臆 yì ("chest"), phổi = 肺 fèi ("lung"), bụng = 腹 fù ("stomach"), gạo = 稻 dào ("rice"), chim = 禽 qín ("bird"), cá = 魚 yú ("fish"), lửa = 火 huǒ ("fire"), lá = 葉 yè ("leaf"), nhà = 家 jiā ("home"), lợn = 豚 tún("pig"), săn = 田 tián("hunt"), and many others  — not to mention those etymons that do not look so obvious .

Unsurprisingly, Vietnamese linguists in the Mon‑Khmer camp will defend their position vigorously. Yet this research demonstrates that the number of Mon‑Khmer basic words cited are about 170 items, pales in comparison to the vast body of Vietnamese words with Chinese origins. If fundamental vocabulary is defined as roughly 200 items, as is widely accepted in linguistics, then the same logic used by Austroasiatic theorists would classify Vietnamese as Sino‑Tibetan.

Cross‑linguistic classification supports this view. Cantonese and Minnan, two major Chinese dialect groups, are classified within the Sinitic branch of Sino‑Tibetan primarily on the basis of shared Sinitic stock, not merely basic vocabulary. By the same reasoning, Vietnamese should be considered alongside them. All of these languages descend from Yue lineages within the Taic‑Yue continuum, with Sino‑Tibetan affinities. Theories of proto‑Taic and pre‑Sinitic contact prior to the Zhou Dynasty (1122–256 BCE) further support this affiliation.

While there is no need to rush to reclassify Vietnamese as Sinitic, the evidence shows that its basic lexicon is plausibly cognate with Sino‑Tibetan. Moreover, Sinitic‑Vietnamese words share not only vocabulary but also structural and phonological traits, including subtle features characteristic of closely related languages.

The following sections in the table below will present striking linguistic similarities in greater detail, with further examples and elaborations to reinforce these points.

Table 6 – Core matter of Vietnamese etymology

  1. The tonal system

    Vietnamese employs an 8‑tone system, often described as 6 in modern orthography, which does not account for the two 'entering tones' (thanhnhập, 入聲 rùshēng). This system corresponds almost perfectly to the Middle Chinese tonal scheme of four tones in two registers. Vietnamese tones can be mapped directly onto those of modern Cantonese and Minnan dialects, and they also allow Tang poetry to be recited in full accordance with its strict rules of tonal melody and rhyming syllabic finals (Xu Liting 1982:219).

    Mandarin today preserves only 4 tones, yet their pitch values align closely with Vietnamese homonyms. Other southern Chinese dialects, especially Minnan and Yue, retain between 7 and 10 tones. Cantonese, for example, distinguishes 9 tonal categories: ma1, ma2, ma3, ma4, ma5, ma6, mak7, mak8, mak, which correspond to Vietnamese ma, mà, mả, mã, má, mạ, mác (mát, máp), mạc (mạt, mạp), mac (mat, map).

    Comparable tonal values are also found in minority languages of southern China, such as Zhuang, Dai, and Miao, spoken across Yunnan, Guizhou, Guangxi, Hunan, Guangdong, and Fujian provinces. In these cases, each tone carries nearly the same phonetic value as its equivalent in Chinese dialects, suggesting an inherent and shared tonal system that points to a common linguistic family.

    By contrast, Mon‑Khmer languages lack such complexity. Their intonation patterns are roughly equivalent to only two Vietnamese tones, /ma/ and /mà/, underscoring the fundamental difference between the tonal systems of Vietnamese and those of Mon‑Khmer.

  2. Main sentence structure:

    Both Chinese and Vietnamese share the basic {Subject + Verb + Object} (SVO) pattern

    • 我 愛 小燕! (Wǒ ài Xiǎoyàn!) → Tôi yêu Tiểu-Yến! 'I love Xiaoyan'.

    Variants include object‑fronting:

    • 飯 我 吃了. (Fàn wǒ chī le.) → Cơm tôi ăn rồi. 'Meal, I already ate.'.

    • 這 本書 我 看 了. (Zhè běnshū wǒ kàn le.) → Quyểnsách này tôi xem rồi! 'This book I have already read'.

    • 把 水果 帶 過來 請客. (Bǎ shuǐguǒ dài guòlái qǐngkè.) → Bưng tráicây đem quađây mời khách. 'Bring the fruits over here to treat our guests'.

    Dual subject:

    • 小燕 她 愛我. (Xiǎoyàn tā ài wǒ) → Tiểu Yến nó yêu tôi. 'Xiaoyan she loves me'.

    In the final example, aside from constructions involving direct and indirect objects, neither language permits a dual S+V+OO structure in which the objects are expressed redundantly, as in 'Tiểu Yến nó...' At the same time, both languages allow the omission of either the subject or the object when the referent is contextually understood. This feature is of particular significance in comparative linguistics, as it highlights a structural parallel that serves as an important diagnostic criterion for establishing genetic or areal relationships among languages within the same family.


  3. "Isolate" construction

    Both Chinese and Vietnamese lack inflectional affixes to mark grammatical functions or syntactic relations, unlike Indo‑European languages. Instead, they rely on fixed lexical items to form stative, copulative, passive, active transitive, and qualificative constructions.

    Grammatical function words

    Vietnamese Chinese Pinyin English meaning
    không (bù) 'negation'
    (yǒu) 'there is' / 'exist'
    (thì, SV thị) (shì) 'to be'
    bị (bèi) 'passive' (in Chinese also active, in Vietnamese only passive)
    được (dé) 'active/resultative'
    nó thôngminh 她聰明 (tā cōngmíng) 'she is intelligent'
    cóphải 是否 (shìfǒu) 'is it…?'
    cóphảilà 是不是 (shìbùshì) 'is that…?'
    không (final particle) 不 / 否 (bù / fǒu) '…isn’t it?', '…don’t you?'


    Morphemic syllables functioning like affixes

    These morphemic forms operate in parallel across both languages because Vietnamese not only borrowed entire Chinese allomorphic sets but also employed them to construct polysyllabic words with identical semantic and structural properties.

    Vietnamese Chinese Pinyin English meaning
    hoanhỏ 花兒 (huār) 'flower'
    mainày 明兒 (mínr) 'tomorrow'
    họcgiả 學者 (xuézhě) 'scholar'
    tácgiả 作者 (zuòzhě) 'author'
    vôlễ 無禮 (wúlǐ) 'impolite'
    vôhiệu 無效 (wúxiào) 'ineffective'
    phithường 非常 (fēicháng) 'extraordinary'
    phichínhnghĩa 非正義 (fēizhèngyì) 'injustice'
    casĩ 歌手 (gēshǒu) 'singer' (手 shǒu ~ 士 shì)
    hoạsĩ 畫家 (huàjiā) 'painter' (家 jiā ~ 士 shì)
    nhàthơ 詩人 (shīrén) 'poet' (人 rén ~ nhà 家 jiā)

    This table highlights how:

    • Both languages use fixed lexical items instead of inflectional affixes.
    • Morphemic syllables act like affixes, producing polysyllabic compounds with parallel semantic structures.
    • Vietnamese systematically integrated Chinese morphemic sets into its own lexicon.

  4. Syllabic structure:

    The basic lexical building block in Vietnamese follows the pattern [initial + middle + final], most often CVC (consonant + vowel + consonant). Vietnamese, like Chinese, favors consonant‑initial syllables (with relatively few vowel‑initial words). These syllables share simple consonants without clusters, such as /k/, /c/, /t/, /ʈ/, /n/, /ŋ/, /ɲ/, etc. Medial glides like ‑w‑ and ‑j‑ are also common. For example

    • xoang [swaːŋ˧˧] vs. 腔 (qiāng, MC /hɑŋ⁵⁵/)
    • hương [hɨəŋ˧˧] vs. 香 (xiāng, MC /hœːŋ⁵⁵/)

    Sinitic‑Vietnamese vocabulary remains especially close to Middle Chinese, particularly in finals that evolved from Old Chinese endings such as /‑wng/ and /‑wk/.

Syllabic correspondences

Viet. IPA MC Mandarin Chinese Meaning
thống [tʰəwŋ˧˥] /thowng5/ tòng /tʊŋ⁵⁵/ 'pain'
đông [ɗəwŋ˧˧] /downg1/ dōng /tong1/ 'east'
cốc [kəwk˧˥] /kowk7/ gǔ /kʊk̚⁵/ 'cereal'
tốc [təwk˧˥] /towk7/ sù /su4/ 'fast'
quốc [kwək̚˧˥] /kwok/ guó /kuɔ³⁵/ 'nation'
mục [muk̚˧˥] /muwk/ mù /mu⁵¹/ 'eye'
bạch [ɓaɪk̚˧˥] /baek/ bái /pai³⁵/ 'white'
lực [lɨk̚˧˥] /liwk/ lì /li⁵¹/ 'strength'
học [hɔk̚˧˥] /haewk/ xué /ɕyɛ³⁵/ 'study'
quách [kwaːk̚˧˥] /kwæk/ guō /kuɔ⁵⁵/ 'outer wall'

This expanded table illustrates:

  • The CVC pattern is dominant in both Vietnamese and Middle Chinese.
  • Final consonants like ‑ng, ‑k, ‑c, ‑ch are preserved in Vietnamese, reflecting Middle Chinese phonology.
  • Mandarin often simplifies or shifts these finals, but the historical link remains visible.

Table 7 - Finals across Vietnamese, Middle Chinese, Mandarin, Cantonese, and Minnan

Endings Vietnamese MC Mandarin Cantonese Hokkien Chin. Meaning
‑ng thống [tʰəwŋ˧˥], đông [ɗəwŋ˧˧] /‑wng/ tòng 痛 /tʊŋ⁵⁵/, dōng 東 /toŋ⁵⁵/ tung3 痛,
dung1 東
thàng 痛, tang 東 痛, 東 'pain', 'east'
‑k cốc [kəwk̚˧˥],
tốc [təwk̚˧˥],
quốc [kwək̚˧˥]
/‑wk/,
/‑ok/
gǔ 榖 /ku³⁵/,
sù 速 /su⁵¹/,
guó 國 /kuɔ³⁵/
guk1 榖,
cuk1 速,
gwok3 國
kok 榖, chok 速, kok 國 榖, 速, 國 'cereal', 'fast', 'nation'
‑t mật [mət̚˧˥],
quật [kwət̚˧˥]
/‑t/ mì 蜜 /mi⁵¹/,
qū 屈 /tɕʰy⁵⁵/
mat6 蜜,
wat1 屈
bat 蜜,
oat 屈
蜜, 屈 'honey', 'bend'
‑p hợp [həp̚˧˥],
thập [tʰəp̚˧˥]
/‑p/ hé 合 /xɤ³⁵/,
shí 十 /ʂɨ³⁵/
hap6 合,
sap6 十
hap 合, chap 十 合, 十 'combine', 'ten'
‑m tâm [təm˧˥],
nam [nam˧˥]
/‑m/ xīn 心 /ɕin⁵⁵/,
nán 南 /nan³⁵/
sam1 心,
naam4 南
sim 心,
lam 南
心, 南 'heart', 'south'
‑n ân [ən˧˥],
sơn
[səːn˧˥]
/‑n/ ēn 恩 /ən⁵⁵/,
shān 山 /ʂan⁵⁵/
jan1 恩,
saan1 山
in 恩,
soaⁿ 山
恩, 山 'grace', 'mountain'

 

Key insights
  • Vietnamese and Cantonese both preserve final stops (‑p, ‑t, ‑k) and nasals (‑m, ‑n, ‑ng), making them closer to Middle Chinese than Mandarin.
  • Hokkien also retains these finals, showing strong parallels with Vietnamese.
  • Mandarin has lost most final stops, simplifying them into open vowels or tonal changes.
  • This preservation explains why Vietnamese and southern Chinese dialects (Cantonese, Hokkien) are especially valuable for reconstructing Tang‑era rhyme schemes and tonal systems.

  1. Basic vocabulary stock

    This prominent commonality is undeniable in all lexical aspects of each and every word for their shared etyma in basic vocabulary stock.

    Core vocabulary stock across Vietnamese, Chinese, and reconstruction systems

    Viet. Chinese Pinyin Pulleyblank
    MC
    Baxter–Sagart
    MC
    Zhengzhang
    MC
    Baxter–Sagart
    *OC
    Meaning
    nạ (niáng) nɨaŋ  nɨaŋ njaŋ nraŋ 'mother'
    tía (diè) tja  tja tja *jaʔ 'dad'
    bố (fù) bjuH  bjuH bjuH pəʔ-s 'father'
    xơi (shí) ʑiək  d͡ʑiɪk ɕiək l̥ek 'eat (meal)'
    ăn (ăn) ʔomX  ʔomX ʔomX ʔˤəmʔ 'eat'
    ngủ (wò) ŋwaH  ŋwaH ŋwaH ŋˤwaʔ 'sleep'
    xem (qiáo) dzew  dzew dzew dzew 'look'
    mắt (mù) mjuk  mjuwk mjuk mˤuk 'eye'
    đầu (tóu) duw  duw duw lˤu 'head'
    ngực (yì) ʔik  ʔik ʔik ʔik 'chest'
    phổi (fèi) piajH  pɨajH piajH pˤi[t]-s 'lung'
    gạo (dào) dawH  dawH dawH lˤuʔ-s 'rice'
    (yú) ŋjo  ŋjo ŋjo ŋa 'fish'
    lửa (huǒ) xwaX  xwaX xwaX qʰˤajʔ 'fire'
    (yè) jep  jep jep lˤap 'leaf'
    nhà (jiā) ka  kra kra kra 'home'
    lợn (tún) dwin  dwon dwin lˤu[n] 'pig'
    trồng (zhòng) tsyuwngX  t͡ɕjuwngX tsyuwŋX toŋʔ 'cultivate'
    săn (tián) den  den den l̥ˤiŋ 'hunt'

    Key insights

    • Vietnamese reflexes often preserve final consonants (‑p, ‑t, ‑k, ‑m, ‑n, ‑ng) that are visible in Middle Chinese and traceable back to Old Chinese.

    • Pulleyblank, Baxter–Sagart, and Zhengzhang differ in notation, but all show the same underlying structure.

    • Old Chinese reconstructions reveal deeper roots: for example, bố (父) traces back to OC pəʔ-s, while mắt (目) reflects OC mˤuk.

    • This demonstrates how Vietnamese, through its Sino‑Vietnamese layer, preserves archaic phonological features that connect directly to Old Chinese.

    • Across all three systems, the finals (‑p, ‑t, ‑k, ‑m, ‑n, ‑ng) are consistently preserved, which explains why Vietnamese reflexes remain so close to Middle Chinese.


  2. Shares of dialectal origin

    Many everyday Vietnamese words and expressions show clear correspondences with southern Chinese dialects (Cantonese, Hainanese, Fukienese/Minnan), reflecting centuries of contact and borrowing.

  3. Vietnamese Chinese Pinyin / Dialect Dialectal Source English meaning

    đúngrồi

    中了

    zhòngle

    Mandarin

    'correct'
    đượcrồi 得了 déle Mandarin 'that’s okay!'
    luônluôn 老老 / 牢牢 láoláo Mandarin 'always'
    ngàymai 明兒 mínr Mandarin (colloquial) 'tomorrow'
    nóichuyện 聊天 liáotan Mandarin 'talk'
    ngầu níu Mandarin (slang) 'hefty, cool'
    đánhcá 打魚 dǎyú Mandarin 'net fishing'
    gàcồ / gàtrống 雞公 jīgōng Hainanese, Fukienese 'rooster'
    gàmái 雞母 jīmǔ Hainanese, Fukienese 'hen'
    mắtkiếng 目鏡 mùjìng Hainanese 'eye‑glasses'
    biết /bat1/ Hainanese, Fukienese 'know'
    soài shē (Fukienese /suã/) Fukienese (Minnan) 'mango'
    con /kiaŋ/, /kiã/, /kẽ/ Fukien (Fuzhou) 'son, child'
    chạy /zau2/ Cantonese 'run'
    xơi shí /ʂʐ̩³⁵/ Mandarin / Cantonese 'eat (meal)'
    uống /jam3/ Cantonese 'drink'

    Key observations

    • Mandarin colloquialisms contributed forms like đúngrồi, đượcrồi, ngàymai.
    • Southern dialects (Hainanese, Hokkien, Cantonese) provided many everyday terms: gàcồ, gàmái, mắtkiếng, soài, con, chạy, uống.
    • These borrowings highlight the layered dialectal influence on Vietnamese, beyond the classical Sino‑Vietnamese stratum tied to Middle Chinese.

  4. Disyllabicity:

    Vietnamese vocabulary is overwhelmingly disyllabic, paralleling Chinese compounds. These forms are frequent in daily usage and often show either direct equivalence or creative semantic composition.

    Common disyllabic vocabulary

    Vietnamese Chinese Pinyin English meaning
    siêngnăng 勤勉 qínmiǎn 'industrious'
    làmsiêng 勤勞 qínláo 'hardworking'
    nonsông 江山 jiāngshān 'country' (lit. 'river + mountain')
    ánhmắt 目光 mùguāng 'the look'
    ánhnắng 陽光 yángguāng 'sunlight'
    giàucó 富有 fùyǒu 'wealthy'


    Peculiar Semantic Compositions

    Viet. Chinese Pinyin Baxter–Sagart
    MC
    Literal meaning Semantic sense
    bàntay 手板 shǒubǎn /*ʑuwX panX/ 'panel of the palm' 'hand'
    cổchân 腳脖子 jiǎobózi /*kjak pwot/ 'neck of the foot' 'ankle'
    khuônmặt 面孔 miànkǒng /*menH khuwngX/ 'frame of a face' 'face'
    dướiquê 鄉下 xiāngxià /*qʰjaŋ hæH/ '(down there in the) countryside' 'countryside'
    đoáhoa 花朵 huāduǒ /*xwa taX/ '(a stem of) flower' 'flower'


    Key observations

    • High frequency of disyllables: Vietnamese, like Chinese, relies heavily on two‑syllable compounds.
    • Semantic creativity: Many compounds are built from vivid metaphors (cổchân = 'neck of the foot').
    • Reverse morpheme order: Some Vietnamese disyllables invert the order compared to Chinese, reflecting local syntactic patterns when they form the disyllabic words which like happened during the Tang's Dynasty.

  5. Morphemic syllable, a building unit to coin new words:

    Morphemic syllable compounds across Vietnamese, Chinese, and reconstruction systems

  6. Viet. Chinese Pinyin Pulleyblank
    MC
    Baxter-Sagart
    MC
    Zhengzhang
    MC
    Baxter-Sagart
    *OC
    Meaning
    bồihồi 徘徊 páihuái baj hwaj bɨaj hwaj bɨaj hwaj bˤəj ɡʷˤəj 'melancholy; hesitation'
    yêuđương 愛戴 àidài ʔaj tajH ʔajH tajH ʔaj tajH qˤəʔ-s lˤəks 'love'
    khổsở 苦楚 kǔchǔ khuX tshoX kʰuX t͡ʂʰoX kʰuX t͡ʂʰoX kʰˤaʔ tʰraʔ 'hardship'
    mắcbệnh 犯病 fànbìng bjiamH biajŋH bjomH biajŋH bjomH biajŋH bom-s breŋ-s 'to be sick'
    bắtcóc 綁架 bǎngjià paŋX kaeH paŋX kraeH paŋX kraeH pˤaŋʔ kraʔ-s 'kidnap'
    cẩuthả 苟且 gǒuqiě kuwX tshjaX kuwX t͡ɕʰjaX kuwX t͡ɕʰjaX kˤoʔ tʰjaʔ 'sloppy; careless'

    Key observations

    • Pulleyblank: Compact, practical transcription.
    • Baxter-Sagart MC: Detailed, with tone categories (X, H) and vowel quality.
    • Zhengzhang: Similar to Baxter–Sagart but with slightly different phonetic assumptions.
    • Old Chinese (Baxter-Sagart OC): Pushes the etyma further back, often reconstructing laryngeals, uvulars, and clusters that explain later MC reflexes.
    • Vietnamese forms often mirror MC phonology, but their semantic structures are preserved from Chinese compounds.

  1. Syllabic parallel compounds (in synonymous / antonymous / reduplicative forms)
  2.     Both Vietnamese and Chinese contain a large number of monosyllabic words, many of which are homonyms due to the limited set of possible syllable structures {(C) + V + (C)}. In Chinese alone, nearly 80,000 characters have accumulated within this phonetic framework.

        To reduce ambiguity, both languages developed a strong tendency toward compounding. This involves combining two monosyllables—often synonyms, antonyms, or semantically related morphemes—into disyllabic words. Vietnamese also makes extensive use of reduplication and parallel morphemic compounds, closely mirroring Chinese strategies.

    Synonymous / Antonymous compounds

    Vietnamese Chinese Pinyin Literal meaning Semantic sense
    đấtđai 土地 tǔdì 'soil + land' land
    thươngyêu 疼愛 téngài 'affection + love' love
    buồnrầu 愁悶 chóumèn 'sad + sorrowful' sorrow
    chịuđựng 承受 chéngshòu 'take + accept' endure
    tìmkiếm 尋找 xúnzǎo 'seek + search' search
    chimchóc 禽雀 qínquè 'fowls + birds' birds
    caothấp 高低 gāodī 'high + low' contrast in height
    trêndưới 上下 shàngxià 'above + below' positional relation

    Reduplicative disyllabics


    Vietnamese Chinese Pinyin English meaning
    liênmiên 連綿 liánmiǎn continuous
    mongmanh 渺茫 miǎománg slim, faint
    lôithôi 囉嗦 luōsuō verbose
    dễdàng 容易 róngyì easily
    lòngthòng 籠統 lóngtǒng long-winded, loose

    Morphemic Parallel Compounds


    Vietnamese Chinese Pinyin Literal meaning Semantic sense
    cayđắng 辛苦 xīnkǔ 'spicy hot + bitter' hardship, suffering
    lạigần 離近 líjìn 'far + near' get closer
    dìghẻ 姨姨 yíyí 'aunt + aunt' stepmother

    Key Observations

    • Monosyllabic overload: Both languages face heavy homonymy due to limited syllable inventories.
    • Compounding as resolution: Disyllabic compounds reduce ambiguity and enrich expression.
    • Semantic strategies: Compounds may be built from synonyms, antonyms, reduplication, or parallel morphemes.
    • Vietnamese-Chinese parallels: The structural similarity reflects deep historical borrowing and adaptation, especially during the Tang period.

  3. Similarities in colloquial, and idiomatic expressions:

    Both languages, by their intrinsic nature, exhibit distinctive attributes across many dimensions—particularly in dialectal variation and colloquial usage. These parallels are evident in expressions that surface in everyday speech, such as:

      • tạitôi (賴我 làiwǒ, 'because of me')
      • vìsao (為啥 wèishă, 'how come')
      • làmviệc (幹活 gànhuó, 'work')
      • chồmhổm (犬坐 quǎnzuò, 'squat')
      • răngkhểnh (犬牙 quǎnyá, 'canine')
      • saocứ (總是 zǒngshì, 'how come')
      • tấtcả (大家 dàjiā, 'everybody')
      • mauchóng (馬上 măshàng, 'immediately')
      • ítra (起碼 qǐmǎ, 'at least')
      • trờinắng (太陽 tàiyáng, 'sunshine')
      • đâunào (那裡 nàlǐ, 'where')
      • đểý (在意 zàiyì, 'to mind')Tênnàythậttếu. (這個人挺逗 zhègèréntǐngdòu, 'this person is really funny')

      Culturally embedded idioms further illustrate the shared conceptual framework:

        • uốngnướcnhớnguồn (飲水思源 yǐnshuǐsīyuán, 'drink water and remember its source')
        • lárụngvềcội (葉落歸根 yèluòguīgēn, 'a fallen leaf returns to its root')
        • ếchngồiđáygiếng (井蛙之見 jǐngwòzhījiàn, 'a frog’s view from the bottom of a well')
        • sưtửHàđông (河東獅子 Hédōngshīzǐ, 'tiger wife')

        Classifiers and their function as pronouns:

        Grammatical and functional classifiers in both Vietnamese and Chinese serve to specify objects, facts, or instances. Typically positioned before nouns, they may also function independently as pronouns. Their usage is virtually parallel in both languages, exemplified by:

          • cái (個 gè, 'a unit of')
          • chiếc (隻 zhī, 'a piece of')
          • đôi (對 duì, 'a pair of')
          • con (子 zǐ, 'a head of')
          • cuốn (卷 juān, 'a roll of')
          •  (把 bă, 'a bunch of')
          • chìa (匙 chí, 'a stick of')
          • trang (張 zhāng, 'a sheet of')
          • trận (陣 zhèn, 'an instance of')
          • cục (塊 kuài, 'a lump of')
          • miếng (片 piàn, 'a slice of')
          • cơn (場 chăng, 'a round of')
          • chuyện (件 jiàn, 'a matter of')
          • ván (盤 pán, 'a game of')
          • cuộc (局 jú, 'a round of')
          • bữa (飯 fàn, 'a meal')

        Semantically and syntactically, these classifiers often pair statically with specific lexical items, forming tightly bound units that convey precise categorical meaning. Each classifier anchors a distinct semantic realm—whether quantifying animate beings, abstract events, or physical objects.

        Moreover, phonosemantic patterns in Vietnamese suggest that initial consonants such as /b-/, /f-/, /ph-/, and their derivatives /x-/, /gi-/, /z-/ often evoke imagery of gliding, swelling, or airy movement. This is reflected in expressions like :

          • phậpphồng (彭彭 péngpéng, 'erratic heartbeat')
          • bềnhbồng (泛泛 fànfàn, 'floating and drifting')
          • phấtphới  (飄飄 piāopiāo, 'wavering')
          • phầnphật (翩翩  piānpiā, 'flutter')
      These examples illustrate not only lexical convergence but also shared cognitive metaphors embedded in sound symbolism across both languages.

      These examples illustrate not only lexical convergence but also shared cognitive metaphors embedded in sound symbolism across both languages.


      1. Particles:

        Grammatical particles are typically appended to the end of a sentence to convey directionality, emotional tone, or the speaker’s attitude toward a given state of affairs. These particles function similarly in both Vietnamese and Chinese, often serving pragmatic or modal roles. Examples include:

          • đây as in Lênđây! (上來 Shànglái, 'Come up here!')
          • đi as in Vềđi. (回去 Huíqù, 'Go home.')
          • ơi as in Trờiơi! (天啊 Tiānna, 'My Lord!')
          •  as in Tôilàynè. (是 我 呢 Shì wǒ ne, 'It's me.')
          • nhanhé as in Tôi ăn nha. (我吃啦 Wǒ chī lā, 'I eat now.')
          • rồi as in Chạykhôngnổinữa rồi! (走不了了呢! Zǒu bù liăo le ne!, 'I cannot walk anymore!'

        These particles, though often monosyllabic, carry nuanced semantic weight and are integral to the expressive rhythm of both languages. Their placement and usage reflect shared syntactic tendencies and pragmatic functions across the Sinitic-Vietnamese continuum.

      2. Prepositions and Conjunctions

        Virtually all Vietnamese prepositions and conjunctions trace their origin to Chinese functional words, known as 虛辭 xūcí (hưtừ). Their semantic roles and syntactic behavior are nearly identical in both languages, forming a shared grammatical substrate. Examples include:

          • và (和 hé, 'and')
          • với (與 yú, 'with')
          • từ (自 zì, 'from')
          • nếu (若 ruò, 'if')
          •  (為 wèi, 'because')
          • nhưngmà (然而 rán’ěr, 'but')
          • vìthế (於是 yúshì, 'therefore')
          • dođó (所以 suǒyǐ, 'hence')
          • dùrằng (雖然 suīrán, 'although')
          • dovì (bởivì) (由於 yóuyú, 'due to')

        These functional elements not only mirror each other in form and meaning but also reflect a deep historical convergence in syntactic logic. Their consistent pairing across both languages underscores the embeddedness of Chinese grammatical architecture within Vietnamese discourse.

      3. Grammatical markers:

        Grammatical markers in Chinese are lexical units that fulfill syntactic functions by framing, fossilizing, or abstracting fixed expressions into stative or nominalized forms. Many of these have evolved into stand-alone words denoting conditions, circumstances, or abstract states of affairs. Their origins lie in classical Chinese (文言文 wényánwén), and they remained in active usage across both languages well into the early 20th century.

        Over time, these markers came to represent syntactic units that encode grammatical abstraction, often stative, circumstantial, or nominal in nature. Most are vestiges of classical constructions, preserved and recontextualized in Vietnamese usage. Examples include:

          • sựchuẩnbị (有所準備 yǒusuǒzhǔnbèi, 'a state of being prepared')
          • cáigọilà (所謂 suǒwéi, 'the so-called')
          • cáitôicó (~>củatôi) (我所有 wǒsuǒyǒu, '(of) mine')
          • cáiviệcnólàm (他所作所爲 tāsuǒzuòsuǒwéi, 'what he has done')
          • ởtrong (其中 qízhōng, 'among')
          • cáikhác (其他 qítā, 'other')

      4. These markers reflect a shared grammatical architecture rooted in classical syntax and fossilized constructions. Vietnamese equivalents retain identical functional roles, often mirroring Chinese word order and semantic scope.

        Syntactic alignment also extends to pronoun formation and grammatical sequencing. For instance:

          • đằngnầy (~>đằngấy) (我等 wǒděng, 'we all' > 'thou')
          • Chúngmình  (~> chúngta) (咱們 zánměn, 'we all')

        In essence, virtually every grammatical feature found in Chinese can be identified in Vietnamese and its derivatives—and vice versa. This includes not only lexical items and particles but also structural conventions and word order, underscoring the deep historical entanglement between the two linguistic systems.

      5. Analytical convergence of intimate linguistic features

        Analytically, these linguistic features represent a uniquely intimate and culturally embedded stratum, peculiar to languages of close affiliation that share internal grammatical traits, phonosemantic patterns, and historical depth. Beyond structural parallels, cultural and emotional nuances are delicately encoded within lexical items and expressions.

        For example:

        • mẹruột (親媽 qīnmā, 'natural mother')
        • charuột (親爹 qīndiē, 'natural father')
        • mẹghẻ (繼母 jìmǔ, 'stepmother')
        • chaghẻ (繼爹 jìdiē, 'stepfather')

        These kinship terms reflect not only semantic equivalence but also shared cultural sentiment and familial hierarchy embedded in both languages.

        Furthermore, expressive pragmatics—especially in profanity or emotional outbursts—reveal striking phonetic and semantic convergence. For instance, when a Chinese speaker exclaims Tāmà! (他媽, 'his mother'), the utterance closely mirrors the Vietnamese Đụmá, both in phonetic contour and emotional force. This parallel is reinforced by:

        • Cantonese 屌 diu (/tjew3/) ≈ Vietnamese đéo or đụ (Semantic equivalent: English "Fuck you!"

        Such examples illustrate not only lexical overlap but also a shared expressive logic rooted in colloquial speech and emotional immediacy. The convergence of form, function, and sentiment across these expressions underscores the deep interconnectivity between Vietnamese and Chinese, linguistically, culturally, and historically.

      With minimal premeditation and moderate effort, one can often render complete sentences from Chinese into Vietnamese on a near word-for-word basis, wherein each lexical item mirrors its counterpart with striking textual connotation. Clauses and phrases are likewise constructed with parallel syntactic architecture and rhetorical texture, allowing for seamless transposition between the two languages.

      This phenomenon is especially evident in the translation of Chinese classics and martial arts novels, such as Romance of the Three Kingdoms and Water Margin by the Ming-era author Shi Nai’an (Thị Nại-Am), or Romance of an Archer by Hong Kong’s Jin Yong (Kim Dung), and the works of Gu Long (Cổ Long). Since the early 20th century, these texts have been translated into Vietnamese using a method that preserves their original structure and diction. For native-born speakers, the archaic style poses no barrier to comprehension; rather, it evokes a literary elegance akin to the stylized English of Hamlet or other Shakespearean works.

      Modern Vietnamese readers continue to engage with these Chinese classics effortlessly, generation after generation, despite the dense layering of Chinese semantic, syntactic, and lexical features embedded in their Sino-Vietnamese transliterations. These texts, while richly evocative for native audiences, remain challenging for foreign learners of Vietnamese, who must exert considerable effort to master the linguistic intricacies and historical registers encoded within.

      In short, if a Vietnamese speaker recognizes approximately 3,000 individual Chinese characters, they can read virtually any classical or modern Chinese literary work with remarkable ease—an ability unmatched by speakers of other languages without considerable effort. This is due to the fact that each character (字) in Chinese corresponds to numerous disyllabic words in Vietnamese, many of which are already embedded in the speaker’s core vocabulary by the time they complete middle school education.

      For example, a Westerner learning modern Chinese must treat the following as six distinct lexical items:

        • guó 國 ('state')
        • jiā 家 ('home')
        • guójiā 國家 ('nation')
        • 婦 ('wife')
        • 女 ('female')
        • fùnǚ 婦女 ('woman')

      Whereas a Vietnamese reader familiar with quốc, gia, quốcgia, phụ, nữ, phụnữ already internalizes these as cohesive units through Sino-Vietnamese vocabulary.

      When comparing Vietnamese to Chinese dialects such as Mandarin, Yue (粵), and Minnan (閩南), the linguistic divergence resembles the difference between Vietnamese and Mandarin itself. Despite Vietnam’s political independence since the 10th century, the language absorbed extensive Sinicized elements over two millennia of cultural and administrative contact.

      The early Annamese vernacular, spoken by the indigenous masses including Mường groups and local wives of Chinese soldiers, gradually evolved from a Yue-based substratum. Dialects such as Cantonese, Fukienese, and Teochow likely originated from the same aboriginal root, with successive Sinitic layers accumulating over time, namely, Old Chinese, Ancient Chinese, Middle Chinese, and Mandarin. These layers were reinforced by the court language brought by mandarins, embedding Sinitic structures into Vietnamese society for over 2,000 years.

      Had Vietnam not secured its sovereignty in the 10th century, its linguistic fate might have mirrored that of Cantonese or Fukienese, languages still considered Chinese dialects by national linguistic institutions.

      Even today, Vietnamese retains lexical items on par with those found in both Mandarin and Cantonese. However, semantic discrepancies between Mandarin and Cantonese often reveal subtle shifts. Consider the following comparative examples.

      Table 8 - Lexical parallels and semantic drift

      Concept Cantonese Mandarin Sino-Vietnamese Sinitic-Vietnamese
      where /pin5dou2/ 那裏 nàlǐ nalí nơiđâunơiấynơiđó
      sleep /fajng1kao1/ 睡覺 shuìjiào thuỵgiác giấcngủđingủ
      eat /sək8/ 吃 chī / 食 shí ngật / thực xơiăn
      drink /jam3/ 喝 hè / 飲 yǐn hát / ẩm uốnghớphúp
      urinate /o5niew2/ 尿 niào niếu tiểuđáiđiđái
      tired /kwuj2/ 累 lèi luỵ mỏi
      see /tʌj3/ 見 jiàn kiến thấy
      descend /lɔt8/ 下 xià hạ xuống
      take /lɔ3/ 拿 ná lấy
      go /hoj1/ 去 qù khứ đi
      run /zau2/ 走 zǒu / 跑 pǎo tẩu / bào chạy

      These examples illustrate not only phonological and semantic alignment but also the shared etymological depth across dialects and Vietnamese. In some cases, Vietnamese forms diverge from Mandarin yet remain closer to Yue or MinNan pronunciations, suggesting a layered borrowing process shaped by regional contact and dynastical influence.

      Between Vietnamese and Chinese, grammatical discrepancies are minimal, chiefly limited to syntactic word order. Vietnamese typically follows a reversed structure, such as {noun + adjective}, whereas Chinese prefers {adjective + noun}. Despite this inversion, the deeper convergence lies not in syntax but in etymology, where the lexical roots of Vietnamese and Chinese reveal profound historical entanglement, far deeper than proponents of Austroasiatic Mon-Khmer theory might concede.

      Etymologically, the following examples demonstrate that it is possible to formulate rules of derivation from a core word-concept, generating plausible cognates across both Chinese and Sinitic-Vietnamese domains. Though restrictive in scope, the principle holds: if a majority of etyma within a semantic category exhibit phonological and contextual alignment, they likely share a common origin, even when phonetic discrepancies suggest otherwise.

      The generalization is as follows: if lexical items carry consistent etymological traits, exhibit parallel phonological peculiarities, and encode similar contextual connotations, they may plausibly derive from the same root, most often as loanwords from Chinese.

      The illustrated list below will further demonstrate how lexical transformation occurs through two complementary methodologies:

      1. Semantic analogy approach – identifying conceptual parallels across languages
      2. Disyllabic sound change approach – tracing phonological shifts across cognate forms

      Together, these frameworks enable the identification of candidate patterns for sound shifts and semantic evolution, topics to be explored in greater detail as the analysis progresses.

      • 'đầu' (head) 頭 tóu – 'sọ' (cranium) 首 shǒu,
      • 'mặt' (face) 面 – 'mày' (eyebrow) 眉 méi,
      • 'mắt' (eye) 目 mù – 'mũi' (nose) 鼻 bì,
      • 'gan' (liver) 肝 gān – 'ruột' (intestines) 腸 chăng,
      • 'sống' (live) 生 shēng – 'chết' (die) 死 sǐ,
      • 'ăn' (eat) 唵 yān – 'uống' (drink) 飲 yǐn,
      • 'khóc' (weep) 哭 kù – 'cười' (laugh) 笑 xiào,
      • 'đi' (walk) 去 qù – 'đứng' (stand) 站 zhàn,
      • 'chạy' (run) 走 zǒu – 'nhảy' (jump) 跳 tiāo,
      • 'nặng' (heavy) 重 zhòng – 'nhẹ' (light) 輕 qīng,
      • 'cao' (high) 高 gāo – 'thấp' (low) 底 dì,
      • 'dài' (long) 長 cháng – 'ngắn' (short) 短 duăn,
      • 'lạnh' (cold) 冷 lěng – 'nóng' (hot) 燙 tàng,
      • 'hay' (good) 好 hǎo – 'dỡ' (bad) 亞 yà,
      • 'buồn' (sad) 悶 mèn – 'vui' (happy) 快 kuài,
      • 'gần' (near) 近 jìn – 'xa' (far) 遐 xiá,
      • 'trước' (before) 前 qián – 'sau' (after) 後 hòu,
      • 'cũ' (old) 舊 jìu – 'mới' (new) 萌 méng,
      • 'đắng' (bitter) 辛 xīn – 'cay' (spicy hot) 苦 kǔ (SV khổ', 'bitter') [the meaning switches here.] (苦), etc.

      The postulation above is grounded in rational methodology, aiming to relate Sino-Vietnamese and Sinitic-Vietnamese lexicons through a parallel disyllabic approach. This framework enables the identification, analysis, and extraction of monosyllabic base forms from synonymous or compound Vietnamese expressions, such as chàilưới, xecộ, cậumợ, chúbác, and others. By applying this approach, we uncover reliable traces of sound change and semantic shift from Chinese into Vietnamese, even when such correspondences appear unconventional. These patterns have long been posited by Vietnamese linguistic specialists using traditional comparative methods.

      For example, the following associations illustrate semantic analogy and phonological convergence:

      • voi ~ 為 wēi ('elephant')
      • lúa ~ 來 lái ('unhusked rice grain')
      • gạo ~ 稻 dào ('rice') [cf. lúa, per Starostin]
      • nắng <~ trờinắng ~ 太陽 tàiyáng ('sunshine')

      For the same matter, indigenous lexicons embedded in the Chinese and Vietnamese zodiac systems reveal deep etymological ties. Each animal name in the Vietnamese cycle corresponds to a Chinese character and phonosemantic root:

      Vietnamese
      zodiac
      Chinese zodiac Sino-Vietnamese Sinitic-Vietnamese English meaning/Notes
      子 zǐ chuột 'rat' / cf. 鼠 shǔ (SV thuộc)
      sửu 丑 chǒu sửu trâu 'ox' /cf. 牛 níu (SV ngưu, VS 'trâu')
      dần 寅 yǐn dần cọp cf. 虎 hǔ (SV hổ), also hùm, 甝 hán
      mẹo 卯 mǎo (replaced by 兔 tù - SV 'thố') mão mèo cf. 貓 māo (SV miêu), no Vietnamese ever calls 兔 tù (SV 'thố'; or 'thỏ')
      thìn 辰 chén thìn rồng cf. 龍 lóng (SV long)
      tỵ 巳 sì tỵ rắn cf. 蛇 shé (SV ), 巳 as snake pictograph
      ngọ 午 wǔ ngọ ngựa perfect cognate
      mùi 未 wèi vị cf. 羊 yáng (SV dương), phonetic shift /j-/ ~ /d-/
      thân 申 shēn thân khỉ cf. 猴 hóu (SV hầu), 猢 hú (SV hồ), 猻 sūn (SV tôn), 猿 yuán (SV viện, VS vượn)
      dậu 酉 yǒu dậu cf. 雞 jī (SV )
      tuất 戌 xù tuất chó cf. 狗 gǒu (SV cẩu), 犬 quán (VS cún)
      hợi 亥 hài hợi heo cf. 腞 tùn, also lợn (Northern dialect)


      These correspondences affirm that mẹo 卯 (mǎo) must align with mèo ('cat'), not thỏ 兔 tù ('hare'). The origin of the twelve zodiac animals, in fact, likely stems from Southern Yue traditions, later adopted and codified by proto-Chinese civilizations.

      This etymological framework, combining semantic analogy and disyllabic sound change, offers a robust methodology for tracing lexical evolution and identifying cognate patterns across Sinitic and Vietnamese strata. 

      From a series of solid etyma within a shared semantic category, it becomes methodologically plausible to induce parallel derivations for other Vietnamese words. Consider the case of Tết, which can be etymologically traced to 節 jié (SV tiết), as in:

      • tiếtxuân (春節 Chūnjié, 'Spring Festival')
      • ănTết (過節 guòjié, 'celebrate the Spring Festival')
      • TếtNguyênđán (元旦節 Yuándànjié, 'New Year Festival')
      • TếtÐoanngọ (端午節 Duānwǔjié, 'Late Spring Festival')
      • TếtTrungthu (中秋節 Zhōngqiūjié, 'Mid-Autumn Festival')

      This analysis supports the identification of Tết as a direct cognate of 節 jié, with ănTết emerging from 過節 guòjié (SV quátiết). While 過 guò in Mandarin is /kwo4/ ('to pass'), SV quá /wa5/ semantically aligns with ăn. In this context, ăn /ɐn/ evolves beyond its literal meaning of 'to eat' and functions as a prefix denoting 'celebration', 'participation', or 'engagement'.

      Thus, 過 guò is reinterpreted as ăn, not merely through phonetic resemblance but via semantic elevation, ănTết becomes a conceptual equivalent of 'feasting', 'rejoicing', or 'partaking in festivities'. This transformation exemplifies what the author terms the principle of sandhi process of association, wherein a Chinese disyllabic compound converges upon a Vietnamese prefix, forming new lexical constructions:

      * 過節 guòjié → ănTết \ M guò /wa5/ → ăn → ăn- / [ xz / x > y > Ø- ]

      Once ăn- assumes prefixal status, it becomes a productive morpheme capable of generating extended meanings such as 'to take in', 'to engage in', or 'to undergo'. This semantic expansion is evident in compounds like:

      • ăntấtniên (過小年 guòxiăonián, 'feast before New Year')
      • ăntiệc (宴席 yànxí, 'banquet')
      • ăncưới (酒席 jǐuxí, 'wedding feast')
      • ănmừng (慶祝 qìngzhù, 'celebrate') [@ 祝 zhù ~ 食 shí, 吃 chī]

      and further:

      • ănmặc (衣食 yīshí, 'lifestyle')
      • ănuống (飲食 yǐnshí, SV ẩmthực, 'diet')
      • ănngon (吃香 chīxiāng, 'enjoy delicious food')
      • ănnói (言語 yányǔ, 'manner of speech')
      • ănhiếp (威脅 wēixiè, 'bully')
      • ăntiền (贏錢 yíngqián, 'win money') / (要錢 yàoqián, 'extort money')
      • ănmày (要飯 yàofàn, 'beg')
      • ănhàng (吃貨 chīhuò, 'glutton / smuggler')
      • ănđòn (挨打 áidă, 'get beaten')
      • ăncắp (竊案 qiè’àn, 'steal')
      • ănbám (白吃 báichī, 'live off others')
      • ănnhậu (應酬 yìngchóu, 'social drinking')
      • ănthua (輸贏 shūyíng, 'compete / gamble')

      As a suffix, ăn continues to evolve:

      • làmăn (生意 shēngyì, 'do business')
      • đồănthứcăn (食物 shíwù, 'food')
      • ngánănbiếngănnhịnăn (厭食 yànshí, 'anorexia')
      • thamănhamăn (貪吃 tānchī, 'gluttony')
      • háuăn (好食 hàoshí, 'appetite')
      • cóăn (有錢鑽 yǒuqiánzuàn, 'make money')
      • ănnóibợmtrợn (胡說霸道 húshuōbàdào, 'talk nonsense')

      Phonologically, the prefix ăn- aligns with initial consonants such as y-, w-, sh-, ch-, j-, suggesting a sandhi-driven convergence. This pattern reflects a broader morphophonemic shift where ăn absorbs and reinterprets elements from Chinese disyllabic compounds.

      Etymologically, both 吃 chī and and VS ăn have their own pathway of development. The character 吃 (chī) has complex phonological roots from 喫:

      • 喫 chī, jī, jí (吃 SV ngậtcật< MC kjit < OC *kɯd 
      • 喫 chǐ < MC kʰek < OC *ŋ̥ʰeːɡ 
      • 唵 ǎn (SV àm, ảm) < MC ʔəm < OC qoːmʔ
      • 奄 yǎn, yān  (SV yễm) < MC ʔɜm < OC *ʔramʔ 

      Starostin notes that 喫 originally meant 'stammer', later evolving to 'eat', 'drink', 'swallow'. The phonetic stem 乙 yì (SV ất) and the {-t ~ -n} ending suggest a plausible pathway to ăn. Thus, ăn may derive from 唵 ǎn, reinforcing its semantic and phonological legitimacy as a cognate. 

      The author’s hypothesis posits ăn- as a conceptual umbrella encompassing a wide array of Vietnamese words derived from Chinese etyma. While not all derivations may be universally accepted, the majority exhibit compelling phonosemantic and structural parallels. This approach challenges rigid Austroasiatic Mon-Khmer frameworks and invites a reevaluation of Vietnamese etymology through a Sinitic lens, one that embraces sandhi, semantic drift, and morphemic innovation.


      III) Chinese and the Vietnamese basic vocabulary stock

      It comes as no surprise that a substantial portion of Vietnamese basic vocabulary—at the very least—appears to share common linguistic roots with Chinese, as previously enumerated. This convergence stems from prolonged and intimate contact between the two cultures, dating back at least 1,500 years prior to the documented onset of Chinese linguistic influence during the Warring States period (403–221 B.C.), as recorded in Chinese historical annals.

      In fact, a significant number of culturally embedded Vietnamese words are demonstrably cognate with ancient Chinese etyma, including but not limited to:

      • "nạ" 娘 niáng (old Vietnamese "mother") [ Doublet: 奶 năi (SV 'nãi') ],
      • "nhà" 家 jiā (home) (SV gia) [ <= M 家 jiā, gū, jie (gia, cô) < MC kaɨ < OC *kra: | jiā | ¶ /ji- ~ nh-/ ],
      • "chén" 盞 zhàn (bowl) (SV tràn) [ M 盞 zhàn (tràn, trản) < MC can < OC *tsjre:nʔ ],
      • "mâm" 盤 pán (tray) (SV bàn) [ M 盤 pán < MC bwan < OC *ba:n | cf. 'bàn' 案 àn (SV án) 'table' ],
      • "bát", "tô", "tộ" 砵 bō (large bowl) (SV bát) [ M 砵 (鉢, 缽, 缽, 盋) bō < MC bo < OC *po:d, *pa:t  | Cant. /but3/, Hẹ /bat7/ | According to Starostin, bowl. A Sanskrit loanword ( < Skr. pa:tra), attested | ¶ /b- ~ t-, -n ~ -t/     ],
      • "đũa" 箸 zhú (chopsticks) (SV trợ, chừ, trừ) [ M 箸 zhù, zhú, zhuó, zhuò (SV trợ, chừ, trừ) < MC ɖɨə̆ < OC *tas, *das | cf. Hainanese: /duə2/],
      • "thìa", "chìa" 匙 chí (spoon) (SV thi, chuỷ) [ M 匙 (㔭) chí, shī (thi, chuỷ) < MC tʂe < OC *dje ],
      • "cằm" 含 hán (chin) [ Also: VS cắn, ngậm, mĩm, hàm (AC 'chin') | M QT 含 hán, hàn < MC ɦəm < OC *ɡɯːm | Cf.  Tibetan PC **kɒ:m (VS cắn 'bite') > Tibetan: agam (VS gặm 'gnaw'), akham (VS cắn) | Note: phonetic stem 今 jīn (SV kim) Cant. gam1 ],
      • "sọ" 首 shǒu (cranium, head) (SV thủ) [ M 首 shǒu (VS đầu) ~ 頭 tóu (SV đầu) | M QT 首 shǒu, shòu (thủ, thú) < MC ɕiu ,ɕuw < OC *hljuʔ, *hljus | According to Starostin, cf. perhaps also Viet. sọ 'cranium, skull'. ],
      • "mắt" 目 mù (eye) (SV mục) [ M 目 mù < MC muwk < OC *mug | cf. Hainanese /mat7/ ],
      • "bếp" 庖 páo (kitchen) (SV bào) [ M 庖 páo < MC baɨw < OC *bruː],
      • "tấmcám" 糝糠 sănkāng (broken rice chaff) (SV tầmkhang) [ M QT 糝 sǎn, sān, shēn < MC tsham, səm < OC *sluːmʔ || QT 糠 kāng < MC kʰaːŋ  < OC *kʰlaːŋ |  According to Starostin, 糝 săn: rice (in grain); to add rice to gruel (L.Zhou). Standard Sino-Viet. is tầm. Protoform: *chya:mH, Meaning: gruel, soup. Chinese: 糝 *ʂjə:mʔ rice gruel with meat. Tibetan: a~z|/am (resp.) soup. Lushei: côm mix (any liquid food) with cooked rice. ],
      • "canh" 羹 gēng (broth, soup) [ M 羹 gēng, láng (canh, hành, lanh) < MC kaɨjŋ < OC *kraŋ | Note: as opposed to modern M 湯 tāng which is also a loanword in Vietnamese 'thang' for both 'noodle soup' 湯粉 tāngfěn 'búnthang' and 'medicinal brew' 藥湯 yàotāng 'thangthuốc' ],
      • "bàn" 案 àn (table) (SV án) [ M QT 案 àn < MC ʔan < OC *qaːns | ¶ /-Øn ~ ban/ | Note: towards the end of Zhou Dynasty: OC *ʔa:ns, cf. 'ấn', 'bấm', 'nhấn' 按 àn (to press) || Ex. 香案 xiāng'àn (VS bànhương(án)) 'altar', 案稱 àn​chèng​ (VS bàncân) 'scale', 案子 ànzi (VS cáibàn) 'table' ],
      • "ghế" 椅  (chair) (SV ỷ) [ M 椅 yǐ, yī, yì (y, ỷ) < MC ʔe, ʔɯiɛ < OC *ʔaj,*qral, *qralʔ | Note: doublet 几 jǐ (stool) (SV kỷ, kỉ), M 几 jǐ < MC ki < OC *krjəʔ | cf. 几倚 jǐyǐ (VS 'ghếdựa') 'reclining chair' ],
      • "tủ" 櫝 dú (cupboard) (SV độc) [ M 櫝 (匵) dú < MC duk < OC *l'o:g],
      • "guốc" 屐 jī (wooden sandals) (SV kịch) [ M 屐 jī < MC ɡɯiak < OC *ɡreɡ | Dialect: Cant. kek6, Hakka. kiak8 ], etc.

      • including other early Old Chinese and Mandarin compounds, for example,

      • "TânMão" 辛卯 XīnMăo (Year of the Cat) [ NOT Chinese 'Year of the Hare'; also, 'NămMão' ~ 'NămMèo' 卯年 Măonián (SV Mãoniên) 'Year of the Cat) ],
      • "TânHợi" 辛亥 XīnHài (Year of the Boar) [ Also, 'NămHợi' 亥年 Hàinián (SV Hợiniên), 'Year of the Pig' ],
      • "thángchạp" #臘月 làyuè (the 12th month of the year in lunar calendar),
      • "ăntấtniên" 過小年 guòxiăonián (Feast on the week before the Lunar New Year Eve),
      • "bâygiờ" #者番 zhěfān (right now),
      • "vuquy" 于歸 yúguī (bridal wedding ceremony),
      • "sínhlễ" 聘禮 pìnlǐ (betrothal dowry for the bride),
      • "thànhphố" 城舖 chéngpǔ (city),
      • "chợbúa" 市舖 shìpǔ (~ #phốchợ) (market),
      • "khaigiảng" 開講 kāijiăng (start a school year) [ modern M 'beginning a lecture' ],
      • "thơmộng" 詩夢 shīmèng (romantic),
      • "đườngcái' #街道 jièdào (road),
      • "đòngang" 渡江 dùjiāng (riverboat),
      • "xinlỗi" 見諒 jiànliàng (apology) [ where @ 見 jiàn ~ 'xin' and @ 諒 liàng ~ 'lỗi'. If that is the case, it could be possibly a cognate of 道歉 dàoqiàn \ @ 道 dào ~ 'lỗi', @ 歉 qiàn ~ 'xin' (apologize) | Note: while modern M # 對不起 duìbùqǐ > VS 'xinbỏqua' (Pardon me!), in ancient Chinese there exist well over 15 ways to 'apologize', say, 謝罪 xièzuì (SV 'tạtiội', VS 'xinlỗi'), 請罪  qǐngzuì (VS 'xinlỗi'), etc., that is, anyone of them could be a doublet or etymon of 'xinlỗi'. ],
      • "cảlũ" 大伙 dàhuǒ (the whole group) [ M 大 (太) dà, duò, dài, dăi, tài (đại, thái) < MC daj, da < OC *da:d, *da:ds || M 伙 huǒ < MC ɦwa < OC *qʰʷaːlʔ ||  cf. 火 huǒ (VS lửa) ],
      • "đồngloã" 同夥 tónghuǒ (accomplice) [ M 夥 huǒ, huài (khoả, khoã, hoả) < MC ɦwaɨj, ɦwa < OC *qʰloːlʔ, *qʰloːls, *ɡloːlʔ || cf. 火 huǒ (VS lửa) ],
      • "ungthư" 癰疽 yōngjū (cancer) [ ~ VS 'ungthối', 'ươngthối' (spoiled) | M 癰疽 yōngjū (ungthư) | M 癰 yōng < MC ʔoʊŋ < OC *ʔoŋ ],
      • "chồmhỗm" 犬坐 quánzuò (squat) [ M 犬 quăn < MC kʰʷen < OC *kʰʷeːnʔ || 坐 zuò < MC ʑwʌ < OC *dzuaj, *zoːlʔ, *zoːls ],
      and the list goes on, etc.

      Note that all listings above and the likes are so numerous and seemingly inexhaustible. While some of those words are still common in contemporary usage in Vietnamese, in modern Mandarin the same forms may have already become obsolete, old-fashionable, or rare, even they still convey same denotation and meaning, though, for instance, 活 huó (SV hoạt) vs. 務 wù (SV vụ) for 'việc' (task), 睡 shuì (SV thuỵ) vs. 臥 wò (SV ngoạ) for 'ngủ' (sleep), etc., that is, some descend from Middle Chinese while other from vernacular Mandarin, so to speak.

      The list will become much densely populated if we include more of old-timed literary words that both Chinese and Vietnamese are still using now. Specifically in this cultural context, it appears that Vietnamese adopted most of Chinese words of the same kind for its own use rather than they were evolved from common roots genetically, or, in other words, they are Chinese loanwords, for example:

      • "thánggiêng" #正月 zhèngyuè (January) [ ~ #元月 yuányuè ],
      • "TếtÐoanngọ" 端午節 duānwǔjié (Late Spring Festival),
      • "thángchạp" #臘月 làyuè (the 12th month [in the Lunar calendar ]),
      • "cúngTáoquân" 祭灶君 jìZàojūn (sacrificial offerings to send Kitchen God to pay homaage the Emperor of Heaven),
      • "ăntấtniên" 過小年 (feast on the week of "cúngTáoquân" starting from the 23rd day of the 12th month of the Lunar Calendar),
      • "Tết" 節 jié (Spring Festival),
      • "xinchào" 見過 jiànguò (hello),
      • "giãbiệt" 辭別 cíbié (good-bye),
      • "kháchsáo" 客套 kètào (polite),
      • "caosang" 高尚 gāoshàng (high class) [ SV 'caothượng' (noble)],
      • "bợmtrợn" 霸道 bàdào (high-handed),

      • and early Mandarin, such as,

      • "dulịch" 遊歷 yóulì (travel),
      • "kháchsạn" 客棧 kèzàn (hotel),
      • "duhọc" 遊學 yóuxué (study abroad),
      • "lydị" 離異 líyí (divorce),

      • as well as a great number of fundamentally basic words which originated from the same roots.

        Vietnamese Mandarin MC (Baxter) MC (Pulleyblank) OC (Baxter-Sagart) OC (Zhengzhang) Gloss & Notes
        cha, tía 爹 diē / ta:j tˤra Father. Tía matches modern M diē closely; cha reflects older MC reading linked to 知 zhī phonetic series.
        mẹ, mợ 母 mǔ mjɨuX muwX mjaʔ mʷəʔ Mother. Multiple VS variants (mái, cái, mệ, mạ, mợ). Retains OC final glottal stop.
        chị 姊 zǐ / jiě tsjɨX tsɨX ɕjəjʔ ʑeʔ Older sister. Shares root with 姐 jiě; SV tỷ, tỉ.
        em 妹 mèi mjejH mjejH mˤi[t]-s mˤi[t]-s Younger sister. VS em may be contraction of 妹妹 mèimèi.
        anh 兄 xiōng xjʉŋ xjɨwŋ smraŋ s.mˤraŋ Older brother. SV huynh retains aspirated onset lost in VS anh.
        em 俺 ǎn ʔəm ʔəm ʔamʔ ʔamʔ Younger brother/self-reference. Northeastern Mandarin colloquial form.
        lửa 火 huǒ xwɑX xwaX qʷʰajʔ qʷʰaʔ Fire. VS onset shift /hw- ~ l-/.
        葉 yè jep jep lap lap Leaf. Cognate forms in Tibeto-Burman (Tib. ldeb, Burm. ɑhlap).
        đất 土 tǔ thuX thuX thaʔ tʰaʔ Soil, earth. Retains OC final glottal stop.
        cúng 供 gòng kuŋ kuŋ kˤoŋ koŋ Offer to spirits. SV cống in formal register.
        giỗ 祭 jì tsjejH tsjejH tsˤi[t]-s tsˤi[t]-s Ancestor memorial ceremony.
        xơi 食 shí ʑik ʑik ljək lək Eat. SV thực. VS xơi colloquial; cf. Cant. /sjək8/.
        uống 飲 yǐn ʔimX ʔimX ʔjəmʔ ʔəmʔ Drink. SV ẩm.
        哺 bǔ buX buX paʔ paʔ Breastfeed, suck. SV bộ.
        thịt 膱 zhí tɕɨk tɕɨk tjək tək Meat. SV thức. Shares phonetic stem 戠.
        lúa 來 lái ljəj laj rˤə Unhusked rice. Starostin links to 稻 dào in OC.
        chả 炸 zhà tsræH tsræH tsˤrak-s tsˤrak-s Fried meatloaf; in VS, boiled ham (chả lụa).
        gỏi 膾 kuài khwajH khwajH kʰˤwat-s kʰˤwat-s Minced meat salad.
        ruột 乙 yǐ ʔit ʔit ʔit ʔit Intestine (arch.).
        tôm 鰕 xiā ɣæ ɣæ [ɢ]ˤra [ɢ]ra Prawn.
        tép 蝦 xiā ha ha [g]ˤra [g]ra Shrimp.
        雞 jī kej kej ke ke Chicken. SV .
        mèo 貓 māo maw maw mrhaw mrew Cat. SV miêu. VS mèo colloquial.
        chuột 鼠 shǔ ʂjoX ʂjoX hljaʔ hlaʔ Mouse, rat. SV thử.
        trâu 牛 niú ŋjuw ŋjuw ŋʷə ŋʷə Water buffalo. SV ngưu.
        diều 鷂 yào jewH jewH ɢʷˤew-s ɢʷew-s Kite. SV ngao.
        chó 狗 gǒu kuwX kuwX koʔ koʔ Dog. SV cẩu.
        cọp 虎 hǔ xuX xuX qʰˤaʔ qʰaʔ Tiger. SV hổ.
        gấu 熊 xióng ɣjuŋ ɣjuŋ ɢʷʰəm ɢʷəm Bear. SV hùng.
        voi 為 wēi we we waj waj Elephant (in VS usage). OC gloss in Shuowen: ‘female monkey’; semantic shift in VS.


        and the list continues on for many other fundamental derivatives with extended meanings,

      • "ruốc" 肉 ròu (meat) [ VS 'nhục' = VS "thịt" 膱 zhí (SV thức) | M 肉 ròu < MC ȵuwk < OC *njuɡ  ] as opposed to "ruốc" 蟹 xiè (small long-legged crab) that has it own variants:
        • "ghẹ" 蟹 xiè (small long-legged crab), and variants
        • "riêu (rêu)" 蟹 xiè (baby crab),
        • "cua" 蟹 xiè (crab),
        • "cáy" 蟹 xiè (crab),
      • "bún" 粉 fěn (vermicelli) [ M 粉 fěn, fèn < MC pun < OC *pɯnʔ | cf. 粉條 fěntiáo: (1) búntàu (mung bean vermicelli), (2) phởtiếu (rice noodles), (3) hủtiếu (rice noodles) ], and its variants:
        • "bột" 粉 fěn (flour),
        • "phở" 粉 fěn (noodle),
        • "phấn" 粉 fěn (chalk),
        • "bụi'' 粉 fěn (dust),
      as more of them will be enumerated later in this survey.

      Many fundamental items in Vietnamese display simultaneous traces of shared origin and identifiable loanword status in both Vietnamese and Chinese. It is reasonable to postulate that a portion of these terms evolved from the same root, particularly where they are basic vocabulary items unlikely to have been borrowed, or which predate the introduction of loanwords. For a term to be treated as a loanword, its provenance must be clearly traceable to Chinese or “Yue” sources, in either direction. While Austroasiatic theorists have conducted studies attributing certain Vietnamese words to Mon–Khmer origins, this paper treats them as of "Yue" origin, for example: sông 江 jiāng ('river'), dừa 椰 yé ('coconut'), chuối 蕉 jiāo ('banana').

      Beyond these basic words of shared root, there is no doubt that Vietnamese has, since ancient times, acquired numerous Chinese words of similar fundamental nature. Historically, this phenomenon can be interpreted as the result of an active process of adoption, terms picked up by native wives of Chinese foot soldiers stationed in the Annamese colony, and taught by local scholars in village schools, continuing long after Vietnam’s independence from China. As late as the Ming Dynasty in the 15th century (Nguyễn Tài Cẩn, 1979), when China ruled Annam from 1407 to 1427, carved wooden tablets unearthed in Vietnam in the late 1970s revealed late Sinitic–Vietnamese usages. These inscriptions provide tangible evidence that Ming–era Chinese lexicons were still being actively adopted.

      It is probable that many of these Chinese words entered Vietnamese through forms of vernacular Mandarin, a recurring phenomenon that persists today. For example, Sino-Vietnamese usage adapted from modern Chinese, especially in the Northern dialect, expanded steadily after the division of Vietnam into North and South in 1954, and continued long after reunification in April 1975. This process solidified a new set of modern Sinitic–Vietnamese and Sino–Vietnamese terms semantically closer to contemporary Chinese, such as SV khẩntrương 緊張 jǐnzhāng ('urgent') used casually in the milder sense of 'quickly', contrasting with the more intense Chinese meaning equivalent to VS căngthẳng ('strained'); SV đảmbảo 擔保 dànbǎo ('guarantee') instead of bảođảm in Southern usage; and other similar loanwords such as SV sựcố 事故 shìgù ('incident'), SV đạocụ 道具 dàojù ('stage prop'), khaisân 開場 kāichǎng ('start a show'), giaođãi 交待  jiāodài ('instruction').

      Beyond new loanwords or deviations from older usage, Chinese linguistic influence on Vietnamese has been a continuous process into the modern era, producing up-to-date colloquial and specialized terms such as:

      • chuồn 滾 gǔn ('get out') [cf. VS cút, partially cognate to cútđi 去去 qùqù]
      • lặn 溜 liù ('slip out')
      • khôngdámđâu 不敢當 bùgǎndāng ('it’s not so')
      • nóichuyện 聊天 liáotiān ('talk')
      • bahoa 大話 dàhuà ('pompous')
      • baphải 廢話 fèihuò ('be all mouth')
      • ẩutả 苟且 gǒuqiě ('unattentively')
      • bạtmạng 拼命 pìnmìng ('recklessly')
      • bênhvực 包庇 pāopì ('take side')
      • bậnviệc 忙活 mánghuó ('busy')
      • dêxồm 婬蟲 yínchóng ('lecherous')
      • phaocâu 屁股 pìgǔ ('chicken’s butt' as a delicacy)
      • tiếtcanh 血羹 xuěgēng ('concentrated duck blood broth')
      • suỷcảo 水餃 shuǐjiǎo ('dumpling')
      • mìchính 味精 wèijīng (SV vịtinh, 'monosodium glutamate')
      • tầmbậy 三八 sānbā ('nonsense'; derogatory term ridiculing women on International Women’s Day, March 8)
      • biểutình 表情 biǎoqíng ('demonstration') [cf. modern M 遊行 yóuxíng (SV duhành, VS diễuhành)]
      • xơitái 吃生 chīshēng ('eat raw'; figuratively 'to butcher') [cf. 生 shēng; Hai. /te1/ 'đẻ'; Cant. 食生 /shejk8 shang1/]
      • chếtyểu 夭折 yāozhé ('die young')
      • trúnggió 中風 zhòngfēng ('stroke'; folk concept)
      • ôngchủ 主公 zhǔgōng ('master') [cf. 老闆 láobǎn]
      • tàixế 司機 sījī ('chauffeur')
      • láixe 駕車 jiàchē ('drive a car') [ Also: 'driver' | cf. chạyxe]
      • ngânquỹ 銀櫃 yínguì ('fund') [in place of 基金 jījīn]
      • băngtần 頻道 píndào ('TV channel')

          These examples illustrate that the Chinese–Vietnamese lexical relationship is not a static historical residue but an ongoing, adaptive process, continually refreshed by contemporary contact and usage.

          The emergence of many words in this list is open to speculation, given the long history of bustling cross-border movement between China and Vietnam via both land routes and sea passages. Until 1949, these borders remained effectively open, with no entry or exit visas required on either side, even under the watch of the French colonial administration. For example, in 1945, following the end of World War II, large numbers of Chiang Kai-shek’s Kuomintang troops entered northern Vietnam to disarm Japanese forces that had surrendered but remained in place after the withdrawal of the Chinese army.

          This sustained freedom of movement undoubtedly injected vernacular forms from southern Chinese dialects, particularly those of Yunnan, Guangxi, and Guangdong, into the Vietnamese vocabulary. Phonetically, many of these borrowings resemble twisted or imitated forms of "accented Mandarin", or at best, a kind of pidginization by the general populace, reflecting deviations in both modern Vietnamese pronunciation and orthography. For instance, 國 guó appears in Sino-Vietnamese as quốc /wəwk˧˥/, but also as quấc /kwəwk˧˥/ while the native Vietnamese is nước /nɨək˧˥/ (‘country’), or and Quanhoả imitates the sound 官話  Guānhuà (SV Quanthoại), etc. Even today, close observation of daily life along the border regions of both countries offers ample evidence of such linguistic interplay.

          As for words in the basic stratum that seem cognate in both Mon-Khmer and Chinese, the commonality noted here is provisional, pending more extensive research into the genetic linguistic relationship between Chinese and Vietnamese. The present study’s aim is to establish a lexically meaningful connection between the two languages by exploring and elaborating on the significance of the many Vietnamese words, across all linguistic categories, that appear to have Chinese roots, some traceable to centuries before the first infantrymen of the Han Empire set foot on NamViệt 南越 NánYuè soil.  


          Figure 7 - King Triệu Đà's Mausoleum



          View from the rear entrance of King Zhao Tuo's Mausoleum in Guangzhou City, Guangdong Province (Source: photo by dchph - 4/2015)

          Mausoleum of the kings of the ancient Kingdom of NamViệt 南越王國 NánYuè Wángguó, whose capital, Phiênngung 番禹 Fànyú,, was located in Guangzhou 廣州, now the provincial capital of China’s Guangdong Province. At its height, the kingdom also encompassed part of what is now northeastern Vietnam, homeland of the aboriginal ancestors of the ancient LuoYue (LạcViệt) people, or pre-Annamese, who had migrated from the regions surrounding Dongtinghu 洞庭湖 (Độngđìnhhồ) in present-day Hunan Province, China. (H)

          In this study, questions of linguistic affiliation emerge naturally from the integration of historical overview and linguistic evidence, with particular attention to the layering of lexical strata. The Sinitic-Vietnamese investigation is framed as an effort to trace lines of kinship directly from Yue to both Vietnamese and Chinese, in reciprocal directions. This framework takes into account the historical amalgamation of Yue populations from the ancient states of southern China with Han colonists in ancient Annam.

          The extensive body of Sinitic-Vietnamese vocabulary, together with a subset of Mon-Khmer basic cognates whose ultimate derivation from ancient Yue languages remains unresolved, highlights the need to examine Vietnamese in its holistic form. From a comparative standpoint, the structural and lexical characteristics of Vietnamese align more closely with Chinese than with any language within the Mon–Khmer subfamily. In fact, the degree of affinity between Vietnamese and Chinese surpasses that observed between many Chinese dialects and other branches of the Sino–Tibetan family, including Tibetan. This perspective positions Vietnamese not merely as a recipient of Chinese influence, but as a language whose historical development reflects deep and sustained contact with both Yue and Han linguistic traditions.

          IV) A new dissyllabic sound change approach to be explored

          Contrary to a long‑standing misconception in certain linguistic circles, neither Chinese nor Vietnamese is inherently monosyllabic. This belief has persisted largely because novices tend to accept and repeat what they have been told without examining the evidence. A simple survey of modern Vietnamese vocabulary, which closely parallels that of Chinese, reveals that both languages are dominated by disyllabic words, that is, lexical items composed of two syllables. Strikingly, the majority of these are of Chinese origin. Such forms are variously referred to as lexical disyllabicity, disyllabicism, or simply disyllabics. In the following section, we will examine disyllabic words in detail, tracing their linguistic changes and the processes by which they have evolved.

          The term dissyllabic can be spelled with a single “s,” yet in this research when it is deliberately written with “ss”, it is to underscore the central importance of disyllabicity in both Vietnamese and Chinese. Recognizing this feature is a prerequisite for any serious study of the two languages.

          Metaphorically, they may be seen as growing from a vast, ancient linguistic tree with a monosyllabic stem, its roots sunk deep into fertile soil enriched by thick Sinitic layers atop an indigenous substratum. Over time, its branches have become heavy with dissyllabic leaves and dotted with polysyllabic fruits, each with distinct textures, forms, and appearances.

          Understanding this natural evolutionary path is an unprecedented insight, one that can guide researchers in identifying further etyma and tracing their origins with greater precision.


          Many prominent Sinologists of the 20th century — including Maspero (1912), Karlgren (1915), Haudricourt (1954), Wang Li (1956), Chang (1974), Denlinger (1979), and Vietnamese scholars such as Lê (1967), Nguyễn (1979), and Ðào (1983) — made extensive use of Chinese data to illuminate the etymology of Sinitic‑Vietnamese words of Chinese origin over the past two millennia. While they recognized an affinity, whether genetic or not, between Chinese and Vietnamese, their analyses were overwhelmingly confined to monosyllabic forms. As a result, many Sinitic‑Vietnamese etyma escaped notice. In reconstructing Middle Chinese (MC) phonology, they relied on Sino‑Vietnamese readings but often failed to identify cognates of the same root embedded in Sinitic‑Vietnamese vocabulary.

          Consider 東 as an illustrative case. Pulleyblank (1984) reconstructed its Middle Chinese (MC) value as /*təwŋ/, corresponding to modern Mandarin /dong1/, within his Early Mandarin framework. This aligns with the Sino‑Vietnamese đông /doŋʷ1/ [ɗəwŋ], a form belonging to a rare division class marked by a closed, rounded lip final /‑owŋʷ/. Pulleyblank, together with Li Fang‑Kuei (1971), was among the few scholars to recognize this distinctive Old Chinese articulation.

          From such a reconstruction, it is possible to posit that words ending in /‑owŋʷ/ , or even /‑owŋm/, could evolve into /‑ow/. Following the sound‑change pattern of clipping, one finds parallels in forms such as đau ('painful') from 痛 tòng, thau ('bronze') from 銅 tóng, and đỏ ('red') from 彤 tóng. The reverse pattern may also be observed, as in đường ('road') from 道 dào (SV đạo), which can be postulated with a /daw/ final. The author doubts that most renowned linguists have recognized this type of clipping interchange between /‑owŋʷ/ and /‑aw/.

          In the realm of Old Chinese reconstruction, eminent Sinologists such as Bernhard Karlgren and Henri Maspero devoted much of their attention to tracing Chinese loanwords in Thai, Khmer, Japanese, and Korean. In doing so, they overlooked a crucial fact: during the millennium of imperial Chinese rule, the inhabitants of ancient Annam regularly articulated forms of Mandarin in everyday speech. At the official level, local administrators addressed the China‑appointed viceroy in Mandarin; in domestic and social contexts, native wives conversed with their Chinese husbands and children in a mixed Chinese–Annamese vernacular, which also served as a lingua franca among Annamese themselves in daily colonial life.

          Phonologically, when Pulleyblank and Wang drew upon Sinitic‑Vietnamese and Sino‑Vietnamese material in their reconstructions of Old Chinese, they may have recognized that such Chinese elements were embedded in early Annamese, some of which later crystallized into the Sinitic‑Vietnamese stratum. Yet their engagement with Vietnamese was largely confined to the Sino‑Vietnamese (Hán‑Việt) stock, and their proficiency in the language was limited, evidenced by frequent misspellings in cited examples. This narrow scope became a methodological constraint, perpetuating the same one‑to‑one correspondence issues that have long characterized earlier Sino‑Tibetan theorization on Vietnamese.

          Because they failed to appreciate the close phonemic proximity between Chinese and Vietnamese, these scholars were unable to detect plausible sound‑change variations beyond monosyllabic stems. Consequently, none identified the expanded potential of monosyllabic roots when they appear in disyllabic formations, a phenomenon central to the present study.


          Table 9 - Monosyllabicity

          "Monosyllabicity" (tínhđơnâmtiết 單音節性) refers to the predominance of single‑syllable words in a language’s vocabulary. In earlier periods, some Western linguists even equated this feature with linguistic "primitivity," likening it to the speech of so‑called "savage" tribal groups in remote Amazonian jungles. The author is unaware of any truly monosyllabic language existing on earth.

          In the case of Vietnamese, those who have labeled it "monosyllabic" reveal a fundamental misunderstanding. They have never undertaken the basic numerical exercise of calculating how many possible consonant‑vowel combinations a language restricted to monosyllables could produce, given standard syllable structures — VC, V, CV, and CVC — combined with the eight tones, and all possible phonemic permutations (for example, tac, tap, tat, and so on).

          In the author’s most recent count, the Sinitic‑Vietnamese lexical collection in the Han-Nom Etymology dictionary  contains nearly 80,000 entries. Many of these monosyllabic forms occur with frequency rates ranging from 25% to well above 100%, and require replacement with polysyllabic forms (tínhđaâmtiết 多音節性) to resolve semantic ambiguity. For example:

          • manh 單 dān ("single") → áomanh 單衣 dānyi ("sweater")

          • manh 氓 máng ("folk") → lưumanh 流氓 líumáng ("hooligan")

          Such polysyllabic expansion is not merely stylistic; it is a functional necessity for clarity in communication. 


          For newcomers to the field, it is worth beginning your exploration of Vietnamese etymology within the realm of polysyllabicity — that is, the vocabulary stock consisting of words with two or more morphemic syllables. Examples include bảvai ('shoulder'), bângkhuâng ('melancholy'), sựcnhớ ('suddenly remember'), lộnxàngầu ('chaotic'), tóctaibùxù ('uncombed hair'), mặtmàybíxị ('unhappy face'), and many others. You will quickly see that two‑syllable words dominate the vocabulary stock, a pattern identical to the lexical status of Chinese.

          Following this path leads naturally into historical linguistics, revealing far more about Vietnamese than the oft‑repeated misconception, still found among some native specialists, that it is a monosyllabic language.

          Vietnamese is, in fact, far more than a disyllabic language in the lexical sense; it possesses the capacity to coin words with affixes, syllabically, a topic to be discussed later in relation to phonology, semantics, and syntax. This section introduces the dissyllabicity approach and explains its advantages over older, self‑constrained methods that focus solely on monosyllabic words and one‑to‑one correspondences as base units of investigation. Once the linguistic community recognizes the superiority of the dissyllabicity approach, it can serve as a framework for newcomers to identify more Sinitic‑Vietnamese etyma, whose syllabic components are mostly of Chinese origin. Research in either Vietnamese or Chinese historical linguistics cannot be complete without reference to the other.

          As noted earlier, the two languages are closely related and intertwined not only in Yue and Sino‑Tibetan etymologies but also in historical phonology. Chinese loanwords make up a large portion of Vietnamese vocabulary and have generated both focused and extended derivatives as the language has developed into full polysyllabicity. Examples include xứsở ('birthplace'), hợppháp ('legal'), tửtế ('kindness'), and xàcừ ('tridacna').

          The point is clear: Vietnamese is not 'monosyllabic'; the language has evolved over three millennia into a sophisticated language on par with any modern tongue. Polysyllabically speaking, it stands alongside English, French, and others, though it still wrongly remains written in a monosyllabic orthography. Reform toward a polysyllabic writing system — just like pinyin that is applicable to Chinese character‑block writing following smart Korean script,— would place Vietnamese on truly equal footing with these languages. Such a shift would free specialists from outdated perceptions and methodologies, opening the Sinitic‑Vietnamese domain as a new frontier in historical linguistics.

          Modern Vietnamese clearly shows its dissyllabic nature, with numerous high‑frequency disyllabic words formed either from two word‑syllables or from morphemic‑syllables — the former being independent monosyllabic words used as syllables, the latter bound morphemes that cannot function alone.

          Most disyllabic words, often direct Chinese loanwords, entered Vietnamese intact and later evolved within the language. Chinese syllables have also been used to coin new disyllabic forms in much the same way as in modern Chinese, resulting in many Vietnamese and Chinese terms being near‑mirror counterparts. These compounds may be formed from individual morphosyllables that are synonymous, parallel, opposite in meaning, or simply assembled from existing lexical material.

          In the following examples, we see how tức|giận ('mad/angry'), thương|yêu ('affection/love'), and trước|tiên ('firstly/initially') illustrate parallel or symmetric formation; how chiềucao ('height') and caothấp ('ranks') show antonymous pairing; and how locally innovated forms such as chàohỏi ('greeting, hello') or xinlỗi ('apologize') demonstrate semantic extension beyond their original Chinese meanings.

          a. Parallel or symmetric compounds

            • tức|giận 氣憤 qìfèn ('mad/angry'), 
            • thương|yêu 疼愛 téng'ài ('affection/love'), 
            • trước|tiên 首先 shǒuxiān ('firstly/initially'), 
            • kề|cận 切近 qièjìn ('by/near'), 
            • đường|cái 街道 jièdào ('road/street'), 
            • đường|lộ 道路 dàolù ('road'),

          b. Opposite or antonymous compounds

          For compounds formed from opposite, or antonymous, word‑syllables, as noted earlier, examples include 高低 gāodì ('high/low'), which in Vietnamese corresponds to chiềucao ('height') with the same connotation as độcao 高度 gāodù ('the height'). At the same time, 高低 gāodì is also associated with the existing form caothấp ('ranks') to denote hierarchical levels in a competition. Similarly, 大小 dàxiăo ('large/small') becomes kíchthước ('size'), associated with 尺寸 chǐcùn (SV xíchthốn), while the original 大小 dàxiăo has evolved further in Vietnamese as tonhỏ ('whisper'). In these cases, the resulting forms are modified disyllabic words shaped by local innovation.

          c. Locally innovated or semantically extended forms

          Similarly, many other words reuse existing vocabulary to develop new, locally modified meanings, presenting themselves as extended and renewed lexical items whose senses exist only in Vietnamese through phonological association. This can be seen in cases such as @ 招 zhāo ~ chào ('greet') 早 zăo, @ 呼 hu ~ hỏi ('ask') 問 wèn, @ 見 jiàn ~ xin ('request') 請 qǐng, and others, exemplified below:

            • chàohỏi 打招呼 dăzhāohu ('greeting, hello') [@ 招 zhāo ~ chào 早 zăo; @ 呼 hu ~ hỏi 問 wèn]
            • xinchào 見過 jiànguò ('greeting, hello') [@ 見 jiàn ~ xin 請 qǐng; @ 過 guò ~ chào 早 zăo; archaic usage]
            • xinlỗi 見諒 jiànliàng ('apologize') [@ 見 jiàn ~ xin 請 qǐng; @ 諒 liàng ~ lỗi; archaic usage; cf. modern M 道歉 dàoxiàn; also cognate to VS xinlỗi via @ 歉 xiàn ~ xin 請 qǐng and @ 道 dào ~ lỗi 罪 zuì (SV tội) → VS lỗi ('wrongdoing')]
            • thươnghại 傷害 shānghài ('sympathize') [opposed to 'injure'; cf. modern M 同情 tóngqíng or SV đồngtình ('sympathize'); alternatively VS thươngtình ('pity')]
            • tửtế 仔細 zǐxī ('kindness') [opposed to VS tỉmỉ ('meticulous') in Mandarin; cf. modern M 細心 xīxīn]
            • lịchsự 歷事 lìshì ('polite') [opposed to VS bặtthiệp ← lịchthiệp ← 渉歷 shèlì SV thiệplịch ('polite')]
            • chùxị 主事 zhǔshì ('host') [opposed to cognate 主席 zhǔxí SV chủtịch ('chairman')]
            • đànghoàng 堂皇 tánghuáng ('solemnly') [opposed to 'stately']
            • sựcnhớ 想起 xiăngqǐ ('suddenly remember') [@ 想 xiăng ~ sực ('chợt' 突 tù); @ 起 qǐ ~ nhớ 記 jì (ký)]
            • nặngnhẹ 輕重 qīngzhòng ('criticize') [opposed to 'weight']
            • khốnnạn 困難 kùnnán ('wretched') [opposed to 'difficulty' and '混蛋 húndàn' in modern Mandarin]
            • bànội 內婆 nèipó ('paternal grandmother') [opposed to modern M 奶奶 năinài ('grandmother'); cf. 內公 nèigōng ('grandfather') in parallel with 外婆 wàipó bàngoại ('maternal grandmother') and 外公 wàigōng ôngngoại ('maternal grandfather')]
            • anhem 兄妹 xiōngmei ('siblings', literally 'older brother and younger sister') [cf. em ← emgái 妹妹 mèimei ('younger sister', by metathesis and contraction) and 俺 ăn (VS 'em', 'younger brother'), a first‑person self‑addressing pronoun in Northern Mandarin (Shanxi, Shandong, Liaoning, etc.)]
            • cậunhỏ 小舅 xiăojìu ('little boy') [opposed to original meaning: wife's younger brother addressed by her husband]
            • chúnhỏ 小叔 xiăoshù ('little boy') [opposed to original meaning: husband's younger brother addressed by his older brother's wife]
            • cônhỏ 小姑 xiăogū ('little girl') [opposed to original meaning: husband's younger sister addressed by his older brother's wife]

              d. Other categories

              • khoảngđường 途徑 tújīng (route),
              • cáibàn 案子 ànzi (desk),
              • cáighế 椅子 yízi (chair),
              • cámực 墨魚 mòyú (to cover the modern M 魷魚 yóuyú 'squid' to),

                etc.

            Why do these details on disyllabic words matter in the study of Vietnamese etymology? They demonstrate that such modified forms originate from changes in semantic, phonological, or lexical aspects — including localized words built from the same Chinese material in disyllabic formation — which provide greater semantic precision than single monosyllabic words.

            From a semantic perspective, close examination of the previously cited examples reveals recurring sound‑change patterns that underpin the etyma of derived Vietnamese words with extended meanings. These often represent selective alternations among Chinese disyllabic equivalents.

            To expand your Sinitic‑Vietnamese corpus, refresh your memory and consider the following additional examples:

            1. Semantic patterns

            Close examination of the examples above reveals consistent sound‑change patterns underlying the etyma of Vietnamese disyllabic words with extended meanings. Many are selective alternations of Chinese disyllabic equivalents, adapted semantically to local usage.

            Examples:

            • phànnàn: 抱怨 bàoyuàn (SV báooán, 'complain') [→ thanphiền ('complain') ← thanvan ('lament') ← 'than' 嘆 tàn + 'phiền' 煩 fán]
            • dànhriêng: 限於 xiànyú (SV hạnvu, 'purposely reserved for') [dànhcho ('reserved for') ← 'be limited to'; extended meaning synonymous with SV giớihạn (界限 jièxiàn, 'set the boundary')]
            • rànhmạch: 明白 míngbǎi (SV minhbạch, 'unequivocal') [sángtỏ ('understand, bright') ← M 明白 míngbǎi; M 明 míng < MC maiŋ < OC *mraŋ || See VS 'biết' ('know') <~ (clipping of 明白 míngbǎi), cf. Hainanese, Amoy /bat7/]
            • ănhàng: 吃貨 chīhuò SV ngậthoá ('eat gluttonously') [also 'like to eat junk food'; extended meaning 'run contraband']
            • ănchơi: 應酬 yìngchóu ('drinking and eating') [ ănnhậu ('be invited to dinner') ← 'engage in social activities' ]

            2. Phonological variations

            Sound‑change articulation often produces multiple Vietnamese variants from a single Chinese source, sometimes with altered tone or final consonant.

            Examples:
            • riêngtư: 隱私 yǐnsī SV ẩntư ('private') [variants: riêngtâytưriêng]
            • sànhvề: 善於 shànyú SV thiệnvu ('be good at') [variants: rànhvềrànhrẽsànhsỏi, sỏivề, hayvề...  given /sh‑ ~ s‑, r‑, l‑/; thiện > hiền > hay ]
            • dùrằng: 雖然 suīrán SV tuynhiên ('although, even if, however') [variants: chodùdùsaomặcdùtuyvậydẫurằngdùlàmặcdầu... given /s‑ ~ j-(d-)/, /r‑ ~ l‑, m/]

            3. Lexical associations

            Many disyllabic compounds are built from two monosyllabic words that are themselves variants of Chinese morphosyllables, linked by synonymy, association, or metathesis.

            Examples:

            • tức|giận ('angry'): Vietnamese variation of 生氣 shēngqì; tức ← 氣 qì, giận ← 恨 hèn; reversed word order compared to Chinese.
            • trước|tiên ('firstly'): Cognate to 首先 shǒuxiān; trước ← 前 qián, tiên ← 先 xiān; association with đầu 首.
            • cũ|kỹ ('old'): Reduplication;  ← 舊 jìu, kỹ as variant; parallels 陳舊 chénjiù.
            • kề|cận ('nearby'): Differs from 切近 qièjìn; aligns with gầnkềgầngũi; metathesis in local speech patterns.

            4. Historical Development

              Aside from older inherited forms like khủnglong 恐龍 ('dinosaur') or yểuđiệu 窈窕 ('graceful'), much of Vietnamese lexical dissyllabicity is a later development. It functions as a mechanism to reduce monosyllabic homonymy after tonal differentiation, a process unique to both Chinese and Vietnamese.

              5. Shared mechanisms

              Chinese and Vietnamese share internal characteristics in disyllabic formation: pairing homonymous morphosyllables with tonal variation to create precise meanings.

              Example:

              • hiếuthảo ('filial'): Cognate to 孝順 xiàoshùn (SV hiếuthuận); thảo originates from thuận /tʰwʌn6/, denasalized to /‑ảo/; meaning shifted to 'generous' in contexts like thảoăn ('share food generously').
              • 順 (shùn) acts as a free‑floating affix, as in xuôigió 順風 ('tail wind'), suônsẻ 順利 ('smoothly').

              Phonologically, for now, newcomers to this field should begin by accepting at face value the many regular interchanges between Chinese and Vietnamese, in both directions. These shifts often occur in clusters across syllables rather than isolated phoneme changes:

                  • /‑eng → ‑e/
                  • /‑ang → ‑ac/
                  • /‑ong → ‑aw/
                  • /‑k → ‑ng/
                  • /n‑ → đ‑/
                  • /‑n → ‑i, ‑t/
                  • /‑wan → ‑oi/
                  • /‑u → ‑ang/

               Comparable sound‑change patterns follow logical linguistic rules: phonemic shifts occur within the realm of neighboring sounds sharing similar articulatory attributes.

              Table 10 - Common sound‑change patterns from Chinese to Vietnamese in disyllabicity

              Chinese Mandarin Vietnamese
              reflex
              English Sound‑change patterns
              生 shēng đẻ 'give birth' /sh‑ ~ đ‑/; cf. Hainanese /te1/
              忙 máng mắcbận 'busy' /m‑ ~ b‑/; /‑ang ~ ‑ak/
              痛 tòng đau 'pain' /t‑ ~ đ‑/; /‑ong ~ ‑aw/
              尿 niào đáitiểu 'urinate' /n‑ ~ đ‑, t‑/
              蒜 suàn tỏi 'garlic' /s‑ ~ t‑/; /‑uan ~ ‑oi/
              前 qián trước 'before' /q‑ ~ tr‑/; /‑ian ~ ‑uok/; cf. Hai. /tai2/
              幕 mù (SV mạc) màn 'curtain' MC mak → VN /‑an/
              高低 gāodì chiềucaocaothấp 'height', 'ranks' Semantic extension; antonym pairing
              大小 dàxiăo kíchthướctonhỏ 'size', 'whisper' Semantic shift; antonym pairing
              無聊 wúliáo côliêu (SV vôliễu) 'in extreme depression' /w‑ ~ c‑/; semantic narrowing
              緣分 yuánfèn duyênnợ (SV duyênphận) 'fate, lot' /‑en ~ ‑iên/; semantic re‑association

              Historically, Chinese became increasingly disyllabic, likely stabilizing during the Tang Dynasty. Many Middle Chinese disyllabic words entered Vietnamese in batches, with all related sound clusters shifting within the paired syllables as a complete unit, not simply vowel‑to‑vowel or initial‑to‑initial changes, nor rigid one‑to‑one syllable correspondences. Some disyllabic loanwords also appear in reverse syntactic order compared to modern Mandarin, reflecting earlier Middle Chinese usage. Examples: 

                • bảođảm ~ 擔保 dànbăo ('guarantee'),
                • liênquan ~ 聯關 liánguān ('related to'),
                • thithố ~ 措施 cuòshī ('show'),
                • vinhquang ~ 光榮 guāngróng ('glorious'),
                • trángkiện ~ 健壯 jiànzhuàng ('strong').

              For researchers, attention to dissyllabicity is essential: sound‑change patterns in paired syllables are a key process in tracing Chinese roots of many Sinitic‑Vietnamese etyma. Both Vietnamese and Chinese are, in structural terms, disyllabic languages. Chinese is already classified as polysyllabic by major linguistic institutions worldwide (Chou 1982, p.106), and Vietnamese can be formally classed as disyllabic based on its word‑formation characteristics and shared commonalities with Chinese.

              Only within this framework can a reliable system of sound‑change patterns be established. Without such recognition, one might overlook correspondences such as 

                • 無聊 wúliáo ('in extreme depression') → VS côliêu (SV vôliễu)
                • 緣分 yuánfèn ('fate, lot by which couples are brought together') → VS duyênnợ (SV duyênphận).

              Recognizing that each Chinese word‑syllable in a pair may shift to a different sound in Vietnamese has led to the formulation of this dissyllabicity approach, enabling the identification of over 20,000 Vietnamese etyma cognate with Chinese forms, from ancient to modern dialects, literary and vernacular , many long regarded by purists as indigenous Nôm or "pure" Vietnamese words. Examples include:

              •  魚 yú ('fish') [cf. 魚汁 yúzhi ('catsup', 'anchovy sauce') from Amoy dialect; OC nga]
              • chim 禽 qín ('bird'), chóc 雀 què ('bird') [→ chimchóc ('birds')]
              • dưa 瓜 guā [→ dưahấu 奎瓜 kuìguā ('watermelon')]
              • chả 炸 zhà ('ham') [→ chảgiò 炸肉 zhàròu ('fried spring roll')]
              • lụa 肉 ròu ('meat') [→ chảlụa 炸肉 zhàròu ('boiled meat loaf'); also 肉 ròu ~ 縷 lǚ ('silk')]
              • giò 肉 ròu ('spring roll') [cf. chảgiò above]
              • rọi 肉 ròu ('meat') [→ barọi 肥肉 féiròu ('bacon')]
              • ruốc 肉 ròu ('meat') [Northern Vietnamese]
              • dồi 肉 ròu ('sausage') [Northern Vietnamese]
              • mặn 咸 xián ('salty') [→ mắm 鹹 xián; mắmcá ← 咸魚 xiányú ('fermented anchovy')]

              Because many lexical compounds derive their meanings from the pairing of syllables, their true form should be written in polysyllabic orthography,  as implemented in this paper. Chinese disyllabic words retain their paired‑syllable attributes when transformed into Vietnamese, often with significant semantic and phonological shifts. For example

                • 氣 qì shifted from hơi (Cant. /hei1/, 'air, steam') as in 汽車 qìchē ('automobile', VS xehơi) to kiệt in keokiệt 小氣 xiăoqì ('stingy'), while 小 xiăo became keo, cognate to 摳 (kòu, VS kẹo, 'stingy').
                • In 客氣 kèqì (SV kháchkhí, 'polite'), 氣 qì appears as sáo or khứa in kháchsáo, which evolved into kháchkhứa
                • In 生氣 shēngqì ('angry'), 氣 qì is tức ('angry'), while 生 shēng means sống ('live') or đẻ ('give birth'), the latter implying 'becoming angry'.

              The magnitude of these sound changes is far‑reaching and multi‑layered. In analysis, disyllabic words are treated as single units, with syllabic portions — macro-syllabic changes — capable of altering their vocalic shells in ways quite different from their monosyllabic counterparts — micro-phonetic changes.

              A single monosyllabic word can, in fact, have more than one pronunciation. The phonological constraints governing an independent monosyllable do not necessarily limit the range of sound changes that may affect it when embedded in a disyllabic formation, especially across languages, where internal forces such as speech habits or localization take over. Examples include: ‑子 zi (cái, con, cây, trái), ‑兒 ‑r (nhi, nhí, nhỏ), ‑者 zhe (kẻ, giả, gia, nhà). In other words, deviations in sound change can occur across the entire string of sounds in a disyllabic unit, producing results quite different from the stand‑alone monosyllabic form. This is a case of one syllable yielding many outcomes.

              If Vietnamese is still regarded as a monosyllabic language, the underlying dynamics of sound change in the Sinitic‑Vietnamese dissyllabicity approach will never be fully appreciated. Once the rule of sound change is accepted, as illustrated in earlier examples, questions such as why /‑ư/ corresponds to /‑a/, /‑iê/ to /‑a/, /‑au/ ~ /‑ông/, /‑ong/ ~ /‑au/, /‑at/ ~ /‑an/, /‑an/ ~ /‑ôt/, /‑ai/ ~ /‑ua/ will no longer arise. Nor will there be insistence on rigid one‑to‑one correspondences such as /‑ia‑/ → /‑ươ‑/, /‑ng/ → /‑ng/, or /d‑/ → /n‑/. In reality, combinations of phonological changes — affecting initials, medials, and finals separately — can produce entirely new sounds in the target language, e.g., 

                • MC 學 /ɦaɨwŋk/ 'study' (SV học) → M xué;
                • MC 一 /ʔjit/ 'one' (SV nhất) → M yī /ji1/.

              When certain Chinese loanwords entered Vietnamese, sound changes may already have occurred within Chinese itself, or they may have taken place later in Vietnamese. In either case, apart from synchronically irregular items, these changes operated within linguistic constraints, often influenced by cultural factors. Local speech habits, for example, yield 手板 shǒubănbàntay 'palm' instead of taybàn, or 母 'mother' → mẹ /mɛ6/, which further evolved into mợ /mə6/ 'maternal uncle’s wife' — likely through contraction by dropping 舅 jìu 'maternal uncle' while retaining Middle Chinese features. In Chinese, 舅母 jìumǔ remains the disyllabic form, avoiding homonymy.

              Cultural factors have facilitated selective borrowing and triggered sound changes. Even at the time of borrowing, many words followed established phonological patterns, continuing to evolve over time due to locality, social status, education, and historical context. Examples

                • 他  (SV tha, 'he, him') → nẫuhọ
                • 我  (SV ngã, 'I, me') → tôitaotuitớqua
                • 咱  (VS ta, 'I, we, us'); 咱們 zánměn (VS chúngmình, 'we' inclusive); 
                • 我們 wǒmén (VS tụimình, 'we').

              Borrowing from neighboring Mon‑Khmer languages is far less common, showing Vietnamese reluctance to adopt such vocabulary. Even in multi‑ethnic highland and southern provinces, indigenous placenames (Đắklắk, Kontum, Đàlạt, Pleiku, Sóctrăng, Càmau) have been “Vietnamized” with tonal accents, but the spoken language remains unaffected.

              By contrast, Vietnamese readily imports Chinese words, often adding parallel forms of the same root with similar meanings. For example, 

                • 粉條 fěntiáo (VS phởtiếu   hủtiếu, as in hủtiếu Namvang, 'Phnom Penh‑style seafood noodles') uses fěntiáo for hủtiếu, while "Namvang" is a transliteration of the name of Cambodia’s capital. 
                • 麵條 miàntiáo 'wheat noodles' shifted to sợimiến 'mungbean vermicelli' and sợimì and mìsợi,
                • 粉條 fěntiáo to búntàu (implied 'Chinese vermicelli'   phởtiếu or sợiphở 'rice noodles'. 
                • miàn 'wheat flour/noodles' became , as in bánhmì (from 麵包 miànbāo 'bread', with 包 bāo linked to bánhbǐng 'bread'), bộtmì  (麵粉 miànfěn, 'wheat flour') and, of course, mìsợi (麵條 miàntiáo)

                  Other examples: 

                • 味精 wèijīng ('MSG') →  vịtinh   mìchính
                • 水餃 shuíjiăo ('dumpling') → taivạc → quaivạc
                • 餛飩 húndùn ('wonton') → hoànhthánh vằngthánh

              Semantic shift is common in all languages, but Vietnamese preference for Chinese material is telling — if Mon‑Khmer were the root, such borrowing would be far less extensive, mostly on one-to-one basis.

              Historically, Vietnamese linguistic development paralleled Chinese for at least 1,200 years before the 10th century, and continued thereafter. Numerous Chinese lexical items entered via dialectal contact, and vice versa. Many Vietnamese cognates appear in the Kangxi Dictionary as dialect forms, e.g., mềm (面 miàn, 'soft'), ăn (唵 ǎn, 'eat') ; others, like mèo (卯 mǎo, 'cat'), are absent because Chinese scholars reject the cognacy.

              Loan doublets with different pronunciations often entered Vietnamese in different periods or from different dialects. They followed acquisitive models within a linguistic kinship boundary. Thus, neither French bande ~ Vietnamese băng, pot-au-feu ~ phở nor English cut ~ Vietnamese cắt; they are not cognate — but Chinese 繃 béng [baŋ] and VS băng, 粉 fěn for VS phởor 隔  [kat] and cắt are.

              The dissyllabics approach rests on analysis of Old Chinese, Middle Chinese, dialectal, and Mandarin data, rationalizing semantic relevance and generalizing sound‑change processes that match hundreds of Vietnamese words in their polysyllabic shells, including tonal correspondences across Chinese dialects. By recognizing the disyllabic nature of both Vietnamese and Chinese, we focus less on isolated phonemic shifts (e.g., /s‑ ~ t‑/, /sh‑ ~ th‑/, /t‑ ~ d‑/) and more on the dynamic, synchronous process in which clusters of sounds change together as a unit that is capable of producing multiple Sinitic‑Vietnamese variants, each distinct from the monosyllabic equivalents of their component syllables. 

              Table 11 - Benefits of polysyllabicity

              As a side note, it is worth emphasizing how efficiently the human brain processes polysyllabic structures, a phenomenon long recognized in the Western world. Cognitively, the chief advantage of combining written forms into polysyllabically linked blocks lies in the ability to absorb information more rapidly by perceiving entire conceptual units at once. This parallels the experience of reading in Latin‑based writing systems—particularly German—where speakers combine and capitalize noun strings to achieve the same effect. Comparable outcomes are also evident in other block‑writing systems such as Korean or Thai, though notably not in Chinese.

              A similar principle is applied in practical contexts. In U.S. motorway signage, for instance, drivers can recognize street names more easily when presented in polysyllabic blocks. The City of San Francisco, for example, replaced all‑uppercase street name signs with capitalized‑letter formats nearly two decades ago, significantly improving legibility and recognition from a distance.

              Acknowledging this cognitive process provides a foundation for the dissyllabicity approach advanced in this paper. Building on its basic concepts and general principles, this methodology enables the identification of a vast number of Vietnamese words of Chinese origin. One striking feature of Chinese polysyllabic words borrowed into Vietnamese is the extent to which their vocalism has undergone drastic sound changes, diverging sharply from the original pronunciation.

              As the examples illustrate, writing disyllabic words in their true combined form is central to postulating Sinitic‑Vietnamese etyma. In polysyllabic formation, individual syllables frequently undergo dynamic phonological shifts—deformation, contraction, or assimilation—moving from one form to another according to patterned articulations. Symbolically, this can be represented as:

              XX XX X XX X XX XX X XX...

              Here, spaces mark word boundaries in combined forms, with each XX modeled on block characters, as is scientifically denoted in Korean. Adoption of a similar system is strongly recommended for Chinese as well.

                       

              Comparatively, these sound‑change patterns resemble the way Latin polysyllabic roots generated diverse forms across the Indo‑European languages. Their variations are easier to trace because they are transcribed in Latin, and even in Cyrillic or Greek, alphabets. For the Sinitic‑Vietnamese etyma, we adopt a similar process by treating them as phonetic clusters rather than as Chinese ideographic blocks, which can distort conceptual analysis.

              Unconventionally, in Romanized transcription, Vietnamese disyllabic words in this paper are written in combining formation, just as Mandarin multisyllabic words are transcribed in pinyin. For example,

                  • 廢話 fèihuà ('nonsense') → VS baphải
                  • 大話 dàhuà ('pompous') → VS bahoa
                  • 溫馨 wēnxīn ('warm') → VS ấmcúng
                  • 溫水 wēnshuǐ ('hot water') → VS nướcnóng
                  • 溫泉 wēnquán ('hot spring') → VS suối(nước)nóng
                  • 開心 kāixīn ('pleased') → VS hàilòng, vuilòng
                  • 忍心 rěnxīn (SV nhẫntâm, 'cool‑heartedly') → VS đànhlòng
                  • 忍讓 rěnràng ('forbearing') → VS nhườngnhịn

              We will continue to examine this phonetic phenomenon to understand why, in many cases, sound changes in disyllabic words are both phonologically and semantically distinct from their original roots. Seeing multiple derived morphs from the same syllable in different disyllabic forms helps reveal that sound patterns operate across the entire cluster, not as isolated syllables. Yet this same formation may confuse lay readers, leaving the impression that phonological variants of the same Chinese monosyllabic stem are ad hoc.

              As with earlier examples of dissyllabicity, the following illustrations expand on how syllabic changes in combined forms create new meanings. Consider 廢話 fèihuà ('nonsense'). If we accept bahoa as cognate to fèihuà, then 廢 fèi ('waste') aligns with ba through the interchange /f‑/ ~ /b‑/, while +/hoa/ conveys the sense of 'nonsensible', but only in this morphemic form and context.

              One may wonder how ba and fèi could be related. They are etymologically connected only within the disyllabic form 廢話 fèihuà. Vietnamese ba here has nothing to do with ba to mean 'three' or ba 'father'. The /ba/ of bahoa exists only within the phonetic shell of /fèihuà/. Monosyllabic 廢 fèi by itself corresponds to SV phế ('waste') and VS bỏ ('discard'). Thus, ba and hoa individually carry no lexical meaning when isolated; they function only as bound morphemes within the disyllabic structure that may convey a little phonosemantic.

              In this way, /ba/ as a bound morph contributes to both baphải and bahoa, yielding two distinct concepts in different vocalic shells. Here, one plus one produces more than two: several new disyllabic words emerge from 'recycled material' combined with other morphemes. Breaking them into monosyllables misses the point, since the morphemes alone do not carry meaning.

              With the same affix 廢 fèi, we see further developments:

                • bỏphế 廢除 fèichú ('eradicate')
                • bỏđi 廢棄 fèiqì ('abandon')
                • đồbỏ 廢物 fèiwù ('trash')
                • bỏhoang 荒廢 huāngfèi ('deserted')

              Like ba, the Sinitic‑Vietnamese bỏ is not always tied directly to 廢 fèi. It arises through multiple sound‑change processes, especially in disyllabic words. Other examples include:

                • bãibỏ 排除 páichú ('abolish')
                • bỏphiếu 投票 tóupiào ('cast a ballot')
                • bỏrơi 抛棄 pàoqì ('leave behind')
                • bỏđi 放棄 fàngqì ('abandon')
                • bỏqua 放過 fàngguò ('let go')
                • bỏlỡ 錯過 cuòguò ('miss an opportunity')
                • bỏmặc 不管 bùguăn ('do not care')
                • bỏbê 不理 bùlǐ ('abandon')
                • bỏphí 白費 báifèi ('to waste')

              and phrases:

              bỏlỡ dịpmay = 放過 機會 fàngguò jīhuì ('miss an opportunity')
              bỏtiền (vô túi) 把錢 進入 口袋 里 bă qián jìnrù kǒudài lǐ ('put money into the pocket')
              bỏtiền ra mua 花錢 來 買 huàqián lái măi ('spend money to buy')

              These shifts to bỏ reflect contextual innovation, involving not only phonological and semantic assimilation but also syntactic reshuffling, such as reversal of word order (đồbỏ, vứtbỏ, bỏhoang) to fit Vietnamese speech habits.

              Similarly, in baphải 廢話 (fèihuà, SV phếthoại), 話 huà evolves into hoa, but how does it become phải? The sound‑change rule ¶ /hw‑ ~ fw‑/ applies, a common pattern in Cantonese and Fukienese compared to Middle Chinese or Mandarin. For example, 葩 pā (SV ba) ~ 花 huā (Cant. /fa1/). In disyllabic formation, /fwa/ could shift to [fai3]. Note also that 話 huà /hwa5/ in its monosyllabic form could evolve into nói ('talk', SV thoại). A parallel pattern ¶ /th‑ (sh‑) ~ n‑/ is seen in 水 shuǐ (SV thuỷ, 'water') → nước; Viet‑Muong /dák/ parallels M 踏 tà 'đạp', VS chà ('trample').

              For the same reasons, sound changes can occur in a variety of other ways. For example:

              開 kāi ~ mở ('open') [cf. Cant. /hoj1/; SV khai; Hai. /k'uj1/; Viet. khui; note pattern ¶ /k‑ (kh‑) ~ m‑ \ hw‑/]
              口 kǒu ~ mỏ ('muzzle') [SV khẩu /kow3/, Cant. /how3/; cf. mồm 吻 (wěn, 'mouth', SV vẫn, VS hôn, hun, 'kiss'); note pattern ¶ /k‑ (kw‑, kh‑) ~ m‑ \ hw‑/]
              底 dǐ ~ trệt ('street level') [Ex. 一樓一底 yī lóu yī dǐ → một lầu một trệt 'the street level and one upper floor']
              快 kuài ~ mau ('fast'; a loan‑graph from the character meaning 'happy', SV vui) [SV khoái /k'waj5/, cf. Cant. /faj1/; note pattern ¶ /k‑ (kw‑, kh‑) ~ m‑ \ hw‑/]
              點 diǎn, as in 快點 kuàidiǎn → maulên ('hurry up') [here 點 diǎn (SV điểm) shifts because ¶ /d‑ ~ l‑/]

              Of course, lên herein does not mean 'ascend, go up, get on'; instead, it functions as a grammatical particle indicating a course of action, similar to 'up' in 'hurry up'. Phonologically, M /tjen3/ corresponds to Vietnamese /len1/, and etymologically both are cognate.

                • 點 [tjen3] also yields tiếng ('hour'), châm ('ignite'), chấm ('dot, dip'), ('a bit'), điểm ('point'), đếm ('count'), etc. [all from M 點 diǎn, diàn, dian, zhān < MC tiɛm < OC *te:mʔ ]. Remarkably, the Vietnamese meanings match the full semantic range of 點 (diǎn) in Chinese dictionaries.

              Separately, for the connotation of lên ('up'), compare:

              lênđây 上來 shànglái ('come up here').

              Here 上 shàng corresponds to lên ('ascend'), while 來 lái is a grammatical particle equivalent to VS ‑đây, assimilated as an adverb of direction, cognate with 此 cǐ (SV thử) or 這 zhè (SV giả) meaning 'here'.

                • 溫 wēn → ấm, but how does 馨 xīn become cúng? It is not the same as cúng 供 (gòng, SV cống, 'make offerings'), but a result of sound change. 馨 xīn is also pronounced xīng in pinyin, hinh /xejng1/ in SV [ 馨 xīng, xīn (hinh, hấn) < MC heŋ < OC *qʰeːŋ ]. The velar /x‑/ often shifts to labiovelar /kw‑/, /k'w‑/ in Chinese [cf. 慶, 磬, 罄, all qìng in Mandarin, khánh in Sino-Vietnamese]. In 馨香 xīnxiāng ('fragrance'), 馨 xīn aligns with thơm, associated with hương (香 xiāng, 'fragrant')

              This illustrates the dissyllabicity approach: begin with a word in Vietnamese or Chinese, expand it into all plausible disyllabic cognates, then eliminate the unreliable to establish the most plausible etyma.

              Many examples show how the same morphemic syllable in derived disyllabic words evolves into bound morphemes or compounded morphs, which cannot stand alone but function only within polysyllabic formations.

              For instance, 利 lì (SV lợi, lị) yields VS lời and lãi:

                • 利息 lìxì (SV lợitức) → VS tiềnlời ('profit, interest rate')
                • 利得 lìdé (SV lợiđắc) → VS đượclãi ('making money')
                • 贏利 yínglì (SV anhlợi) → VS ănlời ('earning profits')
                • 有利 yǒulì (SV hữulợi) → VS cólợi ('beneficial')
                • 利事 lìshì (SV lợisự) → Cant. /lei2sei2/ → VS lìxì ('red‑enveloped money')
                • 流利 líulì (SV lưulợi) → VS lưuloát ('fluently')
                • 順利 shùnlì (SV thuậnlợi) → VS suônsẻ ('smoothly')
                • 伶利 línglì (SV linhlợi) → VS lanhlẹlanhlẹn ('quick')

              The same syllable thus produces lãi or lời as stand‑alone words, or morphemic derivatives like ‑sẻ, ‑loát, ‑lẹn. These may or may not carry meaning independently, depending on their associative strength. All are related: lời ~ lãi ~ lợi ~ lị. Note that lời and lãi were coined euphemistically to avoid the taboo of King Lê Lợi’s name (黎利, Lê Thái Tổ — 黎太祖  Lí Tàizhǔ, 15th century).

              Further illustration: lẹ ('quick') in 伶利 línglì (SV linhlợi) → VS lanhlẹ. Compare mauchóng 敏捷 mǐnjié ('quickly'), a variation of chóngmau (盡快 jìnkuài, 'as fast as possible'), colloquially linked to 馬上 (mǎshàng, 'immediately', lit. 'on horseback'). Here 快 kuài (SV khoái) is the source of VS mau ('fast'), though no other Chinese character directly matches /maw/. Interestingly, kuài also meant 'happy' (VS vui), showing the pattern ¶ /k‑ (kh‑) ~ wj‑, v‑/ \ hw‑}.

              These examples show that in disyllabic forms, either syllable can evolve into various Vietnamese sounds in other compounds. They also demonstrate metathesis (reversal of order), e.g., mauchóng vs. chóngmau, nhanhchóng vs. chóngvánh.

              The novel disyllabic approach departs from the old focus on isolated monosyllables and the boring pattern of one‑to‑one cognates. For example, few specialists have posited bậnviệc with 忙活 mánghuó, or phànnàn with 抱怨 bàoyuàn. Many stop at 務 wù (SV vụ) or 役 yì (SV dịch), missing that VS việc ('work') is cognate with 活 huó (SV hoạt).

              Thus, many Sinitic‑Vietnamese etyma have been overlooked due to the misconception of Vietnamese and Chinese as monosyllabic. Monosyllabicity is a primitive feature of proto‑languages, not of modern Vietnamese or Chinese. This misconception has hindered breakthroughs in Vietnamese etymology. The antithetical views presented here aim to correct this and open new approaches.

              The dissyllabicity approach rests on two premises:

              Both modern Vietnamese and Chinese are fundamentally disyllabic, with a high percentage of two‑syllable words (see Chou Fa‑Kao, 1982).
              There exists a deep kinship between them, traceable through Tai > Yue > Dai and Tai > Chu > Han lineages.

              The author’s hypothesis began with instinct and suspicion of this distant genetic affinity, then was confirmed by systematically matching forms in disyllabic structures.

              Intuitively, applying the dissyllabicity approach has already uncovered thousands of Vietnamese words of Chinese origin. This method has also enabled the author to identify certain basic words with a high degree of accuracy. To illustrate how this process works, consider the following examples:

              1. chimchóc 禽獸 (qínshòu, SV cầmthú, 'birds')

                • Also VS thúvật ~ convật for SV cầmthú ('animals').
                • For chóc, often treated in modern Vietnamese as a reduplicative syllable, the evidence suggests it was originally an independent monosyllabic word. It is associated with 雀 què, qiăo, qiāo (SV tước).
                • M 禽 qín < MC gim < OC *ghjəm. Dialects: Chaozhou ʑin12, Wenzhou ʑiaŋ12, Shuangfeng ʑin12.
                • For chim ('bird'), cf. M 鳥 niăo (SV điểu) ~ Hai. /jiao2/. 
                • Starostin notes that 禽 was frequently used since Late Zhou with the meaning 'wild bird(s)' or 'something caught', while 擒 was used for 'to capture'. 獸 shòu < MC ʂjəw < OC *ʔjəwʔh.

              Thus, 禽獸 qínshòu corresponds to thúvật or convật ('animals'). But chóc is a basic word synonymous with chim, preserved as a dialectal variant in Thanhhoá and Ninh bình, the region of the ancient capital Hoalư in the 10th century. Its survival there strengthens the case for chóc as an authentic monosyllabic root, likely cognate with Old Chinese forms.

              2. chóc 雀 (què, qiăo, qiāo, SV tước, 'bird')

                • M 雀 què, qiăo, qiāo < MC cjak < OC *tɕekw. | Used in compounds like chimchóc 禽雀 qínqiāo.
                • The regular Sino‑Vietnamese reflex is tước, but chóc survives in Vietnamese as a true monosyllabic word. 
                • Starostin reconstructs tɕekʷ, noting 雀 was also used as a general name for small birds in early Chinese.
               3. chảcá 炸魚 (zhàyú, SV tạcngư, 'fried fish cake')
                • Literally 'fried fish'.
                • M 炸 zhà < MC tɕak < OC *tɕra:ks; 魚 yú < MC ŋʊ < OC *ŋha.
                • Vietnamese chả (boiled ground meat cake, 'ham') derives from 炸 zhà ('deep fry'). Semantically, it shifted from 'fry' to 'meat cake', but retains the original sense in chảcá 炸魚 ('fried fish').
                • Cf. chảrươi 炸虲 ('fried worm'), chạotôm 炸蝦 ('fried shrimp cake').
                • Vietnamese rán ('fry') is cognate with 煎 jiān ('fry'), hence both rán and chiên coexist.
                • In the south, chả is used; in the north, giò (giòlụachảlụa 炸肉 zhàròu). Taiwanese usage: 紮肉 zhāròu ('boiled pork meatloaf'), 魚扎 yúzhā ('fish cake').
              4. chảlụa ~ giòlụa 炸肉 (zhàròu, SV tạcnhục, 'boiled meatloaf')
                • Literally 'fried meat'. 
                • 肉 ròu < MC ȵuwk < OC *njuɡ
                • lụa here is a phonetic variant of 肉 ròu (¶ /r‑ ~ l‑/), not related to 綢 chóu ('silk').

              5. cậtruột 骨肉 gǔròu (SV cốtnhục, 'blood kinship')

                • Literally 'bone and flesh'.
                • 骨 gǔ aligns with cật; 肉 ròu aligns with ruột ('intestine'), extended metaphorically to 'kinship'.
                • Cf. 親子 qīnzǐ ('conruột'), 親爹 qīndiè ('charuột'), 親母 qīnmǔ ('mẹruột').

              6. barọi 肥肉 féiròu: SV phìnhục ('bacon')

                • 肥 féi 'fat' → ba; 肉 ròu → rọi (voiced variant).
                • Possibly linked to ba (三 sān, 'three'), innovated as 'three layers of meat' (thịtbarọi). Cf. 五花肉 wǔhuāròu ('streaky pork').

              7. búnriêu 蟹粉 xiéfěn (SV giảiphấn, 'crab noodle soup')

                • M 粉 fěn, fèn < MC pun < OC *pɯnʔ
                • Vietnamese bún ('vermicelli') is likely an independent loan from 粉 | ¶ /f‑ ~ b‑/.

              8. mắmriêu 鹹蟹 xiánxié (SV hàmgiải, 'salted crab sauce')

                • 蟹 xié 'crab' → ghẹ, cua, cáy.
                • 蟹 (蠏) xiè, xiě, xié < MC ɦaɨj < OC *gre:ʔ
                • Vietnamese riêu/rêu plausibly derives from 蟹 xié.

              9. mắmruốc 鹹蝦 xiánxiā (SV hàmhà, 'shrimp paste')

                • 蝦 xiā 'shrimp' → ruốc, tôm, tép.
                • ¶ /x‑ (OC *ghr‑) ~ r‑/.

              10. mắmcá 鹹魚 xiányú (SV hàmngư, 'salted fish paste')

                • 鹹 xián 'salty' → mặn.
                • Vietnamese mắm ('fish sauce') can be traced to 鹹 xián.

              Synthesis: These examples demonstrate how the dissyllabicity approach reveals Vietnamese cognates of Chinese etyma, often hidden in bound morphemes or dialectal variants. Words like chóc, lụa, ruột, rọi, mắm, riêu, and ruốc show how phonological shifts, semantic extensions, and local innovations transformed Chinese monosyllables into Vietnamese disyllabic forms.

              For the etymon mắm (the well‑known Vietnamese 'fish sauce'), we may posit a connection with 鹹 xián, originally cognate with mặn ('salt, salted'), to denote a staple that in fact has Chamic origins (see Nguyễn Ngọc San, 1993). For related items, riêu or rêu can plausibly be traced to 蟹 xié, which also underlies ghẹ, cáy, and cua (three Vietnamese words for different kinds of crabs; notably, ghẹ resembles small Alaskan king crabs). Likewise, bún ('vermicelli') derives from 粉 fěn, which also produced phở ('noodle'), bột ('flour'), phấn ('chalk'), and bụi ('dust'). We may also postulate ruốc as cognate with 蝦 xiā (VS tép, tôm, 'shrimp, prawn'), while another type of ruốc ('fried shredded meat jerky') reflects 肉 ròu, by dialectal variation in the south, alongside ruột, rọi, and lụa functioning as morphs in the compounds cited above.

              In short, the dissyllabicity approach has enabled the identification of more than 20,000 so‑called "pure Vietnamese" etyma that are in fact cognate with Chinese forms, spanning ancient to modern dialects, both literary and vernacular. These Sinitic‑Vietnamese items include words long regarded by purists as indigenous Nôm, such as chim, chóc, chả, giò, , lụa, ruột, rọi, mặn, mắm, riêu (rêu), ghẹ, cua, cáy, ruốc, tôm, tép, bún, bột, phấn, bụi, phở, and others. Amusingly, some specialists have even proposed French or unrelated Chinese origins, for example, linking phở to French feu (as in pot‑au‑feu) or lụa to 綢 chóu ('silk'), but the dissyllabicity analysis demonstrates more plausible Sinitic connections.

              As shown, the discovery of a large body of authentic Chinese-Vietnamese cognates through syllabic association provides strong evidence of genetic kinship between the two languages. These etymological findings serve as building blocks for a framework of their historical affiliation and, at the very least, offer solid evidence accounting for over 95% of the Vietnamese lexicon of Sinitic origin.

              Table 12 – The French Do Not Speak French

              Anthropologically, the development of Vietnamese can be examined in parallel with the ethnogenesis of the Kinh people, provided we accept that the origins of language are reflected in its most basic words. Such a perspective challenges long‑standing claims advanced by Austroasiatic Mon‑Khmer theorists. The Vietnamese Kinh are a mixed people, descended from Sinicized Yue migrants from southern China who resettled in the ancient land known in Vietnamese history as Vănlang. There, they intermarried with Daic indigenous populations. This process unfolded over at least two millennia and continued until the Han conquest of ancient Annam.

              Linguistically, Vietnamese etyma of solid Chinese origin dominate, shaping both the sound system and structural character of the language as it exists today. To understand this affiliation, one may compare the relationship between Vietnamese and Chinese to that between English and its Greek, Latin, and French elements, as opposed to its Anglo‑Saxon base. All belong to the Indo‑European family, yet English vocabulary reveals striking contrasts: water vs. French eau, one vs. une, snow vs. nuage, and so forth.

              By way of analogy, it is worth recalling that the French today do not speak their ancestral Gaulish tongue, but rather a Latin‑derived language now called French. Readers should note, however, that this is not the case with Vietnamese, whose linguistic foundation remains deeply and enduringly tied to its Sinitic heritage.

              In each polysyllabic Chinese word, composed of two or more syllables or morphemes represented by individual characters, every unit, regardless of its meaning,  is associated with a morpheme that may appear under different phonetic shells, whether in monosyllabic or polysyllabic form.

              For example, in VS bồhòn ('wingleaf soapberry), we find parallels such as 無患 wúhuàn (SV vôhoạn), 苦患 kǔhuàn (Hainanese, SV khổhoạn), 油患 yóuhuàn (Sichuan, SV duhoạn), and 木患 mùhuàn (in Li Shizhen, 'Sapindus saponaria', SV mụchoạn). Here 患 huàn ('trouble') functioned as a loangraph (假借 jiăjiè) for 丸 wăn (SV hoàn, VS hòn 'ball‑shaped object') (An Chi 2016, Vol. 2, p. 154). This illustrates how syllabic presentations in Chinese characters may convey entirely different meanings, e.g., 患 vs. 丸, regardless of their written form. Such loangraphs, or 'internal loanwords', were often sound‑loans unrelated to the original semantic value, and by association they entered Vietnamese compounds as well.

              In both languages, a morpheme typically coincides with a syllable, which can freely combine with others to form new words, regardless of its core meaning. For instance:

              (i) on the Chinese side,
              • 運氣 yùnqì: hênxui ('by luck')
              • 起碼 qǐmă: ítra ('at least')
              • 馬虎 măhǔ: qualoa ('carelessly')
              • 馬上 măshàng: mauchóng ('quickly')
              • 便宜 piányi: giábèo, rẽbèo ('cheap')
              • 便秘 piànmì: táobón ('constipation') [<~ SV tiệnmật]
              • 東西 dōngxī: đồđạc ('things')
              • 東家 dōngjiā: chủnhà ('host')
              • 聊天 liáotiān: tròchuyện ('chat')
              • 無聊 wúliáo: lạtlẽo, nhạtnhẽo ('boring'), vôduyên ('nonsense')
              • 陌生 mòshēng: lạlùng ('strange')
              • 花生 huāshēng: đậuphụng ('peanut')
              • 棒子 bàngzǐ: tráibắp ('corncob')
              • 包米 bāomǐ: bắpmì ('corn kernel')
              • 玉米 yùmǐ: ngôbắp ('corn')
              • 玉丸 yùwăn: hòndái ('testicle') [cf. SV ngọchoàn in medical usage]
              • 點心 diănxīn: dằnbụng ('snack') [~ VS lótlòng; SV điểmtâm 'breakfast']
              • 點錢 diănqián: đếmtiền ('count money')
              • 運氣 yùnqì: hênxui ('by luck')
              • 起碼 qǐmă: ítra ('at least')
              • 馬虎 măhǔ: qualoa ('carelessly')
              • 馬上 măshàng: mauchóng ('quickly')
              • 便宜 piányi: giábèo, rẽbèo ('cheap')
              • 便秘 piànmì: táobón ('constipation') [<~ SV tiệnmật]
              • 東西 dōngxī: đồđạc ('things')
              • 東家 dōngjiā: chủnhà ('host')
              • 聊天 liáotiān: tròchuyện ('chat')
              • 無聊 wúliáo: lạtlẽo, nhạtnhẽo ('boring'), vôduyên ('nonsense')
              • 陌生 mòshēng: lạlùng ('strange')
              • 花生 huāshēng: đậuphụng ('peanut')
              • 棒子 bàngzǐ: tráibắp ('corncob')
              • 包米 bāomǐ: bắpmì ('corn kernel')
              • 玉米 yùmǐ: ngôbắp ('corn')
              • 玉丸 yùwǎn: hòndái ('testicle') [cf. SV ngọchoàn in medical usage]
              • 點心 diǎnxīn: dằnbụng ('snack') [~ VS lótlòng; SV điểmtâm 'breakfast']
              • 點錢 diănqián: đếmtiền ('count money')
              etc.

              (ii) and here on the Vietnamese side,

              • đườngmật 甜蜜 tiánmì ('sweetly') [@ 甜 tián (SV điềm) ~ 糖 táng 'sugar']
              • dưahấu 塊瓜 kuàiguā ('watermelon') [modern M 西瓜 xīguā (SV tâyqua) → VS dưatây 'honeydew']
              • thathiết 體貼 tǐtiè ('heartily')
              • bênhvực 包庇 bāobì ('take side')
              • bánhmì 麵包 miànbāo ('bread')
              • làmviệc 幹活 gànhuó ('work')
              • bậnviệc 忙活 mánghuó ('busy')
              • cậtruột 骨肉 gǔròu ('blood kinship')
              • chảgiò 炸肉 jiàròu ('fried spring roll')
              • cẩuthả 苟且 gǒuqiě ('carelessly')
              • nhưngmà 而且 érqiě ('but also')
              • mứcđộ 幅度 fúdù ('extent')
              • bứcvẽ 畫幅 huàfú ('a painting')
              • đòngang 渡江 dùjiāng ('ferry boat')
              • núisông 江山 jiāngshān ('country')
              • trờinắng 太陽 tàiyáng ('sunshine')
              • tạnhtrời 晴天 qīngtiān ('dry weather')
              • banngày 白天 báitiān ('daylight') [<~ 白日 báirì]
              • bồcâu 白鴿 báigē ('dove')
              • ănbám 白吃 báichī ('live on others’ labor')
              • vívon 比方 bǐfāng ('exemplify')
              • thídụ 比喻 bǐyù ('for example')
              • mồcôi 無根 wúgēn ('orphan')
              • mùtịt 無知 wúzhī ('ignorance')
              • lạtlẽo 無聊 wúliáo ('boring') ~ VS nhạtnhẽo
              • lùmù 朦朧 ménglóng ('vague') ~ lờmờ (SV mônglung)
              • bưngbít 蒙蔽 méngbì ('hoodwinking')
              • vỡlòng 啓蒙 qǐméng ('pre‑schooling')
              • hàilòng 開心 kāixīn ('pleased')
              • vừalòng 滿心 mănxīn ('satisfied')
              • vừaý 滿意 mănyì ('pleased')
              • chấpnhất 在意 zàiyì ('to mind') ~ đểý
              • hứngchịu 丞受 chéngshòu ('undergo')
              • chấpnhận 忍受 rěshòu ('endure')
              • rũimà 萬一 wànyī ('just in case')
              • muônvàn 千萬 qiānwàn ('countlessly')
              • ôngchủ 主公 zhǔgōng ('master') ~ chúacông
              • gàtrống 公雞 gōngjī ('rooster') ~ gàcồ
              • đànbà 婦道 fùdào ('woman')
              • bàxã 媳婦 xífù ('wife, term of endearment')
              • tiêupha 花銷 huāxiāo ('spend money') 
              • đồngbạc 銅版 tóngbăn ('dong, monetary unit')
              • đồnghồ 銅壺 tónghú ('clock, watch')
              • bánsỉ 批發 pīfā ('wholesale')

                and the same process can be extended to many other words, abundantly, which is open for our specialist in Vietnamese to fill them in. Here are some suggestions,
              • nóngnảy 衝動 chōngdòng ('hot temper') 
              • phơira 披露 pīlù ('expose')
              • vấtvả 奔波 bēnbō ('struggling, hand‑to‑mouth') [~ VS tấttả, SV bônba]
              • múarối 木偶戲 mù'ǒuxì ('puppetry')
              • bắtđền 賠償péichăng (ask for compensation) (~ bắtthường),
              • lánggiềng 鄰居: línjū (neighbor' (~ 'hàngxóm'),
              • duyênnợ 緣份: yuánfèn (marital encounter),
              • yêuđương 愛戴 àidài (love),
              • conruột #親子: qīnzǐ (biological child),
              • đạochích 盜賊: dàozéi (burglar, thief) (~ 'trộmcắp'),
              • dêxòm 婬蟲: yínchóng (lecherous) (~ 'quỹrâuxanh'),
                etc.

              For the Chinese examples cited, any trained linguist knows that the ideographs involved often have little to do with the meanings they convey. Within a compound, each character frequently functions as nothing more than a sound unit, especially in the case of so‑called “internally borrowed characters” (假借 jiǎjiè), characters used primarily to coin new words by sound‑loan rather than by semantic value. The same principle applies on the Vietnamese side. A Chinese dictionary will show countless characters, including polysyllabic words, with multiple meanings; yet in many of the examples above, they are loan graphs employed simply to phoneticize or transcribe sounds for particular concepts.

              When Chinese words were borrowed into and localized in Vietnamese, either one or both syllables of the compound could be re‑associated with Vietnamese words of similar sound and meaning. Amusingly, what emerges in Vietnamese is sometimes no longer what the original Chinese form signified. In other words, the Vietnamese reflex may not descend directly from the same Chinese root. Words of this type are innumerable.

              Take 起 , which signifies 'rise' (SV khởi, VS dậy, nổi) [M 起 < MC kʰɨ < OC kʰɯʔ]. In actual usage, this morpheme readily acquires re‑assigned meanings shaped by the semantic context of speech. This is no longer a matter of metathesis, as when a new disyllabic word was originally coined — for example, in choosing between 興起 xīngqǐ, xìngqǐ or 起興 qǐxìng ('arousing') for VS nổihứng — but rather a case of adapting and extending an existing form with whatever is conveniently available in context. Illustrations include:

              • 起床 qǐchuáng → VS ngủdậy ('wake up, rise') [@ 起  ~ ngủ, @ 床 chuáng ~ dậy]

              • 起義 qǐyì → VS nổidậy ('rise against') [@ 起  ~ nổi, @ 義  ~ dậy]

              Yet in other compounds, 起  associates with different sounds and concepts:

              • 起馬 qǐmă → VS ítra ('at least')

              • 起源 qǐyuán → VS bắtnguồn ('originate') [起 Cant. /hej3/ > /bej3/ > /bæt7/ > VN /bắt‑/]

              • 起頭 qǐtóu → VS khởiđầu ('start')

              • 起步 qǐbù → VS cấtbước ('take steps') [起 Cant. /hej3/ > /kej3/ > /kʌt7/ > VN /kất‑/]

              • 興起 xìngqǐ → VS hứngchí ('excited') [起  > /cij5/] → cf. nổihứngmừngrỡ

              Similarly, consider 順 shùn (SV thuận) [M QT 順 shùn < MC ʑwin < OC *ɢljuns (Schuessler: mljuəns)].

              Examples include:

              • 順利 shùnlì → VS suônsẻchótlọt ~ trótlọt ('smoothly')

              • 順風 shùnfēng → VS xuôigióthuậngió ('tail wind')

              • 順水 shùnshuǐ → VS xuôidòng ('sail with the current')

              • 順手 shùnshǒu → VS thuậntaysẵntayluônthể ('conveniently')

              • 順便 shùnbiàn → VS luôntiệnsẵntiện ('conveniently')

              • 孝順 xiàoshùn → VS hiếuthảo ('filial piety')

              Thus, morphemic syllables like 起 and 順 are binding forms that have evolved into different sounds, meanings, and words in Vietnamese. Within Chinese itself, such morphemes are innumerable. By pursuing the dissyllabicity approach, nearly all Sinitic‑Vietnamese words can be traced back to Chinese equivalents or roots.

              These etyma have long been overlooked due to the entrenched misconception that both Vietnamese and Chinese are fundamentally monosyllabic. This view has obscured recognition of the exponential sound changes that occur in disyllabic formations, where shifts may diverge from their monosyllabic equivalents. Phonologically, in ancient times both languages were likely monosyllabic, as languages generally evolve from simplicity to complexity. It is easier to confirm monosyllabicity in Chinese, given literary evidence from three millennia ago, than in Vietnamese, whose last written forms in Chinese characters date only to the early 20th century. For over a thousand years after independence, records of ancient Vietnam were largely compiled from Chinese bibliographies. Still, the basic words shared by both languages point to an early monosyllabic stage, with some items evolving into disyllabic forms to differentiate meaning, e.g., đầugối 膝蓋 xīgài ('knee'), cùichỏ 手肘 shǒuzhǒu ('elbow').

              Modern Vietnamese orthography, however, disguises this reality. Most words are disyllabic in nature, yet still written as separate monosyllabic components. This convention has misled untrained readers scanning dictionaries, where thousands of disyllabic words appear disconnected in isolated syllables. The practice stems from the adaptation of Chinese characters into separate written forms, as reflected in early dictionaries such as Đại‑Nam Quấc‑âm Tự‑vị 大南 國音 字彙 (Dictionary of National Sounds of the Great Southern Kingdom, compiled by Huình Tịnh Của, late 19th century). There, each entry is listed character by character, e.g., 江 giang 'river', 山 san 'mountain', and compounds like 江山 giang san are treated as separate words, even though together they mean 'country', not merely 'rivers and mountains'. Such shortcomings reflect the limited linguistic training of early lexicographers.

              Cognitively, this monosyllabic way of writing is even more misleading than the earlier use of hyphens (e.g., giang‑san, quốc‑gia), which remained common until the early 1970s. The persistence of this practice owes much to user convenience and educational neglect, both of which have contributed to the shortcomings of the modern national orthography. (See Appendices)

              In the past, many imperfect specialists on Vietnamese insisted on its supposed monosyllabicity. A representative view was expressed by Barker (1966, p. 10): "With the exception of certain compounds, reduplicative patterns, and loanwords, Vietnamese and Muong are both monosyllabic languages." If we were to take this paradigm seriously and apply it equally to English, the Anglo-Saxon component, so to speak, then English too would appear monosyllabic in many respects, let alone Vietnamese.

              Barker's statement, however, revealed the limits of his mastery of Vietnamese. Even today, Western specialists still confuse Sino‑Vietnamese with Sinitic‑Vietnamese words. In Barker’s time, linguists like him were often surrounded by 'Vietnamese admirers' from half‑trained linguistic circles, eager to celebrate a foreigner who could simply pronounce Vietnamese sounds. One can almost picture the scene: exclamations of "oh,", "ah", "wow", "he can even speak Vietnamese!" Yet nobody marvels at the millions of Vietnamese who speak English fluently, or the hundreds who have authored books in that language. The rarity of Westerners who master Vietnamese has made their words seem disproportionately precious.

              To be fair, Barker was a respected linguist of Southeast Asian languages, academically well‑equipped with methodologies and field experience. But in his Vietnamese study, his reliance on linguistically untrained informants and interpreters, combined with the application of "bookish" formulae, left him ill‑prepared. His conclusion that only "certain compounds, reduplicative patterns, and loanwords" break the supposed monosyllabic mold is enough to disqualify his authority in this specific field. For readers unfamiliar with Vietnamese, such a statement misleadingly suggests that only a handful of multisyllabic words exist. Nothing could be much further from the truth.

              More than three generations have passed since Barker's era, yet no major breakthrough has overturned the lingering influence of his view. Vietnamese learners still encounter dictionaries that present wordlists monosyllabically, a practice replicated across print and digital media. Novices are thus visually misled by an orthography that insists on listing syllables separately, treating them as characters 字 (chữ) rather than words 辭 (từ). By contrast, it would be unimaginable for students of Mandarin or Korean to treat Romanized monosyllables as words in their own right. Yet in Vietnamese linguistics, this outdated perspective persists.

              It is true that many ancient and later disyllabic lexemes can be analyzed as combinations of monosyllabic elements, each of which may function as an affix in other compounds (cf. English homepage, website, logon, blogger, facebook, facetime). But many Vietnamese words formed in this way denote entirely new concepts. Analytically, they are not “compounds” but composite words: forms built from bound morphemes that cannot be broken down into independent syllables.

              Some of the most basic Vietnamese anatomical terms illustrate this point. Words such as bànchân ('foot'), đầugối ('knee'), mắccá ('ankle'), cổtay ('wrist'), càngcổ ('neck'), bảvai ('shoulders'), cùichỏ ('elbow'), màngtang ('temple'), mỏác ('fontanel'), and chânmày ('eyebrow') are all disyllabic composites. Each is made up of bound morphemes that must appear together; neither syllable can stand alone. Conceptually, they function just like their English counterparts.

              The original meanings of the individual syllables often diverge from the meaning of the composite. For example, in đầugối ('knee'), đầu (cf. C 頭 tóu 'head') and gối (cf. M 枕 zhěn 'pillow', as in 枕頭 zhěntóu) have nothing to do with the concept of 'knee'. The Chinese equivalent 膝蓋 xīgài conveys the meaning directly, but Vietnamese expresses it through a composite whose parts no longer transparently relate to the whole and each syllable cannot stand alone. This example illustrates how Vietnamese and Chinese share cognate structures in fundamental vocabulary, though often through divergent semantic paths.

              Beyond anatomy, countless other composite disyllabic words formed from bound morphemes exist across semantic domains: càunhàu ('growl'), cằnnhằn ('grumble'), bângkhuâng ('pensive'), bồihồi ('melancholy'), mồhôi ('sweat'), mồcôi ('orphan'), hàilòng ('pleased'), taitiếng ('infamous'), tạmbợ ('temporary'), tráchmóc ('reproach'), tuyệtvời ('wonderful'), tămhơi ('whereabouts').

              Polysyllabic forms further demonstrate this creativity: cườimĩmchi ('shoot a smile'), tủmtỉmcười ('hide a smile'), mêtítthòlò ('fatally irresistible attraction'), nhảyđồngđổng ('jump up in protest'), bađồngbảyđổi ('change unpredictably'), hằnghàsasố ('innumerable'), lộntùngphèo ('turn upside down'), tuyệtcúmèo ('fabulous').

              All of these examples underscore the same point: Vietnamese is not a monosyllabic language, but one rich in disyllabic and polysyllabic composites, whose bound morphemes function in ways comparable to any other world languages. (平)

              Polysyllabically and morphemically speaking, even in the case of solid Sino‑Vietnamese words of verified Middle Chinese origin such as hiệntại 現在 xiànzài ('present'), phụnữ 婦女 fùnǚ ('woman'), or sơnhà 山河 shānhé ('country'), each lexical stem derived from a Chinese character — though capable of functioning as a syllable‑word — cannot be used independently as a free form in Vietnamese. Each must combine with another syllable to form a complete lexical item. For example, núivàng ('gold mountain') corresponding to SV kimsơn 金山 jīnshān cannot be arbitrarily recombined across classes, such as {SV sơn + VS vàng}, {VS núi + SV kim}, or {VS trèo + SV san}, to yield legitimate words. Such hybrid pairings are impermissible. The only exception arises when no “pure” Vietnamese equivalent exists to replace one of the syllables, as in bàntay 手板 shǒubăn ('palm'), phấnviết 粉筆 fěnbǐ ('chalk'), or miến gà 雞麵 jīmiàn ('chicken noodle soup'), where elements like bàn 板, phấn 粉, or miến 麵 are Sino‑Vietnamese morphemes.

              Remaining with the subject of polysyllabicity, it is worth stressing that if disyllabic and polysyllabic words were written in combining formation rather than as separate syllables in Vietnamese orthography, children would gain a stronger cognitive foundation for abstract thought. At the same time, foreign learners would acquire vocabulary more logically and efficiently, and even specialists such as Barker would be less likely to misinterpret Vietnamese as monosyllabic. Compare the difficulty ESL learners face with English phrasal verbs (keep up, go on, put up with, come on) versus the relative ease of acquiring polysyllabic composites (nevertheless, meanwhile, aforementioned, albeit, regarding, pan‑America, trans-Siberia), which parallel Vietnamese forms such as dùlà, trongkhiđó, kểtrên, dùthế, đốivới, xuyênMỹ, xuyênTâybálợiá. In short, reliance on the antiquated orthography of separated syllables is inadequate grounds for labeling Vietnamese as monosyllabic Vietnamese2020 Writing Reform Proposal.)

              On the question of polysyllabicity, prominent Vietnamese linguists such as Bùi Đức Tịnh (1966, p. 82), siding with Hồ Hữu Tường, rejected the notion of Vietnamese as monosyllabic. Both argued for its dissyllabic character, citing the high frequency of dissyllabic words in ordinary passages. Analogously, just as English draws heavily on Latin and Greek roots, the fact that Sino‑Vietnamese words constitute over 90 percent of entries in modern dictionaries is itself sufficient evidence of Vietnamese disyllabicity.

              From a broader perspective, virtually all world languages are polysyllabic, and their orthographies reflect this nature — including recently romanized systems such as Hmong. French and English loanwords in Vietnamese also entered as indivisible polysyllabic units: French acideaxít, bourseboisbuộcboa, autoôtô, compascômpa, toilettetoalét. To break them apart into a xít, buộc boa, ô tô, côm pa, toa lét, as Phan Hữu Dật (1998) and others did — is misguided. By contrast, younger generations have sensibly accepted complete loan packages as indivisible units: Washington, New York, Canadadollar, visa, silicon, restroom, toilet, cellphone, smartphone, data, webpage, internet, monitor, computer, iPhone, Apple, rather than the earlier calques Hoa‑thịnh‑đốn, Nữu Ước, Gia‑nã‑đại, đô‑laetc.

              Other cultures have long recognized this cognitive principle. Koreans and Japanese, for instance, consistently write polysyllabic words in grouped formations, visually reinforcing their unity and efficiency. Their orthographies appear in patterns like XX XXX XX X XX XXX XX, and this structural clarity has arguably contributed to their innovative linguistic and technological achievements. Thai, Malay, and others follow similar practices. By contrast, modern Vietnamese orthography still separates dissyllabic words into two units, even though, as noted, each syllable alone may be semantically empty. If the same were attempted with Kanji, Romanized Korean, or Chinese pinyin, the fallacy of the monosyllabic view would be obvious.

              Closer to home, all modern Chinese dialects are effectively disyllabic. The same holds true for Vietnamese. On the issue of Chinese polysyllabicity, Chou (1982, p. 106) cites Kennedy, de Francis, and Eugene Chin: "If we admit that words, not morphemes, are the construction material of Chinese, we cannot but admit that Chinese is polysyllabic. If we may use the majority rule here, we will have no trouble establishing the fact that Chinese is disyllabic." Indeed, by the majority rule, Chinese vocabulary is dominated by disyllabic words. The same principle applies to Vietnamese, given their structural similarities. Thus, every disyllabic Chinese loanword — and likewise French or English loans — in Vietnamese must be treated as polysyllabic, and should be written in combining formation, such as San Francisco, not Xan Phơ-ran-xít-xơ-cô.

              Finally, recall our earlier postulate: phonologically, a single disyllabic Chinese word can evolve into multiple disyllabic forms in Vietnamese, including modern neologisms. For instance, the Chinese 三八 sānbā (SV tambát, originally used to mock women on March 8, International Women's Day, with the sense of "nonsense") has likely given rise to, or at least associated with, a cluster of Vietnamese expressions for the same concept: tầmphào, tầmbậy, tầmbạ, bảláp, bảxàm, basạo, xàbát, xằngbậy, and others.

              Not only can the one‑to‑many sound‑change rule be applied to disyllabic words, it also extends to monosyllabic forms. We have already seen solid cases where a single Chinese monosyllable gave rise to multiple Vietnamese reflexes. The problem with the "monosyllabicity camp" of linguists is that they typically search for only one Vietnamese equivalent for each Chinese character, treating both as strictly monosyllabic units. In most cases, they confine themselves to a one‑to‑one correspondence, forcing each Chinese etymon into a single Vietnamese match. They also tend to accept only those forms that fit neatly into their pre‑established sound patterns, for example, deriving 惃 kūn as VS con ('child'), or 亨 hēng as part of hên(h)+xui ('by luck') (An Chi 2016, Vol. 2, pp. 32, 113). In doing so, they overlook more plausible correspondences such as 子 (Fukienese /kẽ/ > VS con) or 運氣 yùnqì > VS hênxui.

              In reality, the fact that a single Chinese character often has multiple pronunciations across dialects strongly supports the principle of one monosyllabic source yielding many disyllabic outcomes in Vietnamese. Both rules — one‑to‑many from monosyllables and from disyllables — operate in parallel. For example:

              • Teochow 餅 /bẽ/ → VS bánhbẽn ~ bánhpía (Teochew‑style pastry) [ bánh 餅 M bǐng + bẽn 餅 Teochow nasalized /bẽ/]

              • Teochow 包餅 /bao1bẽ/ → VS bòbía (spring roll) [ 包 bāo + bía 餅 /bẽ/]

              • 麵包 miànbāo → VS bánhmì ('bread') [bánh 包 bāo +  麵 miàn]

              • 包子 bāozǐ → VS bánhbao ('steamed dumpling') [bánh 餅 M bǐng + bao 包 bāo]

              • 粽子 zōngzǐ → VS bánhchưng ('steamed glutinous rice cake') [bánh 餅 M bǐng + chưng 粽 zōng / 烝 zhēng 'steam']

              Here 餅, 包, and 子 have each generated multiple disyllabic Vietnamese forms.

              This analysis exposes the flaw in the old monosyllabicity approach. While multiple Vietnamese etyma can be extracted from the same Chinese root, earlier scholars restricted themselves to a rigid one‑to‑one mapping. Thus they could not reconcile forms like bánhpía or bánhbẽn, since their framework allowed only bánh = 餅 M bǐng (SV bính). They could not accept that disyllabic Vietnamese words might emerge from recombinations involving other morphemes such as 包 bāo. Note also that 麵包 miànbāobánhmì ('bread') reflects a French colonial product, but the word itself is cognate with Chinese 麵包, via 餅 bǐng > bánh and 麵 miàn > .

              Only when Vietnamese is formally recognized as a disyllabic language, consisting primarily of two‑syllable words, like Chinese, can its sound‑change rules be properly understood. The situation is no different from Indo‑European (IE) languages, where words of the same root evolve differently across languages, and at least one syllable often diverges from the expected phonological pattern. Consider “police”: politi, polizei, policía, polizia, polite, polis, polisi, or even the old Vietnamese loans phúlít and cúlít (from French; see An Chi 2016, Vol. 2, p. 13), alongside the modern colloquial cốm for English cop. Historical linguists of IE are well aware of such complex sound shifts, which are widely accepted as normal.

              What does this have to do with Vietnamese etyma? Orthodox theorists, clinging to monosyllabicity, assume that each Chinese character corresponds to only one Vietnamese equivalent. They reject one‑to‑many correspondences as "chaotic.". Yet in reality, numerous Vietnamese forms undeniably derive from a single Chinese character, some even within the strict Middle Chinese → Sino‑Vietnamese framework. For example:

              • 唐  Táng ('Tang', 'path') → SV Đường, đàng)

              • 元 yuán ('origin', 'beginning') → SV nguyênngươn; VS (tháng)giêngngọnvị ('first, top, unit')

              • 利  lì ('advantage', 'interest', 'benefit', 'profit') → SV lợi, lị,  VS lì, lãi, lời  

              • 貴 guì ('precious') → SV quýquới; VS mắcđắt ('expensive')

              • 度  ('measure') → SV độ; VS đođạcđứctấm

              • 拜 bài ('kowtow') → SV bái; VS váilạyvan

              • 粉 fěn ('flour, noodle') → SV phấn; VS búnbộtphởbụi

              Similarly, 場 chǎng (SV trườngtràng) has produced multiple Vietnamese outcomes:

              • 劇場 jùchǎng (SV kịchtrường) → VS sânkhấu ('theatrical stage')

              • 在場 zàichǎng (SV tạitrường) → VS tạichỗtạitrận ('on the spot', 'red‑handed')

              • 試場 shìchǎng (SV thítrường) → VS trườngthi ('examination site')

              • 戰場 zhànchăng (SV chiếntrường) → VS chiếntrậntrậnchiến ('battle')

              As a classifier, 場 chǎng also fuses into fixed Vietnamese compounds:

              • 一場夢 yīchǎng mèng (SV nhất trường mộng) → VS một giấcmơgiấcmộngcơnmơcơnmộng ('a dream')

              • 一場病 yīchǎng bìng (SV nhất trường bệnh) → VS một trậnbệnhcơnbệnh ('illness')

              • 一場戲 yīchǎng xì (SV nhất trường hí) → VS một tuồnghátxuấthát ('a show')

              • 一場空 yīchǎng kōng (SV nhất trường không) → VS cónhưkhôngcórồikhôngmột khoảngtrống ('emptiness')

              Here the process of association reshaped the sound‑change continuum. Vietnamese reflexes of 場 chăng were influenced by neighboring morphemes with similar sounds and meanings (e.g., 陣 zhèn SV trận, 齣 chù SV xuất), and by semantic transfer into new disyllabic compounds (e.g., cơn 'a bout, a thrust'). As Starostin notes, the Vietnamese variants of 場 chǎng were likely borrowed relatively late, perhaps from vernacular Mandarin after the Middle Chinese period, yielding forms such as sân, chỗ, xuất, giấc, dài, ruột (with /ch‑ ~ j‑/ < /*/‑ ~ *j‑/).

              A) Analytical case study of sound shifts

              Let us examine another case: the syllable‑word 匠 jiàng, which frequently attaches as an affix to different morphemic syllables and, through this process, can be posited as the Vietnamese equivalent thợ ('smith', 'artisan').

              Example:

              1. The compound thợmộc
              thợmộc [tʰə̰ːʔ˨˩ mə̰ʔwk˨˩] ~ 木匠 mùjiàng [/mu⁵¹⁻⁵³ ʨi̯ɑŋ⁵¹/] ('carpenter')
              Mandarin 木匠 mùjiàng ('carpenter') ~ Fukienese /ba̍k‑chhiūⁿ/.
              SV mộc [mə̰ʔwk˨˩] < MC [mowk8] for 木 .

              The Vietnamese thợ is unlikely to be a direct sound change from 匠 jiàng (SV tượng) /tɨə̰ʔŋ˨˩/. [ M 匠 jiàng < MC ʐjɑŋ < OC *ʐhaŋs | Note: Japanese "しょう" /shō/ with the correspondence /sh‑/ ~ /th‑/ and /-ō/, cf. 折 zhé (SV chiết), 逝 shì (SV thệ), 誓 shì (SV thệ) > VS thề ]. Rather, thợ is more plausibly a derivative from compounds X + 匠 jiàng.

              2. The compound thợthầy

              Expressions such as Không ra thợthầy gì cả! ("His skills are not even up to those of an apprentice, let alone a master!") or Nửa thầy, nửa thợ, nửa đườiươi! ("Not even half as good as a master, an apprentice, or even a chimpanzee!") illustrate the mocking sense of thợthầy. This compound (師 shī + 匠 jiàng) parallels thầytrò 師徒 shītú (SV sưđồ). Here thợ, in contrast to thầyshī ('teacher', 'master'), aligns with 徒 ('apprentice', 'pupil', 'follower'), giving rise to the broader Vietnamese sense of thợ as 'apprentice, journeyman, artisan, smith'.

              3. Thợ as a productive morpheme

              Once established, thợ became a free morphemic syllable functioning as a prefix, combining with other elements to form compounds, in all these cases, the associated sound thợ has been imposed on other compounds:

                • ngườithợ 人匠 rénjiàng ('artisan')
                • thợthầy 師匠 shījiàng ('master')
                • thợvẽ 畫匠 huàjiàng ('painter')
                • thợvàng 金匠 jīnjiàng ('goldsmith')
                • thợsơn 漆匠 qījiàng ('painter')
                • thợđá 石匠 shíjiàng ('stone mason')
                • thợđồng 銅匠 tóngjiàng ('coppersmith')
                • thợsắt / thợthiết 鐵匠 tiějiàng ('blacksmith, tinsmith')
                • thợgiày 鞋匠 xiéjiàng ('shoemaker')
                • thợnề 泥水匠 níshuǐjiàng ('bricklayer')
                • thợmài 磨光匠 móguāngjiàng ('grinder')
                • thợkhoá 鎖匠 suǒjiàng ('locksmith')
                • thợnhuộm 洗染匠 xǐrănjiàng ('dyer')
                • thợin 印刷匠 yìnshuăjiàng ('printer')
                • thợngói 瓦匠 wăjiàng ('tiler, bricklayer')

              4. Modern extensions:

              In modern usage, thợ has expanded into new domains, often paralleling Sinitic‑Vietnamese forms:

                • thợtóc 髮師 fàshī ('hair stylist')
                • thợvẽ 畫師 huàshī ('artist, painter')
                • thợchụphình 攝影師 shèyǐngshī ('photographer')
                • thợmay 裁縫師 cáifèngshī ('dressmaker')

              Here 師 shī (SV , VS thầy) is effectively downgraded to the apprentice level of thợ. The key point is that Chinese elements come alive in Vietnamese through the Sinitic‑Vietnamese adaptation of thợ, where both 師 shī and 徒 are associated with 匠 jiàng.

              5. Etymological considerations:

              We, however, should not completely exclude the possibility that thợ derives directly from 匠 jiàng [ M /ʨi̯ɑŋ⁵¹/, 匠 jiàng < MC /dzɨaŋ/ < OC */sbaŋs/, Hokkien /chhiūⁿ/, Japanese /shō/]. The sound change may follow a pattern similar to:

              MC /dzɨaŋ/ > SV tượng /tɨə̰ʔŋ/ > thượng /thɨə̰ʔŋ/ > VS thợ /thə̰/

              Compare:

                • 獎 jiǎng ('award') → SV tưởng /tɨə̰ʔŋ/ → VS thưởng /thɨə̰ʔŋ/
                • 承 chéng → VS thừa /thɨə̰/

              In the cases above, disyllabic words — or polysyllabic words more generally — often undergo processes of assimilation or association that govern sound change in compound formation. Irregular outcomes can be attributed to natural phonetic phenomena whereby one or more syllables in a compound may be deformed, corrupted, dropped, contracted, transposed, or otherwise altered. Such changes can transform the original syllable into a new phonological shell into the opposite, sometimes beyond recognition.

              6. Assimilation and association in polysyllabic forms

              • 攝影師 shèyǐngshī ('photographer') → VS thợnhiếpảnh
              Also yields thợchụphình by associating 攝影 shèyǐng with 照相 zhàoxiàng ('take pictures')
                • 相 xiàng assimilated with 形 xíng (SV hình)
                • 影 yǐng (SV ảnh) reinterpreted with sound shift as bóng ('shadow, reflection')

              From this transformation arose a cluster of synonymous forms: thợchụpảnh, thợchớpảnh, thợchụpbóng, thợchớpbóng. All of these variants align with the polysyllabic model of 攝影師 shèyǐngshī. Their emergence illustrates how synchronic association of similar sounds or meanings, or both, can generate new lexical items. Each form remains interconnected within the broader transformation process, including reversals or reassignments of elements such as 師 shī.

              7. Innovation of loanwords

              In recent years, there has been a trend to import a whole new Sino-Vietnamese words from Chinese for use without any further modifications or changes, at least for now, such as,

                • namthần: 男神 (nánshén, 'Mr. Perfect.')
                • ngọcnữ玉女 (yùnǚ, 'Miss Pretty')
                • giaođãi交待 (jiāodài, 'instruct')

              Throughout its lexical development, nevertheless, Vietnamese has consistently demonstrated a tendency to innovate with loanwords. In the past speakers do not merely reproduce the original sounds but often reshape them, whether consciously or unconsciously. 

              As a result, a single lexical root, once subjected to sound change, can generate multiple Vietnamese variants. Over time, this process enriches the recipient language by expanding its borrowed vocabulary, layering new sound forms onto the original stem and extending their meanings. For instance:

                • 天 tiān: SV thiên ('heaven'); VS trời ('the Almighty'); giời ('sky'); trán ('forehead')
                • 心 xīn: VS tim ('physical heart'); SV tâm; VS lòng ('spiritual heart, inner feelings')
                • 戶 : SV hộ ('household'); VS cửa ('door'); ngõ ('gate')
                • 主 zhǔ: SV chủ ('master'); VS Chúa ('Lord')
                • 注 zhù: SV chú; VS chua ('annotate'); chảy ('flow')
                • 生 shēng: SV sanh; SV sinh; VS sống ('live, raw'); VS đẻ ('give birth'); tái ('raw meat').
                • 回 huí: SV hồi ('return'); VS về ('come back'); quay ('turn')
                • 會 huì: SV hội ('fair'); VS họp ('meeting'); VS hẹn ('date'); hiểu ('understand'); hay ('aware'), hụi ('loan'); hồi ('time'); sẽ ('will'); VS đỗi ('moment') 
                • 沖 chōng: SV xungtrùng; VS dội ('pour water'); sôi ('boil'); xông ('charge'); xấn ('dash'); tông ('collide'); đụng ('collide'); đường ('road'); sang ('print photo'); xối ('wash out').
                • 粉 fěn: SV phấn (powder', 'chalk'); VS phở ('noodle'); bún ('vermicelli')bột ('flour')bụi ('dust')
                • 鏡 jìng: SV kính; VS kiếnggương ('mirror', 'eyeglasses').
                • 機 : SV ; VS cửi ('shuttle'); dịp ('opportunity'); máy ('machine'); máycửi ('loom').

              8. Disyllabic examples

                • 太陽 tàiyáng: SV tháidương ('the sun'); VS mặttrờitrờinắngmàngtang ('temple')
                • 月亮 yuèliàng: VS ánhtrăngtrăngsángnàngtrăngmặttrăng
                • 機會 jīhuī: VS cơmaydịpmaycơhộicódịp; SV cơhội ('opportunity')
                • 問答 wèndá: SV vấnđáp; VS hỏiđáp ('Q&A')
                • 問題 wèntí: SV vấnđề; VS thắcmắc ('question')
                • 聽寫 tīngxiě: VS ngheviết ('dictation'); SV chínhtả ('spelling')
                • 過去 guòqù: SV quákhứ ('past'); VS quađiđiquađãqua ('pass by')
                • 邊界 biānjiè: SV biêngiới ('border'); VS bờcõi ('frontier')
                • 鹹魚 xiányú: VS cámặn ('salty fish')mắmcá ('fish paste'); nướcmắm ('fish sauce')

                etc.

                While most words retain their original forms and associated meanings, some evolve by differentiating meanings through either older pronunciations or newer articulations. This process produces subtle shifts in both sound and semantics, sometimes even leading to the emergence of new written characters. However, this does not mean that the majority of Chinese loanwords in Vietnamese necessarily become richer by generating multiple variants. In fact, the reverse is also true: many Vietnamese lexemes consolidate multiple Chinese sources under a single form, uniting different sounds and meanings "under one roof", much like the phenomenon of loangraphs in Chinese itself.

              B) Cases of many‑to‑one mappings (Chinese → Vietnamese):

              • sợ ('fear, dread, scared'): 嚇 xià, 怕 , 怵 chù, 恄 , 悚 sǒn, 愳 , 懼 , 慴 zhé (懾), 怯 qiē
              • đời ('life, generation'): 世 shì, 代 dài, 輩 bèi, 生 shēng

              • ex. 人生 rénshēng → đờingười ('life'); cf. đẻ ('give birth'), tái ('raw, uncooked'), sống ('unripe')
              • chết ('death, die, pass away'): 死 , 折 zhé, 逝 shì, 殛 , 殊 shū, 陟 zhì
              • mưa ('rain, drizzle, shower'): 溟 míng, 雨 , 霂 , 霡 
              • mây ('cloud, fog, haze'): 蔓 mán, 雲 yún, 霨 wèi, 霧 , 霾 măi
              • nhà ('house, family, -ist, dynasty'): 屋 , 家 jiā (SV gia), 者 zhě (SV giả), 朝 cháo
              • đường ('path, road, route'): 唐 táng, 道 dào, 途 , 沖 chòng
              • việc ('work, task, duty'): 活 huó (SV hoạt), 作 zuò (SV tác), 役  (SV dịch), 務  (SV vụ)
              • xanh ('green, blue, azure'): 倉 cāng (SV xanh), 滄 cāng (SV xanh), 蒼 cāng (SV xanh), 青 qīng (SV thanh), 清 qīng, 葱 cōng (SV song)
              • đỏ ('red, burgundy'): 丹 dān (SV đơn), 彤 tóng (SV đồng), 朱 zhū, 絑 zhū, 赭 zhé
              • trường ('school, campus'): 場 cháng, 堂 táng (SV đường), 庠 xiáng (SV tường), 校 xiào
              • tàu ('boat, ship'): 刀 dāo (SV đao), 舠 dāo, 槽 cáo (SV tào), 舟 zhōu, 艚 cáo (SV tào), 艇 tǐng
              • cho ('give, allow'): 給 , 準 zhǔ, 許 , 賜 , 贈 zèng

              C) Disyllabic parallels: word‑concepts grouped by sound/meaning

              • 同感 tónggăn → thôngcảm ('sympathy') vs. 同情 tóngqíng → đồngtình ('sympathy')
              • 幫忙 bāngmáng → bênhvực ('side with') vs. 包庇 bāobì → bảobọc ('support')
              • 混蛋 húndàn → khốnnạn ('wretch, bastard') vs. 困難 kùnnăn ('in difficulties')
              • 堂皇 tánghuáng → đànghoàng, đườnghoàng ('stately, magnificent') vs. 端莊 duānzhuāng → đoantrang ('dignified, demure')
              • 遊蕩 yóudàng → duđảng ('loaf about, loiter') vs. 流氓 líumáng → lưumanh ('hoodlum, hooligan')
              • 有錢 yǒuqián → giàusang ('affluent') vs. 富有 fùyǒu → giàucó ('rich')
              • 高尚 gāoshàng → caosang ('noble') vs. 高望 gāowàng → caovọng ('socially high class')

              At the same time, some sound changes evolve into innovative words that become independent of their original forms. This developmental path is common across languages. For example, in English:

                • albeit < 'all be it' ('though')
                • morning < 'morn' < Old English morgen
                • evening < æfnung (from the verb æfnian, 'grow toward night')

              Following the analogy of evening, the form morn eventually developed into morning.

              All of the above illustrate how Vietnamese doublets and cognates, like other languages, develop many‑to‑one and one‑to‑many mappings, semantic shifts, and idiomatic innovations. Conceptually, this aligns with what Addam Makkai described under the notion of "lexemic idiom" and "lexeme" in his Pragmo‑Ecological Grammar (PEG): Toward a New Synthesis of Linguistics and Anthropology (1978), where idiomatic expressions such as "Emperor of Japan," "old wife," "hot potatoes," and "red herring" acquire metaphorical meanings far removed from their literal origins.

              "[..]the participant morphemes once had (and in other environments still have) separate lexemic status with separate sememic realizates, and these past (or elsewhere still active) meanings have a definite shining-through effect, suffusing the meaning of these lexemic idioms with the old, suppressed, literal meanings. The denotatum in each case is primary or lexical meaning, and the TRANSLUCENT CONNOTATUM is the original literal meaning of the form. What makes lexical idioms unusual is that they, therefore, have two meanings simultaneously, i.e., the REFLECTING DENOTATUM together with the meaning TRANSLUCENT CONNOTATUM. Whether the language has a heavy morpheme reinvestment ratio or not in its lexeme inventory becomes an interesting typological question, but there is little doubt that there are any real languages that do not somehow utilize morpheme reinvestment in the building of new lexemes. "

              In Vietnamese specifically, the author characterizes this phenomenon of association as assimilative sandhi, that is, the sandhi process of assimilation or association. It represents a common form of phonological, and therefore lexical, "reinvestment". Such developments are best understood as natural products of time, arising without deliberate human intervention.

              To illustrate, consider the lexeme 待 dài (‘wait’) and its associatory variations under this process. Note how the sound of dài shifts and aligns with related Vietnamese forms:

              待 dài → SV đãi, VS đợi (‘wait’) [M  待 dài, dāi < MC dəj < OC *dɯːʔ | FQ 徒亥 (→ VS đợi) \ 亥 hài ~ SV hợi ]

              When incorporated into disyllabic compounds, the articulation of 待 dài changes further, producing new idiomatic forms:

              等待 děngdài → SV đẵngđãi, VS chờđợi ('wait for') [Here 等 děng is reinterpreted as ‘đón’ and assimilated with 待 dài ‘đợi’, creating a semantic doublet. Compare the sound‑change patterns of 寺  → SV tự ~ VS chùa, and 承 chéng → SV thừa ~ VS đằng.]
              期待 qídài → SV kỳđãi, VS chờđón ('expect') [M 期 qī, jī, qí, qǐ (kỳ, kì, ki) < MC gɨ < OC *kɯ, *gɯ. In Vietnamese, 期待 qídài is cognate with chờđợi ('wait for'), and in Chinese it means more like 'expect'.]

              Thus, in Vietnamese, 等待 děngdài and 期待 qídài appear to have exchanged meanings.

              Meanwhile, alongside its Chinese semantic variance, 待 dài also develops the sense of 'treat' in Vietnamese:

              對待 duìdài → SV đốiđãi → VS đốixử ('treat')
              待承 dàichéng → SV đãithừa, VS đãiđằng ('entertain', 'treat with a feast')
              接待 jiēdài → SV tiếpđãi → VS tiếpđón ('reception, to greet')

              Here 待 dài associates with 'xử' 處 chǔ ('handle'), or with 'đón' ('receive'), depending on context.

              D) Lexical association as innovation

              Beyond natural sandhi, Vietnamese also employs conscious lexical association to coin new compounds from existing lexemes. This process produces:

              1. Modern innovations
              • táichế 再製 zàizhì ('recycle') [modern M 回收 huíshōu]
              • bấmnút 按紐 ànnǐu ('press/click a button')
              • mạnglưới 網絡 wǎngluò ('network, computer network’)
              • viênchức 職員 zhíyuán ('civil servant, officer')
              • trangmạng 網頁 wǎngyè ('web page')
              • tinnhắn 短信 duǎnxìn ('text message')
                2. Hybrid borrowings
                • bánhbao 餅+包 (‘dumpling’)
                • bòbía 包餅 (‘spring roll’, Teochew style)
                • tủlạnh 冷+櫝 (‘refrigerator’)
                • thangmáy 梯+機 (‘elevator’)
                • thangcuốn 梯+捲 (‘escalator’)
                • xếhộp 盒+車 (‘automobile sedan’)
                • nhuliệu 柔+料 (‘software’)
                • phầncứng 份+剛 (‘hardware’)
                • trangnhà 張+家 (‘homepage’)
                • liênmạng 聯+網 (‘internet’)
                2. Extended idiomatic compounds
                • trànggiangđạihải 長江+大海 (‘lengthy writing’)
                • vòngvotamquốc 三國演義 (‘beat around the bush’)
                • rượuchè 酒+茶 (‘alcoholic, drinking party’)
                • cờbạc 棋+博 (‘gamble’)
                • côngnhânviên 公+人員 (‘civil servant’) ~ côngchức 公+職
                • toàán 座+案 (‘court’)
                • quantoà 官+座 (‘judge’)
                • ratoà 出庭 (‘appear in court’)

                For those who deny that both Chinese and Vietnamese are fundamentally disyllabic languages, such developments are difficult to explain. A monosyllabic Vietnamese word, originally cognate with a single Chinese character, can evolve into multiple disyllabic forms with varied sound changes and meanings. Only by recognizing Vietnamese as a disyllabic language—like Chinese and indeed most languages—can we account for these transformations.

                These changes do not strictly follow the predictable phonetic rules of historical sound change. Instead, they obey their own principle: associative sandhi. In this process, sound shifts are driven by semantic association and compound formation, not just phonological inheritance. The longer and more complex the multisyllabic form, the more drastic the changes tend to be. 

                Sound change may occur with or without human intervention, but locality and time play decisive roles, especially in cases of prolonged historical contact. The longer the contact, the greater the cumulative effect. To reconstruct the historical pronunciations of Sino‑Vietnamese lexicons, particularly those uncommon Vietnamese transcriptions of Chinese characters, scholars rely on the Fǎnqiè (反切, VS Phiênthiết) spelling method. This system, preserved in sources such as the Kangxi Dictionary (康熙字典, SV Khanghi Tựđiển), Guangyun (廣韻, SV Quảngvận), and Tangyun (唐韻, SV Đườngvận), provides explicit phonetic instructions for how a character was read. Without such guides, even specialists would be unable to pronounce many forms with accuracy.

                For common characters embedded in daily speech, however, their forms emerge naturally as part of the language itself. Examples include 起 → SV khởi, 順 shùn → SV thuận, and 場 chăng → SV trường, (and usually lexicographers started from the sounds those common characters to decipher the unknowns.)

                As more examples are amassed, the picture becomes increasingly complex. Many words derived from Old Chinese and Middle Chinese exhibit multiple lexical and phonological developments. Without first grasping the principle of dissyllabicity outlined above, such variation can appear confusing. 

                Consider, for instance, the following cases:

                  1. 沖 chōng, chòng (SV xung, trùng)

                  [ M 沖 chōng, chòng (xung, trùng) < MC ɖuwŋ < OC *duŋ ]
                  Derived Vietnamese forms include:

                    • sang ('develop/print photo'), as in sanghìnhsangảnh (沖印 chōngyìn)
                    • xối ('wash out')
                    • dội ('pour water on')
                    • sôi ('boil up')
                    • xông ('charge')
                    • xấn ('dash against')
                    • tông ('collide')
                    • đụng ('collide')
                    • đường ('public road')

                  2. 大  (‘big, elder’) → SV đại

                  [ M 大 (太) dà, duò, dài, dăi, tài (đại, thái) < MC daj, da < OC *da:d, *da:ds ]
                  This root yields a wide range of Vietnamese forms:

                    • dàipǎn 大販 → láibuôn ('merchant')
                    • dàifu 大夫 → đạiphu ('minister, high official'; modern M. 'physician')
                    • dàge 大哥 → đạica ('big brother')
                    • dàdăn 大膽 → togạn, cảgan ('daring')
                    • dàshēng 大聲 → totiếng ('raise one's voice')
                    • dàyǔ 大雨 → mưato ('heavy rain')
                    • dàxiōng 大兄 → anhcả ('elder brother')
                    • dàjiě 大姐 → chịcả ('elder sister')
                    • dàhăi 大海 → bểcả ('big ocean')
                    • dàhuǒ 大夥 → cảlũ ('the whole group')
                    • dàjiā 大家 → tấtcả ('everyone')
                    • dàjiāng 大江 → sôngcả ('large river')
                    • dàyì 大意 → sơý ('inattentive')
                    • dàhuà 大話 → tàolao ('talk nonsense')
                    • dàyuè 大月 → thángđủ ('full lunar month')
                    • dà'ài 大礙 → đángngại ('formidable')
                    • pángdà 龐大 → khổnglồ ('enormous')
                    • hóngdà 宏大 → tolớn ('great')
                    • lăodà 老大 → thằnglớn ('eldest son')
                    • dàgézi 大格子 → tocon ('big body')
                    • lăotàbùshăo 老大不少 → lớnđầu ('grown‑up')
                    • dàhóngdàliáng 大宏大量 → tấmlòngđạilượng ('magnanimous')

                  3. 海 hǎi ('sea') → SV hải

                  [M 海 (𣴴, 𣳠) hǎi < MC həj < OC *hmlɯːʔ ]

                  Vietnamese developments:

                • biển → bể (‘sea’)
                • khơi (‘open sea’)
                Examples:
                    • 大海 dàhǎi → biểncả ('big sea')
                    • 苦海 kǔhǎi → bểkhổ ('sea of suffering')
                    • 海浪 hǎilàng → sóngbể ('sea wave')
                    • 海口 hǎikǒu → cửabể ('seaport')
                    • 海寇 hǎikòu → cướpbể ('sea pirate') [modern hảitặc 海賊 hǎizéi → VS giặcbể]
                    • 出海 chūhǎi → rakhơi ('put out to sea')
                    • 外海 wàihǎi → ngoàikhơingànkhơi ('open seas')

                E) Many‑to‑one correspondences

                In cases of the many‑to‑one model from Chinese to Vietnamese, it is evident that a single Vietnamese word, whether monosyllabic or disyllabic, may correspond to multiple Chinese sources, depending on context. Such examples directly challenge the outdated monosyllabicity viewpoint, which assumes that sound change must be restricted to one‑to‑one correspondences. Instead, these cases demonstrate the dynamic and fluid nature of sound change, for example:

                  Examples:

                  1. 'cho'

                    • cho 給 jǐ, gěi → SV cấp ('give')
                    • cho(phép) 準 zhǔn → SV chuẩn ('allow')
                    • cho 許 xǔ → SV hứa ('allow')
                    • cho 賜 cì → SV tứ ('present with')
                    • cho 贈 zèng → SV tặng ('give a gift')

                    Compounds with cho

                    • chodầu 雖然 suīrán ('although')
                    • chonên 所以 suǒyǐ ('therefore')
                    • chotới 直到 zhídào ('until')
                    • chotiền 捐錢 juānqián ('donation')
                    • dànhcho 專用 zhuānyòng ('specialized for')
                    • khiếncho 引起 yǐnqǐ ('cause')

                  2. 'làm'
                    • làm 幹 gàn → SV cán ('do, work')
                    • làm 辦 bàn → SV bạn ('handle')
                    • làm 弄 nòng → SV lộng ('make')
                    • làm 令 lìng → SV lệnh ('cause') [ Ex. 令人驚訝. Lìng rén jīngyá. (Làm ngườita kinhngạc. 'It caused surprise to everybody.') ],
                    • làmruộng 耕田 gēngtián → SV canhđiền ('to farm')
                    • làmcàn 蠻干 mángàn → SV mancán ('foolhardy')
                    • làmơn 頒恩 bān'ēn → SV banân ('bestow')
                    • làmdốc 排架子 báijiàzi → SV bàigiátử ('pretend')
                    • làmgương 旁樣 pángyāng → SV bàngnhan ('exemplify')
                    • làmphiền 勞煩 láofán → SV lạophiền ('please help')
                    • làmăn 生意 shēngyì → SV sinhý ('make a living')
                    • làmviệc 幹活 gànhuó → SV cánhoạt ('work')
                    • làmthinh 安靜 ānjìng → SV antịnh ('keep quiet')
                    • làmkhôngkịp 來不及 láibùjí → SV laibấtcập ('cannot make it')
                    • làmlại 再來 zàilái → SV táilai ('try again')
                    • làmcông 勞工 láogōng → SV laocông ('to labor')
                    • làmlụng 勞動 láodòng → SV laođộng ('to labor')
                    • làmquan 當官 dàngguān → SV đángquan ('be an official')
                    • làmlính 當兵 dàngbīng → SV đángbinh ('be a soldier')
                    • làmchủ 當家 dàngjiā → SV đánggia ('be the boss')
                    • làmtóc 理髮 lǐfá → SV líphát ('hairdo')
                    • làmtiền 賺錢 zhuànqián → SV chuyếntiền ('make money')
                    • làmtiền 勒索 lèsuǒ → SV lặctác ('extortion')

                  And beyond these, idiomatic extensions continue: làmtình, làmnư, làmtàng, làmkhó, làmbộ, etc.

                     These examples demonstrate that Vietnamese sound change is not confined to rigid one‑to‑one correspondences. Rather, it reflects dynamic processes of association, assimilation, and reinvestment, producing both many‑to‑one and one‑to‑many mappings. The longer and more complex the multisyllabic forms, the more drastic the changes tend to be.

                With a measure of linguistic common sense, one can readily accept certain implicit mechanisms, illustrated by generalized Vietnamese làm (cf. English 'do', 'make', 'work', 'perform'), that underlie the sound changes giving rise to the lexical variants discussed above. Forms such as 弄 nòng, 幹 gàn, and 當 dāng are easily recognized as cognates, linked by phonological relations across languages and reflecting shared roots.

                It is also important to recognize that in Chinese a single concept‑word (whether lexeme, morpheme, allophone, or doublet) may be represented by several characters, which can be transcribed or pronounced similarly or differently depending on time period and locality. For example, 作 zuò (SV tác) and 做 zuò (SV tố) both mean 'do' or 'make'. In such cases, the multiplicity of forms in Chinese itself, and their Vietnamese reflexes, require careful analysis to understand how sound changes unfolded.

                a) Illustrative cases:

                • phong 風 fēng ('wind') → giônggió

                  • 颱風 táifēng → giôngtố ('typhoon')
                  • 暴風 bàofēng → bãogiônggióbão ('storm')
                  • 風雨 fēngyǔ → giómưamưagió ('rainstorm')
                  • 蜂 fēng → ong ('bee') [ = 螉 wēng → ong, a doublet of the same root for 'bee'. ]
                • gong 公 gōng ('male, public, baron') → côngcồôngtrống

                  • cf. 翁 wēng ('elder, hair') → ông

                  • 母  ('mother') → mẫumẹmợmái

                In these cases—ong, cồ, công, ông, trống—the multiple forms can be recognized as doublets, phonologically related variants of the same root.

                b) Less transparent associations:

                Not all cases are so straightforward. For instance:

                • 健康 jiànkāng (SV kiệnkhangkiệnkhương) → VS sứckhoẻ ('health', 'healthy')

                  It is plausible that jiànkāng 健康 itself gave rise to the Vietnamese sứckhoẻ ('health'), and this seems the most likely explanation.
                  Vietnamese reflex: sức (via associative sandhi, cf. 力 lì → SV lực)
                  Middle Chinese (Baxter): kɛnH, (Pulleyblank): kɛnH (departing tone, velar initial)
                  • 健 jiàn → SV kiện → reinterpreted as sức ('strength')
                  • 康 kāng → SV khang / khương → evolved into khoẻ ('strong, well')
                  Phonological Pathway for khoẻ:
                  • 康 kāng (MC kʰɑŋ, Pulleyblank kʰaŋ): SV khang → variant khương /kʰjɨəŋ1/
                  • → /kʰaŋ1/ → /kʰwəɒn5/ (khoắn)
                  • Shift: kʰw‑ → w‑ → m‑ → /majŋ6/ (mạnh

                c) Comparative table

                Chin. Mandarin Sino-Viet. Sinitic-Viet. MC (Baxter) MC (Pulleyblank) OC (Baxter-Sagart) OC (Zhengzhang)
                jiàn kiện khoẻ kɛnH kɛnH [k]ˤi[n]-s [k]ˤi[n]-s
                zhuàng tráng khoắn / mạnh tsrjangH tsrjangH [ts]ˤroŋ-s ʔs-toŋ-s

                The comparative evidence we’ve examined, from jiànkāng 健康 → sứckhoẻ to jiànzhuàng 健壯 → khoẻkhoắn,  illustrates how Vietnamese reflexes emerge not through rigid one‑to‑one correspondences, but through dynamic associative sandhi, semantic reanalysis, and phonological reinvestment.

                • 健 jiàn consistently aligns with sức or khoẻ, reflecting its semantic field of 'strength, health'.

                • 康 kāng and 壯 zhuàng interact with each other in Vietnamese, producing khoẻkhoắn, and even mạnh, showing how reduplication and sound shifts (kʰw‑ → w‑ → m‑) generate new forms.

                • Reconstructions across Middle Chinese (Baxter, Pulleyblank) and Old Chinese (Baxter–Sagart, Zhengzhang) confirm the plausibility of these transformations, grounding Vietnamese developments in well‑attested phonological histories.

                Etymologically, the initial kh- /kʰ-/ of the second syllable may have been 'sandhized' with the final /‑n/ of the first syllable /jiàn/. The first syllable can be identified with sức (cf. 力 , SV lực 'strength'), while the second, kāng 康 (with an alternate, probably older SV form khương /kʰjɨəŋ1/), developed into khoẻ ('strong').

                An alternative possibility is that sứckhoẻ arose as an innovation from 力氣 lìqì (SV lựckhí, 'power, stamina'), which could yield sứckhoẻ or hơisức ('strength'), implying 'having the strength to do heavy tasks' and thus extending to the general sense of 'being healthy'. In this scenario, 力氣 lìqì denotes primarily 'strength, stamina', whereas 健康 jiànkāng conveys the broader meaning of both 'health' and 'healthy'.

                A further comparison can be made with the compound khoẻmạnh. If we separate the two elements, khoẻ + mạnh, they could be reconstructed as 壯 zhuàng + 猛 měng ('strong' + 'powerful'), or in reverse order as mạnhkhoẻ corresponding to 猛壯 měngzhuàng (SV mãnhtráng). Yet these Chinese forms, while semantically close, emphasize 'energetic' or 'powerful' rather than the more specific sense of 'health' conveyed by 健康 jiànkāng.

                Moreover, the concept of khoẻ must have existed independently in Vietnamese, as seen in the everyday greeting Chào, có khoẻ không? ('Hello, are you well?'), which parallels the modern Chinese expression 早, 你好? ('Hello, how are you?'). Here, khoẻ aligns with 好 hǎo (SV hảo), suggesting a pre‑existing Vietnamese word later reinforced by Sinitic influence.

                Applying the principle of associative sandhi further postulates the development of 健壯 jiànzhuàng (SV trángkiện, 'feeling fit, well')  khoẻkhoắn. In this case, 健 jiàn (SV kiến) corresponds to khoẻ, while 壯 zhuàng can be associated with 康 kāng (SV khang, khương), producing either the reduplicative khoắn or the variant mạnh. The phonological pathway may be reconstructed as: /kʰjɨəŋ1/ (khương) → /kʰaŋ1/ → /kʰwəɒn5/ (khoắn) → /majŋ6/ (mạnh), with the shift kʰw‑ > w‑ > m‑.

                Parallel cases (cho, làm, phong, gong, etc.) reinforce the principle that Vietnamese often maps many Chinese sources to one Vietnamese form, or one Chinese source to multiple Vietnamese outcomes, depending on context.

                In sum, these patterns strengthen the dissyllabicity hypothesis: Vietnamese, like Chinese, is fundamentally polysyllabic in its lexical organization. The "mystical morphs" in compounds (mĩm in mĩmcườithútthít in khócthútthítbạt in bạtmạng) are not anomalies but natural outcomes of this system.

                Vietnamese sound change is best understood not as a static set of correspondences, but as a living system of associative sandhi, where phonological shifts, semantic layering, and cultural adaptation converge. This framework allows us to trace Vietnamese etyma back to their Sinitic roots with greater clarity, while also appreciating the uniquely Vietnamese innovations that emerged along the way.

                d) Other examples of associative development:

                • bắtđầu 劈頭 pītóu ('start')
                • bắtcóc 綁架 bǎngjià ('kidnap')
                • bắtđền 賠償 péicháng ('demand compensation')
                • bắtnạt 撥弄 bōnòng ('order about')
                • mĩmcười 含笑 hánxiào ('smile')
                • khócthútthít 哭泣 kùqì ('weep')
                • lỗtai 耳朵 ěrduō ('ear')
                • bạttai 巴掌 bāzhǎng ('spank')
                • bạtmạng 拼命 pìnmìng ('risk one’s life')
                • đồngbạc 銅板 tóngbǎn ('monetary unit')
                • đitiền 隨錢 suíqián ('monetary gift')
                • thầytrò 師徒 shītú ('teacher and students')
                • họctrò 學子 xuézǐ ('student')
                • trườnghọc 學堂 xuetáng ('school')
                • nhẹnhàng 輕輕 qīngqīng ('slightly')
                • chungquanh 周圍 zhōuwéi ('around')
                • thôinôi 周年 zhōunián ('anniversary')
                • nhiềunăm 有年 yǒunián ('many years')
                • mấynămnay 近年來 jìnniánlái ('in recent years')
                • hoahồng 花紅 huāhóng ('commission')

                It is a rule of thumb that phonetic sandhi processes occur only within disyllabic formations. To reinforce this postulation, let us examine several additional unique examples:

                Table 13 - Disyllabic Sandhi Examples

                Chin. Sino-Vietnamese Sinitic-Vietnamese Meaning Notes
                垃圾 lāji lạpcấp rác 'trash' Likely from rácrưới, rácrưởi, rácrến. Sandhi: ra- (l‑ ~ r‑) + ‑c (j‑ ~ k‑). Cf. 伊拉克 YīlākēIrắc.
                毫無 háowú hảovô không 'no, not' Via hông < khônghề. Shows contraction and semantic narrowing.
                天啊 Tiānna thiêna Trờiơi 'My Lord' Fusion of 天 tiān + 啊 ā.
                béng bằng đừng 'don't' From 不用 bùyòng ('do not').
                bié biệt chớ 'do not' From 可別 kěbié ('do not'). 可 ~ (有 yǒu). Originally 不要 bùyào → 別 bié.

                This table makes the sandhi processes and associative shifts much easier to scan:

                  • Phonological mergers (e.g., ra- + ‑crác).
                  • Semantic contractions (e.g., háowúkhông).
                  • Exclamatory fusions (e.g., TiānnaTrờiơi).
                  • Colloquial reductions (e.g., béngđừng).
                  • Reanalyses (e.g., biéchớ).

                Taken together, these cases reinforce the dissyllabicity principle. Vietnamese monosyllabic forms such as bắt, thầy, thợ, trò, trường, hàm, ngậm, mĩm, tai, tay, quanh, thôi, nôi, năm, đồng, bạc, nhiều, and disyllabic forms such as bạtmạng or bạttai, can be securely traced to Chinese sources through associative sandhi and phonological reinvestment. This framework also clarifies the role of “mysterical morphs” in polysyllabic composites—mĩm in mĩmcười, thútthít in khócthútthít, or bạt in bạtmạng—showing them not as anomalies but as natural outcomes of dynamic sound change.

                Of course, Chinese → Sinitic‑Vietnamese sound changes sometimes occur beyond the strict constraints of formal linguistic rules. It is unnecessary, however, in the scope of this paper, to enumerate every possible rule of sound change for each Chinese character that yields a Sinitic‑Vietnamese morphemic syllable, including those that intersect with Vietnamese phonology. Most readers can readily grasp the mechanisms behind the examples already cited, sound changes that are both plausible and intuitively recognizable. For instance: bīng 兵 → lính ('soldier'), bèi 盃 → ly ('glass'), bài 拜 → lạy ('kowtow'), 打 → đánh ('strike'), yǐn 飲 → uống ('drink'), yóu 游 → bơi ('swim'), yóu 柚 → bưởi ('pomelo'), bǐrú 比如 → vínhư ('for example'), and so forth.

                In virtually all Sino‑Vietnamese readings, the pronunciation keys, namely 反切 fǎnqiè (FQ), correspond closely to the phonetic descriptions preserved in classical rhyme books and dictionaries such as 廣韻 Guǎngyùn and the 康熙字典 Kāngxī Zìdiǎn. These attestations confirm the reliability of the system. For example: xié 鞋 → SV hài ('shoes'), 哭 → SV khốc ('weep'), bǐng 偋 → SV sính ('betroth'), chéng 承 → SV thừa ('inherit'), among many others.

                Critics may continue to mock the author’s supposed ignorance of Western historical‑comparative methodologies, which are indeed effective for Indo‑European generalities but ill‑suited to the peculiarities and irregularities of Chinese and Vietnamese. Ironically, these same critics insist on classifying Chinese and Vietnamese as "isolated monosyllabic languages", thereby overlooking the fact that Western rules of sound change cannot be applied to patterns involving more than one syllable.

                It remains customary, of course, for a system of well‑established phonological rules to guide specialists in analyzing and explaining the phenomena of sound change. Historical background is essential for understanding how words became homonyms within the Chinese phonological system. Many Mandarin words that share a Middle Chinese origin still retain distinct reflexes in Sino‑Vietnamese, as well as in Cantonese and Hokkien, and these match the phonological spellings recorded in the Kāngxī Zìdiǎn. For example, the morpheme yi [i], pronounced with four different tones in Mandarin, corresponds to a wide range of Sino‑Vietnamese forms: nhất, nghĩa, nghệ, ngãi, nghị, y, dịch, duệ, dị, , , etc., representing 一, 義, 藝, 議, 醫, 易, 裔, 異, 以, 姨, respectively.

                Here’s the compact comparative table for the morpheme yi [i], showing how one Mandarin syllable with multiple tones corresponds to a wide range of Sino‑Vietnamese (SV) outcomes. I’ve included Middle Chinese (Baxter, Pulleyblank) and Old Chinese (Baxter–Sagart, Zhengzhang) reconstructions so you can see the historical depth of each reflex.

                Table 14 - The Morpheme yi [i] Across Chinese and Sino‑Vietnamese

                Chin. Sino-Vietnamese Meaning MC (Baxter) MC (Pulleyblank) OC (Baxter-Sagart) OC (Zhengzhang)
                一 yī nhất 'one' ʔit ʔit ʔit ʔit
                義 yì nghĩa 'righteousness' ngjijH ngjieH ŋ(r)aj-s ŋi̯e-s
                藝 yì nghệ 'art, skill' ngjejH ngieiH ŋ(r)at-s ŋi̯at-s
                議 ùi nghị 'discuss, deliberate' ngjijH ngjieH ŋ(r)aj-s ŋi̯e-s
                醫 yī y 'medicine, doctor' ʔij ʔi ʔij ʔi
                易 yì dịch 'change, exchange' jek jek lek lek
                裔 yì duệ 'descendant' jejH jieH ljats ljat-s
                異 yì dị 'different' jiH jiH ljəʔ-s ljɯ-s
                以 yī 'use, take' yiX yiX ʔijʔ ʔiʔ
                姨 yí 'aunt' ji ji ljə ljɯ

                Key takeaways:
                • A single Mandarin syllable yi [i] corresponds to much more than just the listed ten distinct Sino‑Vietnamese outcomes, depending on tone, historical layer, and semantic differentiation.

                • Middle Chinese reconstructions (Baxter, Pulleyblank) show tonal and initial distinctions that later collapsed in Mandarin but were preserved in Sino-Vietnamese.

                • Old Chinese reconstructions (Baxter-Sagart, Zhengzhang) reveal deeper roots, often with complex onsets (ŋ‑, lj‑, ʔ‑) that explain the diversity of Sino-Vietnamese reflexes (nh‑, ngh‑, d‑, y‑).

                • This multiplicity demonstrates how Vietnamese disyllabicity and associative sandhi interact with Middle-Chinese phonological history to produce a rich array of forms.

                Etymologically, many sound changes within the Chinese phonological system can be reconciled when compared against the Middle Chinese (MC) and Old Chinese (OC) sound systems. Sinologists such as Wang Li and Bernhard Karlgren also compared these with Chinese loanwords in Vietnamese, Korean, and Japanese to reconstruct earlier stages of Chinese. For illustration, consider the first three [i] readings discussed above:

                • nhất 一  [i1] ('one') [ M 一 yī, yí, yì, yāo < MC ʔjit < OC *qliɡ ] → SV nhất

                • nghĩa ~ ngãi 義  [i4] ('righteousness') [ M  義 yì < MC ŋjiə̆ < OC *ŋrals | According to Starostin, ‘be right, righteous, proper’; derived from 宜 ŋaj. Vietnamese nghĩa preserves an archaic reading (late Han a‑vocalism, but with loss of final ‑j), while SV ngãi /ŋaj4/ and Quảng Nghĩa dialect nghiẽ /ŋie4/ retain the final. Cf. Chaozhou ŋi4, Fuzhou ŋie6. ]

                • nghệ 藝  [i4] ('arts', 'skill') [ M  藝 yì < MC ŋjaj < OC *ŋeds. | Starostin: 'to plant, cultivate; skill'. MC unusually preserves ŋ before j. Vietnamese also has the colloquial nghề. Cf. Xiamen ge6, Chaozhou goi6, Fuzhou ŋie6. Possible cognates: VS nghề ('profession'), nghệ ('turmeric'), ngãi (‘turmeric'), tỉa ('to plant’), gieo ('to sow'). ]

                Linguistic literature abounds with such radical changes. As King (1969: 109, 111) observes, "loss of segments is an almost commonplace kind of historical development: Greek lost its final stops, Germanic lost word‑final consonants and vowels under certain conditions." Mandarin exemplifies this process, having undergone extensive loss of phonological finals under the influence of accented speech from Altaic Turkic peoples who tule the Yan State in northern China, the Liao Dynasty, the Mongols, the Jurchen (金 Jin), the Manchurians, over centuries prior to 1911.

                Concrete cases of loss of initials and finals in Mandarin are well documented. While the details of this complex process are not to be addressed here, it suffices to note that the contracted forms seen in the 360 four‑toned [i] syllables arose from the dropping of archaic initials and endings during diachronic sound change.

                Ancient rhyme books such as Guǎngyùn (廣韻) and Zhōngyuán Yīnyùn (中原 音韻) provide ample evidence of these gradual changes. Yet synchronically, the same patterns emerge across Northeastern Mandarin, Wu, and Southwestern Mandarin (e.g., Sichuanese), in contrast to southern dialects such as Cantonese and Hokkien, which preserve ancient initials and finals (/d‑/, /ŋ‑/, /‑p/, /‑t/, /‑k/, /‑m/, etc.). From the early centuries of the last millennium through the Mongol Yuan Dynasty (13th century) and later the Manchu Qing Dynasty (17th–20th centuries), Mandarin remained in close contact with Altaic languages (Turks, Tartars, Jurchen, Mongols, Manchus). These northern non‑Han languages left a profound impact on Early Mandarin, contributing to contraction, omission, corruption, and loss of initials, medials, and finals (Bo Yang 1983; Zhou 1991). In other words, descents of the ancient Yan (Yên) still rule Beijing today. 

                Although it may appear straightforward to chart sound change patterns by systematically tabulating ancient and modern forms, in practice this is a painstaking task. Often the intermediate steps are opaque, making it difficult to reconstruct the precise pathways. For newcomers, it is sometimes best to accept the outcomes at face value. For example, Vietnamese học /hawk͡p̚˧˨ʔ/ (‘study’) derives from MC ɦaɨwŋk < OC *ɡruːɡ, rtc. fitting into a diachronic continuum that leads to Mandarin xué [ɕyɛ2]:

                • học ('study') 學 xué < EM /xjaw/ < MC ha:wk < OC ɣɶ:kʷ [ Rule: final /‑k/ conditioned by /‑w‑/ → /‑kʷ/ | Starostin: MC ɣauk < OC ghrūk. Pulleyblank: LM xɦja:wk < EM ɣaɨwk. The Vietnamese /‑kw/ in học parallels Cantonese /hɔk8/, though Cantonese has lost the labialization. ]

                • tiết ('blood') 血 xiě, xiè, xuè (SV huyết< MC hwet < OC *qʰʷiːɡ | Starostin: Viet. also has tiết 'animal blood' - an archaic loan (with t- regularly representing OC *s-, which was already lost in MC) ]

                Similarly:

                • khóc (‘weep’) 哭  < MC kʰəwk < OC *ŋ̥ʰoːɡ | SV khốc /əwk͡p̚˦˥/ ] preserves the archaic final. VS khóc /kʰawk͡p̚˦˥/ reflects the same development. Dialectal parallels: Yangzhou /khɔʔ4/, Suzhou /khoʔ41/, Cantonese /huk41/, Amoy /khoʔk41/, Chaozhou /khok41/, etc. ]

                • khóc 泣 qì (SV khấp ) < MC kʰɯip  < OC *kʰrɯb | Wiktionary: Phono-semantic compound OC *kʰrɯb: semantic 氵 (“water”) + phonetic 立 (OC *rɯb). The character originally meant "tears," and by extension, it came to represent the act of 'crying.' | Ex. 喪家 同 泣報. Sāngjiā tóng qìbào. (Tanggia đồng khấpbáo.) 'The bereaved family tearfully announces the death.' ]

                These examples show that sound change is dynamic and diverse, affecting not only single syllables but also entire multisyllabic strings, as seen in disyllabic forms discussed earlier.

                One of the most striking features of syntactic adaptation in Vietnamese is the reversal of compound word order to align with Vietnamese speech habits. This reflects the [noun + adjective (modifier)] order of Old Chinese grammar, in which the second element modifies the first, as opposed to the [adjective + noun] order of modern Chinese. Yet both patterns coexist in Vietnamese. For example:

                • mắtkiếng (VS mắtkính) for 目鏡 mùjìng (Hai. /mat7keng1/, ‘eye‑glasses’), paralleling the Old Chinese [noun + modifier] order.

                • kiếngmắt (VS gươngmắt), reflecting the modern Chinese [modifier + noun] order.

                This phenomenon of lexical re‑arrangement — metathesis or inversion, that is, the transposition of sounds or letters in a word — has had a profound effect on the formation of disyllabic compounds, determining which syllable comes first. Many such words, especially Chinese loanwords, were originally composed of two lexical elements. When introduced into Vietnamese, speakers either retained the original order or reversed it to suit local grammar. The relative looseness of these paired syllables allowed for such fluidity, particularly during the period from the Jin (晉) through the Tang (唐) dynasties, when large numbers of both literary and colloquial words entered Vietnamese. Over time, one form often stabilized as the standard, while alternate variants persisted in parallel.

                Examples of such alternation includes:

                • thơdại # ngâythơ 幼稚 (yōuzhī, SV ấutrỉ # VS trẻdại, 'childish')
                • sôngnúi # nonsông 江山 (jiāngshān, cf. 山河 shāhé, SV sơnhà, 'country')
                • nhànước # nướcnhà 國家 (guójiā, SV quốcgia, 'government' vs. 'nation')
                • hoamắt # mắthoa 眼花 (yǎnhuā, 'dazzling vision')
                • trườnghọc # họcđường 學堂 (xuétáng, 'school')
                • chợbúa # phốchợ 市鋪 (shìpǔ, SV thịphố # phốthị, 'marketplace')
                • bảođảm # đảmbảo 擔保 (dànbǎo, 'guarantee')
                • âmthanh # thanhâm 聲音 (shēngyīn, 'sound')
                • lạnhcóng # cónglạnh 寒冷 (hánlěng, 'chilly')
                • tráicây # câytrái 果實 (guǒshí, 'fruit')
                • nhãnlồng # longnhãn 龍眼 (lóngyǎn, 'longan')

                Readers who do not yet recognize the transposition should either suspend judgment or accept the stated propositions as working premises, to be used as a springboard for further inquiry. This is, after all, the natural process of human learning.

                Such lexical shuttling — anastrophe or inversion — often places semantic weight on the modified element rather than the modifier. For instance, 罪惡 zuì’è (SV tộiác 'crime') semantically corresponds more closely to 惡罪 èzuì 'evil‑crime', yet both orders coexist in the lexicon.

                The logical outcome of this disyllabic treatment is clear: when in doubt, one should test the reversed order of syllables. This principle reflects the fact that many sound changes occurred before the final stabilization of disyllabic forms in either Chinese or Vietnamese. By applying this inversion 'trick', we can often reconstruct plausible etyma and establish credible connections between Vietnamese and Chinese cognates.

                The Vietnamese lexicon is full of compounds that seem, at first glance, oddly inverted. Yet this inversion is not random: it reflects a deep historical process in which borrowed syllables were still fluid, their order unsettled, and speakers chose whichever arrangement best suited local grammar and rhythm.

                Take bắtnạt # 欺負 (qīfù, SV khiphụ, 'bully'). The Chinese compound places 欺 'to deceive' before 負 'to bear', but Vietnamese speakers flipped the order, letting bắt carry the weight of action while nạt sharpens the sense of intimidation. The result is a form that feels native, even though its roots are unmistakably Sinitic.

                The same pattern appears in thầymô # 巫師 (wūshī, SV usư, 'sorcerer'). Here, the Chinese order is 巫 'shaman' + 師 'master'. Vietnamese, however, foregrounds thầy 'teacher, master' and lets carry the shamanic nuance. The inversion not only naturalizes the compound but also aligns it with the Vietnamese cultural schema of thầy as a figure of authority.

                Other cases are subtler but no less telling. khônlanh # 靈巧 (língqiáo, SV linhxảo, 'witty') reverses the Chinese order, giving primacy to khôn 'clever' while letting lanh 'quick, nimble' follow. hồnthiêng # 靈魂 (línghún, SV linhhồn, 'spirit') likewise inverts the Chinese sequence, foregrounding hồn 'soul' and letting thiêng 'sacred' qualify it.

                Even everyday kinship terms bear this mark of inversion. bàxã # 媳婦 (xífù, 'wife') and ôngxã # 相公 (xiànggōng, SV tướngcông, 'husband') both re‑order the Chinese elements, mapping them onto Vietnamese kinship vocabulary in ways that feel natural to the ear. Similarly, ôngchủ # 主公 (zhǔgōng, SV chúacông, 'master') places ông first, in keeping with Vietnamese address norms.

                Geographic and cultural compounds show the same play. nonsông # 江山 (jiāngshān, SV giangsơn, 'nation') reverses the Chinese order, yielding the familiar Vietnamese pairing sông núi 'rivers and mountains'. yêuthương # 疼愛 (téng’ài, SV đôngái, 'love') likewise reshuffles the Chinese sequence, foregrounding yêu 'love' and letting thương 'affection' follow.

                Even the mundane đườngcái # 街道 (jièdào, SV cáiđạo, 'road') and phốchợ # 市舖 (shìpū, SV thịphố, 'market') reveal the same logic: Vietnamese speakers instinctively re‑ordered the syllables to fit local patterns of emphasis. And in ôngnghè # 衙門 (yámén, SV nhamôn, 'civil servant'), the inversion is almost playful, mapping 門 'gate' onto ông and 衙 'office' onto nghè, producing a form that is both intelligible and culturally resonant.

                The pattern is unmistakable: inversion was a strategy of naturalization. By flipping the order, Vietnamese speakers made foreign compounds feel native, aligning them with local rhythm, semantics, and cultural schemas. For the historical linguist, this is more than a curiosity — it is a method. When a Vietnamese form resists explanation, try reversing the syllables. More often than not, the hidden cognate will surface.

                What these examples show is not chaos but a principle: when a compound entered Vietnamese, its syllables were not fixed in stone. They could be reversed, re‑weighted, and re‑aligned until they fit the cadence of Vietnamese speech. This "inversion trick" is more than a curiosity; it is a diagnostic tool. When a Vietnamese form seems opaque, try flipping the order. More often than not, the hidden cognate will emerge.

                In sum, our exploration of disyllabicity demonstrates that many Vietnamese words, long overlooked by scholars, can only be properly understood through this lens. By tracing how Vietnamese disyllabic forms diverged from their Chinese roots, we gain a powerful methodological tool.

                The renewed recognition of Vietnamese as a fundamentally disyllabic language establishes a new polysyllabic approach to etymology. Many peculiar sound changes from Chinese into Vietnamese occurred only under such conditions. This approach, long overdue, challenges the entrenched but mistaken notion of Vietnamese as a purely monosyllabic language. (雙)

                Conclusion

                The evidence presented in this chapter shows that Vietnamese cannot be understood apart from the wider Siniti-Yue continuum. From the earliest strata of loanwords to the fluid alternations of disyllabic compounds, the language reveals centuries of contact, adaptation, and inversion.

                Several points stand out. Vietnamese compounds often invert the order of their Chinese models, a process that naturalized foreign forms into local rhythm and semantics. This inversion is not random but systematic, and it provides a diagnostic tool for reconstructing etyma. The lexicon is layered: Sino‑Vietnamese readings, vernacular Sinitic‑Vietnamese forms, and substratal elements from Mon‑Khmer, Chamic, and Tai. Yet the Sinitic layers dominate, and their depth suggests not mere borrowing but shared inheritance from a Yue substrate. Borrowed compounds were not only phonologically reshaped but semantically re‑weighted to align with Vietnamese cultural schemas, as in ôngchủ, bàxã, or sôngnúi.

                Historically, the great influx of Chinese vocabulary coincided with political domination and migration from the Jin through Tang dynasties. Stabilization came only gradually, leaving behind doublets, alternates, and inversions that still mark the lexicon today. The methodological lesson is clear: when a form resists explanation, one should test the reversed order of syllables. This simple procedure often uncovers hidden cognates and reconstructs plausible etymologies.

                Taken together, these findings support the view that Vietnamese and Chinese share a common Yue foundation, later overlaid by successive waves of Sinitic influence. Vietnamese is not simply a Mon‑Khmer language with heavy Chinese borrowings, nor merely a Sino‑xenic reflex. It is a Yue‑based language, refracted through centuries of contact, conquest, and cultural negotiation.

                Recognizing Vietnamese as fundamentally polysyllabic and disyllabic in structure opens a new path for etymological research. It allows us to explain sound changes, semantic shifts, and lexical alternations that have long puzzled scholars. More importantly, it reframes Vietnamese not as a peripheral tongue under Chinese shadow, but as a central witness to the shared Yue heritage of southern China and northern Vietnam.

                This chapter establishes the methodological springboard: to study Vietnamese etymology, one must embrace polysyllabicity, test inversion, and situate the lexicon within the Yue–Sinitic continuum. Only then can we recover the true historical depth of the language and its people.

                x X x

                ENDNOTES


                (南)^ Across the seventy‑two volumes of the Zizhi Tongjian (資治通鑑) Sīmă Guāng’s monumental chronicle of Chinese history from the Xia and Shang dynasties through the Song, one recurring theme is the geography of exile. Again and again, disgraced officials, political rivals, and fallen elites were banished to the far south: the ancient Língnán (嶺南) region — today’s Guangxi, Hunan, and Guangdong — together with the northern reaches of Vietnam, once the NamViệt Kingdom (南越王國), and the island of Hainan. Exile, in other words, was not an occasional punishment but a structural feature of Chinese political life.

                (京)^ Throughout Vietnam’s history, the descendants of the racially mixed Annamese population — later known as Người Kinh, the Kinh, or Jing (‘the metropolitans’) —established their early dominance in the Red River Delta before gradually migrating westward and southward into new settlements. In the process, they displaced or absorbed many indigenous groups, most notably the Mường and the Mèo (Hmong), along with smaller Daic‑origin communities such as the Tày, who today number over a million and continue to inhabit the remote northern highlands, including the more recently incorporated territories of Laichâu and Điệnbiên.

                As Annam expanded further south, the Kinh supplanted the Chamic populations along the coastal plains and pressed into the southwestern uplands traditionally occupied by Mon‑Khmer groups. Over time, these once‑dominant peoples became minorities within their own ancestral lands. The legacy of this displacement remains visible today. In the early years of the twenty‑first century, clashes erupted between Khmer‑descended communities and the Kinh majority, episodes that were at times tacitly tolerated or even directly sanctioned by the state. The resulting tensions forced many Khmer minorities to flee across the border into Cambodia, while hundreds eventually resettled in the United States in 2005.

                (華)^ From a historical perspective, one finds few instances of open racial conflict between Vietnamese and Chinese communities. Even resentment was generally directed not at long‑assimilated groups, those who had intermarried with the Kinh over centuries, but at more recent arrivals, often fresh from the boat, only two or three generations removed from China and still trying maintain a distinct ethnic identity. Nevertheless, the degree of integration is high as illustrated by the fact that many celebrated performers in contemporary Vietnam are of Hoa origin, a reality that contrasts sharply with the violent anti‑Chinese episodes that have recurred in other Southeast Asian countries such as Indonesia, the Philippines, or Malaysia.

                The large‑scale departure of Chinese minorities from Vietnam between 1979 and 1990 should be seen not as spontaneous hostility from the populace but as a politically orchestrated policy of the state. Most of those who left were relatively recent immigrants, with less than a century of settlement, while earlier arrivals had by then largely assimilated into Vietnamese society. Their expulsion was closely tied to the post‑1975 imposition of socialism following national reunification, which provided the government with grounds to confiscate private assets—factories, banks, and businesses—and was later intensified by the Sino‑Vietnamese conflicts of the late 1970s.

                In other words, there were no impromptu acts of violence initiated by ordinary Vietnamese against individual members of the Chinese minority. Even the 2014 riots, which damaged hundreds of Chinese‑owned factories in Vietnam, were directed symbolically at Beijing’s leadership in Zhongnanhai rather than at Chinese laborers working inside the country.

                (門)If Vietnam could have not gained independence from China in the 10th century and were still a China's satellite state, protectorate, or province, then views from the modern linguists for the Vietnamese could have been completely similar to what was designated to the Cantonese and Fukienese dialects as that of the Sino-Tibetan linguistic language.

                (T)All Thai citations in the form of individual word are quoted from Wiktionary.org.

                (苦)^ Vietnamese cay corresponds to 苦 kǔ (SV khổ), which combines with 辛 xīn (SV tân) to form cayđắng (辛苦 qīnkǔ, SV tânkhổ, 'hardship’). The root of cay lies in Middle Chinese 苦 kǔ, with parallels such as 亲酸 qīnsuān (VS chuacay, ‘bitterness’).

                [ 苦 kǔ < MC khɔ < OC kha:ʔ. According to Starostin: ‘be bitter’. Also used for a homonymous *kha:ʔ ‘sow‑thistle’ (Sonchus oleraceus?). Vietnamese khó is a colloquial development, restricted to the sense ‘(bitter) > hard, difficult’, which also exists in Chinese. The regular Sino‑Vietnamese form is khổ. ]

                In modern Mandarin, ‘spicy hot’ is 辣 là (SV lạt), while 苦 kǔ (SV khổ) corresponds to Vietnamese cay. In archaic Chinese, however, 辣 là was closer to Vietnamese lạt ‘insipid, not salted’.

                [ 辣 là < MC ra:t < OC lat. FQ 盧達. According to Starostin: ‘bitter, not sweet’ (Tang). In Vietnamese cf. nhạt ‘insipid, not salted’, written with the same character and possibly a colloquial loan from the same source, though the nasalisation remains uncertain. For r‑ cf. Min forms: Xiamen luat8, luaʔ8, Chaozhou laʔ8, Fuzhou lak8, Jianou luoi8, Jianyang lue8, Shaowu lai6. ]

                For Chinese 辛 xīn ‘bitter’ (SV tân), the Vietnamese equivalent is đắng.

                [ 辛 xīn < MC sjin < OC sin. According to Starostin, also used for a homonymous *sin ‘be bitter, pungent, painful’. ]

                This cluster of correspondences illustrates a common phenomenon in historical linguistics: semantic shifting among archaic roots, where meanings oscillate between ‘bitter’, ‘spicy’, ‘painful’, and ‘insipid’.

                (H)^Teeth from Chinese cave recast history of early human migration

                A trove of 47 fossil human teeth from a cave in southern China is rewriting the history of the early migration of our species out of Africa, indicating Homo sapiens trekked into Asia far earlier than previously known and much earlier than into Europe.

                [ Scientists has announced ] the discovery of teeth between 80,000 and 120,000 years old that they say provide the earliest evidence of fully modern humans outside Africa.

                The teeth from the Fuyan Cave site in Hunan Province's Daoxian County place our species in southern China 30,000 to 70,000 years earlier than in the eastern Mediterranean or Europe.

                "Until now, the majority of the scientific community thought that Homo sapiens was not present in Asia before 50,000 years ago," said paleoanthropologist Wu Liu of the Chinese Academy of Sciences' Institute of Vertebrate Paleontology and Paleoanthropology.

                Our species first appeared in East Africa about 200,000 years ago, then spread to other parts of the world, but the timing and location of these migrations has been unclear.

                University College London paleoanthropologist María Martinón-Torres said our species made it to southern China tens of thousands of years before colonizing Europe perhaps because of the entrenched presence of our hardy cousins, the Neanderthals, in Europe and the harsh, cold European climate.

                "This finding suggests that Homo sapiens is present in Asia much earlier than the classic, recent 'Out of Africa' hypothesis was suggesting: 50,000 years ago," Martinón-Torres said.

                Liu said the teeth are about twice as old as the earliest evidence for modern humans in Europe.

                "We hope our Daoxian human fossil discovery will make people understand that East Asia is one of the key areas for the study of the origin and evolution of modern humans," Liu said.

                Martinón-Torres said some migrations out of Africa have been labeled "failed dispersals." Fossils from Israeli caves indicate modern humans about 90,000 years ago reached "the gates of Europe," Martinón-Torres said, but "never managed to enter."

                It may have been hard to take over land Neanderthals had occupied for hundreds of thousands of years, Martinón-Torres said.

                "In addition, it is logical to think that dispersals toward the east were likely environmentally easier than moving toward the north, given the cold winters of Europe," Martinón-Torres said.

                Paleoanthropologist Xiujie Wu of the Institute of Vertebrate Paleontology and Paleoanthropology said the 47 teeth came from at least 13 individuals.

                The research appears in the journal Nature.

                (Reporting by Will Dunham; Editing by Sandra Maler) Source: http://www.reuters.com/article/2015/10/14/us-science-teeth-idUSKCN0S82CB20151014

                (平)^ Here are postulations of possible Chinese cognates posited for these Vietnamese words:

                  (i) dissyllabicity:

                  • đầugối #膝蓋 xìgài (knee),
                  • mắccá 踝節 guǒjié ‘ankle) [ Also: 踝骨 guǒgǔ ],
                  • cổchân #腳脖 jiăobó (ankle),
                  • càngcổ #脖頸 bójiīng, (back of the neck),
                  • bảvai #肩膀 jiānbăng (shoulders),
                  • cùichỏ 胳膊肘 gēbozhǒu (elbow),
                  • màngtang 太陽穴 tàiyángxué (temple),
                  • mỏác #囟門 xìnmén (fontanel),
                  • chânmày 眉尖 méijiān (eyebrow),
                  • càunhàu 僝僽 chánzhòu (growl),
                  • cằnnhằn 埋怨 mányuàn (grumble),
                  • bângkhuâng 彷徨 pánghuáng (pensive),
                  • bồihồi 徘徊 páihuái (melancholy),
                  • mồhôi 冒汗 màohàn (sweat),
                  • mồcôi 無辜 wúgù (orphan),
                  • hàilòng 開心 kāixīn (pleased),
                  • taitiếng 丟臉 dìuliăn (infaous),
                  • tạmbợ 暫時 zànshí (temporary),
                  • tráchmóc 折磨 zhémó (reproach),
                  • tuyệtvời 絕妙 juémiào (wonderful),
                  • tămhơi 音信 yīnxìn (whereabouts),

                  (ii) polysyllabicity:

                  • cườimĩmchi 笑眯眯 xiàomīmī (crack a smile),
                  • tủmtỉmcười 偷偷笑 tòutòuxiào (hide a smile),
                  • mêtítthòlò 迷離糊塗 mílíhútú (irresistable),
                  • nhảyđồngđổng #蹦蹦跳 pèngpèngtiao (jump up in protest),
                  • bađồngbảyđổi 說三道四 shuōsāndàosì (unpredictably),
                  • lộntùngphèo ® 亂七八糟 luànqibāzao (upside down),
                  • tuyệtcúmèo ® 妙不可言 miào​bù​kě​yán (fabulous),

                  and sure (iii) Sino-Vietnamese equivalalents:

                  • hằnghàsasố 恆河沙數 hénghéshāshù (innumerable),
                  • hiệndiện 現在 xiànzài (present),
                  • phụnữ 婦女 fùnǚ (woman),
                  • sơnhà 山河 shānhé (country),
                    etc.

                (雙)^ Once the dissyllabic nature of Vietnamese is acknowledged, it follows almost inevitably that the writing system must also change. A new orthography, provisionally called Việtngữ 2020 or Vietnamese2020, as outlined in the Vietnamese2020 Writing Reform Proposal) offers such a path. In this paper the author has consistently written Vietnamese dissyllabic words in combined forms to demonstrate the principle. The reasoning is straightforward: many of these words cannot be split into isolated syllables, since each syllable functions as a bound morpheme, a composite element that only gains full meaning when paired with its partner. The numerous examples presented here illustrate this point. The hope is that a truly polysyllabic orthography will prove to be the most faithful way of writing Vietnamese, and that it may, within our lifetime, gain wide acceptance.