bound morphemes = the smallest meaningful phonological units that are
bound together and usually appear in pairs to form composite words
C = Chinese in general (TiếngHán 漢語) (See also: tiếngTàu)
Cant. = Cantonese (TiếngQuảngđông 廣東方言)
cf., ss, or "§" = compare (sosánh)
character = mostly referring to a Chinese ideogram; also, a Roman letter
or an ideographic symbol (chữ, tự, mẫutự 字母, 漢字)
Chin., C., Chinese = Chinese in general (TiếngHán 漢語) (See also:
tiếngTàu)
Chin. dialects, Chinese dialects = 7 major Chinese dialects, including
sub-dialects (phươngngữHán, TiếngTàu 漢語方言)
Chaozhou (Chiewchow, Teochew) = a sub-dialect of Fukienese, also known as
Tchiewchow (tiếngTriều, tiếngTiều 朝州方言), also 'TiếngTàu'
China North = 華北 Huáběi (Hoabắc), regions in the upper north of the
Yangtze River in today's northern part of mainland China
China South = 華南 Huánán (Hoanam), regions below the south of the Yangtze
River in today's southern part of mainland China
composite word = two-syllable word that is composed of two bound morphemes
of which either one of them cannot function fully as a word (từkép, từ
songâmtiết)
compound word = two-syllable word that is composed of two words (từghép,
từ songâmtiết)
doublet = A Chinese character of the same root that appears in different
form (từđồngnguyên 同源辭)
diachronic = concerning historical development of language of something
through time
Dai = T’ai, Tai, Tày, and sometimes Thai, languages (TiếngTày 傣語)
ex. = example (= td. 'thídụ')
dissyllabicity = dissyllabics, dissyllabism
dissyllabics = Charateristics of a language based on its dominant
two-syllable words in its vocabulary (tínhsongâmtiết 雙音節性)
dissyllabism = dissyllabics, dissyllabicity
EM = Early Mandarin
EMC = Early Middle Chinese (TiếngHán Tiềntrungcổ 前中古漢語)
Fk = Fuzhou, Fukienese (Fùjiàn) or Amoy (TiếngPhúckiến hay phươngngữ Hạmôn
厦門方言)
FQ (or Pt) = 'fănqiè' 反切 phiênthiết (initial and syllabic conjugation, a
Chinese lexical spelling system in classics)
Hai. = Hainanese, a sub-dialect of Fukienese or Amoy (TiếngHảinam
海南方言)
HN = Nôm words, same VS, or Vietnamese words, of Chinese origin (HánNôm
漢喃辭匯)
ideograph/ideogram = a written symbol of language writing system developed
from graphic representation (chữtượnghình 形像字母)
IPA = the International Phonetic Symbol (Phiênâm Quốctế)
K, Kh. = Khmer or Cambodian (TiếngKhmer/TiếngCaomiên)
Kinh / NgườiKinh = literally "the metropolitans", or "the Kinh", meaning
the Vietnamese majority ethnic group living in the coastal lowlands as
opposed to "NgườiThượng" ("the Montagnards") which denotes minority ethnic
groups living in remote highlands in Vietnam (京族)
Latinized / Latinization: same as Romanized / Romanization (Latinhhoá
羅丁拼音)
loangraph = A loangraph in Chinese is a homophone connveying a different
meaning but using the same ideographic character (giảtá, 假借)
LZ = Late Zhou, L. Zhou (Cuối ÐờiChâu 周末)
M = Mandarin, QT (TiếngPhổthông, tiếngQuanthoại 普通話, 國語)
Malay = Malay linguistic affinity (Ngữchi Mãlai 馬來語支); National
language of Malaysia (TiếngMãlai 馬來語)
Mao-Nan = Mao-Nan language, a Mon-Khmer language spoken by Mao-Nam ethnic
group in Southern China (TiếngMaonam 毛南語) MC = Middle Chinese (TiếngHán
Trungcổ 中古漢語)
MK = Mon-Khmer linguistic affinity (Ngữchi Mon-Khmer 猛高棉語支)
monosyllabicity = monosyllabics
monosyllabics = charateristics of a language based on its dominant
one-syllable words in its vocabulary (tínhđơnâmtiết 單音節性)
Mèo = Hmong 苗
Môn = Mon
monosyllabism = monosyllabics
N = Original Vietnamese, also old Chinese-based Vietnamese wrting system
(từ Nôm, tiếngNôm hoặc từ thuần Việt 純喃辭匯, ChữNôm "字喃")
Nôm= Nôm characters of an old Chinese-character based Vietnamese writing
system, or in expanding meaning Nôm words, HN (HánNôm), Vietnamese words,
of Chinese origin (HánNôm 漢喃辭匯)
Nùng = Zhuang language, same as Ðồng, Tráng (TiếngNùng 莊語, 垌語)
OC = Old Chinese (TiếngHán Cổ 古漢語)
OV = Old Vietnamese form (TiếngViệt cổ / TiếngViệtMường cổ)
Pt = FQ 'fănqiè' 反切 phiênthiết (initial and syllabic conjugation, a
Chinese lexical spelling)
Pinyin = People's Republic of China's official Romanization transcription
system of Pǔtōnghuà (pinyin haylà bínhâm 拼音 -- phiênâm )
polysyllabicity = polysyllabics
polysyllabics = charateristics of a language based on its dominant
multi-syllable words in its vocabulary (tínhđaâmtiết 多音節性)
polysyllabism = polysyllabics
pre-SV = pre-Sino-Vietnamese (TiềnHánViệt 前漢越辭匯)
pro-C = proto-Chinese (TiếngHán Tiềnsử 前史漢語)
Putonghua, or Pǔtōnghuà = Official name of Mandarin (Tiếngphổthông haylà
Quanthoại 普通話/國語)
radical = basic Chinese ideographic root on which other characters are
built (tựcăn 字根)
Quốcngữ = Vietnamese national orthography
Romanized / Romanization: same as Latinized / Latinization (Latinhhoá
羅丁拼音)
synonymous compound = compund word that is composed of two synonymous
syllables or words (từghép đẳnglập, từkép đẳnglập, từsongâmtiết đẳnglập)
sandhi = change of sound of word under the influence of a preceding or
following sound
sandhi process of assimilation / association = same as the associative
sandhi process
synchronic = studying language as it exists at a certain point in time,
without considering its historical development
Sinicized = influenced, characterized, and/or identified by Chinese
elements (Hánhoá 漢化)
ss, or "§" = cf., compare (sosánh)
ST = Sino-Tibetan (HánTạng 漢藏語系)
SV = Sino-Vietnamese (HánViệt 漢越辭匯)
Tai, T'ai, Tày, Thái (see Dai)
Tchiewchow = a sub-dialect of Fukienese, also known as Chaozhou
(tiếngTriều, tiếngTiều 朝州方言) with variants spellings), Chaozhou, Tchewchow, Teochoew, Teocheo,
Chewchow, etc.
Thượng / NgườiThượng = See: Kinh/NgườiKinh
TiếngTàu = a coloquial term to connote the Chinese languages, of which the
term "Tàu" could have originated from Tần 'Qín 秦' or tiếngTiều 朝州方言
(từ "Tàu" cóthể do "Tần" hoặc tiếngTiều 朝州方言 màra.)
V, Viet. = Vietnamese (TiếngViệt 越南話)
Vh, Vh @, Việthoá = "Vietnamized", vernacular reflex of
VHh, VHh @, ViệtHánhoá = "Sino-Vietnamized", folk Sino-Vietnamese
"Vietnamized" = Characterized by the localization of loanwords to fit into
Vietnamese speech habit (Việthoá 越化), vernacular reflex of
VM = VietMuong or Việt-Mường form (TiếngViệtMường 越孟語)
VS = Sinitic-Vietnamese (HánNôm 漢喃辭匯), vernacular Vietnamese
Zhuang = the Zhuang language, same as Nùng, Ðồng, Tráng (TiếngNùng 莊語,
垌語)
This chapter establishes the conceptual and methodological groundwork for
analyzing Sinitic-Vietnamese (VS)—a foundational stratum in Vietnamese
etymology and linguistic identity. VS denotes the deeply naturalized layer of
Chinese-derived vocabulary, shaped by sustained contact with northern Sinitic
lects and the broader Sino-Tibetan family. Through an interdisciplinary lens,
this study traces the linguistic evolution of Vietnamese, foregrounding its
dual inheritance: a Yue-descended substrate interwoven with Sinitic influence.
The VS stratum exemplifies the cumulative integration of Sinitic elements into
Vietnamese, forged through dynastic governance, cultural transmission, and
vernacular adaptation across centuries of Annamese history.
Sinitic-Vietnamese encompasses all Chinese-derived vocabulary that has
undergone localization within the Vietnamese linguistic environment. It
includes subsets such as Sino-Vietnamese (SV), rooted in Middle Chinese
phonology, which formed the backbone of administrative, literary, and
colloquial Vietnamese during the Han and Tang periods. Sino-Vietnamese is
not merely a historical residue, it is a living system of semantic and
phonological adaptation.
The chapter traces Sinitic-Vietnamese origins to the Yue aboriginals, pre-Han
inhabitants of southern China and northern Vietnam. Their linguistic
contributions to proto-Vietic and Tai-Kadai languages shaped the substrate
upon which Sinitic layers were later imposed. The term Việtnam itself, "Yue
people of the South", encapsulates this fusion of Yue and Han
cultural-linguistic heritage.
Sinitic elements entered Vietnamese during the Han colonial era (206 B.C.–24
A.D.), and were further enriched by Tang influence. These layers evolved into
functional registers, literary forms, and vernacular usage, culminating in
ChữNôm and later Quốcngữ, the Romanized national script. The chapter also
sketches Middle Chinese tonal systems and their role in shaping Vietnamese
phonology, emphasizing Vietnam's position as a Yue-descended yet highly
Sinicized language.
This chapter reexamines the etymological foundations of Vietnamese by
proposing Sino-Tibetan origins for a substantial portion of its lexicon,
challenging long-standing Austroasiatic Mon-Khmer classifications. Through
comparative phonological and semantic analysis, it uncovers Vietnamese
cognates with Old Chinese, many of which have been historically obscured or
misclassified.
These items suggest deep geographic and etymological ties between Vietnamese
and early Sinitic strata, particularly those shaped by Yue substratal
influence and Han expansion. The evidence supports a reevaluation of
Vietnamese's linguistic lineage, not as a peripheral Austroasiatic offshoot,
but as a Yue-descended, Sinitic-integrated language with complex tonal and
morphological inheritance.
Beyond language, the chapter notes cultural remnants such as the twelve-animal
zodiac system and agricultural terminologies that reflect Yue and Taic roots.
It also sketches the socio-political significance of linguistic change during
colonial and post-independence eras, showing how historical influences shaped
Vietnamese identity.
This research reframes regional relationships by comparing Sinitic-Vietnamese
etymology with Sino-Tibetan variants with Old Chinese, Middle Chinese,
Mandarin, Cantonese, Hokkien, and other lects. The comparative framework
enables a more nuanced understanding of shared phonological, morphological,
and semantic features across the Sinitic-Yue continuum.
x X x
This introductory section aims to provide readers with a foundational
overview of the study. It introduces key concepts and engages with
illustrative examples of Sinitic-Vietnamese vocabulary, particularly
those whose etymologies, despite clear Sinitic or Sino-Tibetan (ST)
origins, have been misclassified as Mon-Khmer (MK). These examples serve
to highlight the methodological challenges and historical misattributions
that have shaped the field.
The discussion also extends into a new frontier of Vietnamese
historical linguistics: the identification of prominent
Sino‑Tibetan (漢藏 Hàn‑Zàng) etymological evidence as it will be elaborated in
theChapter 10 - Parallels with the Sino-Tibetan languages
. Among the primary objectives of this study is to establish a
structured methodology for investigating this discovery. The findings
reopen the long‑standing debate over whether Vietnamese s hould be
reclassified as a member of the Sino-Tibetan language family.
This chapter introduces the framework of Sinitic-Vietnamese as a
comprehensive approach to analyzing Chinese-derived vocabulary in
Vietnamese. Unlike the narrower category of Sino-Vietnamese, which
reflects formalized Middle Chinese phonology, the Sinitic-Vietnamese
domain encompasses both literary and vernacular adaptations shaped by
sustained linguistic contact. Vietnamese is situated within a Yue
substratum, and the chapter proposes a Sino-Tibetan affiliation based on
phonological and semantic evidence, challenging the conventional
Austroasiatic classification.
Polysyllabicity is introduced as a central methodological principle,
enabling the identification of layered etymologies and semantic and phonetic
shifts across registers.
Cultural domains including the zodiac, agricultural terminology, and
literary traditions demonstrate the enduring influence of Yue-Taic
heritage. Lexical and idiomatic examples such as "mẹo" (卯), "ngọ" (午),
"gà" (雞), "trống" (雄), "cồ" (公), "mái" (母), and the colloquial phrase
"Bấtkể ai nóigànóivịt, mình chỉ nói ngang." (不管 講雞講鴨, 我 只 講 鵝)
illustrate bidirectional transfer and deep-rooted cognates.
By integrating historical periodization, comparative linguistics, and
typographic precision, the chapter lays the foundation for a polysyllabic
annotated lexicon and a revised linguistic historiography. It advocates
for a reclassification of Vietnamese and a more nuanced understanding of
its Sinitic layers, with the goal of advancing methodological clarity and
scholarly accessibility.
I) Defining Sinitic-Vietnamese
In this paper, 'Sinitic-Vietnamese' not only designates a blend of
foundational items rooted in the Yue substrate, layered with Old Chinese
elements, and further enriched by the "Sino-Vietnamese" layer of Middle
Chinese loanwords but also
refers to lexical items derived from, or shared with, northern Mandarin
Chinese (M), introduced through processes of localization and innovation
by speakers within the colonial administration of Annam at present‑day
northern Vietnam for over more than nine centuries, from 111 B.C. to
939 A.D. The term also encompasses a distinct subset identified as
Sino‑Vietnamese (SV), whose phonological and semantic origins trace to
Middle Chinese (MC). Over preceded centuries, this class had developed
under the administrative influence of officials serving various northern
Chinese imperial dynasties. Comparable to the Sinitic strata in southern
Chinese lects such as Cantonese and Fukienese (Hokkien), these elements
form a foundational layer of the modern Vietnamese lexicon.
Sinitic‑Vietnamese (VS) encompasses every lexical item of Chinese
origin that has been localized within the Vietnamese speech environment,
including:
Sino‑Vietnamese (SV): A codified subset rooted in Middle Chinese
phonology, functioning in Vietnamese much like Greco‑Latin loanwords in
English.
Pre‑Sino‑Vietnamese forms: Older loans from pre‑Qin and Han eras, many
with Old Chinese (OC) or Taic‑Yue origins.
Parallel forms: Doublets where one is formal‑literary and the other
colloquial‑vernacular, sometimes diverging in meaning.
The scope of Sinitic‑Vietnamese will include all mono‑ and disyllabic
words of Chinese origin, including those that resemble or sound like
Sino‑Vietnamese forms, except where 'Sino‑Vietnamese' applies specifically to words as exemplified
in a "Hán‑Việt từđiển" (Sino‑Vietnamese dictionary).
By convention, the term Sino‑Vietnamese (SV), or Hán‑Việt (漢越), is most
often used to refer to the systematic Vietnamese pronunciation of the
large body of Chinese vocabulary employed in modern Vietnamese. In
analogy, Sino‑Vietnamese words function much like Latin‑ or Greek‑derived
terms in English. The Vietnamese pronunciation in this context reflects
the consensus that Hán‑Việt words are those rendered with modern
Vietnamese phonological characteristics. In reality, they represent slight
variations of Middle Chinese sounds, which are believed to have been used
in the spoken language of the imperial court from the early colonial
period, paralleling the development of Cantonese in the same era.
Each lexical stratum carries its own developmental history. In contrast
to the term 'Sinitic', the term 'Yue', written alternatively in Chinese
Classics as 越, 粵, 戉, 鉞, among other forms, is used here to denote the
indigenous linguistic stratum composed of core vocabulary upon which the
proto‑Vietic language evolved. "Yue" denotes the indigenous southern
substratum, upon which Sinitic-Vietnamese was imposed. Archaeological and
textual records suggest Yue communities pre‑date the ethnolinguistic
entity now called "Chinese" by millennia. Western labels like “Sinitic”
are scholarly shorthands; while imperfect, they aid accessibility in
comparative linguistics.
The use of prefixes such as 'Sino‑' or 'Sinitic‑' to denote the concept
of 'Chinese' in linguistic taxonomy should be understood as a matter of
scholarly convenience. These terms, frequently adopted by Sinologists,
serve as shorthand for a widely recognized label. In the historical
periods under discussion, however, the entity now called 'Chinese' had not
yet formed until the Qin Dynasty. Archaeological and textual evidence
shows that Yue communities predated the emergence of what would later be
called 'China', along with the linguistic features that came to define
it.
The term 'Chinese', nevertheless, is effective in this context because of
its broad recognition, whereas 'Yue' remains comparatively unfamiliar.
This usage represents, hence, a form of academic shorthand, employing
familiar terminology to efficiently reference earlier linguistic forms
recognizable to the scholarly community. Such naming conventions are
standard practice in historical linguistics. Substituting it with 'Việt'
or 'Jyut6' in a title would likely reduce accessibility and limit broader
scholarly engagement, but that is 'what makes Chinese so
Vietnamese.'
Etymologically, many foundational Vietnamese words are currently
classified by historical linguists within the Austroasiatic Mon‑Khmer
(AA‑MK) subfamily, itself nested within the broader Austric linguistic
family. However, it is hypothesized that these core terms may instead
descend from a shared ancestral Yue root. This root is posited to derive
from an older Taic‑Yue substratum, a proto‑language complex that predates
and contributed to the formation of proto‑Vietic (the forebear of the
Việt‑Mường group) as well as other Daic languages. Elements of this
Taic‑Yue layer are also discernible in Chinese lects belonging to the
Sino‑Tibetan family, including Cantonese and Fukienese, suggesting a
deeper historical interconnection across the region.
In lexical practice, Sinitic‑Vietnamese and Sino‑Vietnamese function in
tandem. They complement each other across literary registers, from
classical texts to modern usage including everyday speech across diverse
social contexts. This functional parity underscores the intricate ways in
which Vietnamese is interwoven with 'Chinese', not only linguistically but
conceptually, contributing to the distinctively Vietnamese character of
Chinese‑derived vocabulary.
The scope of Sinitic‑Vietnamese sometimes extends loosely to include
other strata: forms traceable to Old Chinese (OC), also referred to as
Archaic Chinese (ArC), Ancient Chinese (AC), and occasionally Early
Middle Chinese (EMC) as well. It may also encompass the class of
"Tiền‑Hán‑Việt", or pre‑Sino‑Vietnamese loanwords from pre-Qin-Han era,
along with their Vietnamese variants, some of which may date back to
proto‑Chinese origins.
Such archaic forms belong to various pre‑Han linguistic stages,
representing ancestral precursors to OC in the pre‑Qin era, centuries
before present (B.P.). Over time, Sino‑Tibetan and Sinitic etyma
circulated bidirectionally between Chinese and ancient Vietnamese
lexicons, undergoing changes in both form and meaning, for example:
bụt, Phật, vãi: 佛 Fó (SV Phật) [M 佛 Fó, fú, bó, bì (Phật, bột, phất, bất) < MC
but, phut < OC *bɯd || Note: Derived from 'Buddha' in Sanskrit, cf.
VS 'bụt' > SV 'Phật'. Cantonese: fat42, Wenzhou 溫州: vai42. In
Vietnamese, 'bụt' preceded the later equivalent of Buddha.) ]: Buddha,
Buddhist, Buddhist monk.
The subtitle "An Introduction to Sinitic‑ Vietnamese Studies"
originated as the title of the initial outline draft, first published
online in 2003. At the time, it served as a foundational guide to the
study of the Sinitic ‑Vietnamese (VS) field, drawing upon available data
compiled by various authors in related disciplines, with a primary focus
on etyma. Since then, the scope of the survey has expanded significantly,
fueled by new discoveries in both Vietnamese and Chinese etymologies.
These findings reveal shared linguistic traits between the two languages,
providing a robust springboard for advancing academic achievements in this
interdisciplinary field. The author hence finds it apt to title this paper 'What Makes Chinese So Vietnamese?'
reflecting the historical reality that the Yue existed first, and it was
only afterward that the Chinese emerged on what is now the Flowery
Land.
The divergence between these linguistic classifications stems largely
from their synchronic mode of analysis. For example, the term 'Sinitic',
though historically tied to the Qin State of the 3rd century B.C., is
retroactively applied to proto-Chinese formations that predate the Qin
Dynasty by millennia, reaching back beyond the Shang and Xia dynasties to
encompass over five thousand years of linguistic development.
Modern Vietnamese began to take shape in the 12th century with a majority
of Sinitic-Vietnamese vocabulary can be traced across the past three
millennia through Chinese historical records (Nguyễn Tài Cẩn, 1978; see Appendix I). In prehistorical period, however, research on Yue origin of Vietnamese
requires engagement with alternative hypotheses, such as those proposed by
De Lacouperie (1887) and even scholars of the Austroasiatic Mon-Khmer
school, which offer provisional frameworks for understanding deeper
linguistic relationships.
In the early 20th century, Vietnamese used to be classified a of
Sino-Tibetan language. Nevertheless, there was not a notable research on
such supposition.
To make that happen, this research, drawing on extensive comparative
analysis, isolates newly identified Vietnamese terms attested within
Sino-Tibetan languages. Following exemplified cases are for illustrations
of how close their etyma:
"bồng" ~ "bế" 抱 bào (SV bão): 'carry' [ N. Ass. Midźu ba (N),Taying ba
(N) (p. 186), E. Nyising bü (p. 194) | (Haudricourt) Daic Siamese peek,
Lao ɓɛk, Shan mɛk, Tay Noir, Tay Blac ɓɛʔ, Tho bɛk || cf. Hainanese
/boŋ2/ ]
"biển" ~ "bể" 海 hăi (SV hải ~ VS "khơi") [ Sino-Tibetan: M.
Bur. pań-lay, Karenic *pań, Pwo pə9-lai28, Sgaw pä7-lâ7, p@7-lâ7 || cf.
Cantonese /hoi2/ for VS "khơi" as in "rakhơi" @ 出海 chūhǎi (SV xuấthải,
'set sails'), "ngoàikhơi" @ 海外 hǎiwài (SV hảingoại, 'be out at sea')
]: 'the sea',
"bò" 牝 bì (SV bí): 'cow' [ OB ba, OB E. *bik || A W. Bod.
Burig bā (p. 83), Groma, Śarpa bo (calf), Dangdźongskad, Lhoskad ba (p.
93), Central Bodish Lagate pa-, Spiti, Gtsang, Dbus, Ãba bʿa, Mnyamslad,
Dźad pa (p. 98), other Bod. languages Rgyarong (ki)-bri, -bru (p. 120),
modern Bod. dialects New Mantśati (bullock), Tśamba Lahuli (ox) bań,
Rangloi bań-ƫa (bullock) (p. 130) || also Chin. 牝 byi/ (Chin. cow, female
of animal), OB ãbri-mo (tame female yak) (p. 59), Minor group Toţo
pik-(a), Dimal pi-(a) (p. 187), Southern Branch Kukish *b@ń, Luśei b@ń,
Thado boń, Vuite -b@ń- (p. 250), E. Himalayish bʿi, Khambu pi', Lohorong,
Yakhha pik (p. 330) | for 'buffalo': Luśei pă-na, Khami *mă-na, Karenic
*-na-, Karenni pæ2-nä2, Pwo pə1-na6, Sgaw pə2-nə8, Bwe pa-nä2 (p. 414) |
(Haudricourt) Chin. ńǔ- 牛 (M níu), Siamese ŋwă, Lao, Tay Noir ńuo, Shan,
Tay Blanc ńo, Tho, Nung mɔ, Sui mo, Mak pho (p. 501) ]
not to mention other entries happened to be recorded in the Kangxi Dictionary, such as
"ăn" (唵 ǎn, SV àm): 'eat' [ Also VS "ngậm" (hold in the
mouth) || M àn 唵 ʿām-, Luśei *um, Siamese ʿ@m (p. 71) || Note: 唵 àn is
plausibly cognate to VS 'ăn' or eat. As Sino-Tibetan scholars, Shafer or
Haudricourt should switch this word with their M hán 含 ɣām-. Kangxi Dictionary define this entry as 'eat with the
hand.' ]
"nước" (淂 dé, SV đắc): 'water' [ In semantic alignment with 'water' as
define in the Kangxi Dictionary as 'Guangyun - Entering Tone - 德·德':
淂 'appearance of water'. Also read with the fanqie 丁力切. 'Kangxi
Dictionary - Water Section - Eight': 淂 in Guangyun, read 都則切; in
Jiyun, read 的則切. Both pronounced 德. 'Yupian': means “water.” Also
glossed as “appearance of water.” Additionally, Guangyun records 丁力切,
pronounced 滴. The meaning is the same. || cf. Proto-Vietic *ɗaːk,
Cantonese /dak1/ || cf. (Haudricourt) Daic Siamese ʾnām, Shan, Sui, Mak
nam, Lao, Tho, Ahom, Tay Noir, Tay Blanc, Dioi, Mak năm, Nung ram, Bê
nɔm, Li nom, nəm (p. 482) ]
As a result, the scope of inquiry expands beyond Vietnamese–Chinese
(越漢 YuèHàn, or 'Sinitic-Vietnamese') cognates to encompass etymologies
distributed across the broader Yue and Sino-Tibetan spectra. This expanded
scope includes reflexes traceable to Old Chinese (上古漢語 Shànggǔ Hànyǔ)
and pre‑Qin-Han strata, with evidence of bidirectional lexical transfer
between ancestral Yue (越) and Sinitic (漢 Hàn) domains. In doing
so, the analysis directly challenges established Austroasiatic theories
that assert a Mon‑Khmer (MK) origin for Vietnamese, backed by
Sino-Tibetan—'Bod' or (蕃) etyma,
offering substantial support to the Sino-Tibetan hypothesis. This re‑evaluation is grounded in shared phonological
innovations, semantic correspondences, and structural patterns documented
across the Sino-Tibetan continuum, all framed within the polysyllabicity principle for
rigorous cross‑linguistic comparison .
II) Historical roots and Yue influence
Archaeological findings and early chronicles converge on a shared
narrative: Yue communities inhabited the southern reaches of what is now
China and northern Vietnam for centuries prior to Qin unification. Their
languages contributed essential phonological structures, core lexicon, and
syntactic preferences to the proto‑Vietic substrate.
These Yue—pre‑Han populations of the region—served as linguistic
architects of proto‑Vietic, supplying phonological and semantic building
blocks that later absorbed Han‑ and Tang‑era vocabulary through successive
waves of contact. The name " Việtnam" itself ('Yue people of the South')
encodes this dual inheritance. Cultural and linguistic exchange unfolded
in tandem with political annexation, particularly following the Han
conquest of NamViệt in 111 BCE. Yue‑origin forms persist in modern
Vietnamese, from zodiacal terms such as " mẹo" (卯) to agricultural and
kinship lexicon.
Traditional Austroasiatic classifications place Vietnamese within the
Mon‑Khmer branch. This chapter reconsiders that placement, presenting
phonological and semantic correspondences with Sino‑Tibetan lects that
support an alternative alignment. What Indo‑European scholars have labeled
‘Austro‑Asiatic’ was, in effect, the linguistic domain of Yue communities
inhabiting China South (華南, Hoanam) prior to the arrival of populations who would later be called
'Chinese'. This is the case often described as China before the Chinese—a framing that also resonates with the qualified question: What makes Chinese so Vietnamese?
For Vietnamese of Yue origin, and for the Vietnamese polity, the enduring
presence of the meme " Việt " that is, " Yue " represents both
survival and sovereignty of identity.
In this study, the former indigenous inhabitants are designated as '
Taic' . From this population emerged the Daic‑Kadai, the Yue, and the
Austroasiatic Mon‑Khmer, the latter incorporating both Taic and Yue
components. Later waves of migration gave rise to Sino‑Tibetan groups with
Taic and proto‑Tibetan elements; to the Han (Chinese), formed through a
fusion of Taic + Yue + Sino‑Tibetan components; and to the Vietnamese,
whose linguistic and cultural profile reflects a synthesis of Yue and Han
elements.
On the premise that prehistoric southern China was originally inhabited
by ancient Yue aborigines, early Chinese populations emerged from the
fusion of these Yue with proto‑Tibetan migrants from the southwestern
plateau, further mixing with Tartar groups from the southern periphery of
Siberia. These elements coalesced into the diverse populations of various
pre‑Chinese polities in the centuries before the Qin conquest (秦國), and
continued through successive historical transformations to shape the
demographic and cultural landscape well into the twentieth century.
By 111 BCE, the NamViệt Kingdom stood in the south alongside the Han
Empire. However, Liu Bang's Han annexed Triệu's NamViệt, inaugurating a
prolonged era of Chinese rule and intensive Sinicization. Only in 939 CE,
after more than a millennium under Chinese dominion, did the ancient Annam
prefecture, located in what is now northern Vietnam, achieve independence
from the NamHan State (南漢國 NánHàn Guó ).
As a foundation for this premise, it is widely acknowledged in academic
discourse that the Sino‑Tibetan and the proto‑Chinese peoples were absent from the geographic regions they now inhabit roughly
5,000 years before present. The term ' Chinese' has never denoted a racial
category, but rather a cultural construct shaped by a historical
experience in which the prevailing mentality was that of emigrants
repeatedly seeking to leave the often repressive yet persistently
compelling polity of mainland China. This trajectory began with the Qin
Dynasty, and after its collapse (221 BCE–207 BCE), its authoritarian
legacy was assumed by the Han Empire (漢朝) and perpetuated by successive
Chinese dynasties.
The discussion of Yue entities in ancient Annam gains further depth when
situated within this broader arc of early Chinese history. From this
perspective, the introduction of Sinitic elements was preceded by the
long‑established presence of Yue communities. Evidence for this sequence
is found in both cultural artifacts, such as the twelve‑animal Zodiac
system, and in lexical correspondences — for example, /krong/ 'river',
cognate with 江 (jiāng) as in 'Sông Dươngtử' 揚子江 (Yángzǐjiāng, 'Yangtze
River'), in contrast to 'Hoànghà' 黃河 (Huánghé, 'Yellow River'). Both
names were recorded by early Chinese sources for two great rivers that
have long defined China's geopolitical and cultural identity. Historically
and linguistically, these two river systems marked the boundary between
the Yue and Han spheres.
Yue‑derived forms embedded within the Sinitic branch of the Sino‑Tibetan
language family are preserved in much of Vietnam's foundational lexicon.
Notable examples include 'voi' (elephant) aligned with 為 (wēi), 'chuột'
鼠 (shǔ, 'mouse'), and 'bò' 牝 (bì, 'ox'), among others. (See Chapter 10 - Parallels with the Sino-Tibetan Languages.)
The Sinitic-Vietnamese layer of Vietnamese vocabulary developed primarily
during and after the Han colonial periods. Illustrations include "gà"
('chicken') corresponding to 雞 (jī), "buồng" ('room) 房 (fáng), 羅 (luó)
reflected in "chài" and "lưới" ('net'), and 車 (chē) aligned with "xe"
('carriage'') (工) . See
Chapter 11 - Vietnamese and Chinese Cognates in Basic Vocabulary
Stratum
for further comparative data.
Taken together, these features attest to the deep interweaving of Yue and
Sinitic elements in the linguistic foundation of Vietnamese and support a
reconsideration of its etymological origins. The lines of inquiry outlined
here will be pursued in greater detail in subsequent chapters.
Table 1.1: Proto-Tibetan Migration and Shu Contact
Proto-Tibetan groups are believed to have originated in the highlands
of southwestern China, particularly in regions bordering modern-day
Yunnan and Sichuan.
The Shu polity (蜀國), centered in Sichuan, was known for its
early bronze culture and distinct linguistic profile.
Archaeological findings from sites such as Sanxingdui and Jinsha
reveal material assemblages unrelated to central plains cultures,
suggesting contact with highland populations.
Migration patterns inferred from burial styles and ceramic
typologies indicate northward movement along Yangtze tributaries,
consistent with your claim.
Extinct Populations and Material Assemblages
Isolated archaeological sites in Sichuan and adjacent regions show
evidence of cultural discontinuity, abrupt shifts in material culture
that suggest population replacement or extinction.
These assemblages often include non-Han artifacts, such as
stylized masks, ritual bronzes, and unique pottery forms.
Linguistic extinction is inferred from the absence of direct
descendants in modern Sino-Tibetan languages, though substratal
influence may persist in phonology and syntax.
In a more remote epoch, Proto‑Tibetan groups—originating in the
southwestern highlands of ancient China—migrated northward, interacting
with indigenous communities along the periphery of the Shu polity (蜀國)
in present‑day Sichuan. Their migratory paths extended toward the
northeastern tributaries of the Yangtze River. Archaeological evidence
from isolated sites, distinguished by unique material assemblages,
indicates that these populations have since become extinct.
The fusion of Taic‑Yue aboriginals with Proto‑Tibetan nomads migrating
from what is now southwestern China ultimately gave rise to the broader
Sino‑Tibetan ethnolinguistic complex. This included the proto‑Chinese
founders of the Xia Dynasty, dated to nearly 5,000 years ago.
According to both legend and the Chinese historical record, these
populations established the Yin polity (殷朝, "NhàÂn", 1600 B.C.–1046
B.C.), initiating the Yin‑Shang Dynasty. Between approximately 1225 B.C.
and 1220 B.C., the Yin are recorded as having invaded ancient Annam.
Over the subsequent two millennia, pre‑Chinese populations merged with
Taic‑Yue communities, forming the ethnolinguistic matrix later
identified as 'Chinese' well before the pre‑Qin‑Han consolidation. Among
the Yue were lineages diverging from the same Taic substratum as the
founders of the Chu polity, including ancestral Zhuang(百), communities that later
established both the Yue (越國) and Eastern Yue (東粤) states.
As the Yin ("Ân") advanced southward, Yue populations were displaced,
migrating deeper into the southern regions. The Qin‑Yue admixture, shaped
over successive millennia, dispersed along both northern and southern
migratory corridors. These trajectories extended from a pivot in
present‑day Yunnan through Zhejiang and Fujian provinces; turning
southward, they traversed Hubei, Jiangxi, and Jiangsu, ultimately reaching
territories now encompassed within the Austric, Austronesian,
Austroasiatic, and Austro‑Thai hypotheses, both in anthropological and
linguistic classification. Across this expanse, the languages exhibit
demonstrable relatedness; divergences arise primarily from the
multiplicity of nomenclatures under which they have been categorized (cf.
Terrien de Lacouperie 1965 [1887]).
In modern taxonomy, 'Chinese' lects and their dialects and sub-dialects
are classified under Sinitic—not because Sinitic predates Yue, but because
the designation reflects their Sino-Tibetan affiliation. Likewise, as used
here, the term Yue (越) (M), or more
precisely, 'Viet', does not imply that either the ancient Yue aboriginals
or the modern descendants of the "LạcViệt" (雒越, LuoYue) constitute a
homogeneous ethnolinguistic population.
To clarify: the Yue, whether Eastern Yue (東越) in the Zhejiang region or
Southern Yue (南越) in Guangdong—corresponding to the Wu groups and
Cantonese speakers respectively—were not direct ancestors of the modern
Daic peoples. Rather, both descended from a shared ancestral Taic lineage.
This same origin plausibly extends to the Chu State and the NamViệt
Kingdom, by analogy. That explains the mutual unintelligibility of their
lects. Nevertheless, their shared Yue features and indigenous etyma across
the languages of China South and North Vietnam produced numerous cognate
doublets, many of which are preserved in the Chinese classical tradition.
For example, the Kangxi Dictionary (康熙字典) records 淂 dé (SV "đắc"),
linked to Old Viet /dák/ 'water', alongside 水 shuǐ (SV "thuỷ", "đák"; cf.
踏 tǎ, VS "đạp"), meaning 'water' or 'river', which also appears as 川
chuān (SV "xuyên") and 江 jiāng (SV "giang") for Vietnamese "sông"
('river').
These doublets preserve vestiges of archaic speech from native populations
within the bounds of ancient states later absorbed into the Chinese empire,
including the states of Shu (蜀國 'NướcThục'), Chu (楚國 'NướcSở'), Yue
(越國 'NướcViệt'), and the NamViệt Kingdom (南越王國 'Vươngquốc NamViệt').
Their territories were home to ethnically composite groups such as the Luo
Yue (雒越 'LạcViệt'), Xi'Ou (西甌 'TâyÂu'), Ou Yue (歐越 'ÂuViệt'), Dong'Ou
(東甌 'ĐôngÂu'), and MinYue (閩越 'MânViệt'), tribal confederations of
considerable diversity.
Table 1.2: ÂUVIỆT
The ÂuViệt or OuYue (Chinese: 甌越) was an ancient conglomeration
of Baiyue tribes living in what is today the mountainous regions of
northernmost Vietnam, western Guangdong, and northern Guangxi,
China, since at least the third century BCE. They were believed to
have belonged to the Tai-Kadai language group. In eastern China, the
Ouyue established the Dong'Ou or Eastern Ou kingdom. The Western Ou
(西甌; pinyin: Xī'Ōu; Tây meaning "western") were other Baiyue
tribes, with short hair and tattoos, who blackened their teeth and
are the ancestors of the modern upland Tai-speaking minority groups
in Vietnam such as the Nùng and Tay, as well as the closely related
Zhuang people of Guangxi.
The ÂuViệt traded with the LạcViệt, the inhabitants of the state of Văn
Lang, located in the lowland plains to ÂuViệt's south, in what is
today the Red River Delta of northern Vietnam, until 258 or 257 BCE,
when Thục Phán, the leader of an alliance of ÂuViệt tribes, invaded
Vănlang and defeated the last Hùng king. He named the new nation
"ÂuLạc", proclaiming himself "Andươngvương" (literally "Peaceful
Virile King"). The origins of Thục Phán are uncertain. According to
traditional Vietnamese historiography, he was the prince or king of
the Kingdom of Shu (in modern Sichuan). However the kingdom of Shu
was conquered by the Qin in 316 BCE, making it chronologically
improbable that Thục Phán was Shu royalty a hundred years later.
There may be some merit to the story due to archaeological evidence
of cultural ties between Yunnan and the Proto-Vietnamese, but
possibly as a result of the gap in time between the origin of the
story and when it was recorded, the location could have been changed
to Shu or simply mistaken due to erroneous geographical knowledge.
According to a translated oral account of a Tày legend, the western
part of ÂuViệt's land became the Namcương Kingdom, whose capital was
located in what is today the Caobằng Province of Northeast Vietnam.
It was there that Thục Phán hailed from. The authenticity of this
account is considered suspect by some historians. It was published
in 1963 as a translation while no extant copy of the original Tày
text exists. The title of the story contains many Vietnamese words
with slight tonal and spelling differences rather than Tai words. It
is uncertain what text the translation originated from.
According to Chinese historians:
The Qin Dynasty conquered the State of Chu, unifying China. Qin
abolished the noble status of the royal descendants of the State of
Yue. After some years, Qin Shihuang sent an army of 500,000 to conquer
the West Ou. After three years, Qin forces killed West Ou chief
Yiyusong (譯籲宋). Even so, West Ou waged guerilla warfare against Qin
and slew Qin commander Tu Sui (屠睢) in retaliation.
Before the Han Dynasty, the East and West Ou regained independence.
The Eastern Ou was attacked by the MinYue Kingdom, and Emperor Wu of
Han allowed them to move to region between the Yangtze and the Huai
rivers. The Western Ou paid tribute to NanYue until it was conquered
by the Han. Descendants of these kings later lost their royal
status. Ou (區), Ou (歐) and Ouyang (歐陽) remain as family
names.
According to Vietnamese historians:
257 BCE, Andươngvương 安陽王 unified the LạcViệt tribe
(Austroasiatic) (chiefdom) of Hung Kings 雄王 (Hùngvương) with his
ÂuViệt tribe (Tai-Kadai) (chiefdom) into a single tribe (The ÂuLạc
chiefdom).
208 BCE, Zhao Tuo captured ÂuLạc and incorporated it into his Han
kingdom of NanYue, which was ruled by the Han Dynasty.
Prior to the first century B.C., the Chinese-Han population had
already emerged as an anthropological fusion of proto-Tibetan groups
and Yue indigenous peoples, forming the core population of the Qin
state. This population was drawn from six other ancient states, with
a notable contribution from Chu subjects, including Daic and Yue
peoples.
Following the annexation of the NamViệt Kingdom into the Han Empire
in 111 B.C., the Yue people became further intermixed with Han
subjects, expanding beyond the Lingnan southern region. This process
of integration repeated itself continuously across both space and
time.
The ethnic composition of the Han populace likely preserved the same
proportionate racial fusion that had characterized the Chu polity by the
time the Han Empire was founded. However, the total population must have
declined due to the preceding wars. Crucially, Yue-Daic elements remained
predominant among Han subjects following the fall of Chu, given that Chu
itself had originated as a Daic polity. This continuity is historically
significant: the founding emperor of Han, Liu Bang (劉邦), along with his
generals, sub-commanders, and much of the infantry, were originally Chu
fighters who had resisted the Qin army prior to the Han's eventual
triumph. This fact merits renewed emphasis for its broader ethnohistorical
implications.
The term Han and Han-related designations originated from Hanzhong (漢中,
SV "Hántrung"), a remote enclave in present-day Shaanxi Province. Liu Bang
(劉邦, SV "Lưu Bang") had been appointed viceroy of Hanzhong by General
Xiang Yu (項羽, SV "Hạng Võ"), the last Duke of Chu, acting on behalf of
the final King of Chu. However, Liu Bang and Xiang Yu later turned against
one another (楚漢戰爭, 206–202 B.C.), and the victorious Han faction
subsequently dissociated from Chu heritage, identifying themselves as the
"Han people"—that is, the followers of the Hanzhong viceroy.
As a result, 'Han' entities emerged alongside terms such as 'Chinese'
(from 'China') and 'Sinitic' or 'Sino' (from 'Qin'). While alternative
names such as 'Cathay', 'Tang', or 'Qing' have also been applied to the
Han entity, their actual racial composition reflects a unification of
states within China, shaped by centuries of admixture and regional
consolidation.(C)
Figure 1.1: Map of territories of dynasties in China Source: https://en.wikipedia.org/wiki/File:Territories_of_Dynasties_in_China.gif
More than 4,000 years later, the subjects of the newly unified Qin State
(秦國, 206 B.C.) included the Taic-Daic peoples of Chu (楚國) and the Yue
descendants who formed the southern Yue State (越國) and the vassal State
of Wu (吳國) as recorded in the Chinese chronologies of the Spring Autumn (770-476 B.C.) and Warring States (475-221 B.C.) periods. Prior to the Western Zhou era of
this period, the forementioned southern Yue polities had already existed
in a tributary relationship.
After the Han faction supplanted the Qin State and consolidated power over
the Middle Kingdom, including its southern territories, the newly
established Han Empire required all subjects to adopt the official court
language (X). That language, reportedly spoken
by Liu Bang, the founding emperor of the Han Dynasty, was likely a Chu
dialect. This was a Taic-Daic language reflecting his origins in the Chu
realm. The Yue peoples, along with the ancestral Zhuang populations in the
far south, adopted this language following the annexation of the Nam Viet
Kingdom in 204 B.C. In this compound, 'Nam' (南) means 'south' and 'Viet'
(越) means 'Yue' (Cantonese /jyut6/). In antiquity, these characters were
pronounced similarly in Vietnamese of the Luo-Yue branch and in Cantonese of
Eastern Yue. That phonological correspondence has been preserved in
historical transcription.
Many of the shared Yue etyma can be traced to remote antiquity, when
proto‑Tibetan and ancestral Yue languages came into contact with the later
Viet‑Muong language, which itself has roots in the Taic linguistic family.
Proto‑Yue languages were once widely spoken by the aboriginal peoples of a
vast region in South China, and their domains extended into parts of
northern China along the banks of the Yangtze River (揚子江, also known as
長江). These areas formed the natural habitats of ancient Taic indigenous
peoples, who in time gave rise to the Yue people, known as "Bách Việt"
(百越) or Bai Yue, a term possibly derived from the Tibetan "Bod". (華)
In the Vietnamese case, successive north-south migrations across both
geographic and historical scales displaced indigenous populations from
their fertile lowland settlements into less arable, mountainous zones.
Following the Qin-Han period, incoming settlers from southern China
introduced their own languages, which gradually blended over the
subsequent millennium with local Tai-Yue speech forms, namely Dai, Thái,
Tày, Nùng, alongside other Viet-Mường dialects.
Prior to 939, when both Annam and Canton remained under the rule of the
NamHan Kingdom (南漢帝國), their inhabitants appear to have been mutually
intelligible, at least through a vernacular form of regional Mandarin.
During this period, Annamese scholars actively participated in
administrative affairs and literary production with the Tang imperial
court. This is evidenced by the literary record and the development of a
fully articulated Hán-Việt (漢越) or Sino-Vietnamese lexicon, presumably
transmitted directly from Middle Chinese, particularly during the final
289 years of Tang rule.
Historical records indicate that large-scale migration from southern
China into "Annam" occurred not only during the millennium of Chinese
colonial administration (111 B.C.-939 A.D.), but also well into the modern
era, continuing beyond 1949 and, notably, spilling over the 21st century
when Chinese laborers are seen establishing Chinatown-style enclaves
across the country.
Such ease of communicative transition is exceptional in this context when
compared to other Mon-Khmer speakers, with the partial exception of later
contact effects on neighboring Mường groups. These communities, having
diverged from earlier Viet-Muong populations that resisted Han
colonization, withdrew into remote highland zones where they coexisted
with Mon-Khmer speakers. This interaction resulted in distancing lowland
Yue-specific commonalities from Mon-Khmer lexicons, whose resemblances
appear to have emerged only through later contact. Archaeological and
historical evidence suggests that Mon-Khmer groups migrated into the Red
River Delta approximately 6,000 years ago (Nguyễn Ngọc San 1993, p.
43).
Over time, the Annamese vernacular retained only a limited proportion of
Yue elements. The long process of Vietnam's national formation began
with the biological composition of its early population, descended from
the racially mixed Yue, LạcViệt, XiLuo, and OuLuo communities of the
Nam Việt Kingdom. Consequently, it is unlikely that the ancestors of the
Vietnamese remained genetically pure descendants of the original Yue
tribes, even before the 1,004 years of Chinese rule that ended
in 939 A.D.
The linguistic traits introduced by these immigrant populations,
including tonal patterns and phonological features characteristic of
Cantonese, Hainanese, Chaozhou, Amoy, and Hokkien, entered
Annamese as integrated structural components, rather than as merely
external overlays influencing tone or syllabic configuration. A comparable
process of contact-driven change can be observed in the development of
Cantonese.
From a linguistic perspective, the exchange of vocabulary between host
and migrant communities rendered native Yue elements complementary to the
expanding Sinitic domain, rather than replacing its structural
foundations. This process is broadly comparable to the way Chinese lexical
material was assimilated into the formation of new Japanese
concepts.
Following independence, the Vietnamese population, then referred to as
the Annamese, established a sovereign polity corresponding to present-day
Vietnam ("越南" Yuè-Nán), literally 'the Yue of the South'. This
interpretation stands in contrast to the mistaken view that "越" signifies
'advancing to the south', a misconception rooted in its semantic
association with 'advance' or 'surpass'. In fact, ancient Chinese
transcriptions of 'Việt' ("越") include variant graphs such as "戉", "粵",
and "鉞", each denoting implements or weapons resembling an axe that
associate with "Yue" as well. This distinction is significant, as it
separates the early ethnonymic identity conceptualized from the
territorial expansion that occurred long after the 10th century.
Over successive millennia, and through a sustained southward migration
from a polity referred to as "Vănlang"—likely a transcription of the early
sound '賓郎 Bīnláng' [← 'blau' = 'trầu', cf. "檳城 Bīnchéng" ('Bếnthành' ~
'Penang') or 'betel']—located in what is now northern Vietnam, the later
Vietnamese emerged as a composite population, hybrid in origin,
incorporating Chamic and Mon-Khmer elements along the migratory corridor.
Archaeological and anthropological evidence consistently supports this
view, framing modern Vietnamese ethnogenesis as the result of stratified
admixture rather than a linear descent from preexisting ethnic groups in
either the north or the south.
Just as no population can claim to be ‘purely Chinese’, there is
no entirely ‘pure’ Vietnamese lineage. Vietnam's history is marked by
the fusion of Chinese settlers and Southern Yue communities, many
hailing from what is now southern China. The very name ‘Việtnam’,
translating as “Yue people of the South”, embodies this shared legacy.
Unlike ethnic Chinese communities elsewhere in Southeast Asia, those in
Vietnam integrate readily; within two generations at most, descendants
born and raised on Vietnamese soil commonly self-identify as Kinh in
census data. Through successive waves of southward settlement and
integration with indigenous groups, these blended communities coalesced
into the Kinh majority that defines contemporary Vietnam.
As previously noted, the early Annamese population emerged through
centuries of intermixture between indigenous Yue groups and Han colonial
settlers, culminating in the formation of the Kinh (京族, Jīngzú; VS
"tộcKinh") being descendants of this complex hybridity. The enduring
interaction between these communities became a recurring theme in
nationalist discourse, particularly in the face of so many Han settlers
who fled living disruptions accompanying dynastic transitions in China,
from the fall of the Tang Dynasty in the 10th century to the rise of
communist rule after 1949, and they permanently remained in the southern territories over time.
This demographic pattern has continued into the contemporary period.
Reports indicate that since 1990, over one million mainland Chinese have
established permanent residence in Vietnam, according to figures compiled
from annual Chinese diaspora assemblies held in major Vietnamese cities
(see factsanddetails.com).
This deep historical interconnectedness explains the shared etyma derived
from a common ancient linguistic substrate, close enough that some have
inferred Vietnamese etyma evolved from Cantonese. In fact, both languages
share a substantial Middle Chinese inheritance from the Tang period,
reinforced by large‑scale migrations during the 'An Lushan Rebellion'
(755–763), which devastated the Central Plain and sent many northerners to
the Lingnan region. That is why early communities of present‑day Cantonese
speakers identified themselves as 'Tang people' (唐人, Tong4jan4), whereas
the Vietnamese ethnonym "Việtnam" (越南), again, literally means 'the Yue
people of the South'.
Regarding the proto‑Vietic language, the split within the Viet‑Muong
groups marked a decisive divide between indigenous people who resisted Han
occupation of their ancestral land and those who submitted to and
collaborated with Chinese colonizers. In a manner comparable to the
evolution of Cantonese speech, early Sino‑Vietnamese forms were actively
integrated into the ancient Vietic language, which over time developed
into early Annamese. This process unfolded over centuries and culminated
in the Middle Vietnamese period, particularly through the absorption of
Tang‑era linguistic variants by the emerging "Kinh" elite. It involved the
localization of Middle Chinese vocabulary and expressions, together with
gradual, nuanced changes in phonology, syntax, and semantics.
The entire process likely began before and extended well beyond the fall
of the Tang Dynasty (618–906). It entailed the adaptation and localization
of Middle Chinese lexical stock during periods of colonization, aligning
with the broader evolution of Chinese lexicography, a trajectory shaped by
shifting patterns of phonological and semantic crystallization across the
Han and Tang dynasties (Tang Lan, 1965, p. 110).
Having deeply shared the same historical background, the sound‑change
patterns of Sino‑Vietnamese and Cantonese, both originating from Middle
Chinese, appear to have followed similar phonological paradigms in
literary contexts, for example, literature and scholarship, as well as in
spoken forms. This parallel evolution persisted until at least the 10th
century, after which the two languages diverged from their shared path.
During their shared period, both made use of Middle Chinese as the lingua
franca of the NamHan Kingdom. Over time, their respective vocabulary
stocks either disappeared because of lexical redundancy in the form of
doublets or stabilized into distinct forms, as seen in Sino‑Vietnamese on
one hand and the so‑called 'Tang language', now commonly associated with
Cantonese, on the other.
For general readers, Sinitic‑Vietnamese etyma often appear
indistinguishable, not only to novices without formal linguistic training,
but even to language educators, particularly in matters of sound change
and etymological divergence between formal and colloquial registers. This
observation is based on the author's survey of bilingual teachers in
general subject areas, such as language arts and ESL, in U.S. schools.
Many of these teachers candidly acknowledged that they had never noticed
lexical correspondences between the two languages in literary lexicons.
For example, none were aware that the SV "quốcgia" (國家 guójiā, 'nation')
directly matches Cantonese "/gok7ga5/" ('nation'), let alone that its
vernacular Vietnamese synonym "nướcnhà" carries the same meaning.
In the case of the latter, laypersons with some exposure to historical
linguistics may recognize such correspondences when they are explained
through regular patterns of sound change, yet they often resist the idea
that "nướcnhà" shares a common root with "quốcgia". This resistance is
partly rooted in a poetic interpretation of "nướcnhà" as a compound of
"nước" ('water') and "nhà" ('home'), reflecting an idealized vision of
Vietnam as a land of virtuous governance cherished by Confucian scholars
who love compose Tang poems. Such a reading, however, implicitly denies by
obstructing the view of the Chinese etymology of 水 (shuǐ, SV "thuỷ") and
家 (jiā, SV "gia"), as well as the compound 國家 (guójiā), despite the
well‑known early 20th‑century classroom chant in village schools "gia/nhà,
quốc/nước" from the primer Tam Thiên Tự Kinh
('Prime Book of 3000 Characters'), in which the latter pairing conveyed an
abstract sense approximating 'country'.
While the poetic interpretation is semantically plausible, it obscures
the phonological continuity linking "nướcnhà" to "quốcgia" and Cantonese
"/gok7ga5/". Adding further complexity, the more recent form "nhànước",
meaning 'ruling body of government', reverses the original
syllabic‑morphemic order, creating an inversion that introduces yet
another layer of morphological and semantic development.
Long after the NamHan Kingdom ceased to exist in 971, and despite
Annam's separation from its control in 939, Cantonese and Sino-Vietnamese
may still have retained notable phonological similarities inherited from
late Tang-era speech. By that time, however, the two languages were
already distinct, much as their divergence is evident today. A
comparable situation is observed in the localized variant of Cantonese
spoken in the Guangxi Autonomous Region, known as Baihua (白話) .
This transformation resulted from layered ethnic blending with migrants
from northern regions of the Tang Empire. Southern China, especially the
Guangzhou prefecture, experienced major influxes of settlers due to
upheavals such as the An Lushan Rebellion during the reign of Emperor Tang
Ming Huang. Widespread famine further altered the region's demographic
balance. The conflict led to mass displacement and mortality, as
documented by Bo Yang (1982–1992, Vol. 49).
Meanwhile, Cantonese speech underwent repeated phases of transformation
shaped by surrounding sociohistorical forces. Until the tenth century, it
is plausible that Cantonese speakers in Guangzhou and Annamese speakers in
Tonkin could still communicate using Sinicized speech forms, such as Yue's
Baihua, as previously noted in accounts of interaction between the peoples
of Guangdong and Guangxi, not mentioning the common share of basic lexical
stock. For instance, within the aboriginal Yue substratum, several
foundational etyma shared by Cantonese and Vietnamese—such as "lưỡi" 脷
(tongue) [Cant.: /lej6/], "bông" 花 (flower) [Cant.: /fa1/], "biếu" 畀
(give) [Cant.: /pej3/], "khui" 開 (open) [Cant.: /hoj5/], "xơi" 食 (eat)
[Cant.: /sik8/], "uống" 飲 (drink) [Cant.: /jam3/], "thấy" 睇 (see)
[Cant.: /taj3/], "đéo" 屌 (curse) [Cant.: /tjew3/], and "ỉa" 屙 (defecate)
[Cant.: /o5/]—represent only a small portion of the indigenous Yue layer
that persists across both languages.
Similarly, while Cantonese retains the Middle Chinese‑derived
pronunciation of 走 as "zow3" meaning 'go', the Sino‑Vietnamese term "tẩu"
(təw3) has shifted in modern usage to "chạy", denoting 'run'. This latter
sense aligns with Mandarin "qù" and Cantonese "hoeỉ3/hoeỉ2", and is
etymologically linked to 去 qù (SV "khứ"). The connection extends to a set
of Sino‑Vietnamese doublets such as "khu", "khử", and "khứ", along with
variants including "khừ", "khự", "khử", "khứa", and "đi", as well as the
Hanoi sub‑dialect form /xɨ5/. In the Han‑period stage of Ancient
Chinese, these terms shared a unified core meaning. Over time, however,
their semantic range broadened to encompass related notions such as
'eliminate', 'get rid of', and 'cut off'.
Additional etyma likely reflect remnants of the Taic-Yue substratum found
in both Vietnamese and Cantonese. These languages emerged from distinct
Tai-Kadai branches long before their speakers were unified under the
NamViệt Kingdom in 204 BCE. For example, the term 雞公 (jīgōng) for
'rooster' corresponds to Vietnamese "gàtrống" and to archaic Cantonese
/kaj5koŋʷ1/. This rare but compelling correspondence offers evidence of a
shared Yue linguistic affiliation at a substratal level, suggesting that
both forms derived from the same source prior to Sinicization.
The modern grammatical pattern in which an adjective precedes the noun it
modifies, as in Mandarin gōngjī (公雞), syntactically equivalent to the
English 'male bird', reflects Sinitic elements that developed atop an
aboriginal Yue substratum. In Vietnamese, the corresponding term is
"gàcồ", which follows the [noun + modifier] order. It is likely that in
earlier stages of language development — when both systems were still in
the formative phase of polysyllabicity during the late Ancient or Early
Middle Chinese period — the two languages shared a much greater degree of
structural similarity, particularly in the official court languages
beginning in the Han colonial era.
As disyllabic words became more common, along with synonymous
constructions, Sinitic speakers began to differentiate homophones by
placing modifiers before the main morphemic syllable to create new
polysyllabic words. Vietnamese, in contrast, retained a Yue speech habit
that tended to reverse this order, placing the noun before the
modifier.
From a linguistic standpoint, Vietnamese speakers across social
backgrounds can readily acquire the Cantonese dialect or approximate
its pronunciation, reflecting a degree of accessibility between the
two languages. The prominent Sinitic elements in Cantonese are broadly
similar to those found in Sino-Vietnamese lexemes, as both developed
through sustained contact with northern migrants from successive
Chinese dynasties over the past two millennia. Despite this influence,
Cantonese speakers and their language remain distinct—not only from
Vietnamese, but also from other regional varieties within China. Over time, the ethnic composition of Cantonese communities became
increasingly mixed due to the influx of migrants from northern
regions of the Great Tang Empire. This process continued as mainland
China came under the control of various northern rulers, including
the Jurchens of the Northern Song, the Tartars of the Liao, the
Mongols, and the Manchurians. Notably, the Cantonese population was
primarily composed of descendants who identified as Tang subjects,
particularly from the seventh century onward. These descendants are
recognized as the Hoa ethnic group in Vietnam and other parts of
Southeast Asia.
In contemporary usage, Vietnamese and Cantonese no longer exhibit the
semantic and syntactic parallels they once shared. For example, the
modern Vietnamese term gàtrống contrasts with its earlier Cantonese
counterpart "gung1gai1" (公雞), a divergence that reflects historical
shifts in linguistic affinity.
These differences are further shaped by varying degrees of Chinese
influence. The impact of Han Chinese, both prior to 111 BCE and during the
Middle Chinese period beginning in the seventh century, left enduring
phonological and semantic imprints. For example, Vietnamese continues to
use "đôiđũa", a term cognate with Han Chinese zhúzi (箸子) for
"chopsticks". In contrast, Cantonese, like Mandarin, avoids using the term
箸, as its phonetic resemblance to "đổ" 倒 (dǎo, SV "đảo"), meaning
'capsize' that carries negative connotations. Instead, it favors kuàizi
(筷子) or faai3zi2 where 筷 is homophonous with 快 (kuài, VS "mau"),
meaning 'fast', a term associated with auspiciousness in southern Chinese
culture, particularly in regions where boat travel was historically
common.
Although Cantonese preserves ancestral Yue substratal elements like
Vietnamese, it is still classified within the Sino‑Tibetan language
family. This classification is grounded primarily in its substantial
Middle Chinese lexical stratum, which outweighs the influence of ancient
Yue etyma. Throughout its history, Cantonese has remained firmly within
the Sinosphere, with a continuous lineage as a living language traceable
at least to its historical presence during the era of Zhao Tuo of NamViệt
Kingdom, later reinforced by waves of immigrants during the flourishing of
the Tang Empire. It is therefore unsurprising that Cantonese has been
informally referred to as "the Tang language" (唐話, tong4waa6‑2).
The placement of Cantonese in the Sino‑Tibetan family is well‑founded,
shaped by both quantitative and qualitative considerations as a result. As
noted earlier, except for its share with a limited number of
Sinitic‑Vietnamese fundamental lexemes, the core vocabulary of both
Cantonese and Sino‑Vietnamese derives from the same Middle Chinese source
is substantial. This common origin reinforces Cantonese's inclusion in the
Sino‑Tibetan framework and, by extension, invites a reassessment of
whether Vietnamese might also be situated within this
classification.
The present task, then, is to advance comparative analyses that assess
the position of Sino‑Vietnamese and Cantonese in the broader context of
Middle Chinese historical linguistics. Anthropologically, in
considering the Yue‑before‑Sinitic substratum, both Zhuang and Vietnamese
traditions suggest that the Vietnamese (越, Việt) and Cantonese (粵, Jyut)
peoples may have descended from distinct branches of the Yue (戉) prior to
the second century BCE (cf. Truyệncổ Dòng BáchViệt
- dchph, on the legend of the magic sword Thần cung Bảo kiếm). The earlier
Jyut-speaking communities, associated with Báihuà (白話), were likely of
Zhuang (壯族) origin, expanding from Guangdong (廣東) into what is now
Guangxi (廣西). The correspondence between these two toponyms reinforces
the linkage between TâyÂu (西甌 Xī’Ōu) and ĐôngÂu (東甌 Dōng'Ōu), wherein
the phonological parallel of 壯 (OC /ʔsraŋs/) and 廣 (OC /kʷaːŋʔ/)
reflects a pattern of regional continuity. The Zhuang self‑designation
/Bố‑/ stands in contrast to the /Bod/ ethnonym discussed earlier.
This distribution of BaiYue tribes encompassed the region historically known
as the Southern Mountainous Range (嶺南道 Lingnan Dao). Notably, a lexical
chain links terms such as Bốchuang, Bốthổ, Bốỷ, Bốbản, and Bốviệt with the
etymon Bod, which is cognate with BaiYue, BáViệt, and
BáchViệt—names once used to designate indigenous populations.
III) Linguistic evolution through dynasties
This section investigates Sinitic-Vietnamese terminology whose etyma are
traceable to Old Chinese, a historical branch of the Sino-Tibetan family. It
also explores foundational Vietnamese cognates attested across Sino-Tibetan
languages that appear to descend from the ancient Taic-Yue linguistic
complex—a substrate that flourished throughout China South long before the
emergence of Chinese civilization.
Sinitic-Vietnamese development proceeded through successive
dynasties:
Han period: Initial Old Chinese loans, particularly in
governance, military, and agriculture.
Tang period: Enrichment from high‑register Middle
Chinese, reinforcing tonal and phonological complexity.
Post‑Tang independence: ChữNho retained as the prestige
written medium; chữNôm created for vernacular literature.
Colonial to modern: Romanized Quốcngữ script
codified all registers, that is, literary Sino-Vietnamese, vernacular
Sinitic Vietnamese, and indigenous vocabulary, into a unified orthography.
To fully appreciate the argument presented above within a historical
timeline, we must examine both the prehistoric and historical periods in
China and Vietnam, a perspective that the Austroasiatic theory overlooks. A
historical review of Yue entities is essential for understanding that modern
Vietnamese emerged as a very late product; moving beyond a strictly
Mon-Khmer framework reveals many fundamental Vietnamese words with
Sino-Tibetan etymologies, thus reviving the former Sino-Tibetan theory that
began to emerge in the late 19th century but has yet to be fully realized in
the 21st century.
First, let us establish a historical picture of the prehistoric era,
approximately 5000 years BP, when the indigenous Yue, that is, the Taic or
proto-Yue, terms used before these groups were later designated as Yue (越,
粵, 戉, 鉞, etc.) in Chinese history, inhabited southern China. This period
predates the arrival of itinerant proto-Tibetan nomads in search of fertile
lands. Later proto-Chinese resettlers, who were formidable warriors
conquering on horseback, colonized and subjugated the indigenous vassal
states across the fertile mainland. Over time, successive dynasties,
including the Xia (夏), Yin (殷, SV "Ân"), Shang (商), and Zhou (周),
brought under their control indigenous states such as Qin (秦 SV "Tần"), Chu
(楚), Yue (越), Wu (吳), Yan (燕), and Qi (齊), with all these entities
eventually subjugated by the Zhou kings (1045 B.C.–256 B.C.). By the end of
the Eastern Zhou period, in 221 B.C., the Qin state had conquered its
remaining opponents, forging the first unified Middle Kingdom, later known
as China. Etymologically, the term 'China' derives from variants such as
'Cin' and 'Chine', which in turn originate from 'Qin'(秦).
The brief Qin Dynasty (秦朝, 221–207 B.C.) was succeeded by the Han Dynasty
(漢朝), founded by Monarch Liu Bang (劉邦), known as Han Gaozu, who emerged
victorious in the final battle against the resurgent Chu State in 206 B.C.
to claim the imperial crown of the nascent Flowery Empire. In the meanwhile,
in the war-torn southern region, Triệu Đà (趙佗 Zhào Tuó), formerly a Qin
general and viceroy, gathered breakaway Yue colonies from southern China and
established the NamViet Kingdom (南越 王國, "NamViệt Vươngquốc" in 204 B.C.,
a polity that endured for 93 years (see Keith Weller Taylor, The Birth of Vietnam [1983] as quoted by Bùi Khánh-Thế inAPPENDIX I)
The emergence of Vietnamese statehood can be traced to the period following
111 B.C., when the Han Dynasty annexed the NamViet Kingdom. The region
that would later be known as "Annam", a name derived from the Tang-era
administrative unit 'Protectorate of the Pacified South' (安南 都護府, SV
"Annam Đôhộphủ"), was subsequently absorbed into the Chinese empire and
governed by successive dynasties for nearly a millennium. This imperial
control persisted until the early 10th century, when the collapse of the
Tang Dynasty in 906 A.D. fractured the empire into nine independent states.
Amid the ensuing fragmentation, the people of Annam broke free from
the disintegrating NamHán Kingdom (南漢 帝國) and established an independent
polity in 939 A.D. (Bo Yang, Sima Guang Zizhi Tongjian, Vol. 69, p. 209,
1993).
Following independence, the former Annam territory was renamed ĐạiViệt (大越)
in 1054 and later Việtnam (越南) in 1804. Vietnam stands apart as the only
state founded by early descendants of the proto-Yue peoples, ancestral to the
later Sinicized Yue populations across southern China. In contrast, other Yue
groups in the region, now identified as the Cantonese in Guangdong, the Wu in
Jiangsu, the MinNan in Fujian, the Zhuang in Guangxi, the Gang in Jiangxi, and
various ethnic communities throughout Yunnan, Guizhou, and neighboring
provinces, all was gradually absorbed into the Chinese imperial structure over
successive dynastic periods.
The ancestral subjects of the ancient NamViet Kingdom who settled in Annam
actively participated in the struggle for independence and endured over a
millennium of Chinese domination and successive invasions. Despite repeated
subjugation by every Chinese monarch, and later by heads of state from a
rising empire that has long exerted influence over the region, from Mao
Zedong and Deng Xiaoping to Jiang Zemin and the current,
indefinite-term General Secretary-President Xi Jinping, Vietnam retained her
sovereignty. This history underscores the enduring impact of Chinese
political dynamics on territories south of the border.
As a matter of fact, while the Middle Kingdom often succeeded in
suppressing internal uprisings, it developed a recurring tendency to lose
wars to foreign invaders. Among these were the Jurchens (女真), Mongols, and
Manchurians, who each went on to establish ruling dynasties in
China, the Liao (寮), the Jin (金), Yuan (元) , and Qing (青)
dynasties, respectively.
Changes in dynasties within the Middle Kingdom have led the outside world
to recognize the region under one common name, China. In discussions of
"Sinicization," the transformative power of Chinese heritage and culture is
inescapable, as it has long absorbed foreign elements and made them integral
to its identity. For example, the official court language, Mandarin (官話),
was adopted by various regimes of northern origin, including the Liao, Jin,
Yuan, and Qing dynasties, all of which were led by Tartar or Turkish-derived
elites. Linguistically, Mandarin absorbed numerous foreign influences: its
original eight-tone system was reduced to four tones under the impact of
non-tonal Altaic languages, and final consonants such as /-p/, /-t/, and
/-k/ disappeared, changes that departed markedly from its ancestral Middle
Chinese characteristics. Despite these shifts, Mandarin evolved into
Putonghua, today's national language of China, reflecting its adoption and
adaptation by predominantly northern rulers.
After Qin unification, Sinitic elements circulated back into major Yue
lects. Wu, Min (Hokkien or Fukienese), Cantonese, and Vietnamese
progressively absorbed these features, layering them over an older Yue
foundation and producing highly Sinicized Yue speeches (cf. Comparative Sino-Tibetan Etymologies.) This cyclical traffic helps explain Vietnamese cognates aligned with
Sino‑Tibetan fundamental etyma. Like other Yue lects, the ancient Vietic
language participated in shaping the Sinitic subfamily until Annamese
diverged following political independence in the 10th century.
Over subsequent centuries, Yue roots embedded in Old and Ancient Chinese
resurfaced across Sinitic languages in repackaged forms. Alongside broadly
comparable tonal systems (from roughly three to ten tones), many items
vary only subtly in regional articulation. The pattern is especially clear
in lexical doublets, words tracing to the same ancestral root, notably
from proto‑Taic, Taic, and Tai‑Kadai. For example, Vietnamese "gạo" aligns
with Chinese "dào" 稻 ('rice'), and analogous correspondences appear for
animals such as elephant, whale, fox, and rhinoceros (see APPENDIX G: Tsu-lin Mei, The case of "ngà").
A frequently cited illustration is the set of twelve animals in the
well-known Chinese zodiac, many of which were borrowed and repurposed in a
range of southern Chinese minority languages. The sole exception is the
'hare' 兔 (tù), an auspicious creature in both Chinese and Altaic
traditions, rendered in Vietnamese as "thỏ" . The other eleven zodiac
animal names in modern Vietnamese trace their origins to shared indigenous
sources, with cognates attested among diverse ethnolinguistic groups of the
China South.
Historical sources record that lexical material from both aboriginal Yue
and proto‑Chinese merged into a shared diplomatic koine known as Yǎyǔ
(雅語, 'elegant speech'), employed among pre‑imperial polities, as
noted in early Chinese annals. This lingua franca likely originated in Taic,
the speech of the subjects of the Chu State (楚國) during the Spring and
Autumn Period (春秋時代, 771 B.C.–403 B.C.). From this base, Taic developed
into the modern Daic–Kadai languages spoken today by the Dai, Thai, and
related peoples such as Laotians of Laos, Tày in Vietnam. Yue, as a
descendant subbranch of Taic, likewise constitutes a primary substrate in
the ancestral Vietnamese lexicon.
An early stage of Vietnamese, historically referred to as ' ancient
Annamese' , began to take shape with the introduction of Old Chinese
elements during the Western Han period (206 B.C.–24 A.D.), brought into the
Annamese territories under Han colonial administration. These Ancient
Chinese influences continued to evolve across subsequent dynasties. By the
time Annam achieved sovereignty in 939 A.D., Chinese characters known
locally as chữNho (儒字), or Classical Chinese ( 文言文 wényánwén ),
remained the official medium of administration and scholarship. The
Vietnamese language in the form recognized today, however, did not fully
crystallize until the 12th century (Nguyễn Ngọc San, 1993, p. 5).
From the 15th century onward, vernacular literary works began to appear in
chữNôm (𡨸喃) (字), a modified script derived from Chinese
characters. In the 18th century, confronted with the complexity of these
Vietnamized character systems, Western missionaries devised a Romanized
orthography for Vietnamese. This Latin‑based script gained wide currency in
the early 20th century owing to its relative simplicity, though it was not
officially adopted until 1945. By then, the national script known as Quốcngữ
had already received active promotion by the French colonial government as a
means of reducing Chinese cultural influence in Annam.
In practice, the new Romanized script functioned chiefly as a transcription
system for both Vietnamese and Hán and chữ Nôm ('pure Vietnamese')
vocabulary. It encompassed the full range of Sinitic‑Vietnamese and
Sino‑Vietnamese lexicons, integrating them seamlessly into Romanized
spelling. By contrast, French borrowings contributed fewer than one thousand
low‑frequency items to the modern language.
(APPENDIX A-V Polysyllabic Vietnamized English and French words)
In an article published in Tập san Khoa học, Trường Đại học Khoa học
Xã hội & Nhân văn, National University of Hồ Chí Minh City, issue 38
(2007, pp. 3–10), Prof. Bùi Khánh‑Thế examines the interaction and interchange
of Chinese in Vietnam's linguistic history. Citing his own mentors, including
Nguyễn Tài Cẩn (1998), he condenses key points in the summary table reproduced
below.
Table 1.3 Division of Historical Periods in the Development of the
Vietnamese language
A
Proto-Vietnamese
2 languages in use: Ancient Chinese (a vernacular Mandarin spoken by
the ruling class) and Vietnamese; 1 Chinese writing script
the 8th and 9th centuries
B
Archaic Vietnamese
2 languages in use: Ancient Chinese and Archaic Vietnamese (spoken by
the ruling class); 1 Chinese writing script
the 10th, 11th, and 12th centuries
C
Ancient Vietnamese
2 languages in use: Ancient Vietnamese and Classical Chinese; 2
Chinese and Chinese-based Nôm scripts
the 13th, 14th, 15th, and 16th centuries
D
Middle Vietnamese
2 languages in use: Middle Vietnamese and Classical Written
Chinese; 3 Chinese writing scripts: Chinese and Nôm scripts, and
National Romanized Quốcngữ writing system
the 17th, 18th, and the first 1/2 of the 19th centuries
E
Early contemporary Vietnamese
3 languages in use: French, Vietnamese and Classical Written
Chinese; 4 writing scripts: French, Chinese, Nôm, National
Romanized Quốcngữ writing systems
during the rule of the French colonial government
F
Modern Vietnamese
1 language in use: Vietnamese; 1 National Romanized Quốcngữ
writing system
From 1945 until present
Based on the formation of the Hán-Việt pronunciation of the Middle
Chinese, Annam Dịchngữ (安南譯語 'Translated Annamese
Words') and the Annamese-Latin-Portugese Dictionary by
Alexandre de Rhode (1651), H. Maspero devised similar division of 5
development periods:
A) Proto-Việt (prior to the 9th century)
B) Archaic Vietnamese: the 10th century (formation of the
Hán-Việt) C) Ancient Vietnamese: the 15th century (Annam Dịchngữ) D) Middle Vietnamese: the 17th century (Dictionary by A. de Rhôde
1651) E) Contemporary Vietnamese (19th century)
Source: Table 1 by Nguyễn Tài Cẩn (1998, p. 8) quoted by
Bùi Khánh-Thế.
(SeeAppendix I)
This work advances the thesis that core Chinese and Vietnamese vocabulary
shares Yue etyma, called "Việt" (越, Yuè) in Vietnamese and "Jyut6" (粵,
Yuè) in Cantonese, layered atop a Sino-Tibetan stratum. The classical
literary language of later periods incorporated many native items cataloged
under Yǎyǔ (雅語) (De Lacouperie 1887). That diplomatic koine provided a
matrix from which Old Chinese, Ancient Chinese, and Middle Chinese took
shape.
Table 1.4. HISTORY IN A NUTSHELL
Archaeological evidence and historical records show that the region
of modern southern China, located below the Yangtze River (揚子江), was originally home to the ancient Yue aborigines. During the Zhou
Dynasty (1045 B.C.–256 B.C.), and especially toward the end of the
late Eastern Zhou period (culminating in 221 B.C.), these indigenous
peoples formed the bulk of the population in the seven states that
would later fall to the Qin Dynasty. The Qin, emerging as the
strongest state, unified these territories under the banner of the
Middle Kingdom (中國).
After their conquest, the Taic-Yue natives were incorporated first
into the Qin Empire (秦朝, 221 B.C.–207 B.C.) and subsequently into
the Han dynasties. Over time, many of these peoples came to identify
as "Han" (漢人), a name derived from the Han Dynasty (漢朝) founded by
Liu Bang (劉邦), who himself had once been a subject of Chu (楚國人).
Successive Han rulers continued to displace the independent Yue groups
in southern China, driving them further south.
In the land later known as Annam, ruled for a significant period by
the Han, the distinction between the original Yue and the later Han
immigrants gradually diminished. Waves of Chinese settlers fleeing the
recurring dynastic upheavals in northern China blended with the
indigenous inhabitants, effectively erasing clear-cut ethnic
boundaries.
This historical layering survives today in Vietnam, the sole state
emerging from the ruins of ancient cultures such as
Chu 楚 (Sở), Shu (蜀 Thục), Yue (粵 Việt), NanYue (南越 NamViệt), Dali
(大理 Đạilý), and Nanzhao (南詔 Namchiếu).
The Vietnamese (the people of Việtnam) represent the enduring legacy
of the Southern Yue. Ironically, the same expansionist processes that
once characterized Chinese history were mirrored later by the
Vietnamese. After achieving sovereignty, Vietnam expanded its
territory further south, culminating in the downfall of the Kingdom of
Champa and the annexation of parts of the eastern flank of the old
Khmer Empire.
In many respects, the historical trajectories of Vietnam and China were
deeply entwined until Annam secured independence from Chinese rule.
Vietnam's own written historiography did not cohere until well after the
10th century; before then, accounts of its past were drawn chiefly from
Chinese chronicles, often without corroboration from alternative sources.
The same axiom applies in linguistics: any comprehensive treatment of
Vietnamese or Chinese remains incomplete without the other, especially in
discussions of Old Chinese, Sinitic-Vietnamese etyma, and shared structural
peculiarities (see Wang Li, 1957).
For over two centuries prior to 939 A.D., ancient Vietnam functioned as a
Chinese prefecture known as the Annam Protectorate (679–860, 863–906), a
historical condition that accounts for the extensive presence of Middle
Chinese loanwords in Vietnamese. The final phase of influence, following the
collapse of the NamHan State during the post-Tang period, proved especially
consequential: it disseminated elite court vocabulary into broader
usage—much like the incorporation of Latin and Greek terms into English—and
reinforced a Middle Chinese lexical substratum within Vietnamese. This
substratum contributed to Vietnamese's resemblance to Cantonese,
particularly through retention of the full eight-tone system, including the
eighth tone, "thanhnhập" 入聲 ('Rusheng', or 'Entering Tone').
Contrary to common belief, Vietnamese aligns more closely with Mandarin—a
court language—in its colloquial uptake of northern vernacular elements
than with Cantonese, which reflects a Tang-era literary register.
Among Vietnamese's distinguishing phonological traits are finals such as
/‑owŋ/, which contribute to its unique acoustic profile and tonal
architecture. The scope and transmission routes of Mandarin influence will
be addressed in detail in subsequent chapters.
Both literary and colloquial forms derived from Tang-period speech were
thoroughly integrated into Annamese (a term used here to avoid the
retrospective label 'Vietnamese', paralleling the terminological ambiguity
surrounding 'Chinese')(H). These forms
circulated widely across social domains, not only among the literati but
also within the general populace. This widespread adoption explains why
Vietnamese speech often bears Mandarin-like expression and cadence.
This historical reality also accounts for the persistence of systematic
Hán‑Việt (Sino‑Vietnamese) variants and the extensive Middle Chinese lexical
substratum long after Vietnam's political independence in the 10th century.
These elements, phonological, lexical, and syntactic, contributed to the
formation of Ancient Vietnamese, became foundational to Middle Vietnamese,
and remain integral to the modern language as it is known today. Their
presence also explains why Vietnamese, despite its Yue-Taic substrate,
retains structural affinities with Cantonese, particularly in tone contour
and compound formation.(差)
Sinitic influence is not the whole story, though. Older Yue elements lie
beneath the heavy Sinitic overlay, and many indigenous Taic-Yue words have
been misidentified as Chinese, a pattern mirrored in Vietnam, where such
items are paradoxically labeled 'thuầnViệt' or 'pure Vietnamese'. Vietnamese
thus preserves Yue-descended survivals whose archaic features are realized
in distinctively Vietnamese ways; Chinese-layered variants can act, in
effect, as tonal modulators for toneless items in several other Sino-Tibetan
languages. While Yue-origin words were often masked as Chinese, Taic-Yue
terms that moved into Sinitic languages were simultaneously preserved within
the ancient proto-Vietic layer. Across these eras, Sinitic-Vietnamese interacted with Yue and Taic speech
habits, producing unique word order patterns (e.g., noun+modifier as in
"gàcồ" vs. Mandarin 公雞 gōngjī).
This distinction underpins the claim that the 'Yue' people predates the
arrival of early Sino-Tibetan speakers, the forebears of the Chinese, in
China South. Fundamental cognates shared among Taic-Yue, Chinese, and
Vietnamese etyma across many Sino-Tibetan etymologies will be treated in
Chapter 10.
From an anthropological perspective, the Taic peoples preceded the Yue,
followed by the Dai, who at one point held dominion over the Chu State.
Within this Chu cultural sphere, Liu Bang rose as a subject of Chu and
ultimately founded the Han Dynasty. His ascent is linked to his appointment
as viceroy of the Hanzhong region, situated in present-day southern Shaanxi,
where Chu forces had earlier triumphed over the Qin.
Technically, Sino-Vietnamese and Sinitic-Vietnamese are distinct lexical
classes; the latter comprises multiple layers of doublets superimposed on
the former, driven by vernacular Mandarin forms that spread from at least
the Han in the 2nd century B.C. through the Ming in the 15th century.
The persistence of fixed expressions in Vietnamese that align with those
found in modern Mandarin suggests that Early Mandarin may have functioned
as a concurrent spoken language among mandarins for official
purposes, as evidenced by Prof. Nguyễn Tài Cẩn's analysis in Table 1.1 above. (W). Such uses would have included imperial decrees, legal documentation,
and reports to the Tang imperial court in Chang'an (長安, SV
"Tràngan"), now Xi’an City (西安市.) (安) As a protectorate throughout the Tang Dynasty, old Annam contributed
to the imperial court through administrative internship, scholarship, and
artisanship—channels that introduced higher-register Middle Chinese
vocabulary, the same that circulated in Cantonese, into Annamese during the
Tang period (618–906).
From the Tang era until its gradual decline toward the end of the 19th
century, Classical Chinese style (文言文) was extensively employed in Vietnamese
letters. Its dense, allusive register shaped Tang-style verse and Vietnamese
literary prose alike, until Romanized Quốcngữ ushered in a shift toward a
more colloquial written style (see Nguyễn Thị Chân-Quỳnh, 1995).
As ancient Vietnamese transitioned into late Middle Vietnamese, the
emergence of new function words became essential for constructing sentences
that increasingly mirrored French syntactic patterns, particularly by the
early 20th century. Lexically, a stratum of Sino-Vietnamese items, likely
rooted in Tang-era vernacular, was retained in a markedly Sinitic register,
comparable in style to spoken Cantonese. By the 16th century, numerous
Middle Chinese lexemes had evolved into Sinitic-Vietnamese function words
('虛辭'), and these lexemes became indispensable in Vietnamese vocabulary, serving grammatical roles analogous to English particles and prepositions
such as 's', 'of', 'although', 'not', 'in', 'at', 'from', 'hence',
'herewith', 'albeit', and others.
These elements became syntactically necessarily for managing
non-inflectional grammar in both Vietnamese and Chinese, facilitating
syntactic cohesion without morphological variation (cf. Nguyen Ngoc San,
1993, pp. 138–142). More broadly, these items belong to a set of Chinese-origin vocabulary
systematically localized through pronunciation rooted in a variety of Middle
Chinese, plausibly related to an ancient Shaanxi dialect.
Through successive periods of contact, the Han and Tang lexicons introduced
successive waves of new vocabulary into the Sinitic-Vietnamese layer. This
pattern is comparable to the influence of Middle Chinese on Cantonese, much
as, in an earlier era, Qin-Han Old Chinese shaped Southern Min (NanMin)
varieties such as Hokkien, Amoy (Xiamen or 'Hạmôn'), Hainanese, and Chaozhou
(Teochew).
From a typological perspective, when major southern Chinese lects are
strongly marked by Sinitic features within the Sino-Tibetan family,
classification is determined by dominant attributes. The situation parallels
other hybrid outcomes: Latin-influenced French versus Anglo-Saxon-dominant
English; Australian English versus Indian English; Bulgarian and Afrikaans
in relation to Dutch; Latin-French in contrast to Gaulish; or Haitian French
in contrast to Moroccan French.
Table 1.5: The Case Of Afrikaans
Afrikaans, also known as Cape Dutch, is one of the eleven official
languages of South Africa. It originated in the 17th century from the
Zuid-Holland (South Holland) dialect used by Dutch settlers in South
Africa during this period. The language was spoken by Dutch, French,
German settlers, as well as by their enslaved people. From the 18th
century onward, Afrikaans gradually developed distinct linguistic
features.
Afrikaans borrowed vocabulary from English, German, and French, reflecting
the cultural and linguistic backgrounds of European settlers in South
Africa. It also incorporated words from indigenous African languages. Its
grammar underwent simplification, such as the omission of verb endings
that indicate tense. Phonetically, changes included simplifying the Dutch
"sch" sound to "sk" (e.g., the Dutch word "schoen" became "skoen," meaning
"shoe").
Until the mid-19th century, Afrikaans was primarily a spoken language,
with Standard Dutch being used for writing. Later, a movement emerged to
promote Afrikaans as a literary language. The language gradually found its
way into journalism, schools, and churches. In 1925, Afrikaans officially
replaced Standard Dutch.
Today, Afrikaans is predominantly used in South Africa and Namibia, with
lesser usage in Botswana, Zambia, and Zimbabwe. Estimates from 2020
suggest that the number of Afrikaans speakers ranges between 15 and 23
million. Most linguists classify Afrikaans as a creole language.
It is estimated that approximately 90%-95% of Afrikaans vocabulary
originates from Dutch, with additional words borrowed from other
languages, including German and South Africa's Khoisan languages.
Distinctions from Dutch include more analytic morphology and grammar, as
well as certain phonetic differences. The written forms of Afrikaans and
Dutch maintain a high degree of mutual intelligibility. In May 2022,
Afrikaans was officially recognized as an indigenous language of South
Africa.
Comparative Sino‑Tibetan etymologies suggest that the diachronic evolution
of modern Vietnamese mirrors a historical trajectory in which early Southern
Yue populations established autonomous polities throughout China South prior
to the consolidation of Han imperial authority. It is not within the purview
of Sino-Tibetan linguistics to classify Vietnamese as a member of the
Sino-Tibetan family, whether by subsuming it under the Sinitic branch or by
drawing analogies to Cantonese or any other Chinese lects even though
characteristically Vietnamese is the one on par with it. Such a
classification demands a broader base of etymological evidence and a more
rigorous linguistic framework.
Lexical recycling has persisted into the modern era, evident in the
transregional circulation of terms such as "cộnghoà" 共和 (gònghé,
'republic') and "dânchủ" 民主 (mínzhǔ, 'democratic'). These items originated
as Japanese neologisms constructed from Chinese morphemes, were subsequently
re-borrowed into Chinese, and eventually permeated Vietnamese usage. Their
trajectory exemplifies the ongoing exchange of linguistic material across
Sinitic, Japonic, and Vietic domains.
In parallel, the examples below illustrate the process of localization,
whereby Sino-Vietnamese lexemes undergo phonological and semantic
nativization to become Sinitic-Vietnamese. In such cases, original senses
are not always preserved—a phenomenon more prevalent in Japanese Kanji than
in Sino-Vietnamese. For instance, "lịchsự" (polite) derives from 歴事 lìshì
(originally 'experience'), and "tửtế" (kind) from 仔細 zǐxì
('meticulously').
Interestingly, contrary to modern belief, Vietnamese is best understood as
a Sinitic-dominant language in a way that Japanese or Korean is not. It inherits a rich Middle Chinese lexicon, and many items
classified as Sino-Vietnamese overlap with Sinitic-Vietnamese due to their
integration into everyday speech alongside native vocabulary. For example,
the etymon 順 (shùn, SV "thuận") exhibits context-dependent variation: 順利
(shùnlì, VS "suônsẻ"), 孝順 (xiàoshùn, VS "hiếuthảo"), 順便 (shùnbiàn, VS
"sẵntiện"), 逆順 (níshùn, VS "ngượcxuôi"), among others.
Table 1.6: A case study of Sinitic-Vietnamese neologism formed with
Chinese lexemes
The Vietnamese term 'côngcuộc'—now familiar in modern discourse as a
formal compound meaning 'cause', 'process', or 'undertaking'—is a persistent
source of lexical confusion and scholarly intrigue. While often misinterpreted
as a Sino-Vietnamese compound mapping straight onto Chinese 公局 or 工局
(Mandarin gōngjú 'public bureau', 'work office'), its correct etymological
genesis instead lies in 工作 (gōngzuò, 'task', 'work'), with the element
'cuộc' emerging not from 局 (jú) but from 作 (zuò). The fact that 'cuộc' in
Vietnamese phonologically and semantically diverges from both its
Sino-Vietnamese dictionary reading (tác) and its expected Mandarin reflex
(zuò) reflects a network of historical sound change, sandhi assimilation, and
semantic-phonetic association—processes that collectively illuminate the
complex history of Chinese lexical influence in Vietnam.
The
Vietnamese word 'côngcuộc' functions in modern written and spoken Vietnamese
to denote a significant collective undertaking—'project', 'cause', 'the course
of'—especially in governmental or historical phrasing (e.g., "côngcuộc
khángchiến" 'resistance war', "côngcuộc đổimới" 'the undertaking of
renovation/reform'). It is a compound of 'công' (from 工 'work; labor') and
'cuộc'.
The confusion with 公局 or 工局 is understandable, as both
公 and 工 read 'công' in Sino-Vietnamese, and 局 (SV: cục) is a common bound
morpheme for official entities. However, 'côngcuộc' is a modern compound built
on the model of Chinese 工作 (gōngzuò), but adapted phonetically and
semantically within the Vietnamese system. While 'công tác' is the canonical
Sino-Vietnamese reading for 工作, 'côngcuộc' emerged as a neologism where
'cuộc' operates as a native or nativized reflex of 作, rather than 局.
The
emergence of Sino-Vietnamese compounds such as 'côngcuộc' reflects
longstanding processes of borrowing and semantic adaptation widespread across
the Sinosphere, i.e., Japan, Korea, and Vietnam, collectively referred to as
the 'Sino-Xenic' realm. In these contexts, new words for modern concepts were
often coined using Chinese morphemes and then mapped phonologically into the
target language in a regularized, but sometimes innovative, fashion.
Middle
Chinese, as represented in rime dictionaries such as Qieyun (7th century), had
a richly articulated syllable template. For the character 作 (Mandarin: zuò),
used in 工作 (gōngzuò), the reconstructed MC pronunciation is commonly given
as */tsak/ or /tsak-s/, with the following features: initial: ts- (voiceless
alveolar affricate), vowel and medial: /a/ as nucleus, sometimes with a
palatal medial in some dialects, final: -k (voiceless velar stop), a classic
'entering tone' coda., and one: entering (rusheng), which has phonological and
tonal correlates in Sino-Vietnamese readings for 作 are systematically 'tác',
tracing the regular sound correspondences established for Chinese readings in
Vietnamese. Key observations:
The initial [ts-] to [k-] shift is irregular (i.e., not predicted by the
regular SV correspondence), suggesting non-Sino-Vietnamese, perhaps
colloquial or nativized, development.
Labiovelar final [‑əwkpʔ] is robustly preserved in 'cuộc', with the
final -k and medial -w- (from /ua/ or /uə/) mapping closely to MC -ak,
and aligning phonotactically with native Vietnamese coda structure.
The resultant tone is nặng [˧ˀ˩ʔ], consistent with the entering
(rusheng) tone category linked to -k finals in Han-Viet transmission.
Semantic-phonetic association: 'cuộc' vs. 'cục' and the shadow of 局, the homophony and semantic
overlap
One reason for the widespread misreading of 'côngcuộc' as 公局 or 工局 is
the phonological and structural near-identity between 'cuộc' and 'cục'
(局):
'cục' SV: cục, Mandarin jú, MC *kɨwk; used for administrative,
governmental, and physical 'units' or 'offices'
'cuộc', derived via the above pathway from 作, but due to similar form
and function, is often reanalyzed by speakers and writers as rooted in
局, especially in compounds
The confusion is exacerbated by the convergence of rimes and finals, both
'cục' /kʊkpʔ/ and 'cuộc' /kəwkpʔ/ conforming to the [k•w•k•p̚] structure,
with heavy final closure and possible central or back rounded
vowels.
Semantic blending in compound formation: semantic overlap also drives
this folk association
In both Sinitic and Vietnamese, compounds involving 工作 (work), 局
(office), and 作 (to do/make) are semantically related to tasks, operations,
or affairs, domains where 'cuộc' has come to be used.
For
example, in classical Chinese, 局 (jú) denoted physical bureaus ('bureaus',
'games') and by extension 'affairs' or 'situations' and 作 (zuò) in
compounds implied the 'doing', 'working, or 'citing upon' something:
matching the function of 'cuộc' in in syntagms such as "côngcuộc vậnđộng"
'the campaign task'.
Consequently, the phonetic resemblance
between 'cuộc' and 'cục' enables semantic-phonetic association (lexical
contamination or 'folk etymology'), especially when context or classical
literacy is limited.
This phenomenon is hereby called 'sandhi assimilation' or 'assimilative
association'; it is recurrent in the realm of Sinitic-Vietnamese.
Conclusion
The analysis of 'côngcuộc', especially the sound change underlying 'cuộc',
is a case study in the stratification, innovation, and reanalysis inherent
to Sinitic-Vietnamese contact linguistics. Through the transformation of
Middle Chinese *tsak to Vietnamese 'cuộc', we witness the interplay of
phonological adaptation, semantic reinterpretation, and structural
assimilation:
▪ The initial [ts-] > [k-] shift, though irregular, is emblematic
of colloquial nativization and possibly dialectal borrowing
▪ The preservation of labiovelar coda [-əwkpʔ] aligns with Vietnamese
phonotactics, fostering both the formation of new compound morphemes and
confusion with native terms like 'cục'
▪ The importance of sandhi, compound formation, and semantic blending
means the etymological and structural boundaries between Sinitic and native
vocabulary are porous.
Comparative evidence across Sino-Xenic languages highlights both shared
roots and Vietnamese-specific pathways. While 'côngcuộc' initially traces to
工作, its contemporary form and meaning exemplify Vietnam's creative
synthesis of linguistic inheritance, local adaptation, and ongoing lexical
renewal.
In this paper, when discussing the etymology of Sinitic-Vietnamese words,
the author restricts his analysis to locally influential references within
the Sinitic framework for comparative purposes. In other words, he focuses
only on those etyma that exist concurrently in both Chinese and
Vietnamese, including foreign words that entered Vietnamese through a
Chinese intermediary. For example, the Vietnamese word "mắt" 'eye',
rendered as 目 mù (SV mục) in Chinese, may be related to Malay 'mata';
"gạo" ('rice'), represented by 稻 dào, might compare to Thai /gaw/; and
other foreign-derived words include SV "kỹsư" (技師 jìshī, 'engineer')
borrowed from Japanese "gishi", as opposed to the modern Chinese meaning
'technician'), "bệnhviện" (病院 bìngyuàn, 'hospital') also from Japanese
usage, "ưumặc" (幽默 yōumò, 'humor'), "câulạcbộ" (俱樂部 jùlèbù, 'club') ,
and country names such as "Anh" (英 Yīn, 'England'), "Mỹ" (美 Měi,
'America'), all from English, "Pháp" (法 Fǎ, 'France') from French, and "Đức" (德 Dé, 'Germany') from German 'Deutsche'.
The sound change patterns observed in core vocabulary across Chinese and
Vietnamese suggest the preservation of substratal residues from an earlier
"Yue" linguistic layer. These exchanges demonstrably predate the Qin–Han
expansion into the southern regions of China (206 B.C.–220 A.D.). Numerous
lexical items from this substratum are securely attested in the Kangxi
Dictionary 康熙字典, the Qing-era compendium commissioned by Emperor Kangxi,
underscoring their deep historical entrenchment. (Y)
From a linguistic standpoint, the predominance of Sinitic features in all over
Vietnamese etyma—including tonality, morphological structure, phonological
traits, and disyllabicity—has led many scholars to infer a Chinese origin.
However, as phonological and semantic convergence increases, so too does the
likelihood of borrowing. This is especially evident in Tai-Kadai languages,
and most prominently within the Tai-Kam-Sui subgroup, where nearly all lexical
items appear to derive from Chinese sources (cf. Comparative Sino-Tibetan Etymologies).
Consider Vietnamese "gạo" ('hulled rice'), often compared to Thai /kao/, or
"nếp" (‘sticky rice’) to Thai /nɛp/ and Lao /nèep/. These correspondences
align with Chinese 稻 dào (SV đạo), or 糯 nuò (SV nọ), both of which are
themselves loanwords in Chinese. In contrast, Vietnamese "lúa" ('paddy
rice') appears to be a native Yue-Taic term, corresponding to Lao /lua/ and
Zhuang /luə/, with no direct Chinese cognate. This challenges the assertion
by A. Starostin (1953–2005) that "lúa" reflects an archaic Chinese loanword
derived from 稻 dào, reconstructed as [ lhu:ʔ < Protoform ly:wH ],
encompassing meanings such as 'rice', ‘'grain', and 'paddy'. Starostin's
broader comparative framework includes Burmese /luh/ ('a grain species',
Panicum paspalum), Kachin /c^əkhrau1/ ('paddy ready for husking'), and Kiranti lV ('millet'), which he
interprets as part of a native Chinese semantic field. Nonetheless, this
view necessitates careful differentiation between inherited Yue-Taic
forms—where the Vietnamese share word order syntactically—and Sinitic overlays.
A similar caution applies to items traditionally assigned to the
Austroasiatic Mon-Khmer layer. Where phonological correspondences fail to
conform to established patterns of sound change or semantic alignment, such
items are more plausibly interpreted as intergroup loanwords, facilitated by
geographic proximity and prolonged contact.
Table 1.7. Glyph origins and etymological convergence: 來 and 麥, and the case of
Vietnamese "lúa" and "lại"
The character 來, now widely interpreted as
'to come', originated as a pictogram (象形) depicting wheat. Its ancestral
forms include 麥 (OC *mrɯːɡ, 'wheat') and 麳 (OC *rɯː, 'wheat'). In early
script forms, the central vertical line represented the ear of wheat,
flanked by upward strokes for leaves and downward strokes for stem and
roots. An additional horizontal line was often added at the top, possibly to
emphasize the ear. Compare 禾, which shares structural parallels.
This glyph was borrowed for the meaning 'to
come' as early as the oracle bone script. During the Western Zhou and
Warring States periods, semantic components such as 止 ('foot') and 辵
('walk') were appended to distinguish the original agricultural sense from
the emerging verbal usage. These additions, however, were not retained in
later script traditions. Some scholars interpret the derivative 麥, formed
by adding 夊 ('to walk slowly'), as the original glyph for 'to come'. If so,
the meanings of 來 and 麥 may have interchanged due to the dominant use of
來 in verbal contexts.
Shuowen connects the semantic domains of
'wheat' and 'arrival' mythologically: 天所來也 ('it comes from the
heavens'). This interpretation may be supported by archaeological evidence
suggesting that wheat was not indigenous to China, but introduced from the
Heavens.
Phonologically, both 來 and 麥 have been
reconstructed with initial *mr- in Old Chinese. In 來, the liquid onset /l/
is retained, while 麥 preserves the nasal /m/. Etymologically, 來 derives
from Proto-Sino-Tibetan *la-j ~ *ra ('to come') (STEDT), and is cognate
with:
迨 (OC *l'ɯːʔ, 'reach; until')
賚 (OC *rɯːs, 'bestow')
蒞 (OC *rɯbs, 'arrive') — Schuessler (2007)
Burmese လာ (la, 'come')
Proto-Vietic *laːjʔ
The Vietnamese reflex "lai" (SV lai) is
possibly related to Chinese 來 (MC lʌi, ləj 'to come; to arrive').
Baxter–Sagart (2014) note that 來 shows
irregular development, possibly due to the loss of final *-k in an
unstressed form that was later restressed:
來 *mə.rˤək > *mə.rˤə > *rˤə > loj > lái 'come'
This trajectory, however, does not fully
explain the irregular presence of final -ʔ (nặng tone) in Vietnamese. If we
posit an intermediate stage where *-k > *-ʔ occurred and was subsequently
lost, allowing for borrowing into Vietic during that window, the tone could
be accounted for. Yet the Vietnamese form lacks expected traces of *-rˤ-
(e.g., ‹r› or ‹s›), suggesting a late loan, after *r(ˤ) > l had already
occurred. This raises further questions about tonal interpretation and
phonological alignment.
For comparative reference, Zhuang (or Nùng)
/lai/ aligns with Proto-Tai *ʰlaːjᴬ ('many; much') [ cf. Vietnamese "lắm" ],
itself derived from Old Chinese 多 (OC *t.lˤaj). Cognates include:
Thai lǎai
Lao lāi
Lü l̇aay
Shan lǎay
Bouyei laail
Saek หล่าย
Jizhao laːi²¹
These forms suggest a broader semantic and
phonological network in which 來 participates, spanning Sino-Tibetan,
Vietic, and Tai-Kadai domains.
This principle is not universal. In some cases, a single-morpheme syllable
categorized as a "word" is primarily governed by phonological alternation
while showing additional features (beyond tonality) that do not neatly fit
established patterns, for instance, "tỏi" 蒜 suàn (SV toán, 'garlic') where /s- ~ t-/ and /-n ~ -i/, whereas
"chua" 酸 suān (SV toan, 'sour') does not follow the same pattern.
Nonetheless, such items are still classified as loanwords based on overall
affinity. Consider 兒 ér, which corresponds to SV "nhi" and yields VS "nhỏ"
(child), VS "nhí" ('baby'), and "nhínhảnh" (with "nhảnh" as a reduplicative
morphemic syllable conveying 'childish', analogous to English "-ish"), as
opposed to "nhỏ" 孺 rú (SV nhụ, 'young'). We may thus conclude that the
etymon "nhi" entered via Middle Chinese and that its cited derivatives are
all Chinese loanwords.
In comparison with other southern Sinitic dialects, and contrary to common
assumption, Vietnamese, beyond sharing a similarly broad tonal range (up to
nine tones), aligns more closely with Mandarin than with Cantonese, Min Nan,
or Wu varieties, particularly in the lexical domain. Only a small
number of indigenous Cantonese
words have cognates in
"thuầnViệt" ('basic native Vietnamese'), such as:
sik6 → ("xơi", 'eat')
jam2 → ("uống", 'drink')
gai1 → ("gà", 'chicken')
By contrast, rarer Cantonese forms lack direct Vietnamese matches, e.g.:
fajng1kao1 ('sleep') ≠ M 卧 wò that corresponds to SV "ngoạ"→VS "ngủ"
pin5tow2 ('where') ≠ M 哪裏 nǎlǐ that corresponds to SV "nalí"→"nơinào"
tzuo3 ('already') ≠ M 了 liǎo that corresponds to SV "liễu"→"rồi"
Who are the Cantonese-speaking population in historical-linguistic context?
This population historically occupied a substantial portion of the ancient
NamViệt Kingdom (204–111 BCE), whose capital, Phiênngung (番禺; present-day
Fanyu district, Guangzhou), was governed by its founding monarch Triệu Đà
(趙佗; Zhao Tuo) and his dynastic successors. This region formed the
southern frontier of early Sinitic expansion.(V).
Following the annexation of NamViệt into the Middle Kingdom (中國), the
preexisting process of Sinicization intensified. This catalyzed the
divergence of Cantonese and Vietnamese into two distinct linguistic and
cultural entities. Each followed separate historical trajectories, with only
Annam ultimately achieving independence from Chinese rule in 939 CE.
The genetic and cultural composition of modern Cantonese speakers differs
markedly from that of their pre-Han ancestors and from populations
inhabiting the region up to the tenth century. It is plausible that some kin
groups migrated southward into Annamese territories, a phenomenon repeated
across centuries of intertwined regional histories. In China, such
migrations often occurred in response to famine, repression, or political
upheaval. Similarly, ancient Annamese populations moved further south to
evade imperial reach.
By the time these migrations occurred, settlers in new regions would have
encountered populations not vastly different from themselves, especially
under shared or adjacent statehoods. The border between China and Vietnam
remained relatively permeable throughout history, facilitating such
movements until its closure in 1949 under Maoist rule.
Had Annam remained under Chinese dominion into the present, its national
trajectory might have mirrored that of NamViệt (Cantonese: NamJyut6), now
subsumed within Guangdong Province. Historically, Guangdong produced millions
of emigrants who dispersed globally, including to Annam and other Southeast
Asian polities. Conversely, had the greater Canton region achieved statehood
akin to Annam's, it might have retained linguistic sovereignty. Its language,
like Annamese, could have preserved distinct typological features, prompting
reevaluation of its classification within the Sino‑Tibetan family. Similar
speculation applies to Fukienese (Hokkien) and Hainanese.
Modern Cantonese descendants, now fully Sinicized, can only access their
pre-Han heritage through archaeological vestiges such as the mausoleums of
NamViệt kings in present-day Guangzhou. The orthography of NamViệt may be
rendered phonetically where appropriate to reflect its historical
pronunciation.
The immersive Sinicization of the Canton region profoundly shaped its
linguistic identity. Cantonese, as a Sinicized Yue language, stands in
contrast to Vietnamese—a distinction rooted in their respective historical
paths. Cantonese remained within China from 111 BCE onward, while Vietnam
extricated itself from Chinese rule in 939 CE. This divergence is
foundational to Vietnam's national identity.
During the Ming Dynasty's 25-year occupation of Vietnam in the fifteenth
century, Chinese influence left indelible marks. A particularly devastating
episode occurred when Ming forces destroyed Vietnam's entire written library
(Nguyễn Tài Cẩn, 1998). Over centuries, Vietnam navigated a complex
sovereignty, alternating between vassalage and independence, adapting to the
shifting power dynamics of its northern neighbor. Even after more than a
millennium since the end of China's 1,004-year colonial rule, this balancing
act remains central to Vietnam's historical narrative.
Despite their shared Yue ancestry, Vietnamese speakers often express
nostalgia for their Yue heritage, whereas many Cantonese speakers remain
unaware of or indifferent to their Yue origins. The Cantonese model is
instructive: the Sinicization of Yue subjects in NamViệt deeply influenced
the ethnic and linguistic evolution of the ancient Yue. Records of
Canton's OuYue (甌越) exhibit striking parallels to Annam's LuoYue (雒越).
The Han colonization extended into the Sông Hồng Basin (Red River Delta),
which became part of southwestern NamViệt following the conquest of 111
BCE.
Han imperial policies left enduring Sinitic imprints on the emerging Yue
languages, which over centuries evolved into Cantonese and Vietnamese. While
these languages share notable features, they are not linguistically bonded
as kin. This is evident in the limited number of newly identified
Sinitic‑Vietnamese etyma with shared ancestral roots. For instance, the
legend of the Magic Sword, which recounts the shared ancestry of the Zhuang
and Vietnamese peoples—once self-identified by the same ethnonym—underscores
their connection to ancient Cantonese traditions.(Z).
Conversely, the Chinese affiliation of Sino‑Vietnamese etyma and Sinitic
vocabulary in Cantonese is unequivocal. This is attested by their shared
usage of Middle Chinese variants and phonological commonalities, including
tonality (e.g., 8-toned Vietnamese vs. 9-toned Cantonese) and final
consonants (e.g., ‑m, ‑p, ‑t, ‑k).
Among the Sino-Vietnamese lexemes derived from Middle Chinese etyma, one of
the most controversial cases involves the naming of the 'duodenary zodiac
system'. This system reveals how substratal pathways in Sinitic-Vietnamese
zodiac terminology originating from ancient Yue and passing through Old
Chinese, trace long and intricate trajectories before entering Vietnamese.
These forms are conspicuously absent in Cantonese, likely due to its deeper
Sinicization. Cultural elements such as the duodenary cycle of twelve zodiac
animals, shared among the Chinese, ethnic minorities in southern China,
Vietnam, and southern Mon-Khmer cultures, exemplify this substratal retention.
For example, the Year of the Horse (馬年) in 2014 was also referred to as
'Jiawu Year' (甲午年, Jiǎwǔ Nián) or "Năm GiápNgọ" in Vietnamese. Here, the
term 'Ngọ' (午), an ancient Yue loanword for 'horse' (contrasting with the
native Vietnamese word 'ngựa'), exemplifies the linguistic imprint of Yue
heritage. Although nomenclature like 'Jiawu Year' may sound foreign to modern
Chinese ears, it remained prevalent until the early 20th century. A notable
instance of this usage is tied to the Xinhai Revolution (辛亥 革命) of 1911,
which overthrew the Manchurian Qing Dynasty and established the Republic of
China (中華民國). The year 亥 (hài), signifying "pig," is another ancient Yue
loanword. In Sino-Vietnamese, it appears as 'hợi,' while in
Sinitic-Vietnamese, it is rendered as 'heo.' Thus, 1911 is recognized as the
"Xinhai Year" or "Year of the Boar" in modern usage (Boltz, William G., 1991,
"Old Chinese Terrestrial Names in Saek").
A notable example is "mẹo", an older Sinitic-Vietnamese reflex of 卯 (M
'máo'), later reintroduced as the Sino-Vietnamese "mão". In Vietnamese
tradition, 卯 denotes the fourth position in the zodiac, but unlike Chinese
usage where it corresponds to 兔 ('tù', SV "thố", VS "thỏ", 'hare'),
Vietnamese associates it with "mèo" ('cat'), an animal culturally unwelcome
in Chinese contexts. Thus, while Chinese marks 兔年 ('Tùnián', 'Year of the
Hare'), Vietnamese calls the same year 卯年 (M 'Máonián'), rendered SV
"Mãoniên", VS "nămMão", "nămMẹo", or colloquially "nămMèo". This divergence
demonstrates that the Vietnamese 'Year of the Cat' is not a reinterpretation
of the Chinese 'Year of the Hare' but a retention of an older, likely
Yue-origin association, contradicting the claims of many Chinese
Sinologists, whose interpretations may be based on misreading or deliberate
distortion.
A parallel case involves 未 (M 'wèi') and the Vietnamese "dê" (/ze1/,
'goat'). The original southern concept of 未 as 'goat' was later supplanted
by northern terms for 'ram' or 'sheep' (or VS "cừu" 羯 jié, SV "kiết", or 羭
'yú' or SV "du"), even though 羊 ('yáng') still denotes 'goat' in many
southern lects. This semantic shift reflects northern influence, where 羊
was associated with 'sheep' or 'lamb' (羔 'gāo', VS "cừu"!). Crucially, 未
should be understood as 'goat' in any case, corresponding to SV "dương" (羊
'yáng') and VS "dê" (/ze1/). This pronunciation aligns with southern Sinitic
varieties such as Teochew (/jẽw1/), Amoy (/jũ1/), and Hainanese (/jew1/),
all unequivocally meaning 'goat'; the modern disyllabic compound 山羊
('shānyáng', VS "dênúi", 'mountain goat') reinforces this
interpretation.
It is plausible that 未 ('wèi') descends from an ancient Yue form
approximating /ze1/ or /je1/, entering Chinese through its integration into
the zodiac system. In this context, 未 may have been adapted to transcribe a
foreign term for 'goat', replacing 羊 ('yáng'), which northern nomadic
cultures more commonly associated with 'sheep' (羯) aforesaid . The
Sinitic-Vietnamese "dê" (/je1/) thus preserves a substratal pronunciation
that diverges from Mandarin /wèi/.
Middle Chinese pronunciations of 未 varied considerably—/mwe̯i/, /mĭwəi/,
/miuəi/, /mʉi/, /mʷɨi/, /muj/—and eventually bifurcated into SV "vị"
(/vjej6/, VS southern /zjej6/, 'upcoming') and SV "mùi" (/mʷɨi2/, 'goat').
The phonological shift from /v-/ to /j-/ or /z-/ in VS "dê" suggests a
southern borrowing instead, possibly mediated through an intermediate /wj-/
stage. In this scenario, Mandarin 'wèi' may represent a
back-loan from Old Chinese */mɯds/, as noted in 《說文》: 未,
味也!
The character 未 thus bifurcates semantically and phonetically into SV "vị"
(indicating 'not yet', 'future') , as in "vợchưacưới" (未婚妻 'wèihūnqī
', "vịhônthê") and SV "mùi" ('goat'), as in "NămẤtMùi" (乙未年 'YǐWèiNián',
'Year of the Goat'). It is plausible that 未 was introduced by Yue-speaking
populations of NamViệt or Annam prior to the Old Chinese period. While
neither ancient Chinese nor Vietnamese possessed a native /v-/ onset,
southern dialects likely preserved a form closer to /jej/ or /zjej/.
To further complicating the etymology, the Vietnamese "dê" may also be a
doublet cognate of 羊 ('yáng'), reflected in VS "dê" and SV "dương"
(/jɨəŋ1/), and paralleled in Teochew "yeo" (/jẽw1/), all denoting 'goat'.
These forms reinforce the hypothesis that Vietnamese retains a substratal
lexical layer distinct from northern Sinitic developments, though.
In zodiac reckoning, years such as 1955, 2015, and 2075—formally designated
in Vietnamese as "NămẤtMùi" (乙未年 'YǐWèinián')—are now more commonly
referred to in mainland Chinese usage as 羊年 ('Yángnián', 'Year of the
Goat', VS "nămDê"). Notably, younger Chinese speakers often do not recognize
the calendrical significance of 乙未年, whereas Vietnamese youth remain
familiar with both "NămẤtMùi" and "nămDê". This is reflected in expressions
such as "我 的 生 於 乙未年" ('Wǒde shēng yú YǐWèinián'; "Tôi sanh
NămẤtMùi") and "我 的 生肖 屬羊" ('Wǒde shēngxiào shǔyáng'; "Tôi cầmtinh
conDê"), or simply "我 屬 羊" ('Wǒ shǔ yáng'; "Tôi tuổi Dê")—all conveying
'I was born in the Year of the Goat'.
T his cultural continuity supports the hypothesis that 未 ('wèi')
originated as a Yue loanword in any cases, plausibly reconstructed as /zẽ/
or /jẽ/, and distinct from 羊 ('yáng'), a pictograph depicting the head of a
goat or sheep. The semantic and phonological interplay between 未 and 羊 is
further illustrated in the character 美 ('měi', SV "mỹ" /mej4/,
'beautiful'), where 羊 placed over 火 ('huǒ', 'fire') metaphorically conveys
'beautiful taste'. The etymological links between 美 and 未,
particularly through SV "mùi" (/mʷɨi2/, 'goat')—, reinforce their shared
heritage and suggest that Vietnamese preserves substratal lexical and
symbolic associations that diverge from later northern Chinese
reinterpretations. (未)
These two zodiac cases 卯 and 未 have broader implications for Sino-Tibetan
comparative work. Further analysis could examine Vietnamese cognates such as
SV "ngọ" (VS "ngựa", 午 'wǔ', 'horse') and SV "sửu" (VS "trâu", 丑 'chǒu' <
MC ʈʰuw < OC *n̥ʰuʔ, 'buffalo'). Additional parallels include:
Vietnamese
Gloss
Old Tibetan
Note
cẳng
foot
rkań
Phonological alignment
mắt
eye
mig
Semantic stability
sông
river
kluń
Cf. Viet-Muong */krong/
bò
cow
ba
Lexical continuity
Such correspondences suggest that these terms may have existed in
proto-Vietic or evolved independently before later Sinitic influence. They
open new avenues for exploring Vietnamese affiliations within the
Sino-Tibetan family, as will be illustrated in later sections using Shafer's
comparative wordlists (1966–1974) (S)
Table 1.8: The case of "the Year of the Cat"
According to Nguyễn Cung Thông, the connection between
Mão, Mẹo, and mèo
is quite straightforward: these sounds all belong to the "low-pitched"
tonal category and share the vowel e (as in Mẹo and mèo), which is an older form compared to the vowel a (as in Mão). Examples in VS/SV correspondences include hè/hạ, xe/xa, keo/giao, vẽ/hoạ, mè/ma, chè/trà, beo/báo, etc. The confusion between cats and rabbits in Chinese culture is
evident in the case of Thốtôn
(兔猻), a type of wildcat that is gradually disappearing. This animal,
found in Central Asia, Siberia, Kashmir, Nepal, Qinghai, Inner Mongolia,
Hebei, Sichuan, Tibet, and Xinjiang, is also known as Xálịtôn (猞猁孫) or Steppe cat
in English, and it typically inhabits desert regions.
When the Han people expanded southward and westward, the phenomenon of
"mistaking cats for rabbits" (similar to the Vietnamese idiom "mistaking a
chicken for a quail") became apparent, as seen in the naming of thốtôn. This confusion partly explains why the fourth Earthly Branch (Mão, Mẹo) is associated with cats rather than rabbits in its original context. Thốtôn
(兔猻) is also referred to as dươngxálị (洋猞猁), ôluân (烏倫), mãnão (瑪瑙), or mã nãotặc
(瑪瑙勒). The term xálị
(猞猁) refers to a type of wildcat (lynx). The Sino-Vietnamese word miêu
(貓) means "cat," but in ancient Chinese, miêu
referred to a type of hairless tiger rather than a domestic cat. This
evidence supports the idea that Mão
(卯) was a phonetic transcription of a foreign word (likely an ancient
Vietnamese term) that entered the Chinese language.
The definition of miêu in the Erya (Nhĩnhã) states: "A tiger with sparse fur is called 虦貓 (sạnmiêu)." According to the Ngọc Thiên dictionary, sạn/sàn
(虦) also refers to a cat. The character 虦
(a rare variant written as 虥) denotes a striped wildcat. Meanwhile, Thố/thỏ
(鵵) in its ancient sense referred to a type of bird, and mãn
(梚, a rare character) referred to a type of tree in ancient Chinese
texts. In the Hakka dialect, thỏ is pronounced t'u2 (similar to thổ), which contrasts with the pronunciations of mãn (cat) and thố/thỏ.
To understand why the Vietnamese associate cats with the Earthly Branch Mão
(卯), one common explanation in Chinese sources is that the sound of Mão
when adopted into Vietnamese resembled mèo or miêu
(Sino-Vietnamese for "cat"). Thus, the Vietnamese used cats as the symbol
for this branch instead of rabbits. If mèo
sounded similar to Mão
and was used as the symbolic animal for this branch, it is difficult to
explain why nga
(wild goose or seabird), which is closely associated with Vietnamese life
(fishing, coastal living), and whose ancient pronunciation ngwa resembles Ngọ
(午), was not chosen as the symbol for the Earthly Branch Ngọ. Similarly, the ancient pronunciation of Mùi
(未) for the eighth branch is closer to muỗi
(mosquito), yet the Vietnamese chose goats instead of mosquitoes. There
are many other such phonetic parallels.
Although the Nôm script is relatively "young" for analyzing the phonetic
connections of the 12 zodiac animals, some notable points include the use
of mèo (and meo) with the Sino-Vietnamese character miêu
(貓), as seen in Nguyễn Bỉnh Khiêm's Bạch Vân Thi tập
(1491–1585): "Lẻo lẻo doành xanh con mắt mèo" ("Bright green eyes of the
cat"). Meanwhile, méo in Nôm uses the character Mão
(卯), sometimes with additional diacritical marks, as in Hồng Đức Quốc Âm Thi Tập
(compiled by Lê Thánh Tông, 1442–1497): "Tròn tròn méo méo in đòi thuở"
("Round and round, distorted through time"). Thus, the distinction between Mão and mèo
has existed since at least the Lê dynasty, and the likelihood of confusion
between Mão
(Middle Chinese pronunciation, reintroduced into Vietnam during the
Tang-Song period) and mèo
(ancient Vietnamese pronunciation) is minimal.
The general and natural tendency of human writing systems evolves from
concrete and simple to abstract. For example, animal names are often
extended to more abstract meanings, such as "mouse face" (compared to
"dragon face"), "ox-like body," "eating like a cat sniffing," or
"snake-like temperament." Therefore, deriving mèo from Mão
does not align with this natural tendency; rather, it is more logical for
the concrete term mèo
(animal) to give rise to the abstract term Mão
(timekeeping system, divination). The system of naming specific animals
(simple) familiar to farmers was integrated into Chinese culture and
transformed into a system for recording time and divination (abstract,
complex). This 12-zodiac system flourished as Chinese culture reached its
peak (Qin, Han, Tang, Song dynasties) and influenced surrounding regions,
including Vietnam. This phenomenon of "reverse borrowing" is often
overlooked in Vietnam's case.
In reality, Vietnamese people do not need to overanalyze the natural
connection between Mão, Mẹo, and mèo, just as they do not question the links between Tý (mouse), Ngọ (horse), Hợi (pig), or Sửu
(ox). Unlike Chinese culture, which uses compound terms like Mão Thố (卯兔, "Rabbit of Mão"), Tý Thử (子鼠, "Mouse of Tý"), or Sửu Ngưu
(丑牛, "Ox of Sửu") to emphasize these connections, Vietnamese culture
inherently recognizes the associations between Mão and mèo, Tý and chuột, or Sửu and trâu.
Source: Nguyễn Cung Thông:"Nguồn gốc Việt (Nam) của tên 12 con giáp - Mão/Mẹo/mèo"
Our revised hypothesis, as elaborated etymologically above, is
substantiated by Vietnamese etyma that exhibit direct cognacy with
Sino-Tibetan roots. These etyma appear to descend from other Sino-Tibetan
languages rather than through Chinese transmission. The frequency and
consistency of such correspondences are too numerous to dismiss as
coincidental. Consequently, we propose a novel linguistic classification: a
distinct category termed Sinitic-Vietnamese. This classification may warrant
equal footing with the Sinitic branch itself, given the historical
precedence of Yue substrata over proto-Chinese, as previously discussed.
Moreover, the Vietnamese fundamental words cited in Chapter 10.) demonstrate clear cognate relationships with Sino-Tibetan etyma,
lending further credence to this theorization.
Analytically, the new etymological survey presented in this paper
integrates the historical perspective outlined above, examining linguistic
development through both synchrony and diachrony. This methodology resembles
capturing motion-picture frames in a historical reel—allowing for
fast-forwarding, rewinding, zooming in, and zooming out to contextualize
lexical evolution. However, the chronological placement of certain etyma
remains ambiguous.
For example, "béo" ('greasy') aligns with 油 yóu as in 油膩 yóunì (VS
"béongậy"), illustrating the ¶ /y- ~ b-/ pattern in Mandarin–Vietnamese
sound correspondences. Other examples include 郵 (yóu, SV "bưu", 'postal'),
由 (yóu, VS "bởi", 'because'), 柚 (yóu, VS "bưởi", 'pomelo'), and 游 (yóu,
VS "bơi", 'swim')—all of which conform to the Sinitic-Vietnamese
phonological contour. While such interchanges are plausible, identifying the
latest sound splits depends on the comparative methodologies introduced in
later chapters.
Given that all Vietnamese sister languages in 'China South', including
regional Chinese lects, are classified under the 'Sino-Tibetan' family, how,
then, has Vietnamese come to be categorized as a member of the
'Austroasiatic' family, specifically the 'Mon-Khmer' subbranch? How does
this classification reconcile with the Sino-Tibetan and ancient Yue
etymological evidence presented in this paper?
The challenge lies not in the data, but in the mindset of those committed
to inherited frameworks. Reevaluating Vietnamese classification requires
confronting entrenched assumptions and acknowledging the complexity of its
linguistic ancestry.
V) Cultural integration and beyond
From an ethnic-historical standpoint, theorists within the 'Austroasiatic'
framework have posited that the origins of Vietnamese, both its people and
its language, are primarily traceable to "Mon-Khmer" speakers. This
hypothesis finds support in the composition of Vietnam's ethnic minorities,
whom we classify as later arrivals. Alongside other populations of "Yue"
derivation, both major and minor, these groups now comprise a total of 54
officially recognized ethnicities (as recorded in the 2023 census), with
many communities speaking at least one "Mon-Khmer" language, especially those
inhabiting the western highlands and southernmost provinces of
Vietnam.
With regard to the assertion, within our racial-component perspective, that
the Mon-Khmer elements were merely latecomers, note that Vietnam acquired
its southernmost territory from the ancient Khmer Kingdom only about 325
years ago. In the contemporary era, Vietnam's geopolitical territory is
historically many times larger than the ancient Annamese land of two
millennia ago (excluding the portion once part of the NamViet Kingdom in
what is now annexed to Guangdong Province of China). From an Austroasiatic
viewpoint, modern Vietnam encompasses even more indigenous Mon-Khmer ethnic
minorities inhabiting their ancestral lands for over 2210 years before
present, or since prehistoric times, as some Austroasiatic Mon-Khmer
theorists propose.
Ethnically, as of late 2023, Vietnam’s population surpassed 100 million, with over 85.7 percent identified as the "Kinh" majority. The ancestral roots of the Kinh trace largely to Sinicized 'Yue' emigrants who migrated southward from China into the region now known as northern Vietnam. Over centuries, these groups gradually intermingled with indigenous populations, including 'Chamic' and 'Khmer' communities situated south of the 16th parallel, especially following the 12th century.
Linguistically, on the other hand, evidence from the Sino-Tibetan family
indicates that Sinitic-Vietnamese elements constitute more than 95 percent of
the Vietnamese lexical inventory. This includes not only basic and
foundational vocabulary of Tai origin, recurrent across ancient linguistic
strata, but also a rich array of shared features and structural peculiarities
that remain indispensable in modern Vietnamese usage.
Drawing on archaeological excavation, proponents of the 'Austroasiatic'
school have argued that the Indo-Chinese peninsula serves as the cradle of
"Khmer" ethnogenesis. Within this framework, the indigenous
substratum, reflected in the "Mon-Khmer" foundational vocabulary embedded in
Vietnamese, was reinterpreted as a layer of Chinese loanwords, allegedly
introduced by emigrants from 'China South' who settled in Vietnam. This
theorization was designed to reject the notion that ancient "Yue" entities
represented a veiled Austroasiatic presence.
However, two critical oversights emerge from this dismissal. First, the
"Yue" and Austroasiatic populations may share ancestral ties with the native
inhabitants of 'China South', suggesting a deeper ethnohistorical
convergence. Second, the Vietnamese are a racially composite people: they
include descendants of the "Yue" as well as earlier settlers in the Red
River Delta—those whom Austroasiatic theorists identify as ancient
"Mon-Khmer" speakers who later established autonomous polities in the
southern territories.
Over the past millennium, this demographic mosaic expanded to include
successive waves of "Chamic" and "Khmer" populations crisscrossing in the
region, whose ancestors were gradually assimilated into the region's
geopolitical landscape beginning in the 12th century.
The author's position on this issue is that, although the Austroasiatic
language family may have given rise to the Mon-Khmer languages, it is not
directly ancestral to modern Vietnamese (see Table 1 above).
Anthropologically, prior to the arrival of Mon-Khmer groups from the
southwest, local aboriginal populations and early settlers likely
intermingled in the same region. These groups are believed to have descended
from shared Taic ancestors in the Red River Delta of northern Vietnam, a
region that extended into what is now southern China, as discussed earlier.
This scenario is proposed solely to account for the commonality of several
shared fundamental words between Vietnamese and Mon-Khmer languages.
Chinese sources, anthropological evidence suggests that early immigrants
from both the southwestern and northern neighbors of ancient "Annam", in what
is now northern, Vietnam were present in the northwest long before these
regions were incorporated into Annam's geopolitical domain. These
populations were of mixed "Taic" descent, presumed to be descendants of
ancient "Daic" peoples. A similar pattern recurred with later "Mon-Khmer"
migrants. The integration of these groups into the existing ethnically
diverse population did not significantly alter Vietnam's overall ethnic
composition as the Annam polity expanded westward and southward.
It was not until the late 16th century that the western territories of the
old "Khmer" Kingdom were annexed into Vietnam, with their inhabitants now
classified as ethnic minorities. A comparable process had already occurred
along the central coast, south of the 16th parallel, where Chamic natives
were gradually incorporated between the 12th and 18th centuries.
Archaeologically, this southward expansion contributed to the formation of
contemporary Vietnamese communities stretching from the central coastline to
the tip of Camau Cape. To date, cultural artifacts excavated from these
ancestral lands were neither created by nor exclusively associated with the
forebears who founded Annam nor with modern Vietnamese, and their linguistic
items must be evaluated accordingly. Under linguistic scrutiny, the early
Annamese language appears to have undergone only limited transformation
after prolonged exposure to local speech, presumably of Austronesian Chamic
or Austroasiatic Mon-Khmer language family. In fact, aside from the adoption
of a few local elements, such as placenames and foundational lexicons
encountered along the southward migratory routes, the developments south of
the 16th parallel during that period bear minimal anthropological or
linguistic connection to ancient Vietnamese identity, despite assertions
made by Austroasiatic theorists.
Meanwhile, new Vietnamese nationals, such as late Ming refugees from
Chaozhou, the Teochew people possibly the group underlying the modern
derogatory term "Tàu"), fled southward by boat in large numbers during the
17th century, as Qing Manchurian forces advanced to occupy mainland China.
These refugees eventually resettled in what is now the southwestern region
of Vietnam. As a result, their presence, along with the Teochew language,
has continuously infused the Vietnamese lexicon with new phonological layers
atop the older lexical substrate while the Khmer did a little.
Culturally, Chinese society has long absorbed traditions from northern
peoples, including the Moon Festival, which some attribute to Altaic or
Korean influences from the northeastern frontier, while also retaining
deeply rooted ancestral Yue elements. As noted earlier, one such Yue
contribution is the duodenary cycle of twelve animals, a system that has
served as a chronological marker for years across centuries.
Chinese identity, it is clear, is fundamentally cultural rather than
racial. There is no distinct Han race; instead, we speak of the Han people,
much as older Pekingese once referred to themselves as Qírén (旗人),
reflecting Manchurian or Jurchen ancestry, or as veteran Cantonese still
identify as Tang subjects (唐人), denoting their heritage as citizens of the
Great Tang Empire (大唐帝國), even as they have migrated across the globe.
In practice, the Chinese are an ethnically mixed population, unified by a
shared national identity. This is evident in how overseas Chinese continue
to regard themselves as Chinese, regardless of whether they hold citizenship
in Taiwan, Malaysia, Singapore, Canada, or the United States.
In contrast, Vietnamese national identity encompasses not only the tangible
legacy inherited from the extinct Champa and Khmer kingdoms—their lands,
peoples, languages, and cultural artifacts—but also the intangible spirit of
nationalism and valor passed down from generations who resisted repeated
Chinese aggressions. Anyone who has read all 72 volumes of Bo Yang's edition
of Sima Guang's Zizhi Tongjian (資治通鑑, 1983–93), which chronicles Chinese
governance from antiquity through the Song Dynasty (宋朝), will have
encountered the harsh realities imposed by successive Middle Kingdom regimes
upon their own subjects. These narratives reveal the suffering of commoners,
including those in the colonized vassal state of ancient Annam, and help
explain why modern Vietnam endures as a nation sustained by a resilient
national spirit—a collective will to resist foreign domination and preserve
cultural integrity.
Vietnam, uniquely, is remembered for having repelled three Mongol invasions
led by Genghis Khan and his heirs, who had previously shattered the Song
Dynasty and established the Yuan Dynasty (元朝) on Chinese soil, a regime
that endured for nearly a century.
Here, nationalism refers to the indomitable spirit of the Vietnamese people
and their hard-won independence, a spirit they have consistently defended.
This fervent nationalism has shaped their anthropological identity,
especially their national language. It helps explain why many Vietnamese
reject genetic affiliation with the Chinese and question aspects of the
Austroasiatic theory, instead affirming an ancestral connection to the Yue,
a non-Chinese lineage, an interpretation steadfastly upheld by patriotic
Vietnamese scholars.
In an ethnically diverse society, elements assimilated into the
Vietnamese melting pot emerge distinctly as Vietnamese, regardless of
whether a person is of Chinese, Chamic, or Khmer descent. The history of
the nation known as Vietnam is a chronicle of descendants from those who
arrived either as conquerors or as refugees fleeing hunger and
oppression from the north. Their long southern journey, culminating at
the tip of the Indo-Chinese peninsula, spanned nearly ten centuries
during which they waged continuous wars against northern and southern
external enemies, beginning as early as 939 A.D., in the relentless
pursuit of national sovereignty.
Vietnamese history is shaped not only by resistance wars but also by
ongoing patterns of immigration and emigration, much like China's.
Consider Taiwan, where modern migration trends mirror those familiar in
Vietnam: successive waves of Chinese migrants from the mainland settled
over generations, while hundreds of thousands of Vietnamese women
married into Taiwanese families. This long-standing exchange continues
today.
In other words, the history of the Vietnamese people is also the story
of descendants of racially mixed immigrants from southern China. These
groups included refugees fleeing war-ravaged regions, as well as outcast
proletarians from newly affluent provinces. Notably, fledgling Ming
loyalists, escaping execution after the Manchurian conquest and the
founding of the Qing Dynasty (1644–1912), contributed to Vietnam's
migratory mosaic. This is reflected in the prevalence of Chinese surnames among the Kinh
majority.
In the 21st century, Vietnam continues to receive immigrants from its
northern border with China, including economically disadvantaged
laborers and so-called 'technical' workers, many of whom, critics argue,
form a 'Chinese fifth column' after overstaying their visas. Regardless
of origin, many Chinese emigrants from inland provinces along the
northern frontier have, over time, come to identify as Vietnamese. Since
the 1990s, over one million new migrants from mainland China have
settled permanently in Vietnam, often through marriage into Vietnamese
families, a trend well documented at annual gatherings of Chinese
expatriates.
The formation of the Kinh majority was shaped not only by immigration
but also by domestic emigration. Hanoi, much like Shanghai, underwent
significant demographic shifts as its original residents relocated—some
moving south during the great migration of 1953–54, others departing
overseas after the Vietnam War ended on April 30, 1975. As middle-class
urban dwellers left in search of opportunities abroad, their absence was
gradually filled by incoming villagers, who arrived as new migrant
laborers to occupy the growing vacancies in the city.
Taken together, these demographic shifts reveal that modern Vietnamese
identity, and the Vietnamese language, cannot be traced solely to
Mon-Khmer origins. Instead, contemporary Vietnam reflects a complex
mosaic of ancestry. Its citizens are primarily of mixed Chinese descent,
tracing back to the ancestral Yue of Zhou Dynasty vassal states and the
Yue-influenced Han of the Chu region more than 2,100 years ago. They
also carry genetic contributions from native Mon, Chamic, and Khmer populations from the 12th century onward, along with
more recent admixtures, such as Euro-Asian children born to American
servicemen during the Vietnam War (1965–1975), which added over 50,000
individuals to South Vietnam's population of 20 million by 1975. This
extensive intermingling underscores the profound racial mixing that
defines Vietnam.
Linguistically, Austroasiatic theorists have pointed to Mon-Khmer basic
words in Vietnamese as evidence for their theory. For example, their
numerical presence in the range from one to five, these items do not
align with Vietnamese counting from six to ten at all, and they bear no
genetic relationship to the core vocabulary. Like any living language,
Vietnamese has absorbed a wide range of loanwords over time, including
those from Daic, Thai, and Malay, as well as English and French,
alongside contributions from the Austroasiatic family.
Statistically, the rate of foreign lexical infiltration in Vietnamese
remains modest. Even the decade of active American presence during the
Vietnam War failed to significantly reshape the language, leaving only a
small set of persistent English terms, such as 'hello', 'okay', 'bye-bye',
'number-one', 'one-two-three', 'snack-bar', 'cowboy', '(bus)boy', 'hippy',
and 'jeep', in stark contrast to the enduring Sinitic influence.
In fact, the situation became somewhat farcical when certain French
institutions sponsored Vietnamese scholars to publish works on French
influence in Vietnamese, including one that argued for a French origin
of select Vietnamese words (see Cao Xuân-Hạo, 2001). Had the French
colonial presence in Annam lasted longer, it is conceivable that roughly
400 French loanwords might have entered mainstream usage. By proportion,
French loanwords, remnants of the 96-year colonial legacy ending on July
20, 1954, number several hundred in Vietnamese (see APPENDIX A (5)). Common terms in some Vietnamese circles, such as 'moi' 'I', 'toi'
'you', 'monsieur' 'mister', 'madame' 'madam', and various modern
grammatical constructions, of course, do not reflect a deep-rooted
etymological bond.
This stands in contrast to entrenched Chinese pronunciations in
Vietnamese, such as "anh" (兄 xiōng, SV "huynh", 'brother'), "em" (俺 ǎn, SV "am", 'younger sibling'), "chị" (姊 zǐ, SV "tỷ", 'sister'), "cô" (故 gū, SV "cô", 'miss'), and "mẹ" (母 mǔ, SV "mẫu", 'mother'), including the many
modern Chinese loans that remain popular today, including "bảotrọng"
(保重 bǎozhòng, 'take care'), "đảmbảo" (擔保 dànbǎo, 'guarantee'),
"thịphạm" (示范 shìfàn, 'demonstrate'), "đạocụ" (道具 dàojù, 'prop
set'), and "giaođãi" (交待 jiāodài, 'to brief').
VI) Key contributions to linguistics
Anthropologically, in addressing the origin of Vietnamese etymology,
the author advances an independent argument grounded in data analysis to
counter the claims put forth by the Austroasiatic linguistic camp, which
he regards as having introduced a distracting agenda into the debate.
Advocates of this camp approach the issue from a southern geospheric
perspective, focusing on regions where the Austroasiatic boundary
intersects with the Austronesian racial substratum—particularly among
Chamic populations in the Indo-Chinese peninsula—and extending across
the archipelagos of Malaysia and Indonesia, the western islands of the
Philippines, and Taiwan, formerly known as Formosa.
Why did Austroasiatic theorists group the Vietnamese language into the
Mon-Khmer branch in the first place? The Austroasiatic hypothesis took
root largely because the Mon-Khmer populations dominated the Indo-Chinese
peninsula and permeated deeply into the local demographics. Additionally,
this hypothesis emerged during the 'gold rush' era of historical
linguistics in the late 19th century, when Western linguists were yet to
hear of the Yue people and their linguistic legacy. By contrast, Mon-Khmer
speakers in Southeast Asia resonated with the grandeur of the ancient
Khmer Empire, a past that captured admiration and envy. This led to the
creation of the Viet-Muong subdivision within the Austroasiatic Mon-Khmer
linguistic subfamily as scholars sought connections among these groups.
In response, the author firmly establishes the theory that the
Vietnamese people descend primarily from ancient Yue ancestry in
southern China, having intermixed with Han settlers during the
millennium of Chinese domination following 111 B.C. As the Annamese
polity expanded southward into what is now central Vietnam, further
admixture occurred with Austronesian Chamic and Austroasiatic Mon-Khmer
populations. Consequently, the modern Vietnamese population reflects a
racially composite lineage shaped by centuries of migration,
integration, and cultural synthesis.
That is the author's anti-thesis of what the Austroasiatic Mon-Khmer
theorists have ever argued about the Sinicization of indigenous
Mon-Khmer people in ancient Annam that is the real process that produced
the Vietnamese identity. This viewpoint largely ignored the recorded
history of Yue people, considered ancestors of early Annamese
populations, who had advanced further south and bridged the
anthropological gap leading to modern Vietnamese fusion. According to
the Austroasiatic camp, the intermingling of Mon-Khmer people with
Chinese resettlers during the colonial period was the origin of the
Vietnamese. They claimed that Mon-Khmer peoples from the Indo-Chinese
peninsula were the direct ancestors of modern Vietnamese. Crucially, the
'Vietnamization of the Mon-Khmer' factors seemed overlooked, possibly
because the timeframe of when Mon-Khmer groups purportedly arrived in
the Red River Basin, already inhabited by Daic populations, remains
vague.
While archaeological findings in Central Vietnam further affirm that the
inhabitants prior to these migrations bore no ancestral connection to the
Vietnamese. Historically speaking, early Vietnamese emigrants ventured into the
southern Indo-Chinese peninsula only after the 12th century, where they
first mixed with the Chamic people. This mixing was facilitated by the
concession of two Chamic prefectures as a gift to the Tran Dynasty
through royal interracial marriage between the King of Champa and a
Vietnamese princess, Huyềntrân Côngchúa. That is how the late Vietnamese
appear along the stretch of the Vietnam's central coastline and
southwestern part.
Intriguingly, the Austroasiatic hypothesis aligned neatly with domains
historically attributed to the Yue as recorded in ancient Chinese
annals, a coincidence that blurred distinctions between Yue and
Austroasiatic entities. The Austroasiatic Mon-Khmer theorists discreetly
adopted this notion while sidestepping the complexities of
Sinitic-Vietnamese linguistics. It was certainly simpler to identify a
set of basic words shared by Mon-Khmer and Vietnamese and then draw
conclusions about their shared roots, rather than confronting more
intricate etymological challenges.
In effect, it has often proved formidable and challenging for many
Western-educated scholars to delve deeply into ancient Chinese classics
to uncover the intricate etymological roots of Vietnamese. While their
linguistic expertise often excelled in the realms of proto-Chinese, Old
Chinese, and Middle Chinese, utilizing phonetic sound rules and
methodologies, this approach fell short in the case of Vietnamese, both
historically and in contemporary studies.
Unsurprisingly, it was not until the early 20th century that Sinology
became an established discipline, and even then, very few scholars could
confidently substantiate the connection between Sinology and the
exploration of Vietic roots. Renowned linguists such as De Lacouperie,
Maspero, Haudricourt, Shafer, Forrest, and Karlgren were among the
select few whose work pointed to Sinology as a vital key for
understanding Vietnamese etymology. Without a deep knowledge of Chinese
language and history, no one could reliably offer a comprehensive view
of Vietnamese linguistic origin.
Despite these competing frameworks, the broader picture can be
synthesized by integrating the perspectives of Yue and Austroasiatic
Mon-Khmer into one concept, the "Bod" (Terrien De Lacouperie. 1887). It is conceivable that Indo-European theorists may have deliberately
substituted the term Yue with Austroasiatic in order to reframe
aboriginal Yue entities along a continuum that aligns with established
historical linguistic models. This interpretive shift, whether
intentional or methodological, echoes earlier typological depictions
found in the works of T. D. Lacouperie (1887) and R. A. D. Forrest
(1948).
Geographically, in fact, by substituting the terminology
'Austroasiatic' with the Yue ("Bod" or "BáchViệt") , the author
traces the movements of early indigenous Yue emigrants —LuoYue (雒越), OuYue (歐越) or Xi'Ou (西甌), and MinYue (閩越) or
Dong'Ou (東甌), as well as racially mixed groups like the Qin-modified
Shu (巴蜀 BaShǔ, "BaThục"), Yue-modified Chu (楚, Chǔ, "Sở"), Yue-modified Han (漢), Hakka (客家,
Kèjiā or "Cácchú"), Hokkien, Hainanese, Cantonese, etc.—from China South to northern Vietnam across vast areas Southeast Asia and beyond. These groups advanced southward, resettled, and
intermingled with native inhabitants along their journey, and in the
case of Vietnam, fusing with the Chamic and Mon‑Khmer peoples. In a sense, this process is encapsulated in the official name
"Việtnam", which first appeared in 1802. This designation can also be
read as a reverse form of "NamViệt", meaning 'the Việt of the South',
which usually misinterpreted as 'to surpass in the south' or 'advance southward'.
Such connotations highlight the migratory pattern of the ancestral Yue,
whose emigration from China South became more pronounced around 300 B.C.
in response to Qin expansion (Lu Shih Peng, 1964).
Figure 1.5: Map of the historical ancient proto-Chinese migratory
routes Source: Multiple sources on the internet
The author's perspective on the southward geo-spherical migration of the Yue
originating from a northern axis and radiating toward the southern
hemisphere can be expanded without invoking competing theories regarding the
origins of Austronesian populations, whose dispersal spans the eastern
hemisphere over a timeline of 3,000 to 4,000 years, as supported by
available historical records. (A) This framework
aligns with archaeological evidence indicating that the Yue were not the
exclusive creators of bronze drums, artifacts that have also been unearthed
in the Shu State (蜀國) of Sichuan in southern China and across parts of
Indonesia. In these regions, Austronesian interpretations have informed
alternative hypotheses, including the Austro-Thai theory. Fundamentally, all
southern migratory trajectories appear to originate from northern sources.
Practically speaking, the Austroasiatic hypothesis overlooks alternative
perspectives on the proto-Yue presence, which extended as far northeast as
the Yangtze River and up to the Yellow River basin. For example, the
proto-Yue were present in the ancient Lu State (魯國) within Shandong
Province (山東), as suggested by the broader ethnological framework of the
Taic-Yue stock originating from the Chu State (楚國) near present-day
Hubei (湖北) and Anhui (安徽) provinces. Vietnamese legends, too, recount
that their earliest ancestors emerged from the Dongtinghu Lake area
(洞庭湖) in Hunan Province (湖南), south of Hubei. Together, these regions
form a contiguous zone representing the racial principality of the Taic
stock.
The author's postulated frameworks for both "Yue" and "Austroasiatic"
theories are further synchronized with available long-standing ancient
Chinese legends and history. Different tribes of the ancient Taic-Yue people spread both eastward
and westward, contributing to the racial composition of the pre-Qin (先秦)
era, which is backed by evidence includes early human fossils discovered
in ancient Sichuan Province, where the Bashu State (巴蜀) was once located. These tribes collectively introduced new
cultural elements to the pre-Han (前漢) populace, with
the key difference being in name changes over time. Notably, the
first monarch of the Han Dynasty, Liu Bang, along with his generals and
followers, were originally subjects of Chu (楚) as repeatedly emphasized.
Had the last Duke of Chu, Xiang Yu (項羽), defeated Liu Bang in the
decisive battle, the dynasty might well have been named 'Chu' rather than
'Han'.
As previously mentioned, after the Han forces defeated Chu, the subjects
within the Han Empire's periphery gradually came to identify as Han people
(漢人 Hànrén), a process that took considerable time. This marked the
emergence of the Chinese Han from a racially mixed population composed of
pre-Han peoples and Taic-Yue descendants. These included groups from six
ancient states conquered and unified under Qin rule in 221 B.C. The racial
composition of the Chu subjects primarily consisted of Taic-Yue
descendants, who in turn gave rise to the Southern Yue tribes (百越
BǎiYuè, SV "BáchViệt", 'Bod') through various historical stages spanning the Zhou, Qin, and Han
periods.
In essence, the Vietnamese ethnogenesis reflects a layered process:
rooted in ancient Yue ancestry from southern China, subsequently
intermixed with Han settlers during a millennium of Chinese rule beginning
in 111 B.C. As the Annamese advanced into central and southern Vietnam,
further admixture occurred with Austronesian Chamic and Austroasiatic
Mon-Khmer populations. The result is a modern Vietnamese demographic
profile shaped by centuries of migration, integration, and cultural
synthesis.
The demographic evolution of ancient Annamese populations initially
paralleled that of other Southern Yue-descended groups, including the
Cantonese (粵), Fukienese (閩越, 'Hokkien'), and WuYue (吳越). Yet this
resemblance proved short-lived. The Vietnamese historical trajectory
diverged markedly under prolonged Chinese domination, spanning from
235 B.C. to 939 A.D., punctuated only by brief episodes of autonomy.
Following the 12th century, the emergent Annamese polity began a sustained
southward expansion beyond the 16th parallel, gradually consolidating its
territorial reach over the next 1080 years. This arc culminated in 1989,
when Vietnam withdrew from Cambodia (formerly Kampuchea) and restored its
pre-1979 borders.
Figure 1.6. The distribution of indigenous languages before the
Vietnamese
Map of the Austroasiatic languages per the Austroasiatic view Source: Multiple sources on the internet
x X x
The nature of a people's mother tongue, as commonly perceived, often
reflects their racial composition, and vice versa. The Austroasiatic
Mon-Khmer hypothesis for Vietnamese appears to align with this notion. A
playful way to frame this theory is to liken the Vietnamese language to
the product of a "forced marriage" between Mon-Khmer and Chinese
influences. From an anthropological standpoint, the prolonged colonization
of early Annamese populations might reflect a dynamic of role reversals:
the "guests" (early Kinh settlers) ultimately became the new sovereign
majority, while the indigenous natives assumed subordinate roles in their
own land, newly annexed into a foreign state.
As life progressed in the resettlement, separate from mainland China, let
us envision a "what-if" scenario. Imagine a family of new homeowners
moving into a residence previously inhabited by others. While settling in,
the new occupants discover cultural artifacts buried on the property. The
head of the household could easily claim ownership of the artifacts, but
it would be dishonest to present them as ancestral heirlooms, treasures
passed down by their forebears. Meanwhile, their descendants adopt new
surnames, such as
Phạm or Trần, except for cases of Chamic or Khmer
heritage, marked by surnames like Chế or Thạch. This
illustrates how the Vietnamese identity absorbed not only Chinese surnames
from a broader set of Chinese-origin names but also names rooted in Chamic
or Khmer lineage.
Linguistically, a nation's language does not always reflect the tongue
spoken by its ancestors. Analogous phenomena exist worldwide: for
instance, modern French is distinct from the Gaulish language of ancient
France, and people in former French colonies like Morocco or Haiti
continue to speak French, albeit with distinctive local accents. For the
Austroasiatic view, rooted in the heritability of language based on racial
identity, to hold water, Vietnamese speakers would need to be "racially
pure" Mon-Khmer, or at least comparable to the Muong linguistic stock.
However, this does not seem to align with the evidence, just as Cantonese
and Fukienese remain grouped within Chinese dialectology despite their
divergence. To enforce such a standard would risk undermining broader
notions of national identity, particularly for larger nations such as
China.
It is also worth recalling that modern Vietnamese as a fully formed
language did not emerge until after Vietnam gained independence from China
in the 10th century
Etymologically, the commonalities in certain basic words can be explained
as the result of linguistic contact. Words from one Mon-Khmer language
spilled over into the Muong subdialects, which in turn influenced the
Vietnamese language. This was made possible by their geographical
proximity, particularly in mountainous regions further south, where
aboriginal populations retreated in the face of Chinese occupation and
Sinicization. Even though mutual intelligibility between Viet-Muong and
Muong languages waned long after their split, Muong speakers have remained
anthropologically and culturally connected to the Vietnamese Kinh as
neighboring kin in many ways.
Additionally, these shared basic words spread between Vietnamese and
Mon-Khmer languages through everyday activities such as trade, bartering,
agricultural exchanges, handicraft production, and shared farming
practices. In other words, while the Kinh collaborated with Chinese
occupiers, they also maintained ties with other diasporas within their
territory. This interaction bridged linguistic gaps between Vietnamese and
Mon-Khmer languages. These encounters trace back to prehistoric times,
starting with the first wave of Mon-Khmer speakers moving into the Red
River Delta from southwestern Lower Laos (Nguyễn Ngọc-San, 1993, p.
43).
Methodologically, Austroasiatic linguists grouped related basic etyma
spanning many Mon-Khmer languages into a broad linguistic spectrum of
mixed elements. However, some of the Mon-Khmer basic words found in
Vietnamese also have cognates in Chinese and Sino-Tibetan languages,
referred to in this paper as Sinitic-Vietnamese words. Many fundamental
etyma in Vietnamese reveal roots in Yue-related languages like Cantonese,
Teochew, Hainanese, and Fukienese, as well as Sino-Tibetan etymologies,
further complicating the Austroasiatic hypothesis. The Austroasiatic
theorists appear to have grouped these elements under the Mon-Khmer
umbrella without addressing their potential origin elsewhere, while the
pervasive influence of Khmer served as an overwhelming and, at times,
unchallenged foundation for their claims.
The Austroasiatic theory emerged with its Mon-Khmer linguistic subfamily
as a focal point but has also engaged in dismissing the Sino-Tibetan
theory, which predates it and posits an alternative root for the
Vietnamese language. The issue of linguistic affiliation thus involves not
only Austroasiatic Mon-Khmer versus Yue but also Sinitic-Vietnamese versus
Sino-Tibetan frameworks. This dynamic is further complicated by the vast
number of Sino-Tibetan cognates in Sinitic-Vietnamese and the unique
linguistic features shared between Vietnamese and Chinese. While it may be
simpler to accept that ancient Annamese developed from a Yue linguistic
foundation layered upon a Taic base, the claim that certain basic lexicons
in the Viet-Muong subdialects could be loanwords from neighboring
Mon-Khmer languages aligns with the understanding that these languages
were part of a broader family spanning southern China hundreds of years
ago.
In any case, whether or not the Vietnamese language belongs to the
Sino-Tibetan linguistic family, Austroasiatic Mon-Khmer theorists remain
focused on genetic classification, proposing that Austroasiatic Mon-Khmer
is the mother language that gave rise to Vietnamese, as their theory
asserts. Meanwhile, the Sino-Tibetan camp highlights the Sinitic affinity
of Vietnamese, as explored in this paper, tracing its historical
foundations back approximately 3,000 years, a timeline notably absent in
the prehistoric Austroasiatic Mon-Khmer framework.
Regarding the timeframe in historical linguistics and their
affiliations, Merritt Ruhlen, in The Origin of Language (1994 [1944]), quotes Hans Henrich Hock:
"We can never prove that two given languages are not related. It is always
conceivable that they are in fact related, but that the relationship is of
such an ancient date that millennia of divergent linguistic changes have
completely obscured the original relationship.
Ultimately, this issue is tied up with the question of whether there was a
single or a multiple origin of Language (writ large). And this question can be
answered only in terms of unverifiable speculations, given the fact that even
the added time depth provided by reconstruction, our knowledge of the history
of human languages does not extend beyond ca. 5,000 B.C, a small 'slice'
indeed out of the long prehistory of language. " (Hock 1986:566).
In his work, the author explores both perspectives, i.e., genetic
affinity and historical settings, and this research brings home many basic
words to be in line with Chinse-Vietnamese interchanges with more than 400
fundamental lexical items from a wide range of Sino-Tibetan etymologies.
To make their cognacy more plausible, they are propped up with elaboration
on commonly-shared Chinese linguistic peculiarities in Vietnamese to
substantiate the core matter of the Sino-Tibetan theory as presented in
this paper.
The author also set up new methodological foundations to approach his Sino-Tibetan Sinitic-Vietnamese
theory. I n contrast to the accelerated information-gathering capabilities
of the Artificial Intelligence (AI) era, the author's research methodology originates in the pre-internet
age, grounded in traditional scholarship. His findings were documented the
old-fashioned way, that is, through direct engagement with printed books,
hundreds of them, examined one at a time, page by page and line by line.
Each insight was manually recorded on index cards, extracted from a vast
corpus of publications. As of 2025, only about one third of these titles
have been entered into the Bibliography and References, with compilation
still ongoing—a time-consuming but meticulous task.
Over the span of more than 20 years, the author accumulated over 20,000
research notes by the year 2000, just as the world was entering the full
momentum of the internet-information era. These notes were not simply
archived but deeply internalized, forming a durable cognitive framework
that continues to shape his analytical process. Rarely needing to revisit
the original index cards, he has mentally constructed from this foundation
a comprehensive perspective on the structural and semantic essence of
Vietnamese etymology, recovering a linguistic heritage that had long
receded from scholarly view.
With this substantial body of evidence, he is now systematically
assembling corroborative data to support the argument that the majority of
cited ST Sinitic-Vietnamese lexical items can be traced to at least one
cognate in Chinese, thereby reinforcing the historical and linguistic
continuity between the two traditions.
Digitally, the author entered the new electronic era and continues on to
advance the project through incremental releases on his website,
prioritizing mobile accessibility and modular presentation. This format
ensures broad public access while preserving editorial clarity and
semantic precision, allowing the research to scale without compromising
its methodological rigor.
The overarching goal remains eventual publication in mainstream print,
aimed at advancing a thesis that invites renewed linguistic inquiry into
the Sino‑Tibetan continuum. Central to this thesis is the proposition that
modern Sinitic lects emerged as branches shaped primarily by the fusion of
Taic‑Yue substrata with proto‑Tibetan, culminating in proto‑Chinese and
the layered pre‑Qin-Han linguistic strata across ancient China. By
analogy, the integration of Yue elements into the Annamese sphere during
the Qin and Han periods likely contributed to the genesis of early
Vietnamese, crystallizing around the 10th century.
Perceptually, regardless of the final contours of this narrative, the
author asserts that languages must be approached as holistic, living
systems, to be understood in their full contemporary complexity rather
than reduced to their earliest reconstructible stages, whether 3,000 or
5,000 years prior. This view parallels our treatment of English: not
merely as an Indo‑European relic, but as a dynamic amalgam of Anglo‑Saxon,
Germanic, Norman, Romance, Latin, Greek, and other influences altogether
that have shaped its present form.
Methodically, this paper's technical arrangement is designed to engage
both novices in Vietnamese historical linguistics and specialists in
Chinese and Vietnamese philology. Vietnamese learners with a solid
understanding of Mandarin (M), known in Vietnamese as "Quanthoại" (QT,
sometimes referred to as "tiếng Quanhoả" or 官話 Guanhua), and now
officially named as 'Putonghua' (普通話), and a foundational grasp of
historical linguistics will likely find this study particularly enriching.
While certain explanations may seem overly detailed or repetitive to
highlight widely recognized points, already familiar to experts, or
occasional gaps might challenge general readers (普), these choices aim to strike a balance that caters to diverse audiences.
Introductory resources on historical linguistics can be found in the
bibliography at the end of this paper.
This introductory section aims to establish rapport with readers and
clarify the author's perspective ahead of the detailed scope of this
study, forging an academic connection with those seeking insight into the
linguistic origins of Vietnamese. It does not present itself as a formal
scientific paper replete with data tables and statistical modeling;
rather, it offers a narrative exploration of the Vietnamese language,
framed under the Vietnamese subtitle Ýthức Mới Về Nguồngốc TiếngViệt—literally, "New recognition on the origin of Vietnamese."
For readers who remain skeptical yet intrigued by the etymological ties
between Vietnamese and Chinese, it may be worthwhile to await the full
book publication of this research. Printed works often invite deeper
engagement and a more sustained openness to complex arguments. It is
unlikely that the full contours of this inquiry will be patiently absorbed
in an online format. In print, however, the material may be approached
with greater impartiality, supported by quotable evidence—contingent, of
course, on the author's success in securing a reputable publisher.
Human nature tends to favor ideas that resonate with instinctive beliefs.
To fully appreciate the insights offered here, readers ideally should
possess a foundational understanding of the historical interplay between
Vietnam and China. That said, newcomers are welcome, though, provided they
bring sincere curiosity to the subject. This research, at its core, is a
retelling of history, a compelling linguistic and cultural narrative that
may captivate both seasoned scholars and engaged lay readers alike.
It would be ideal if we shared mutual interests, as this commonality
builds trust and belief, akin to the solidarity of believers in the same
religion. A new theory typically begins with foundational premises,
facts, quotations, supportive evidence, rules, paradigms, analogies,
logic, etc., from which we adopt a shared perspective, accepting them as
a basis for further discussion as we progress. For instance, if we
propose that 雞 (jī, SV kê) = VS "gà" (chicken) alongside 蛋
(dàn, SV đản) = VS "trứng" (egg) are cognates of indisputable
origin, both likely stemming from Yue roots simply because they all
exist long before the Chinese, then there is no need to belabor proof.
Once accepted as premises, our focus shifts to examining whether the
bird originated from the south or the north, or even delving into the
age-old question of whether the egg predates the chicken.
From a historical linguistics standpoint, it is apparent that, in cases
of language contact between two groups, the dominant language tends to
assimilate the less dominant one over time. The details of this process
hinge upon factors such as the prowess, population size, and cultural
sophistication of the groups involved. For example, the language of a
conquering population, following an extended period of bilingualism,
ultimately becomes adopted by the subjugated group (Roberts J. Jeffers,
et al. 1979, p. 142).
While novices in historical linguistics may initially struggle to
embrace such reasoning, this reflects the realities of history.
Specifically, in the Sinitic-Vietnamese context, the process of
Sinicization in Annam spanned hundreds of years. The author has no
intention of debating with detractors who have contested his arguments
in the past or will do so in the future. Likewise, the author does not
aim to recruit followers among those resistant to the Sinitic-Vietnamese
theory, as there is no absolute truth in historical linguistics, it is
shaped by interpretation and perspective. People's beliefs are often
deeply entrenched, guided by instinct or predisposition. Comments such
as "Chinazi propagandist!", "Wikipedia sources are unreliable and
unquotable!", or "Bogus views!" are predictable reactions from those
opposing this perspective, especially in the AI era.
To such readers, the author encourages them to leave this forum if they
do not align with what has been discussed thus far. This work seeks
engagement, not discord.
How did the author arrive at this juncture in his etymological
exploration? Admittedly, he is not formally trained as a historical
linguist specializing in Vietnamese. Yet his journey began with fortunate
exposure to foundational linguistics courses taught by three towering
figures in the field: Professors Nguyễn Tài Cẩn, Hoàng Tuệ, and Bùi Khánh
Thế, renowned scholars at the former Saigon University during the late
1970s. What began as academic curiosity soon evolved into a lifelong
devotion to the study of Vietnamese etymology and its Sinitic
underpinnings.
The author vividly recalls his first assigned project under Professor
Hoàng Tuệ: an inquiry into the term "tiếng"
('sound') in Vietnamese. This deceptively simple word encapsulates a
constellation of meanings: sound, morpheme, syllable, word, and language.
His comparative research into its Chinese counterpart 聲 (shēng, SV
"thanh") proved transformative. The semantic breadth of 'shēng', especially its appearance in expressions like 蠻聲 (Mánshēng),
referring to "tiếngMôn"
in the Shaozhou Tuhua (韶州土話) dialects of Guangdong, Hunan, and
Guangxi, revealed a profound linguistic resonance across cultural
boundaries.
For the author, tiếng
and 聲 have become the "Đạo"
(道 Dào, 'the Way') through which the vaults of Sinitic-Vietnamese
etymology are unlocked. This guiding principle has propelled him to
examine other Chinese characters whose meanings stretch far beyond their
conventional semantic domains. Such is the enchantment of language: a
system at once rigid and fluid, historical and living.
Why does the author advocate so confidently for a Sinitic hypothesis
while remaining skeptical of Austroasiatic models? The interplay between
Chinese and Vietnamese is intricate, and the author approaches it with a
blend of scholarly rigor and tongue-in-cheek candor for that is a
hypothesis rooted in experience and observation. Few are
willing to wade deeply into these debates, which often resist definitive
resolution. Linguistic affiliation theories, especially those modeled on
Indo-European paradigms, tend to falter when applied to Austroasiatic
contexts. Dissenters from orthodoxy are sometimes dismissed as
"uninformed", yet linguistics is a field where science, history, and human
insight converge. Progress often comes from those bold enough to challenge
prevailing narratives. Fueled by enduring fascination, the author
has spent decades immersed in self-directed study of Vietnamese and
Chinese historical linguistics. His efforts have culminated in the
painstaking construction of an online dictionary of Nôm words of Chinese
origin, an annotated repository of Sinitic-Vietnamese etyma built one
entry at a time.
Over the past thirty years, the author's exposure to Chinese has deepened
through scholarly engagement and personal life. His mastery of modern
Mandarin (' Putonghua' ) has been shaped by daily conversations with his
Chinese-native wife, extensive reading of Chinese literature, and regular
consumption of Chinese media, from satellite broadcasts to contemporary
dramas on YouTube. This sustained immersion has sharpened his insights
into the etymological ties between Chinese and Vietnamese. What
captivates him most is the striking proximity, beyond mere lexical
overlap, between modern Mandarin expressions and their Vietnamese
counterparts in everyday usage. These parallels, observed in sitcoms and
colloquial speech, reinforce his conviction that the linguistic bond
between the two languages runs deeper than traditional Chinese linguistics
often acknowledges. This affinity also surfaces in classical Chinese
novels dating back to the 12th century, suggesting a long-standing
intertextual and intercultural dialogue.
The author believes that any Vietnamese scholar fluent in modern Mandarin
and equipped with an etymological lens would likely recognize the validity
of this perspective. Yet he cautions against reducing Vietnamese to a mere
Yue-descendant variant, akin to Cantonese, Fukienese, Zhuang, or Daic.
These languages, shaped by centuries of Chinese rule, have undergone
extensive Sinicization, often to the point of being subsumed within the
Sino-Tibetan classification, especially the modern Kadai-Daic languages.
Vietnamese, however, resists such simplification. Its historical
trajectory and linguistic architecture demand a more nuanced and
independent recognition.
Admittedly, this endeavor is not without its moments of monotony. The
author often finds himself wondering why he has committed so deeply to
this pursuit; what is he doing? There is no material reward awaiting
him. Who, after all, truly cares whether a Chinese etymon is of Yue
origin, or vice versa? Regardless of the outcome, Vietnamese will likely
continue to be classified under either the Austroasiatic or Sino-Tibetan
linguistic family. Yet, as long as the author retains the energy and
passion to press forward, he says: let us continue this journey together,
until the day he can no longer do so.
Like a pilgrim in search of sacred revelations, the author approaches
this etymological journey with wonder and resolve. Each discovery, whether
a breakthrough or a setback, deepens his understanding of the Vietnamese
linguistic landscape. Years of exploring Chinese historical linguistics
have fueled his curiosity about China's linguistic past. This experience,
akin to the fascination English learners feel when delving into Greek,
Latin, and Romance languages, has broadened his grasp of Vietnamese
etymology while enriching his knowledge of Chinese itself.
In earlier stages of his research, the author accepted the prevailing
view among Vietnamese specialists that emphasized the Mon-Khmer
connection. That perspective, however, belongs to the past. With time,
experience, and sustained inquiry, he has cultivated a more nuanced
understanding. As he delves deeper, he observes that Vietnamese shares
more linguistic commonalities with Sinitic languages than Sino-Tibetan
languages do within their own family. These parallels extend beyond basic
vocabulary and into the realm of expressions and structural features that
historical linguists use to establish genetic affiliations.
In fact, the Austroasiatic school focuses primarily on shared elements
between Vietnamese and Mon-Khmer languages. Yet it overlooks key findings
in Sino-Tibetan etymology, particularly the basic words that appear
consistently in both Chinese and Vietnamese over time, which Austroasiatic
theorists often attribute to Mon-Khmer origins. In light of the many
Vietnamese words that align etymologically with Sino-Tibetan languages,
the proposed Mon-Khmer connections lack essential linguistic features,
including disyllabicity and tonality, hallmarks shared between Vietnamese
and Chinese. These traits overwhelmingly outweigh the evidence presented
by Mon-Khmer proponents within the Austroasiatic camp.
If Vietnamese linguistic characteristics were systematically tabulated
and compared in detail alongside those of Chinese historical linguistics,
it would become evident that Vietnamese is, in many respects, a modified
form of Chinese. This conviction has driven the author to sort through
these complexities and compile this work over more than 25 years. The
Sinitic-Vietnamese theory he proposes is not based solely on a comparative
list of over 400 fundamental cognates with Sino-Tibetan etymologies, which
will be elaborated in the chapter addressing Sino-Tibetan. It is also
supported by extensive evidence from anthropology, archaeology, and
historical records.
Due to the scarcity of historical documentation, Austroasiatic
specialists are often left to speculate strictly based on linguistic sound
change rules. Consequently, their focus has shifted toward comparative
analyses of Mon-Khmer basic words and many of which have gradually been
reclassified as belonging to other linguistic families. In the absence of
conclusive linguistic proof, they have sometimes redirected their
attention to neighboring languages, a tendency that has undermined the
validity of their arguments concerning Vietnamese-Chinese etymological
connections.
Consider, for example, the Vietnamese word "vịt" ('duck'). It lacks
cognates in Mon-Khmer languages. To address this, Austroasiatic scholars
have proposed a connection to the Thai word เป็ด /pĕd/, despite knowing
that Thai descends from the Daic languages, which in turn originate from
the Taic family, that is, the same lineage that gave rise to the Yue
languages, ancestors of the ancient Viet-Muong language!
If, however, we hypothesize an etymological link between "vịt" and
Chinese 鴨 yā (SV "áp"), historical records may offer supporting evidence.
Dong Zuobin (董作賓, 1933), in Discussing Tan
(《譠》), p. 162, references a location in the Tan State of the Shang
Dynasty (present-day Shandong Province) called 武原城 Wǔyuánchéng
(Vietnamese: Thành Vũnguyên). Locals referred to it as 鵝鴨城 Éyāchéng
(Vietnamese: Thành Nganvịt, literally 'Citadel of Ducks and Geese'),
likely due to phonetic resemblance in their dialect at the time.
This historical dimension of Sinitic etymology, exemplified by cases like
"vịt" and "ngan" (also "ngỗng", both of which can be used to reconstruct
words such as 源 yuán for "nguồn" and "dòng"), underscores the depth of
the Chinese-Vietnamese linguistic connection, an area where the
Austroasiatic Mon-Khmer hypothesis falls short.
Built upon the historical framework outlined above, and expanded in
subsequent chapters, this research offers a comprehensive account of how
the modern Vietnamese language evolved—both diachronically and
synchronically—and how it relates to other Sinitic segments within the
Sinitic sub-branch of the Sino-Tibetan linguistic family.
Another key contribution from the author is a clarification regarding the
use of the term 'Sinitic', used here as a practical convention to denote
elements associated with Chinese linguistic and cultural domains. As
previously noted, 'Chinese' is not an ethno-religious designation like
'Jewish', but rather a cultural construct, akin to 'America'. From a
linguistic standpoint, the concept of 'Chinese' as from 'China' is a
unified polity only emerged following the establishment of the Qin
Dynasty, when a variant of proto-Tibetan was layered atop preexisting
Taic-Yue substrates. The language we now call Chinese was named after this
political consolidation, and thus carries a distinct historical trajectory
intertwined with the evolution of the Middle Kingdom.
The term "Sinitic", or 'Chinese' in broader usage, derives from the
unification of ancient states into the Qin Empire, an event comparable in
scope to the formation of the European Union in modern times. During this
period, vestiges of indigenous Taic-Yue linguistic elements permeated the
emerging imperial lexicon, whether acknowledged by Chinese linguists and
Sinologists or not. Southern Yue languages, including Cantonese,
Fukienese, and Wu, have since been institutionally classified within the
Sino-Tibetan linguistic family, often through official imperial decree.
Lexicographically, 'Chinese' has come to encompass all these dialects as
part of a unified linguistic identity (see Tang Lan, 1965, p. 184).
Had history unfolded differently—say, if the Chu state had triumphed over
both Qin and Han in their decisive campaigns—China might today be known as
"Chu". Historical records suggest that Chu was a Daic-Yue polity, likely
of Taic origin, ancestral to the Daic and Zhuang peoples. Its population
may have spoken a variant of an ancient 'Chu-nese' language, rather than
the form now conventionally labeled 'Chinese'. Likewise, had the NamViệt
Kingdom succeeded in overtaking the Han Empire, the dominant term might
have become 'Việt', or /Jyut6/, or 'Yue' as pronounced in Mandarin (see Lu
Shih-Peng, 1964; Bo Yang, 1983–93).
In essence, 'Chinese' is not a fixed ethno-racial identity, but a
cultural construct confined within the evolving boundaries of the Chinese
polity. Its name may shift with regimes, but its linguistic continuity
transcends nomenclature. These hypotheticals underscore the contingency of naming: terminological
conventions are shaped by historical victors and political consolidation,
while the deeper linguistic substrate often endures across dynastic
transitions. For instance, during the Manchu Qing dynasty (1644–1911), the
polity was officially designated as "Qing", yet its linguistic and cultural
core remained recognizably Chinese because the Manchurians were a part of
it.
Consider further the hypothetical in which Imperial Japan had won World
War II and rebranded the Middle Kingdom as "Dai Đông Á" (Great East Asia).
In such a scenario, the term 'Sinitic' might have been supplanted by an
entirely different designation, perhaps 'not-X'. This thought experiment
illustrates how linguistic nomenclature is often a product of convenience,
necessity, and power, rather than intrinsic linguistic reality. When
juxtaposed with Taic-Yue or Austroasiatic Mon-Khmer frameworks, such
naming conventions can obscure deeper continuities. Ultimately, the
linguistic essence transcends the labels imposed upon it.
Regarding the integrity of this survey, the author affirms that it is an
original and human-authored work, especially in light of its digital
format as an ongoing research project. AI serves only as a tool for final
proofreading and surface-level editing. Without the author's creative
intellect and scholarly commitment, this work would not, and could not,
exist.
It is worth acknowledging the skepticism some readers express toward
academic studies published exclusively online, often dismissing them as
“bogus” or likely AI-generated. While digital formats offer undeniable
advantages in accessibility and scalability, concerns about reliability
and longevity remain valid. Online works are subject to constant revision,
and their long-term availability is far from guaranteed. Over time,
websites may vanish from search indexes due to inactivity, or disappear
altogether when hosting services lapse or accounts go unpaid.
In this context, the author's decision to publish incrementally online
reflects both necessity and intent: to share findings in real time while
preserving the human voice behind the research. The enduring value of this
work lies not in its format, but in the originality of its insights and
the rigor of its methodology.
This document should be regarded as a prelude to the forthcoming printed
edition. In the realm of linguistic inquiry, no conclusion is ever truly
final, and this research is no exception, regardless of whether it
ultimately appears in bound form. The author maintains that readers are
generally less inclined to engage with an online publication in its
entirety, as they might with a physical volume acquired at considerable
expense.
In practice, the author approaches the reference works cited in the
bibliography with similar reverence, though, as previously noted, the
bibliography remains incomplete. Hundreds of titles, meticulously arranged
across his personal bookshelves, are consulted with care and deliberation,
forming the intellectual scaffolding upon which this study rests.
This paper adopts a nontraditional approach by not devoting an entire
section to exhaustively listing all sound change rules, natural or
conditioned, between Sinitic-Vietnamese and Chinese loanwords. Such
comprehensive treatments, as exemplified in Nguyễn Tài Cẩn's studies of
the Sino-Vietnamese sound system (1979, 2000, 2001), are often expected in
research of this scope. Instead, readers will encounter a synopsis of
phonological patterns illustrated through examples and concise commentary.
The emphasis is placed on irregular or distinctive sound correspondences,
such as the ¶ /y- ~ b-/ pattern: 由 "bởi" (because), 油 "béo" (greasy), 邮
"bưu" (post), 柚 "bưởi" (pomelo), 游 "bơi" (swim), all pronounced /yóu/ in
Mandarin. Another example is 公母 (gōngmǔ), which corresponds to
Vietnamese expressions such as "trốngmái" (male and female), "sốngmái"
(life-or-death struggle), or "vợchồng" (husband and wife).
Should this work later prove to have academic value, specialists in
specific fields, such as lexical data tabulation and categorization, can
undertake the task of establishing possible sound change patterns and
formulating their corresponding rules. This type of endeavor is
extraordinarily detailed, if not inherently complex, given that
frequency-dependent sound changes tend to occur in synchrony and are
often irregular, rather than uniformly systematic as observed in, for
example, Germanic languages. While such phenomena are not uncommon,
these irregularities are particularly pronounced in the
Sinitic-Vietnamese context.
Readers inclined to skim for illustrative examples may freely navigate
between sections or pursue areas of personal interest. In doing so, they
will encounter scattered yet thematically linked instances throughout the
text. However, to fellow scholars, the author offers a word of caution:
please avoid quoting passages out of context or drawing conclusions from
isolated errors or incomplete datasets. Such imperfections are inevitable
in a work still undergoing revision, and occasional typographic lapses may
persist. Premature judgments, such as those the author has previously
endured, often result in unwarranted criticism. One notable example
involved an exploratory link between 將 ( jiāng) and Vietnamese " sẽ"
('will'), which was dismissed as "unreliable'" and "bogus" by a linguistic
forum due to a misalignment between " nướctương" 醬油 ( jiāngyóu , 'soy
sauce') and " xìdầu" 豉油 ( chǐyóu , 'bean sauce'), an error stemming from
careless data handling. In such cases, readers may be tempted to infer
exceptional sound change rules, such as " jiāng" ~ " sẽ" via the
speculative pattern ¶ /j- ~ s-/, /-iang ~ -Ø/. Yet a single misstep does
not invalidate the broader inquiry into phonological
correspondences.
High-profile etyma requiring detailed treatment may unavoidably occupy
substantial space. Exceptional or anomalous cases often resist neat
categorization and highlight why enumerating sound change rules can become
unwieldy, sometimes warranting independent study. These irregularities,
which do not generalize across similar phonological environments, demand
careful deliberation. The goal is to equip readers to either interpret
such subtleties through conventional linguistic frameworks or explore
emergent patterns via unconventional heuristics. Ultimately, this endeavor
underscores the speculative nature of historical phonology and the
interpretive latitude inherent in linguistic reconstruction.
Rather than presenting exhaustive lists of mechanical sound change rules,
often overlooked or unread, we will prioritize engaging case studies and
targeted examples. These will illuminate the specific processes by which
conclusions regarding Sinitic-Vietnamese etyma have been reached. By
venturing beyond the well-trodden paths of frequently cited
correspondences, readers are invited to navigate the complexities of sound
change and cultivate the analytical tools necessary to extract and apply
linguistic rules independently.
While regularity governs most phonological transformations, this
research foregrounds examples involving Chinese lexemes in their
diverse forms and phonetic variants, many of which have permeated the
Vietnamese lexicon since antiquity. This linguistic infiltration spans
multiple historical phases, notably the millennium following 111 B.C.,
when the Annamese region was under Chinese rule until its liberation
in 939. During the Ming Dynasty's incursion in 1410, Mandarin briefly
reemerged as the official language, playing a prominent role in
diplomatic and administrative exchanges with the Chinese imperial
court.(囯)
Phonetically, there are instances where sound changes have given rise to
multiple Vietnamese variants of a single etymon. Comparatively, similar
cases can be observed in Japanese Kanji and Go-on readings for individual
Chinese words. Take 道 dào ('way') as an example. In Vietnamese, we
can identify several distinct "readings" that convey different concepts,
interestingly, most of which correspond to the range of meanings found in
the Chinese equivalents. For instance:
'đạo' (way, religion, sect, morals, skill, line),
'dạo' (time),
'đường' (road, line),
'nẽo' (path),
'nói' (speak),
'bảo' (tell),
'tưởng' (suppose), etc.,
Each of these Vietnamese words may seem like a translated version of the
Chinese word, but this is not necessarily the case. Rather, each derived
Sinitic-Vietnamese form is a variant that is cognate with the same Chinese
etymon 道. This phenomenon would be easier to understand if the old
Chinese-based Nôm characters were still widely used in Vietnamese writing.
Unfortunately, this was not always the case, especially given that modern
Putonghua syllables are shorter than their Middle Chinese
counterparts.
The phonological change rules illustrated in this paper are neither
exhaustive nor intended as definitive references. As this research remains
a work in progress, it continues to undergo revision and refinement, with
plans for a first print edition to reach select university
campuses—ideally those with active communities of historical linguists.
The methodologies presented here are exploratory and suggestive rather
than conclusive, though their foundational principles remain consistent
unless explicitly revised.
Given the evolving nature of this study, the demonstrated approaches
should be understood as practical models—examples of how the author has
applied two innovative etymological frameworks to generate preliminary
results. Readers will observe the investigative process used to identify
Vietnamese words of Chinese origin (Sinitic-Vietnamese [VS]) and, in turn,
gain the tools to replicate this process with confidence and
clarity.
These newly developed methodologies have proven effective in
uncovering the etymology of Sinitic-Vietnamese words and in
formulating tentative sound change rules, tracking transformations
between forms, or identifying 'what changes into what.' For instance,
this approach underpins the analysis of 道 dào, as previously
discussed. Readers will have the opportunity to apply these techniques
in later chapters, particularly through the worksheets provided
inChapter 13
. They will also encounter a curated selection of Sinitic-Vietnamese
etyma—a small but meaningful subset of the broader findings presented in
this research.
Caution is warranted when interpreting loanwords among the examples
presented. As a general rule, if a Vietnamese word closely resembles its
Chinese counterpart in both phonological form and semantic meaning, it is
likely a direct loan. Recognizing such cases is essential for
distinguishing inherited etyma from later borrowings and for maintaining
analytical precision throughout this study.
While the linguistic resemblance between Vietnamese and Chinese will be
addressed in greater detail in later chapters, it is worth noting here
that their structural and lexical affinities are significantly closer than
those observed between Chinese and many other Sino-Tibetan languages. The
term Sinitic-Vietnamese (VS), also referred to as HánNôm (漢喃),
encompassing both Hán and Nôm strata, is used to denote either Vietnamese
words of Chinese origin or cognates shared by both languages that descend
from common ancestral roots. Examples include "sông" 江 (jiāng, 'river'),
"ngà" (牙 yá, 'tusk'), and "dừa" 椰 (yé, 'coconut').
Among their shared linguistic features, beyond morphonological and
semantic parallels, nearly every linguistic trait present in Chinese finds
an equivalent in Sinitic-Vietnamese. These features are so deeply embedded
in Vietnamese usage that they are often mistaken for indigenous Vietic
words or regarded as 'pure' Vietnamese. Some are considered
quasi-Sino-Vietnamese variants, especially those represented by Nôm
characters incorporating Chinese components.
For the Sinitic-Vietnamese etyma investigated here and identified as
having Chinese roots, such conclusions are based on holistic alignment
with Chinese linguistic attributes. These include phonetic and morphemic
structure, phonological and semantic traits, syntactic and lexical
parallels, tonal systems, CVC syllabic architecture, and grammatical
arrangements in sentence construction.
The closer a Vietnamese word resembles its Chinese counterpart, the more
likely it is to be a loanword. However, this research also examines
whether resemblance necessarily implies borrowing. For example, the
Vietnamese "tếu" 'funny' may be hypothesized as a loan from 笑 xiào (SV
"tiếu" 'laugh'), which is cognate with VS "cười". Alternatively, "tếu" may
be cognate with 逗 dòu /tow⁴/ 'tease', SV "đậu" /ɗɐw⁶/, where the voiced
/ɗ-/ reflects an older development and the unvoiced /t-/ a more recent
one. This word may have been reintroduced into Middle Vietnamese via
spoken Mandarin, likely during the Ming Dynasty. Readers may compare the
contemporary usage of 逗 in Chinese with its appearance in classical
literature such as Dream of the Red Chamber (紅樓夢 Hónglóumèng).
With the findings presented in the Sino-Tibetan chapter, including the
genetic affinity demonstrated through shared linguistic peculiarities and
cognates, it becomes increasingly plausible to reconsider Vietnamese as
part of the Sino-Tibetan linguistic family. Such a reclassification could
be achieved through the methodologies outlined in this research, which
adopt broader and innovative approaches. These can be applied alongside
existing tools from Chinese historical linguistics, offering insights into
Vietnamese etymology across disciplines such as anthropology, archaeology,
and history, particularly regarding the origins and biological composition
of the Vietnamese people and their state. The underlying premise is that
populations of shared racial ancestry tend to speak variant languages of
common origin.
Throughout this paper, each etymon is accompanied by its corresponding
Chinese character and pinyin (拼音) transcription to facilitate sound
identification. In many cases, the pinyin alone suffices and may be less
visually distracting than the character itself, especially when the
character is constructed with "giảtá" (假借) or 'loangraph', which
requires readers to decipher embedded phonetic codes. A loangraph
refers to a Chinese character borrowed solely for its phonetic value and
repurposed for a different concept. For example, the Vietnamese "lại" 來
(lái, 'come') may have originally been associated with "lúa" ('paddy,
millet, grain'). If loangraphs were transcribed only in pinyin, they might
resemble English homophones with divergent meanings such as 'yard',
'glass', 'page', and 'lie'.
Pinyin, the official romanization system of the People's Republic of
China for transcribing Mandarin (普通話 pŭtōnghuà, 'national language'), has gained widespread global adoption,
including in Taiwan, which began integrating it nearly three decades
ago.
For accurate sound transcription, this study primarily employs the
International Phonetic Alphabet (IPA). IPA symbols are used to represent
dialectal and archaic pronunciations, as well as precise phonetic values,
enclosed in square brackets ["xxx"], in contrast to approximate phonemic
values indicated by slashes "/xxx/". This distinction helps clarify subtle
phonetic nuances in the cited lexicons.
These distinctions are especially relevant in cases involving diphthongs,
where comparative analysis depends on capturing fine phonemic variation.
For instance:
To streamline typographic presentation, phonetic symbols may be rendered
in simplified forms such as [-ow-] and [-ejn], or alternatively as /-ou-/
and /-ein/, when the intended sound values are contextually clear and
unambiguous. This convention will be applied consistently across other
phonetic environments, with supplementary notes and examples provided
throughout the text to ensure clarity and continuity.
In many instances, IPA transcriptions offer a more precise reflection of
Vietnamese phonetic values, especially in relation to Chinese character
correspondences—than conventional pinyin. For example:
Pinyin d aligns with [t]
Pinyin t corresponds to [tʰ] or /th/
Pinyin r maps to /j/
Pinyin gu and ku are phonetically
realized as [ku] and [kʰu], not [gu] and [ku], respectively
This transcriptional approach parallels the methodology employed by
Pulleyblank (1984) in his reconstruction of Old Chinese (OC), where he
explored phonetic values ambiguously recorded in classical annals and
inscriptions.
To avoid typographic clutter and potential confusion with IPA diacritics,
tonal numerals (ranging from 1 to 9) will be appended to each phonetic
form. These numerals indicate tonal categories across various Chinese
dialects—such as Cantonese (Guangzhou), Fukienese (Hokkien, Fuzhou, Amoy),
Teochew (Chaozhou), and Hainanese—as well as other regional languages
including Daic, Thai, and Vietnamese. This system ensures both phonetic
precision and cross-linguistic comparability.
Tonal numeral symbols are conventionally used in the transcription of
Cantonese, Fukienese, and other Chinese dialects to indicate pitch
contours and tonal categories. In the case of Vietnamese, tones are
annotated following the traditional eight-tone framework—more precisely, a
system of four tonal categories bifurcated into upper and lower registers.
This structure is rooted in classical sources such as the Guǎngyùn 廣韻, Jerry Norman’s Chinese
(1988, p. 55), and foundational Vietnamese linguistic studies, notably Nguồn gốc và Quá trình Hình thành Cách đọc Âm Hán-Việt
(“The Origin and Transformational Process of the Sino-Vietnamese
Pronunciation”) by Nguyễn Tài Cẩn (1979, 2001).
The tonal categories are as follows:
1.
,
3.
ʔ
5.
´
7.
´ -p, -t, -c, -ch
2.
`
4.
~
6.
.
8.
. -p, -t, -c, -ch
The use of tonal numerals will be limited and reserved for cases where
clarification is essential, particularly to prevent misinterpretation
across Chinese dialects. Tonal values assigned to the same numerical
markers often vary significantly between dialects. For example, Mandarin
tones (1, 2, 3, 4) differ markedly from those in Cantonese (1, 2, 3, 4),
as documented by Wang Li et al. (1953), and diverge further from
Vietnamese tonal conventions.
To maintain clarity in Vietnamese phonetic transcription, modern
diacritics will be the primary notation system, used alongside IPA
symbols, for instance, [à], [ả], [ã], etc., except where such usage risks
confusion with IPA phonetic values (e.g., nasalized /ã/). For precise
tonal interpretation, readers may consult Quốcngữ diacritics for
Vietnamese or Pinyin tone marks for Mandarin (e.g., ā, á, ǎ, à, a), both
of which offer distinct tonal representations despite superficial visual
overlap.
In select cases, tonal markings will be deliberately omitted. This
reflects the author's view that tonal values in many Sino-Vietnamese and
Sinitic-Vietnamese forms, like their Chinese dialectal counterparts, have
undergone extensive historical shifts. These tonal evolutions, often
cyclical and unpredictable, lack a universally reliable rule for
reconstruction. In some instances, tones may even revert to their original
contours as they existed at the time of lexical absorption into
Vietnamese. Such tonal fluidity is well attested in Chinese historical
phonology, alongside other systemic changes such as shifts in initial
consonants and syllabic finals (see Chao Yuen-Ren, Tone and Intonation in Chinese, 1933, pp. 119–134).
Phonemically,
Phonemically, Vietnamese initial and medial consonants exhibit a range of
articulatory values that are not always transparently reflected in the
orthography. For instance, the following correspondences are commonly
observed:
b- → [ɓ]
d- → [ɗ]
ch- → [ʨ]
kh- → [kʰ]
ph- → [pf]
r- → [ʐ]
th- → [tʰ]
tr- → [ʈ]
nh- → [ɲ], occasionally rendered as ɲ-, jn-, or nh- depending on
typographic or contextual constraints
Similarly,
vowel clusters such as -uy and -iê are more accurately transcribed in IPA as [wej] and [iə], rather
than [wi] and [ie], reflecting the true phonetic realization rather than
the orthographic approximation. Vietnamese spelling conventions often
obscure these distinctions, particularly in final consonant
environments.
To ensure clarity and consistency, final consonants will be transcribed
using the following IPA representations:
-p → [p]
-t → [t]
-ch → [jt]
-c → [k]
-nh → [jŋ]
In cases involving labiovelar articulation—especially when preceded by a
rounded vowel (e.g., o-, ɔ-) or a glide medial (-w-), the following
transcriptions will be used:
[-kʷ] → -kw, -wk, or -kʷ
[-ŋʷ] → -wŋ, -ŋw, or -ŋʷ
The velar nasal / ng/ will be rendered as either [ŋ] or [ng], contingent
on its phonetic environment and the need for typographic clarity. These
conventions will be applied systematically throughout the text to maintain
phonological precision and editorial coherence.
Subsequent chapters elaborate on all preceding elements and extend each
example through polysyllabic grouping across Chinese, pinyin, and
Vietnamese. This includes:
Detailed correspondences with Middle Chinese finals and tonal categories
Chronologically layered borrowing trajectories
Diagnostic markers of Yue substratal influence
A comprehensive polysyllabic lexicon, indexed by Chinese characters, pinyin
forms, and Vietnamese equivalents
The overarching objective is to produce a synthesis that is ready for
publication—methodologically rigorous, typographically exact, and fully
transparent in its analytical claims.
CONCLUSION
The title What Makes Chinese So Vietnamese? distills the chapter’s central thesis: long before the emergence of a
unified Chinese identity, there was Yue, and its imprint remains indelibly
woven into the phonological and structural fabric of modern Vietnamese.
Sinitic-Vietnamese is not merely a collection of borrowed forms; it
constitutes a naturalized linguistic heritage, accessible through the lens
of polysyllabicity and comparative historical methodology.
Drawing from Sino-Tibetan research that reveals patterned cognates and
structural parallels between Vietnamese and Chinese, this work invites a
reconsideration of Vietnamese as a potential member of the Sino-Tibetan
family. Such a reclassification would not rely solely on traditional
comparative methods, but rather on the expanded, polysyllabic framework
developed herein, one that complements and extends the tools of Chinese
historical linguistics. The implications reach beyond linguistics,
offering new perspectives for anthropology, archaeology, and population
history, especially regarding the deep ancestral ties and sustained
cultural contact that often underlie linguistic convergence.
The practical value of Chinese linguistic studies is well established,
with global institutions dedicating significant resources to its
advancement. This momentum can, and should, be leveraged to benefit
Vietnam in three key areas.
First, Vietnamese etymological research stands to gain by integrating
Sinitic-Vietnamese studies into the broader cognitive and methodological
domain outlined in this work.
Second, the structural parallels between Chinese and Vietnamese,
particularly the shared polysyllabic tendencies, suggest a compelling
model for orthographic reform. Chinese, now formally recognized as
polysyllabic and reflected in its Pinyin system, offers a precedent for
modernizing Vietnam’s outdated monosyllabic script. Originally devised in
the 18th century by European missionaries for religious dissemination, the
current Vietnamese writing system remains largely unchanged, functioning
like a colonial-era locomotive dragging its digital-age passengers across
deteriorating tracks. A polysyllabic reform would not only enhance
cognitive accessibility but also align Vietnamese orthography with its
linguistic reality.
Finally, the findings presented here lay the groundwork for a new
generation of Vietnamese lexicography—one that includes etymological
explanations for each entry, a feature conspicuously absent from existing
dictionaries. Such a development would represent a transformative leap
forward in Vietnamese linguistic scholarship, anchoring future research in
a framework that is historically grounded, methodologically sound, and
intellectually expansive.
SYMBOLS AND CONVENTIONS
Finally, a few housekeeping details regarding terminologies, conventions,
and classifications will be addressed to ensure a consistent framework for
discussing Sinitic-Vietnamese subjects throughout this work.
In this paper, the author will employ conventions commonly utilized in
the field of historical linguistics, alongside their alternate usages and
some custom-made symbols of his own design. Readers are expected to
already have familiarity with standard linguistic symbols, the International Phonetic Symbol (IPA), and Vietnamese orthography (Quốcngữ) (Q).
At the conclusion of this research, you will find an extensive
bibliography listing references. For books in print, many of these may
still be accessible in the libraries of academic institutions across the
United States. While numerous related linguistic websites are valuable
resources, some may eventually become unavailable over time. Readers
seeking outdated URLs cited in this research may refer to the Internet
Archive (https://archive.org/) (Syntax: https://web.archive.org/web/http:..../ that will display all posting history.)
Abbreviations and acronyms will be defined upon their first introduction.
To improve clarity, examples cited within paragraphs will be wrapped onto
separate lines and numbered or bulleted (•). Lengthier comments on
patterns of sound change and the evolution of Vietnamese words under
examination will be enclosed in square brackets, such as [xxx yyy zzz], to
provide detailed explanations supporting arguments related to the listed
etymologies. This is, after all, the central purpose of this
research. English translations of cited vocabulary will be included as needed
following each term, though these may not be exhaustive. In cases where
translation is irrelevant, it may be omitted.
The commonly used symbols include
">" indicating "evolves into" (diachronically)
"<" signifying "derived from" (diachronically)
"=>" meaning "giving rise to" (by a phonetic rule)
"~>" representing "giving rise to" (by analogy)
"<=" indicating "built with"
"~" denoting "alternating with," "correspondent to," or "cognate to"
(synchronically)
"$" used for literary forms, as opposed to vernacular or colloquial
usage
"#" signifying metathesis or reverse order ("iro")
"®" signifying contraction, clipping, sound dropping, or deletion
("rụng")
"§" used for comparison ("so sánh," cf., confer)
"¶" or |P signifying patterns of sound change
"%" representing possible alternatives
"&" indicating combination with
"/" indicating conditional change (e.g., x > y /_V#) or
alternation (x/y)
"\" representing alternative influence or equivalency with "/"
"[xxxx]" denoting exact phonetic value or providing explanatory
notes
"/xxxx/" marking approximate phonetic value
Nasalized vowels: /ã/, /ẽ/, /õ/, etc.
Vh indicating "Vietnamized," i.e., localized
Capital letters like "X-, -Y-, Z-" symbolizing consonant articulation
classes in phonetic transcriptions, e.g., /P-/ for labial sounds
"=" indicating equivalency or equivalence
"(?)" denoting unidentified or unknown sources
/-ʔ/ marking guttural endings or syllabic breaks in diphthongs or
triphthongs
/Ø-/ marking guttural initials such as ŋ- or ʔ-
"|" signifying syllabic division, homonymous elements, opposition, or
parallel forms (also used as || for separation)
"{x- ~ y-}" marking conditioned sound changes or interchanges
"" and "*" indicating hypothetically reconstructed forms: *
for ancient (AD to Middle Age) sounds and ** for archaic Proto-forms
(B.C., potentially pre-historic).
Most images, maps, and illustrations are either original creations by the
author or sourced from publicly available resources, including Wikipedia.
These are licensed under the Creative Commons Attribution-ShareAlike 3.0
Unported License (CC-BY-SA) and the GNU Free Documentation License (GFDL)
.
x X x
ENDNOTES
(車)^According to Starostin, in Middle Chinese 車 also reads /tʂa/, FQ 尺遮
(whence Mand. chē, Viet. xa), but this reading is rather recent (judging
from rhymes in Guangyun 廣韻, not earlier than Eastern Han) and must
have stemmed from some Old Chinese (OC) dialect. Vietnamese has also a
colloquial loan from the same source, that is "xe" /sɛ/. If the
reconstruction is indeed *kla, one could think of an early borrowing
from OC, hence, "cộ". Meanwhile, interestingly, there exist also 檋 jù
(SV cục) as "cộ" cognate to variants 檋, 輂, 輁, 梮) jù where the former
characters having the phonetic M 車 chē, jū, jù [ MC kʊ < OC *kla
].
Vietnamese also contains a colloquial loan from the same phonetic
lineage, xe (/sɛ/). If OC reconstruction indeed posits
*/kla/, this suggests an early borrowing, accounting for the
Vietnamese cộ. Additionally, cộ finds a cognate in 檋 jù
(SV cục), along with its phonetic variants 檋, 輂, 輁, and
梮 jù, where the initial characters share the phonetic foundation 車
chē, jū, jù [ MC cʰia, kɨə̆ < OC *kʰlja, *kla ] with the latter
lexicons are late development, as usual, of word-formed module
{ideographic radical + signific stem}.
(蕃)^ "An
exonym for Tibet that appeared in Tang dynasty. Some scholars argue the
second syllable, 蕃, was originally read with the -n coda in Middle
Chinese (i.e. pʉɐn or bʉɐn, the former of which regularly gives rise to
modern Mandarin fān). They argue that the modern Tǔbō reading is recent,
possibly originating from French sinologist Jean-Pierre Abel-Rémusat's
(1788-1832) argument that the second syllable should be pronounced this
way to match Old Tibetan བོད་ (bod, "Tibet") (Pelliot, 1915). Rhymes in
poetry from Tang and Yuan dynasties also suggest that the second syllable
蕃 was read with the -n coda during those times (Yao, 2014). "
(See 吐蕃 - Wiktionary)
(工)^ For example, the aboriginal
form /krong/ is cognate with both
VS sông (river) and Chinese
江 jiāng (Cantonese: /kong1/). While the
former is accepted as presented, the latter can be substantiated through
its phonetic stem 工 gōng (SV công). This
phonetic correlation requires no further proof, as the variant
pronunciation derived from 工 /kong/, which contributes to
the phonetic structure of 江 jiāng, reinforces the
legitimacy of the Yue pre-existing etymological root.
(百)^ "Bod" is simply another form of the name “Bak,” as in 百姓 (Baixing), 百越 (Bách ViệtorBai Yue), discussed by Lacouperie (ibid., see Chapter 9):
"Bak was an ethnic and nothing else. We may refer, as proof, to the similar name—rendered, however, by different symbols—which they gave to several of their early capitals: PUK, POK, PAK, all names known to us after ages, and whose similarity to Pak and Bak cannot be denied. In the region from which they had come, Bak was a well‑known ethnic name; for instance, Bakh in Bakhdhi (Bactra), Bagistan, Bagdada, etc., and it is explained as meaning ‘fortunate, flourishing."
This interpretation aligns with what the same author discusses in Chapter Six (Lacouperie, ibid., pp. 116‑119) concerning the ancestral Bak of the early Chinese, in contrast to the pre‑Chinese populations.
(M)^ Linguistic Considerations in Transliteration: In this paper, all transliterations of historical names follow
Mandarin pronunciations for ease of reference, though their modern
phonetic forms may not accurately reflect how they were originally spoken.
For instance, the contemporary
Yue (越, 粵, 戉, 鉞) → Viet
NanYue (南越) → NamViệt
OuYue (歐越) → ÂuViệt
Annan (安南) → Annam
LuoYue (雒越) → LạcViệt
MinYue (閩越) → MânViệt
DongYue (東越) → ĐôngViệt
WuYue (吳越) → NgôViệt
Additionally, phonetic reconstructions vary, and not everyone
agrees on the ancient pronunciation. Some scholars propose /Viet8/,
while others favor /Jyet8/ or /Jyut6/. This uncertainty is reflected
in modern Vietnamese dialectal pronunciation,
where Việt is articulated differently in the
southern sub-dialect, alternating between /v-/, /j-/, and
/z-/.
(C)^ Hanzhong: In the Qin dynasty the area was governed as the
Hanzhong Commandery, whose seat was in current day Nanzheng County,
south of the Hanzhong urban area.[5] In 207 BC, the Qin dynasty
collapsed. Liu Bang, who would later become the founding emperor of the
Han dynasty, was made lord of Hanzhong. He spent several years there
before raising an army to challenge his arch-rival, Xiang Yu, during the
Chu–Han Contention. In 206 BC, after the victory at Gaixia, Liu Bang
named his imperial dynasty after his native district, as was customary.
However, he chose Hanzhong rather than his birthplace Pei County
(present-day Xuzhou, Jiangsu Province). Thus, Hanzhong gave its name to
the Han dynasty. (Source: Wikipedia)
(X)^ Political Influence on Linguistic Policy: The People's Republic of China's language policies under Xi
Jinping's administration (beginning in 2017) explicitly restricted local
TV programs from broadcasting in regional dialects, mandating exclusive
use of Northern Putonghua. This exemplifies political
intervention in linguistic development, a subject explored in greater
depth in forthcoming chapters.
(華)^ Yue Loanwords in Chinese: Examples of Yue-derived loanwords in Chinese include:
đường:糖táng(sugar)
dừa: 椰yě(coconut)
trầu: 柄榔bīngláng(betel nut, cf. Muongblau)
sông: 江jiāng(river, cf. Muongkrong)
chó: 狗 gǒu (dog, cf. Proto-Vietic klo).
(H)^ Persistence of 'Annamese' in Hainanese Speech: The term 'Annamese' (安南話) remains in use within
Hainanese speech, pronounced as /A1nam2we1/. Hainanese is a MinNan
sub-dialect, part of the Fukienese (Hokkien, Amoy) linguistic group,
spoken by inhabitants of Hainan Province, China.
(W)^ On the sidelines, as the place
name Tràngan appears multiple times in linguistic
discussions throughout this paper, it is worth noting its historical and
cultural significance. Located in present-day Ninhbình Province in northern Vietnam, where the first king of the Lê Dynasty established his initial capital, Tràngan is
renowned for its breathtaking waterways, framed by towering limestone
karsts.
(秦)^ (1) Tần, (2) Chệt. (3) Tầu, (4) Tàu 秦 Qín (Tần) [ M 秦 Qín < Middle Chinese tʂjin < OC *tʂin | Chinese dialects: Cant. ceon4, Hẹ cin2, Tn ćhiẽ 12, Ta ćiẽ 12, Dc ćhĩ 12, Nx chin12
Kangxi Dictionary:Entry for Qin (秦):
Ancient Forms and Pronunciation: In Tang Yun, Guang Yun, Ji Yun, Lei Pian, Yun Hui, and Zheng Yun: Pronounced qín, with fanqie reading 匠隣切 (jiang lin qie) or 慈隣切 (ci lin qie). Sound: qín (螓).
Definition: A country name. According to Shuowen Jiezi, Qin was the territory bestowed upon the descendants of Bo Yi. It is fertile land suitable for grain cultivation. In Book of Songs·Qin Feng·Che Lin Commentary, Qin refers to a valley name in Longxi, located northeast of Bird Rat Mountain in Yongzhou. Annotations: Today, it is Qin Pavilion and Qin Valley. Historical Context: During the Spring and Autumn Period, the State of Qin existed. The Han Dynasty established Tianshui Commandery there, which was later renamed Qinzhou during the Northern Wei Dynasty. In Shiming, Qin means "crossing" (津), as its terrain is fertile and enriched with moisture. Three Qin: In Records of the Grand Historian·Xiang Yu: Xiang Yu divided Guanzhong into three regions, granting the surrendered generals titles: Zhang Han as King of Yong (雍王), Sima Xin as King of Sai (塞王), Dong Yi as King of Zhai (翟王). Together, they were referred to as Three Qin. DaQin (大秦):
In Later Han·Records of the Western Regions: Da Qin refers to the region west of the sea (also called Sea West Country). Its inhabitants were tall and upright, resembling the people of China, hence the name Da Qin.
Notes: In phonology, the character 秦 Qín ("Tần") ends in an open nasal -n, making it difficult to transform into -w, a rounded, closed-lip sound. According to Shuowen Jiezi, the pronunciation of 秦 Qín (originally referring to a type of grain) was akin to 舂 cōng (SV "thông", corresponding to VS "tàu"). Comparing phonological transformation patterns, this resembles the shift seen in 痛 tòng → "đau." Additionally, it was borrowed for the pronunciation of 牆 qiáng ("wall"), which corresponds to SV "thương" ~ VS "đau."
Before and after the Warring States period (Eastern Zhou), the term 秦 Qín was used across various regions in what is now China to refer to the State of Qin, which Emperor Qin Shi Huang unified along with six other states in 246 B.C. In Vietnamese culture, the Double Fifth Festival (Tết Đoanngọ), celebrated on the 5th day of the 5th lunar month, was once a major folk tradition. One custom involved wrapping and throwing rice cakes into the river to prevent fish from consuming the remains of Qu Yuan (Khuất Nguyên), a loyal scholar of the State of Chu, who drowned himself rather than be captured by Qin forces.
Based on this historical context, there is an immediate association with resistance, and even contempt, when referring to the Qin State (Tần). Today, the region that was once the State of Chu is located in Hubei Province, which may have once been a part of or closely linked to the southern BáchViệt (Hundred Yue) territories. These included provinces such as Yunnan, Guangxi, Hunan, Guangdong, Fujian, Zhejiang, Jiangsu, and others over 2,000 years ago. This phonetic connection further supports the plausible link between Tần and Tàu, as in the Vietnamese word Tàuô, which aligns with the black-colored uniforms worn by Qin officials.
Some argue that "Tàu" derives from "tàughe" ('boats'), and that ngườiTàu ('Chinese people') refers to those arriving in Vietnam by boat or living aboard ships. However, this interpretation is merely speculative. The most reasonable linguistic link remains Tần = "Tàu". During that era, the people of the former Warring States, which were conquered by Qin, deeply resented Tần ("Tàu").
Another relevant linguistic observation involves Cantonese speakers in Vietnam, who often refer to themselves as Thòngdành (唐人 Tángrén, "Tang people") or người Đường ("Tang people"). In phonological transformation, thòng in 唐人 Tángrén or Thòngdành could have evolved into tàu. However, it is worth noting that 唐 Táng = SV đàng, đường ends in an open final, yet follows a phonological pattern wherein /-ương/ shifts to /-au/. In ancient usage, 唐 Táng carried meanings such as "great road" or "main path" (đường cái, đàng cái).
Despite this possibility, the explanation that Tần = Tàu is stronger. Unlike their animosity toward Qin, the Vietnamese did not harbor the same resentment toward Cantonese speakers. While Cantonese people are commonly referred to as ngườiTàu in Vietnam, collective consciousness suggests that Vietnamese speakers may have internally recognized that Cantonese belonged to a different branch of the BáchViệt people, one that had been completely Sinicized (Hánhoá). This is reflected in historical figures such as Triệu Đà, who declared himself King of NamViệt (NanYue), with his capital at Phiênngung, now modern-day Guangzhou.
Additionally, in phonological analysis, 中 Zhōng ("Trung") could have also evolved into "Tàu" due to phonetic shifts: /ʈ-/ → /t-/, and /-ŋʷ/ → -w/. This follows similar phonetic transformations observed in 痛 tòng (SV thống) → "đau". Hence, "Trung" could plausibly have shifted into "Tàu".
Examples:
秦晋之緣Qín-Jìnzhīyuán ("KếtduyênTần-Tấn", 'Alliance between Qin and Jin')
秦人 Qínrén ("Người Tàu", 'Chinese people')
三秦 Sān Qín ("Ba Tàu", 'Three Qin regions')
秦越 Qín-Yuè ("Tàu-Việt", 'Qin-Yue').
The term China and Chinese trace their origins to the Qin Dynasty (246–210 B.C.). Qin also appears as a family surname, a tribal name, and a designation for regions in ancient China (including Shaanxi Province). In Vietnamese, the term Chệt or Chệc carries a derogatory tone, though it is believed to derive from 潮 cháo (Teochew 潮州 Cháozhou). The phonetic progression 潮 cháo → Triều → Tiều could have eventually resulted in Tàu
三秦 Sānqín (1) TamTần, (2) BaTàu [ @ M 三秦 Sānqín \ Vh @ 三 sān ~ ba (cf. 仨 sa), @ 秦 Qín ~ 'Tàu' | M 三 sān, sàn, sā, sēn < Middle Chinese sɑm, sʌm < OC *sjə:m, *sjə:ms | FQ 蘇甘, 蘇暫 || M 秦 Qín < MC tʂjin < OC *tʂin (See 'Tàu') || Handian: ◎ Three Qin (三秦 Sānqín) refers to the Guanzhong region. After Xiang Yu defeated Qin and entered Guanzhong, he divided the territory among the surrendered Qin generals Zhang Han, Sima Xin, and Dong Yi, thus calling the Guanzhong area Three Qin. ◎ "The city towers support Three Qin, smoke watches over Five Crossings" , Tang Dynasty, Wang Bo's "To Du Shaofu upon His Appointment to Shu Prefecture."
(1) After the fall of Qin, Xiang Yu divided Guanzhong into three regions, appointing the surrendered Qin generals:
Zhang Han as King of Yong (雍王)
Sima Xin as King of Sai (塞王)
Dong Yi as King of Zhai (翟王).
Together, they were known as Three Qin. See Records of the Grand Historian: Qin Shi Huang Chronicle (《史記·秦始皇 本紀》). Later, Three Qin came to refer to the region now known as Shaanxi Province. Wang Bo's poem "To Du Shaofu upon His Appointment to Shu Prefecture" describes it: "The city towers support Three Qin, wind and smoke overlook Five Crossings."
Feng Bi's poem "Map of Rivers and Mountains" further mentions: "The terrain extends west to control the distant Three Qin, the river flows south to encompass Two Hua."
(2)Three Qin also refers collectively to: Qinzhou (秦州), Eastern Qinzhou (東秦州), Southern Qinzhou (南秦州). In The Book of Wei (《魏書·尒朱天光傳》): "From Three Qin, the He River, Wei River, Gua Prefecture, Liang Prefecture, and Shanshan, all came to submit." This text is also cited in Comprehensive Mirror in Aid of Governance (《資治通鑑·梁武帝中大通二年》), with historian Hu Sanxing annotating: "Three Qin refers to Qinzhou, Eastern Qinzhou, and Southern Qinzhou."
Note: San Qin, central Shanxi Plain; the Vietnamese "BaTàu" is derogatory term to call Chinese.
(字)^ The Chữ Nôm script renders 𡨸喃, which is also written as 字喃.
(差)^ Let's examine another case, one that is arguably more "Vietnamese" than "Chinese," though still constructed using Chinese linguistic material. Consider phải and trái, which are distinct from their Chinese equivalents but conceptually align with the notions of "right and wrong" versus "left and right."
In Vietnamese, trái denotes both "wrong" and "left." The former meaning may be linked to sai trái 差錯 chācuō (SV saitô, "wrong"), where 錯 cuō is associated with 差 chā ("sai" in Vietnamese). Phonologically, VS trái appears connected to 左 zuǒ (SV tả, "left"). Meanwhile, the concept of phải functions similarly to the English word "right" in both the directional and moral senses, as seen in phảichăng 平等 píngděng (SV bìnhđẳng, "equal, righteous"). This association extends to the phrase phảitrái 是非 shìfēi (SV thịphi), meaning "right and wrong."
Notably, Vietnamese phải ('right' in the sense of correctness) does not derive from the Chinese word 右 yòu (SV hữu, 'right side'), as seen in "tảhữu" (左右 zuǒyòu, 'left and right'). However, phonological parallels suggest an underlying relationship between 右 yòu and phải within the {¶ /y- ~ B-/} transformation pattern. This pattern appears in pairs such as:
郵 yóu (SV bưu, "post")
由 yóu (VS bởi, "because")
柚 yóu (VS bưởi, "grapefruit")
游 yóu (VS bơi, "swim").
Such correspondences imply that 右 yòu may have historically shared phonetic characteristics with VS phải. It is plausible that phải once sounded closer to /bɨw/ in prehistoric times.
The broader takeaway here is that many modern Vietnamese words have been coined using Chinese linguistic material. The pair phải and trái (是非 shìfēi) reflect a pattern of antonymous disyllabic word formation in Vietnamese, paralleling structures found in:
cao thấp 高低 ("height"),
to nhỏ 大小 dàxiăo ("size"),
nặng nhẹ 輕重 qīngzhòng ("weight")
(安)^ Linguistic Parallels in Former Colonies: Similar to the role of English as a global lingua franca in former
British colonies, early Mandarin may have functioned in a comparable
capacity in Annam prior to 939 AD. Even today, Hanoi residents continue to
associate refinement and elegance with Tràngan people,
referring to themselves with a sense of cultural prestige. This metaphor
mirrors the early 20th-century sentiment of "Saïgon est Paris de l’Orient," despite the fact that the French only arrived in Saigon in 1868 and
their colonial presence in Vietnam lasted until 1954.
(Y)^ Pig Terminology in Vietnamese and Its Yue Origins: For "pig," northern Vietnamese speakers
use lợn (豚 tún, SV độn),
whereas in the south, it is
called heo (亥 hài,
SV hợi). The latter is an archaic, authentic Yue term found
in both Vietnamese and Chinese zodiac systems, where
亥年 Hàinián (VS NămHợi or NămHeo)
corresponds to the "Year of the Boar."
Meanwhile, lợn 豚 tún (SV độn),
appearing in the Kangxi Dictionary, is more accurately a
doublet of 豘 tún, which carries the same meaning.
The
key point to emphasize is that Yue linguistic elements predate Chinese
ones, as 亥 hài was likely transcribed from an ancient
Yue term for heo, both etymologically and culturally
(See APPENDIX D, E, F, G)
(V)^ NanYue (Chinese: 南越; pinyin: NánYuè; Cantonese Yale: Nàahm-yuht; Vietnamese: NamViệt) was an ancient kingdom encompassing parts of present-day Guangdong, Guangxi, and Yunnan in China, as well as northern Vietnam. . Today, visitors can explore the magnificent ruins of mausoleums once built by the kings of NanYue, located in Guangzhou City, Guangdong Province, China.
(Z)^ Shared Folktales Between Zhuang and Vietnamese Cultures: The Zhuang folktale of the Magic Sword and the
Vietnamese legend of Trọng Thuỷ and Mỵ Châu narrate
strikingly similar stories, both detailing the historical transition of Âu
Lạc (歐雒) into the Nam Việt Kingdom. (cf Truyệncổ Dòng BáchViệt and https://vi.wikipedia.org/wiki/Mỵ_Châu.)
(未)^ Goat and Its Linguistic Associations: The Chinese character 未 wèi can be
transliterated as both Sino-Vietnamese vị ("upcoming")
and SV mùi, as seen in Năm ẤtMùi 乙未年 Yǐwèinián ("Year of the Goat"). In
Sinitic-Vietnamese, dê (goat) is cognate with
羊 yáng (SV dương, VS dê),
which aligns with Teochew /jẽ/, all denoting "goat." The zodiac name
羊年 Yángnián ("Year of the Goat") corresponds with
Sinitic-Vietnamese NămDê.
An important
elaboration here is that 未 wèi originated as a
loanword from the ancient Yue linguistic family, whereas
羊 yáng is a pictograph depicting the head of a goat or
sheep. Linguistically, 未 wèi and
羊 yáng may be considered doublets, connected both
semantically and phonetically. This relationship is exemplified in
美 měi (SV mỹ, "beautiful"), where
羊 yáng above 火 huǒ ("fire")
metaphorically conveys "beautiful taste" or "deliciousness." Furthermore,
美 měi and
未 wèi (cf. mùi) exhibit phonetic and
semantic connections.
It is plausible that an
early form of "dê" entered the Chinese language in dual
forms for zodiac classification, possibly sounding similar to
未 (wèi) centuries before being reintroduced to the Yue
populace of the NamViệt Kingdom or Annam.
(S)^ A classic example of a Sinitic-Vietnamese word is 江 jiāng (VS sông, ‘river’), which was an ancient loan from the Yue form /krong/.
Similarly, 目 mù and VS mắt ('eye') may have originated from a shared ancestral root, likely
tracing back to a pre-Taic linguistic stratum in the distant prehistoric
past. Other notable examples include the following: 子鼠 Zǐshǔ ("Týchuột", 'Tý rat'), 丑牛 Chǒuníu (SửuTrâu 'Sửu buffallo'), 寅虎 Yínhǔ (Dầncọp 'Dần tiger'), 卯貓 Mǎomāo (Mãomẹo 'Mão cat')
[ NOT =>卯兔 Mǎotù? ("Mão thỏ"? 'Mão rabbit') ], 辰龍 Chénlóng (Thìnrồng
'Thìn dragon'), 巳蛇 Sìshé (Tỵrắn 'Tỵ snake'),
午馬 Wǔmǎ ("nămNgọ", 'Ngọ horse'), 未羊 Wèiyáng (Mùidê
'Mùi goat'), 申猴 Shēnhóu (Thânkhỉ
'Thân monkey'), 酉雞 Yǒujī ("Dậugà", 'Dậu chicken'), 戌狗 Xūgǒu
("Tuấtchó", 'Tuất dog'), 亥猪 Hàizhū ("Hợitrư", 'Hợi pig').
(A)^ Western theories often
overlook
historical Yue linguistic and cultural facts, favoring new constructs
over existing knowledge. Many Western scholars have hesitated to engage
deeply with older historical sources, particularly those requiring
proficiency in Chinese, leading them to invent frameworks from scratch
rather than building on established research.
(T)^See APPENDIX LBùi Khánh-Thế, Ứng xử Ngôn ngữ của Người Việt đối với các Yếu tố gốc
Hán.
(文)^ Historical Linguistic Transformation:
The evolution of Han-Viet lexicons, as reflected in both daily speech and
literature, is further corroborated by historical events that unfolded in
ancient Annam following the collapse of the Tang Empire (906–939 AD). For an
in-depth analysis of the transformation from Middle Chinese to
Sino-Vietnamese, see Nguyễn Tài Cẩn's
Nguồn gốc và Quá trình Hình thành Cách đọc Âm Hán-Việt (1979).
(普)^ A few hey points before proceeding: For general readers, here are a few introductory guidelines before delving further into this work.
Time commitment: This research is intended for publication in print format and is not suited for cursory browsing on the internet. Be prepared to invest ample time in engaging with its content.
Conceptual framework: If the introductory chapter feels dense or difficult to grasp, do not be discouraged. If you are eager to learn, consider a simplified perspective: treat Austroasiatic as a linguistic branch stemming from pre-Yue Taic languages, and build your understanding from that premise. Alternatively, you may begin with the assumption that Yue, distinct from both Sinitic and Austroasiatic, serves as the foundation for this discussion. This approach clarifies why Austroasiatic classifications tend to be retroactive, tracing a circuitous route from south to north.
Navigating Austroasiatic research: Do not let the overwhelming amount of Austroasiatic information online intimidate you. Much of it reiterates the same interpretations drawn from similar sources. Scholars in the Sino-Tibetan linguistic circle (focused on Yue studies) understand the limitations of such analyses. The author assume that if you have read this far, you align with the Sino-Tibetan perspective; otherwise, you likely would not have had the patience to engage with these discussions, let alone with the equivalent of hundreds of printed pages ahead. To maintain clarity, avoid reactive engagement with Austroasiatic arguments, as they often lead to distractions rather than progress.
Linguistic insights for different audiences:
For language learners:Much like the thrill of tasting "phở" for the first time, learners may be intrigued to learn that "phở" is etymologically cognate with 粉 fěn (SV "phấn", meaning 'noodle'). This root has branched into several Vietnamese words, including "phấn" 'chalk', "bún" 'noodle', "bột" 'flour', and "bụi" 'dust', all tracing back to the same semantic origin. (See Han-Viet.com)
For linguists: Experts may immediately recognize the plausibility of cognates such as:
However, for general readers, digesting these etymological connections requires time and effort. Explanatory elaborations may help, but some assumptions should be accepted as foundational premises without excessive scrutiny, such as the correspondence between 打 dǎ and "đánh". Further phonetic details, like its association with 丁 dīng (SV "đinh", 'young man'), would only add complexity. 丁 dīng also gave rise to words like 釘 dīng (SV "đinh", 'nail') and 打包 dǎbāo, which corresponds to Vietnamese "đóngbao" 'to package'. Readers may, of course, question whether "trai" 'young man' originated from 丁 dīng, but such inquiries extend beyond the immediate scope of this work.
(囯)^ Austroasiatic Interpretations of Sino-Vietnamese Usage: Although unproven, this perspective is noteworthy as it provides Austroasiatic scholars with a rationale for the widespread use of Sino-Vietnamese words in daily Vietnamese speech. Their argument suggests that these words were adopted into common usage through linguistic evolution rather than being inherently native expressions belonging to speakers of the same language.
(音)^ Phonological Insights for Chinese Philologists: Chinese philologists may find value in examining subtle articulation discrepancies in Vietnamese, which could offer solutions to complexities such as chongniu (重紐, rime doublets) and phonemic division patterns (I, II 等, first and second class distinctions) in Middle Chinese historical phonology.
(P)^ The Singular 'They': Regarding pronoun usage, the author acknowledges that the
singular they is increasingly recognized as a practical
alternative to "she," "he," or "s/he" in various contexts. The
Washington Post formally adopted this usage in its
stylebook in December 2015, and the
U.S. Examiner followed suit on September 22, 2016.
Furthermore, they was named Word of the Year by
the American Dialect Society in 2015.
(Q)^ For guidance on approximate
pronunciation in modern Vietnamese, consult
or refer to Vietnamese-English Dictionary by Nguyễn
Đình-Hoà (1966) or Nguyễn Văn Khôn (1967).