Executive Summary
This chapter challenges the prevailing Austroasiatic Mon-Khmer classification of Vietnamese, arguing that linguistic bias, often driven by political and nationalist motivations, has influenced academic discourse to favor Austroasiatic theories.
- Serves as a metaphor for the dominance of Austroasiatic narratives in Vietnamese linguistic classification.
-
Digital knowledge dissemination reinforces Mon-Khmer theories through search engine algorithms and governmental regulations favoring Austroasiatic classifications while suppressing alternative perspectives.
-
Highlights the need to rectify linguistic misclassifications in Vietnamese etymology.
II) Dispelling Austroasiatic Mon-Khmer Misconceptions
- Critique of Austroasiatic Mon-Khmer assumptions that disregard Sinitic-Vietnamese etymologies.
-
Evidence supporting Vietnamese as part of the Sino-Tibetan family through cognates of basic words.
-
Examination of political biases influencing linguistic classification and limiting scholarly objectivity.
- Examination of flawed methodologies used in Austroasiatic Mon-Khmer classifications.
-
Influence of Western biases and limited access to Chinese linguistic resources.
-
Critique of modern online discourse, highlighting how unverified articles and social media engagement contribute to linguistic misconceptions.
-
Growing preference for brief online summaries over rigorous academic research.
-
The digital dominance of Austroasiatic advocates, making their theories more accessible to general audiences, particularly students and newcomers.
IV) Archaeological Evidence and Yue Origins
-
Dongson-style bronze drums as material evidence linking Yue migrations to Southeast Asia.
-
Artifacts confirming shared heritage between Vietnam and China South.
- Migrations from Dongtinghu Lake, China South, to Vietnam, shaping early linguistic formations.
-
Influence of Han colonization on Annamese language evolution, integrating Sinitic structures.
-
Yue cultural and linguistic traits embedded within Vietnamese, demonstrating historical continuity.
-
Introduction of the Sinitic-Yue hypothesis, leading to two innovative methods for identifying hidden Sinitic-Vietnamese cognates.
-
Findings revealing over 400 fundamental Vietnamese words aligning with Sino-Tibetan linguistic structures, supporting the reclassification of Vietnamese within the Sino-Tibetan framework.
-
Challenges in Sino-Tibetan research, which remains largely dependent on print materials that are increasingly difficult to access.
-
The author’s intellectual journey, initially accepting Austroasiatic Mon-Khmer classifications before shifting toward Sino-Tibetan perspectives and independently researching Chinese historical linguistics since the 1980s.
-
Academic accessibility challenges, emphasizing the need for evidence-based linguistic inquiry rather than ideological or politically motivated classifications.
Methods used to uncover hidden Sinitic-Vietnamese cognates.
-
Comparative techniques linking Chinese etymologies and Vietnamese vocabularies.
Conclusion
Vietnamese should be reconsidered within the Sino-Tibetan linguistic family, acknowledging historical migrations from China that influenced its linguistic evolution. These findings challenge mainstream Austroasiatic classification and establish a foundation for Sinitic-Vietnamese etymology, setting the stage for further analyses in subsequent chapters.
x X x
This chapter explores several key issues concerning the linguistic classification of Vietnamese. It first challenges the misconception that Vietnamese descends from common Austroasiatic Mon-Khmer languages, arguing that historical and linguistic evidence points to an alternative lineage. The discussion also examines the influence of political motivations, revealing how nationalism has shaped linguistic classification to favor Austroasiatic Mon-Khmer frameworks. These biases underscore the need for a reevaluation grounded in historical and linguistic evidence.
Building upon previous concepts, the chapter introduces novel approaches to uncover hidden Sinitic-Vietnamese cognates, cognates that may have been overlooked even by specialists in Austroasiatic Mon-Khmer studies. Two innovative methods are presented, specifically designed to identify linguistic traits that Vietnamese shares exclusively with Chinese.
These methods have led to groundbreaking discoveries in Sino-Tibetan and Sinitic etymology, identifying over 400 foundational words within the Sinitic-Vietnamese lexicon. These findings provide compelling evidence to support a reclassification of Vietnamese within the Sino-Tibetan framework, revitalizing discussions on its linguistic origins .
I) The "Rainwash" Effect
The metaphor "rainwash", echoing the psychological resonance of "brainwash," evokes the cleansing force of torrential rainfall, washing away entrenched misconceptions much like it purges environmental pollutants. This image aptly captures the historical trajectory of the Sino-Tibetan classification of Vietnamese, which, despite facing staunch resistance in the early twentieth century, has undergone continuous refinement and reevaluation.
Meanwhile, Austroasiatic Mon-Khmer theorists have long sought to fill perceived gaps in the Sino-Tibetan framework, particularly the absence of cognates for foundational Vietnamese lexemes, a deficiency that has, in many cases, been addressed through deeper etymological mapping.
In the digital age, Sino-Tibetan linguists face mounting challenges in academic accessibility, while Mon-Khmer advocates increasingly leverage online platforms to promote counterarguments. Web-based learners of Vietnamese historical linguistics often encounter a flood of contradictory data, ranging from rigorous scholarship to speculative claims, allowing misinformation to circulate unchecked.
Faced with competing narratives, many readers disengage entirely, concluding that the classification of Vietnamese, whether Sinitic or Austroasiatic, is inconsequential. Reports suggest that both Chinese and Vietnamese governments actively regulate Wikipedia entries, enforcing politically correct boundaries even in linguistic domains.
Despite digital proliferation, much of the Sino-Tibetan etymological corpus remains inaccessible online. Scholars continue to rely on printed materials, yet these texts increasingly languish in obscurity, gathering dust on library shelves that are out of reach for most readers. This decline raises concerns about the long-term viability of print-based research, which, despite its depth, risks being eclipsed by algorithm-driven content.
Search engines further reinforce the Mon-Khmer narrative, shaping public perception through keyword bias. Queries involving "Austroasiatic", "Viet", or "Khmer" often yield results dominated by Austroasiatic perspectives, conditioning interpretations among digital audiences unfamiliar with broader linguistic frameworks. Even a handful of prominent search results, accurate or not, can disproportionately influence novice understanding.
Alternative perspectives increasingly face resistance in this algorithmic landscape. While information circulates through both print and digital channels, mainstream education has yet to incorporate the full scope of linguistic debates into formal curricula. The internet, while democratizing access, also accelerates the global spread of unverified claims. Over seventy percent of users reportedly trust online sources, many of which originate from anonymous or unverifiable outlets, further complicating the epistemological terrain. (Nielsen: Consumer Trust in Online, Social and Mobile Advertising Grows, 2025)
Moreover, fewer readers today dedicate time to consuming long dissertations online. Many web users skim materials for quick facts rather than reading thoroughly. This trend exacerbates the negative effects of brief online engagement, with newcomers’ cognitive development easily influenced by predetermined suppositions packaged in concise, affirmative, but often misleading, formats.
Enthusiastic debaters frequently engage with such socially engineered, abridged, and unverifiable articles, which circulate widely on the internet. Unfortunately, some even resort to personal attacks, further muddying the discourse.
Selective hearing is a fundamental aspect of human nature. People instinctively gravitate toward information that aligns with their preexisting beliefs, often shaped during early cognitive development when they first encounter a new domain of knowledge. Such initial impressions tend to take root deeply, influencing perception for years to come. As the saying goes, love strikes at first sight, metaphorically speaking.
Moreover, the author has observed recurring patterns of slander and misconduct in online debates, particularly those concerning Vietnamese origins. Over the past decade, responses to this preliminary research, shared across various internet forums, have consistently highlighted these trends. Notably, a significant portion of Vietnamese commenters appears hesitant to engage constructively in meaningful discussions.
Consequently, the author often finds himself addressing only a few readers, or, metaphorically, his own shadow. This dynamic presents a clear obstacle to the broader recognition and reception of his theory.
As noted in the introductory chapter, this research is best suited for print publication rather than online dissemination. Printed books engage readers more deeply, allowing them to immerse themselves fully in the material, whereas online posts often fail to sustain prolonged attention. Should future specialists encounter the newly uncovered Sinitic evidence presented here, they may build upon this foundation, moving beyond misleading Austroasiatic narratives. With this groundwork in place, their focus can remain anchored in reliable discoveries, protecting them from the Mon-Khmer fallacies prevalent in cyberspace.
The author is less concerned with persuading veteran specialists already entrenched in their positions. However, given the challenges posed by deeply held convictions, it remains essential to disseminate these findings widely. The goal is to inform and guide newcomers in the field of Sinitic-Vietnamese etymology, ensuring that groundbreaking research does not remain obscured but instead inspires new avenues of inquiry before reaching its final form.
Like many young scholars today, the author initially accepted the Austroasiatic Mon-Khmer framework during his college years in Vietnam in the late 1970s, fully embracing theories advanced by leading historical linguistics experts. Over time, however, he began to distance himself from the Austroasiatic hypothesis, despite its endorsement by respected mentors. Among them was Professor Nguyễn Tài Cẩn (1926–2011), one of Vietnam's most distinguished linguists internationally. The authoritative grasp these scholars held over the subject often overshadowed students' ability to challenge prevailing views, even when Sino-Tibetan perspectives were privately acknowledged. This dynamic left a lasting impression on the author's intellectual journey, ultimately prompting him to break away from the confines of campus orthodoxy.
Overcoming the conceptual divide between Austroasiatic Mon-Khmer and Sino-Tibetan classifications required considerable time. The author initially approached the latter with hesitation, only committing fully after further investigation. His early exposure to Chinese, acquired as a schoolboy, provided insight into linguistic patterns that had long puzzled him. His independent exploration of the Sino-Tibetan framework began in the early 1980s, a long, arduous, yet rewarding endeavor. By the 1990s, as he developed the Sinitic-Yue hypothesis, moments of unexpected breakthroughs brought exhilaration, each discovery enriching his understanding of Vietnamese etymology. For example, how many Vietnamese Sinologists could instantly recognize the Sinitic-Vietnamese etyma in words such as:
- 飯 (fàn) ~ bữa (buổi, "time of the day") {cf. Hainanese /bui2/}
- 重 (zhòng) ~ nặng ("heavy") {cf. Hainanese /dang2/}
- 寒 (hán) ~ cóng ("chilly") {cf. Hainanese /kua2/}
- 檨 (shé) ~ soài ("mango") {cf. Fukienese /soã/}
etc.
These linguistic correlations underscore the significance of reconsidering Vietnamese classification within the Sino-Tibetan framework, encouraging a departure from outdated Austroasiatic assumptions.
In the past, linguistic knowledge was acquired gradually, one book at a time. In contrast, today's students confront an overwhelming flood of online information, and misinformation, that complicates their ability to discern reliable content for specialized studies. As a result, many newcomers are "rainwashed" into accepting the Austroasiatic Mon-Khmer hypothesis, as search queries on Vietnamese historical linguistics consistently return results saturated with its imposed perspective on the language's origins.
Figuratively speaking, Austroasiatic Mon-Khmer narratives have eclipsed the Sino-Tibetan theory, much like neglected trees overshadowed by overgrown shrubs. When was the last time fresh growth emerged in Sino-Tibetan perspectives on Sinitic-Vietnamese etymology? While Austroasiatic research has continued to advance, progress within the Sino-Tibetan framework regarding Sinitic-Vietnamese etymology has remained limited. Since the late twentieth century, few new Sino-Vietnamese elements have surfaced, and longstanding linguistic gaps remain either overlooked or unresolved. As a result, the potential for further exploration appears increasingly distant.
Through the reconstructive mechanics of traditional historical phonology of ancient Chinese, conducted by renowned linguists, the author has assembled substantial evidence of Sino-Tibetan and Sinitic-Vietnamese cognates, which he is eager to share. He firmly believes that analyzing these discoveries in Sino-Tibetan etymology, unveiling a vast array of basic words cognate with Sinitic-Vietnamese etyma, can radically shift readers' perspectives. Interestingly, many of these etyma overlap with foundational Austroasiatic Mon-Khmer lexicons, accounting for approximately fifty percent and forming the theoretical basis of the Mon-Khmer sub-family. Contrary to Austroasiatic Mon-Khmer assertions, these findings strongly reinforce the Sino-Tibetan perspective.
The so-called "Austroasiatic Mon-Khmer rainwash", as discussed earlier, illustrates the rationale behind why contemporary Vietnamese scholars, unlike their South Vietnamese counterparts prior to 1975, such as Lê Ngọc Trụ, Nguyễn Đình Hoà, Nguyễn Hiến Lê, and Hồ Hữu Tường, have aligned themselves with proponents of the Austroasiatic Mon-Khmer hypothesis. In doing so, they have frequently downplayed or disregarded Chinese linguistic influences. Given the historical backdrop, including the limited Western understanding of Chinese prior to the nineteenth century (see Lumbæk, 1986), it is plausible that Austroasiatic pioneers of the previous century formulated their hypothesis through an Austroasiatic-centric lens. Their simplified approach required minimal engagement with the linguistic complexities of Chinese dialects and subdialects, further reinforcing their premise.
Over time, fallout from the "rainwash" under the Austroasiatic paradigm has solidified Western views on Vietnamese linguistic classification. The author of this research seeks to challenge the entrenched "business as usual" approach by reviving long-overlooked Sino-Tibetan perspectives. Specifically, Chapter 10 - Parallels with the Sino-Tibetan languages, which elaborates on Sino-Tibetan-Vietnamese cognates, draws upon Shafer’s long-recognized Sino-Tibetan etymology work (1972) to reintroduce the century-old theory into the Sino-Tibetan-Vietnamese linguistic radar.
By the early twenty-first century, historical linguistics had firmly placed Vietnamese within the Austroasiatic Mon-Khmer family. This classification was based on comparative analyses of basic vocabulary across Mon-Khmer languages, which revealed sporadic cognates with Vietnamese. Austroasiatic scholars routinely designated foundational Sinitic-Vietnamese elements as Sino-Tibetan lexicons or Chinese loanwords. In doing so, they avoided confronting the deeper linguistic peculiarities shared exclusively between Vietnamese and Chinese, features conspicuously absent in Mon-Khmer languages. Nevertheless, these theorists elevated the perceived importance of such foundational words, reinforcing their framework within linguistic analyses.
II) Dispelling Austroasiatic Mon-Khmer Misconceptions
By the time readers examine the Chinese etymologies of Sinitic-Vietnamese words presented in this research, they may have already formed opinions about whether Vietnamese belongs to the Sino-Tibetan or Austroasiatic Mon-Khmer linguistic family. Like all scholarly debates, this discussion presents two primary frameworks that intersect in ways still underexplored. When evaluating both classifications, theoretical gaps exist on each side, yet the absence of definitive answers does not invalidate either perspective.
Linguistically, the Austroasiatic Mon-Khmer hypothesis has introduced a modern narrative that contrasts sharply with traditional Vietnamese viewpoints, ones shaped by centuries of legends, folklore, and cultural continuity. Empirically driven Indo-European linguists, who prioritize scientific analysis, have often shown little regard for historical memory, let alone the spiritual values deeply embedded in Vietnamese cultural convictions.
For instance, Austroasiatic proponents frequently dismiss the interpretive chronology of the eighteen reigns attributed to Vietnam’s ancestral King Hùng I, II, III, and so forth. They view such accounts as implausible, particularly due to the significant chronological gaps spanning hundreds of years within the speculative timeline of over 4,896 years since 2879 B.C., the date traditionally believed by the Vietnamese to mark the birth of their nation. This timeline remains difficult to substantiate, as numerical inconsistencies challenge the credibility of these historical narratives. (H)
Over the years, a small but discerning group of Austroasiatic specialists has acknowledged the importance of Chinese-Vietnamese linguistic research carried out by independent scholars. These foundational studies have played a role in refining the Austroasiatic hypothesis, with notable contributions from Tsu-lin Mei (1976), Jerry Norman (1988), and Mark J. Alves (2001, 2007, 2009), among others. Yet it remains noteworthy that some Austroasiatic scholars still struggle to clearly distinguish between Sino-Vietnamese terms and deeper Sinitic-Vietnamese etyma when analyzing Vietnamese vocabulary. This is especially evident in the confusion between Hán-Việt and Hán-Nôm layers, despite the fact that Vietnamese lexicon is structured across three principal strata: Nôm (V), Hán-Nôm (VS), and Hán-Việt (SV). (A).
Sino-Vietnamese words, much like Latin words in English, are relatively easy to identify, yet this distinction remains muddled in scholarly citations. Therefore, evidence demonstrating Sino-Tibetan linguistic origins for numerous cited Vietnamese vocabulary items has not been properly addressed while Austroasiatic researchers present their works on Mon-Khmer basic words. they continuously recycle citations of Mon-Khmer cognates dating back to David D. Thomas (1966) without introducing meaningful breakthroughs.
At the time of its inception, many Indo-European theorists who originally proposed the Austroasiatic hypothesis may not have been aware of the Yue people documented in ancient Chinese annals, or they simply did care at all. Instead, they arbitrarily assigned linguistic classifications based on their assumptions, even erroneously attributing these populations to the Southern Hemisphere, metaphorically speaking, that is what the prefix 'Austro-' means.
Until relatively recently, scholars continued to struggle with the etymological classification of certain lexical items, such as 戌 (xù), 狗 (gǒu), 犬 (quán), and the Vietnamese word chó ("dog"), or 死 (sǐ), 折 (zhé), 逝 (shì), 陟 (zhì), 卆 (zú), 卒 (zú), 殂 (cú) for chết ("die"), uncertain whether these originated from Yue substrata or Chinese sources. Rather than undertaking a comprehensive historical etymological analysis, many of these terms were indiscriminately assigned to the Austroasiatic Mon-Khmer subfamily and reconstructed as /kro/, bypassing their complex linguistic trajectories.
The Mon-Khmer theory, built upon Austroasiatic foundations, has proliferated across internet-based research platforms, particularly over the past three decades. This favorable digital environment has granted Austroasiatic Mon-Khmer theorists a distinct advantage, leading to widespread acceptance of their classification model while benefiting from methodological conveniences in the field of historical linguistics.
The prominence of Austroasiatic Mon-Khmer scholarship is partly sustained by the enduring perception that Western linguistic methodologies are inherently scientific and superior. This belief has reinforced the credibility of Austroasiatic classification models, especially through their association with prestigious Western academic institutions. As a result, these frameworks have gained considerable influence, attracting a broad intellectual following in Vietnam.
Many Vietnamese scholars, eager to engage with Western-trained linguists, have aligned themselves with these perspectives, further legitimizing Austroasiatic theories. However, this tendency has often inflated the perceived scholarly value of certain localized contributions, sometimes at the expense of methodological rigor.
Such imbalances have hindered efforts to reevaluate Vietnamese linguistic classification within the Sino-Tibetan framework. Over time, the dominance of Austroasiatic Mon-Khmer theory has become entrenched, solidifying its status as the prevailing model for Vietnamese origins with minimal critical scrutiny. This has placed Sino-Tibetan theorists at a disadvantage, limiting their ability to introduce new insights into Vietnamese etymology.
The discourse surrounding Vietnamese linguistic classification has grown increasingly politicized, though, shaped by nationalist debates that amplify its cultural significance. As a result, discussions about the origins of Vietnamese are now deeply entangled with broader questions of national identity , a reality often obscured by nationalist or Sinocentric historiographies. So said, although the Austroasiatic Mon-Khmer hypothesis emerged relatively late, it has been warmly embraced and gradually gained local support.
Vietnamese academia may eventually revisit its linguistic heritage, recognizing that its true origins could diverge from long-standing narratives. While the adoption of the Austroasiatic Mon-Khmer framework has provided a convenient interpretive lens, scholars have repeatedly sought ways to sidestep the Sino-Tibetan classification. Yet the issue now extends beyond the reach of any single researcher. Prevailing anti-Chinese sentiment has obstructed scholarly progress, fostering an anti-academic climate that continues to stall meaningful advancement for native Vietnamese linguists.
The Vietnamese are well aware that their country's history has been repeatedly rewritten by its rulers to align with shifting perspectives on Sino-Vietnamese relations. As a result, theories regarding the origin of the Vietnamese people have been reshaped to serve these changing narratives, often disregarding historical accuracy. In essence, history is written by the victors.
In contrast to the pursuit of academic truth, Vietnamese historical linguistics remains burdened by political considerations, much like the broader historical narrative of the country, which has often been shaped by the victors. This dynamic echoes earlier patterns, including resistance movements against imperialist China. Yet, paradoxically, the core of Vietnamese culture remains deeply Sinicized in so many respects.
From a philosophical standpoint, heightened national identity awareness among educated Vietnamese has complicated the reception of the Sino-Tibetan, Sinocentric narrative, even as it continues to trail behind China’s historical and academic dominance.
Strong nationalism in Vietnam cannot be resolved simply by rejecting historical Chinese influence and replacing it with alternative narratives. Truth, as a persistent force beneath ideological constructs, inevitably resurfaces. In time, the revival of historical convictions will compel nationalist scholars to reassess their positions, separating academic inquiry from political bias and restoring the impartiality that once defined the discipline.
Disentangling politics from national identity remains a formidable challenge. The reassessment of historical narratives and their broader implications continues to resonate, even in recent years in the United States. Prior to 2016, before the presidency of Donald Trump, the proliferation of Confucius Institutes, Chinese cultural centers funded by the Chinese government and widely viewed as instruments of geopolitical influence, was evident. These institutes expanded rapidly, shaping U.S. academic institutions through donations and extending China’s global reach. As a result, growing interest in Chinese language studies among students outside China has gradually shifted the balance toward Chinese proficiency, eclipsing the linguistic relevance of Austroasiatic specialists.
In the first two decades of the new millennium, influenced by Western academic trends, Vietnamese theorists began to show increasing openness to the Sino-Tibetan framework (see Bùi Khánh Thế, Appendix L.) However, the axiom that individuals tend to reinforce their preexisting beliefs remains steadfast. Eventually, the prevailing linguistic perspective will shift, regardless of any resistance. This anticipated shift will inevitably reshape the Sino-Tibetan framework explored in this study. As with other humanities, historical linguistics evolves in tandem with broader socio-political transformations.
Readers may wonder how academic inquiry becomes entangled with nationalism. The answer lies in the sublimation of national identity into ideological frameworks that shape scholarly interpretation. In historical linguistics, the Yue foundation of Vietnamese core vocabulary has often been deliberately reinterpreted through Austroasiatic Mon-Khmer terminology, likely as a means of distancing Vietnamese origins from Sino-Tibetan or Sinocentric associations traditionally linked to Chinese influence. Caution is warranted when consulting academic publications from government-affiliated institutions in Vietnam, as such works frequently function as state-sanctioned instruments rather than independent scholarly research (Knud Lundbæk, 1986, p. 45).
Due to the political complexities surrounding Vietnamese linguistic classification, the author has devoted Chapter 5 - The politics of Chinese-Vietnamese Linguistics examining the influence of Sino-Vietnamese relations on academic discourse. This topic significantly affects the ability to render impartial judgments on whether the Sino-Tibetan framework can gain renewed traction. Those familiar with Vietnam’s history, even as presented by its ruling government, may recognize the paradox inherent in this issue. Applying Western value systems to fully grasp these intricacies proves virtually impossible.
For Western-educated readers unfamiliar with Vietnam’s political landscape, the entanglement of scholarly discourse with China-related sensitivities may appear abstract. Yet these dynamics are well understood by locally trained scholars and members of the diaspora. The historically fraught Sino-Vietnamese relationship, addressed in later sections, illustrates how political motivations can distort academic narratives, concealing nationalist interests beneath ostensibly neutral frameworks.
Among educated Vietnamese youth, the Austroasiatic Mon-Khmer classification may seem straightforward. However, when confronted with questions about ancestral origins, many exhibit a form of cultural self-denial. Much like American youth who gradually lose connection with ancestral heritage, Vietnamese youth often downplay their predominantly Chinese descent, traced through paternal lineage across generations. Instead, they identify with the Kinh majority, participating in an assimilation process that omits traces of Chinese heritage, even when census records and Vietnamized surnames of Chinese origin suggest otherwise. This distinction reflects enduring Vietnamese values rooted in the Yue genealogical legacy.
As each generation inherits societal roles and cultural narratives from the last, perspectives and identities continue to evolve, perpetuating a cycle of historical interpretation shaped by complex socio-political influences.
In earlier sections and a separate article-length paper, the author argued that the Vietnamese people emerged from a complex intermingling of Chinese immigrants from the north and local women. This assertion understandably provoked strong reactions from intellectuals and patriotic netizens (see Some Thoughts on the Origin of the Vietnamese People in Vietnamese in the Appendix K of this survey.)
This paper employs detailed racial profiling to support the argument that, from 111 B.C. to 939 A.D., ancient Annam functioned as a prefecture of imperial China. Like any province within the Middle Kingdom, it received continuous waves of Chinese immigrants from diverse multilectal backgrounds. Over time, the convergence of these lects with indigenous languages obscured the Yue substratum, embedding it beneath successive layers of linguistic evolution.
This long continuum of historical developments ultimately shaped modern Vietnamese, a language in which over ninety percent of its lexical components derive from Chinese.
To contextualize the issue, Vietnam’s historical trajectory may be compared with three other sovereign entities in the region that, like Vietnam, absorbed large numbers of Chinese immigrants:
-
Hong Kong underwent political unrest in 2019 and was
reintegrated into Greater China during the COVID-19 pandemic (2019–2024).
Its predominantly Cantonese-speaking population traces its lineage to
mainland China. The region’s transition from British colonial rule to
Chinese sovereignty parallels Vietnam’s ancient history, which began under
Chinese domination in 111 B.C. This mirrors the transformation of the former
NanYue state into the fully Sinicized Guangdong province.
-
Singapore, once a Chinatown within Malaysia, gained
sovereignty in 1965 and evolved into a multi-ethnic nation, with ethnic
Chinese forming the majority.
- Taiwan maintains a precarious relationship with mainland China. Its contested sovereignty echoes Vietnam’s experience of intermittent tensions, both historically and in modern times.
Together, Singapore and Taiwan serve as contemporary reflections of the Annamese sovereign state, which eventually evolved into modern Vietnam. By contrast, Hong Kong simply resembles a younger extension of Shenzhen.
Among these regions, Taiwan faces a distinct identity crisis. It stands at the intersection of competing cultural trajectories while navigating political survival under increasing pressure from mainland China. On one side, it may embrace its Sinicized MinNan heritage, brought to Formosa by forebears from the MinYue Kingdom, now part of Fujian Province. On the other, there is a growing effort to align with Austronesian natives, those who identify with traditional customs such as betel-nut chewing and claim status as the island’s true indigenous inhabitants. These groups were initially marginalized following the influx of mainland Chinese migrants after the Kuomintang’s retreat in 1949.
Vietnam has similarly wrestled with a fundamental question of identity: whether its ancestry stems primarily from South Chinese migrants or from indigenous minority populations. In Vietnam’s case, this identity crisis has evolved beyond cultural heritage into a broader issue of nationalism, deeply entangled with political dynamics.
After the fall of the Ming Dynasty to the Manchurians in the seventeenth century, Ming expatriates (明鄉人, VS NgườiMinhhương) emigrated to Vietnam, contributing to a distinct racial component within the contemporary Hoa ethnicity (華橋). Many of these refugees hailed from Teochew origins in Guangdong’s Chaozhou region, including Chaozhou Prefecture and Foshan City. Today, their descendants form a significant portion of Vietnam’s southern provinces, as reflected in the local saying: Dướisông cáchốt, trênbờ Tiềuchâu, "In the river, there are catfish; on land, there are Teochew people."
Etymologically, it is clear that Tiều is the origin of Tàu (Chinese), a clarification that dispels longstanding misconceptions. Contrary to popular but inaccurate interpretations, Tàu does not derive from Tần (Qin), Đường (Tang), or Ba Tàu (“three ships”). This distinction settles the etymon conclusively.
Adding further dimension to this narrative, subjects of the "Southern State", informally referred to as Namquốc (南國), take pride in their Yue ancestral heritage, which they continue to honor through annual sacrificial ceremonies centered around bronze drums. These rituals serve as cultural affirmations of lineage and identity. Ironically, the Vietnamese Kinh majority does not incorporate bronze drums into formal ritual practice. Instead, they often exhibit a sense of cultural superiority over minority groups who do use them, groups that, as indigenous peoples, may credibly share the same ancestral lineage and may even have been the original creators of the bronze drums themselves.
Ethnologically, Vietnam remains the only existing sovereignty that represents the living descendants of the Yue, including related Daic and Zhuang minorities in southern China. These communities are culturally distinct and historically significant, recognized by Beijing through the granting of autonomous status in acknowledgment of their heritage. Many of their ancestors sought refuge in remote mountainous regions following the Qin invasions.
China’s ethnic composition continues to reflect its Yue lineage, comprising Chu and Han subjects both before and after 111 B.C., until the arrival of Tartaric peoples from the north who established successive dynasties in the Central Plains, including the Liao, Kim, Yuan, and Qing. During the Han era, Sinicized Yue, descendants of annexed states, were incorporated into imperial forces that expanded southward to conquer Giaochỉ (交趾, Jiāozhǐ), the territory known historically as ancient Annam. (C)
Prior to Vietnam’s independence in 939 A.D., and excluding Han immigration during its 1,060-year colonial period, Annam’s racial composition closely mirrored that of the Lingnan Range region, historically encompassing today’s Guangxi, Hunan, Guangdong, and Fujian provinces. (Q)
Following Vietnam’s separation from China, nationalism intensified through
successive wars against Chinese dynasties seeking to reclaim the territory ,
including the Song, Yuan, Ming, and Qing , as well as ongoing conflicts with
the People’s Republic of China. Historically, each time a Chinese monarchy
reached the height of its power, it sought to reassert Vietnam’s status as a
vassal state. The Annamese defied expectations by defeating the Mongols not
once but three times in the twelfth century, an achievement that astonished
contemporary European observers. Yet no Chinese emperor, past or present,
seems to have learned from Vietnam’s historical resilience. Regardless of
dynastic strength, every conquest attempt ultimately ended in defeat,
including the border war of 1979.
III) Reassessing Austroasiatic Theories
This section reexamines the Austroasiatic Mon-Khmer hypothesis through an anthropological lens, with particular emphasis on the Yue origins of Sinitic-Vietnamese (VS). The approach integrates newly identified basic cognates within Sino-Tibetan (ST) etymologies, offering a revised understanding of Vietnamese linguistic development. The Austroasiatic family can only be meaningfully connected to the Yue if it includes populations originating from the Yangtze River basin (揚子江). Reframing the discussion through Sino-Tibetan insights allows these etymological findings to clarify the presence of basic Vietnamese words that have long been misclassified as Mon-Khmer. Readers will encounter the significance of this argument as they progress through the text.
Historical grounding is essential to any rigorous linguistic theory. In contrast, the widely circulated Austroasiatic Mon-Khmer framework lacks such grounding. It has not been supported by a comprehensive historical perspective or substantiated by concrete records. Nevertheless, Austroasiatic advocates have long enjoyed academic prominence. Meanwhile, Sino-Tibetan proponents, despite relying on written documentation, continue to face institutional resistance.
For Mon-Khmer theorists, the framework rests primarily on a limited set of basic lexical items. Some are transcribed from oral speech, others reconstructed from early substratal forms. Yet the number of selectively identified loanwords accounts for less than 0.5 percent of the 80,000 Sinitic-Vietnamese entries potentially present in the Vietnamese lexicon. This is a negligible fraction of the language’s overall vocabulary. While Vietnamese lexicons do exhibit some cognates with Austroasiatic Mon-Khmer, they also reveal substantial overlap with Sino-Tibetan etyma. Given the limited lexical support for the Austroasiatic narrative, its dominance may prove unsustainable.
In contemporary linguistic discourse, the term Austroasiatic is often used interchangeably with Mon-Khmer, though Austroasiatic encompasses a broader geographic scope. The hypothesis, which implies a southern origin, proposes that ancestral Austroasiatic populations migrated both northward and southward from a homeland in Southeast Asia. These migrations are believed to have crossed now-submerged land bridges into South Asia, India, and southern China, while also extending toward present-day island regions.
However, the theory remains speculative. Archaeological evidence suggests that populations reaching Oceania likely traveled by sea rather than land. Whether future research will expand the hypothesis to include Austronesian and Austric classifications, potentially linking Polynesian and Malaysian populations across the South Pacific, remains an open question. (M)
As hypothesized above, the Austroasiatic homeland has been reevaluated, centering around the Mekong River region (Paul Sidwell, "The Austroasiatic Central Riverine Hypothesis," Journal of Language Relationship 4, 2010, pp. 117–134). Sidwell postulates that the Austroasiatic peoples could not have originated from the same racial stock as the Yue, whose habitat was historically situated in South China below the Yangtze River. Prior to Sidwell’s revision, earlier theories had proposed Yunnan Province as the Austroasiatic homeland, an area now recognized as the primary territory of the ancient Yue, who expanded southward.
For the indigenous populations of the southern regions, Paul Benedict (1975) classified a distinct linguistic branch under the designation Austro-Thai, introduced in a separate publication. His framework represents an alternative model distinct from traditional Austroasiatic classifications, suggesting a potential link between Austroasiatic and Tai-Kadai languages. (泰)
Regardless of whether Vietnamese origins are framed through a Yue or Austroasiatic lens, it is important to recognize that the cultural artifacts unearthed in southern Vietnam predate the arrival of late Vietic-speaking populations. These relics cannot be considered part of the ancestral heritage of the later inhabitants who came to dominate the region. As some readers may have already inferred, the Kinh majority emerged from racially mixed origins, primarily descending from Sinitic-Vietnamese speakers shaped during the early phases of Han colonization. For this reason, the ancient Austroasiatic Mon-Khmer populations of Indochina cannot be directly linked to the prehistoric Yue ancestry of the Vietnamese further north.
It was not until the sixteenth century that Annamese Kinh groups began resettling the Mekong River Delta, migrating from the north. Long before this movement, ancestral Vietic speakers had already undergone significant linguistic and ethnic transformation through sustained contact with Yue and Han populations. These changes were later compounded by fusion with Mon-Khmer and Chamic communities during westward and southward migrations. Upon reaching the southern tip of Càmau Cape, they encountered the southern Khmer people. The Mon-Khmer influences found in Vietnamese culture today are therefore relatively recent, layered onto an existing cultural framework over the past ten centuries.
This historical trajectory precludes the possibility of the Vietnamese being direct descendants of prehistoric Austroasiatic Mon-Khmer populations, particularly in linguistic terms. While genetic analysis may offer further insight into these ancestral relationships, such biological inquiries fall outside the scope of this linguistic investigation.
Etymologically, countering the claims made by Austroasiatic proponents, recent findings indicate that the Vietnamese lexical base shares linguistic traits with over 400 fundamental words in the Sino-Tibetan linguistic family (See Chapter 10 - Parallels with the Sino-Tibetan Languages). These roots span a vast geographical terrain across southern Asia, forming a solid foundation to support the reclassification of the Vietnamese language.
The Austroasiatic Mon-Khmer hypothesis, which attempts to reconstruct prehistoric Yu, hence, Austroasiatic and lexicons, lacks historical substantiation and has led to notable linguistic inconsistencies. This framework has often been used as a foundation for subsequent theories, requiring counterarguments to challenge its claims. In contrast, the Sino-Tibetan hypothesis is supported by ancient Chinese records. Historical annals acknowledge the existence of the Bai Yue tribes, among whom certain groups are identified as ancestral pre-Viet-Muong peoples, including those from ÂuLạc (歐雒, OuLuo) and LạcViệt (雒越, LuoYue).
One of the most significant shortcomings of the Austroasiatic hypothesis is its inability to demonstrate a historical relationship, through ancient Khmer scripts, between basic Vietnamese words and Austroasiatic Mon-Khmer cognates. This stands in contrast to the careful reconstruction of Sinitic roots in Sinitic-Vietnamese etyma, supported by extensive documentation in Chinese characters across multiple linguistic periods. For example, the four basic human needs, ăn, ngủ, đụ, ỉa, align well with Chinese characters of equivalent meaning: 唵 ǎn (SV àm, "eat"), 卧 wò (SV ngoạ, "sleep"), 屌 diǎo (SV điệu, "copulate"), 屙 ē (SV a, "defecate").
Unlike the Austroasiatic framework, Chinese characters, each corresponding to a Sinitic-Vietnamese etymon and its alternants across various Chinese dialects, offer a rich body of linguistic evidence. From their earliest forms, archaic Chinese writing systems were constructed with phonetic-ideographic principles. Over centuries, Sinologists have deciphered their evolving sounds and meanings. Many original forms are self-evident, ranging from basic ideographs like 日 (rì, "sun" and "day") to more complex ideo-phonetic structures such as 麥 (mài, mạch, "wheat"), which originally replaced 來 (lái) in the sense of "millet." Over time, 來 (lái) was repurposed phonetically to mean "to come", as reflected in Vietnamese usage such as lúa ("rice paddy") and lại ("come"), among other derived forms discussed in later chapters.
Understanding linguistic history requires tracing the development of Sinitic-Vietnamese and Sino-Tibetan words through historical events and cultural shifts. The presence of Sinitic-Vietnamese words with cognates in Sino-Tibetan languages points to a shared linguistic ancestry. The etymology of a given Sinitic term often reveals its connection to a Sino-Tibetan counterpart, both contributing to derived Vietnamese forms that are distinct from Mon-Khmer elements. Within the Sino-Tibetan family, including Chinese dialects, historical linguistic patterns intersect with the reconstruction of Old Chinese lexicons. These lexicons, tentatively shown to share roots with Bodic (ancient Tibetan) languages, are supported by structured phonological analysis and preserved in Bodic scripts composed of syllabic alphabets.
In this context, etymology refers to the study of word origins, including their morphological components and evolution into modern forms. Related words across languages are termed etymons or etyma. This study focuses on Vietnamese and Chinese linguistic counterparts, with particular emphasis on Sinitic-Vietnamese etyma. Many basic Vietnamese words trace their roots to Sino-Tibetan sources, representing findings from the second stage of linguistic analysis as outlined by Ruhlen.
From the outset, the classification of Austroasiatic was a misnomer. Early Mon-Khmer theorists had limited knowledge of ancient Vietnamese and Chinese history, and Sinology was still an emerging discipline in the seventeenth century (see Knud Lundbæk, 1986). These scholars disregarded the existence of the Yue and their entity known as Luó Yuè (雒越, SV Lạc Việt), dismissing them as folklore despite extensive documentation in Chinese annals and classical texts. Whether due to difficulties in studying the Yue or an inability to link ancient Vietnamese history with Mon-Khmer linguistic ancestry, Austroasiatic pioneers struggled to correlate the Lạc Việt with other Yue groups, including ŌuYuè (歐越, SV ÂuViệt), XīYuè (西越, SV Tây Việt), MǐnYuè (閩越, SV MânViệt), and WǔYuè (吳越, SV NgôViệt). Collectively known in Chinese records as the Hundred Yues (百越, BáchViệt), these groups encompassed the Chu State, the Kingdom of NamViệt (南越, SV NamViệt), and others. Within this broader historical framework, the placement of Austroasiatic and Mon-Khmer peoples remains unclear.
The proto-Viet-Muong speakers, equated with the Lạc Việt, undeniably existed. This requires theorizing dialectal forms of the ancestral Yue language, which laid the foundation for the Vietic branch. To address this complexity, Austroasiatic specialists adopted a simplified approach, equating proto-Vietic forms with Austroasiatic ones. They relied on cognate etyma found in the Viet-Muong subbranch, which aligns with modern Mon-Khmer languages (see Robert Parkin, A Guide to Austroasiatic Speakers and Their Languages, 1991).
From a historical perspective, Austroasiatic Mon-Khmer theorists face significant challenges in reconstructing ancient linguistic forms that plausibly align with connections between the Lạc Việt, the Bai Yue, and Old Chinese. These interactions, dating back more than 3,000 years before present, left enduring linguistic imprints on the Vietnamese lexicon.
For example, Vietnamese words such as cầy and chó are historically linked to 狗 (gǒu, SV cẩu) and 犬 (quán, Western Sichuan Mandarin /co1/), both meaning "dog." Their derived disyllabic forms further reinforce cognacy with Sino-Tibetan roots. Notable examples include:
- 犬坐 (quánzuò) → chồmhỗm ("to squat")
- 犬牙 (quányá) → răngkhểnh ("canine")
- 小狗 (xiǎogǒu, "puppy") → cầytơ
- 犬子 (quánzi, "pup") → concún, "puppy-dog"
The word chồmhỗm, which has been dubiously attributed to Khmer chorohom (see Nguyễn Ngọc San, 1993), is merely coincidental. As later chapters will demonstrate, such etyma, formerly grouped emphatically within the Mon-Khmer classification by Austroasiatic specialists (See Mei Tsu-lin, Appendix D-G), are now validated as belonging to the Sino-Tibetan linguistic family.
From a linguistic perspective, recorded history supports the Yue theory of an ancestral Yue language and its descendant speakers. Jerry Norman (1979) referred to it as a foreign extinct language. Ancient Chinese classics recount the Yue people speaking archaic Yue forms, coexisting alongside what were assumed to be early Taic languages spoken by subjects of the Chu State (楚國 Chuguo).
For example, The Yuèrén Song (越人歌) was recorded in the Yue language during the sixth century B.C. by Ejun Zizhe (鄂君子皙, Ngạc Quân Tử Tích) (see Appendix J). Chinese linguists have studied the song’s lyrics to analyze a few surviving words of the Yue language (see Appendix K). Additionally, King Liu Bang (劉邦), as previously discussed, likely spoke a subdialect of the Chu language, as he and his followers were former subjects of the Chu State before their triumph in establishing the Han Dynasty (漢朝).
Austroasiatic theorists lack historical records to substantiate their claims and cannot demonstrate how the Mon-Khmer framework fits into the prehistory of the Viet state, i.e., LạcViệt, especially when viewed within the broader Yue historical context of the same timeframe.
At the same time, we must acknowledge the limitations inherent in postulating the Taic-Yue language, a proto-Yue speech, particularly when juxtaposed against the Austroasiatic hypothesis, which lacks historical evidence to support its claims. By the time the Mon peoples migrated from South China into the Indo-Chinese peninsula, they had established a southern Mon-Khmer homeland, which later became a geographic pivot extending northward toward the Viet-Muong group. This movement preceded subsequent waves of Yue-mixed Han infantry that followed Han colonialists into ancient Annam.
It is suggested that the Lạc Việt ancestors of the ancient Vietnamese may have spoken an archaic form of Yue, possibly a proto-Viet-Muong speech derived from a Taic ancestral language. Interestingly, Austroasiatic theorists inadvertently included such linguistic ancestry in their reconstructions of Vietnamese origins, even though it does not align with modern Mon-Khmer languages.
In other words, a prehistoric Taic language likely served as the ancestral root of proto-Mon-Khmer, but as linguistic evolution progressed, the Mon-Khmer and Viet-Muong branches diverged. The Mon-Khmer lineage gradually shifted away from archaic Taic forms, while the Viet-Muong branch integrated elements of Sinitic linguistic fusion through interactions with ancient emigrants from South China. Initially regarded as guest settlers regardless of social status, these migrants intermarried with the local populations, ultimately forming the majority ethnic group now known as the Kinh, distinct from other minority groups such as Mon-Khmer speakers.
As a result, the Austroasiatic Mon-Khmer theoretical narrative bears a striking resemblance to material claims made by certain Vietnamese scholars who assert national ownership over excavated artifacts found in annexed territories. Such claims often frame indigenous relics as ancestral heritage despite the absence of direct lineage. In this regard, the Austroasiatic theory relies heavily on linguistic and anthropological assumptions grounded in minority populations historically exiled to mountainous regions.
Analogously, this approach is similar to branding modern American citizens as direct descendants of indigenous American Indian ancestry or equating Taiwanese identity exclusively with Austronesian or Daic Han roots, both nations having less than three centuries of recognized history (U.S.A. and the Republic of China, respectively).
Any credible theory regarding the origins of languages, whether Indo-European or Sino-Tibetan, requires historical validation, often reinforced by written records, as seen with Latin, Greek, Pali, or Sanskrit within their respective linguistic families. Without historical substantiation, such theories remain purely hypothetical. Both prehistoric and documented historical periods play a critical role in shaping linguistic evolution, influencing whether languages endure or become extinct. In essence, history is the foundation of both a nation and its language.
By contrast, the Austroasiatic Mon-Khmer hypothesis lacks direct historical evidence and remains largely speculative, relying primarily on reconstructed basic lexicons. While its proponents have devised plausible classifications and methodologies, the theory notably omits any substantive connection to Chinese linguistic development. This study adopts linguistic principles from the Austroasiatic hypothesis to examine the structural framework that shaped its theoretical evolution. The strength of this hypothesis lies in its methodological ingenuity, including the use of limited sets of basic words and substituting historical records with archaeological evidence and preliminary DNA analysis of Vietnamese Kinh populations where applicable. These approaches have systematically defined the Austroasiatic Mon-Khmer hypothesis based on raw linguistic data from languages such as Mon, Khmer, Katuic, Bahnaric, Nicobarese, Viet-Muong, Vietic, and Muong.
Similar to structuralist theories of sound change, impersonal, mechanical, non-intuitive, and strictly formal, the methodological framework applied in Austroasiatic studies could theoretically be expanded to construct hypotheses for other languages (Roberts J. Jeffers et al., 1979, p. 91). Using the same tools and methodologies introduced by Austroasiatic Mon-Khmer theorists, one could potentially formulate a linguistic model for any language, even one without historical reference, and present it as a legitimate framework. For instance, an unfamiliar African tribal language could be theorized using existing linguistic data.
It is undeniable that Western methodologies have significantly advanced linguistic research, yielding breakthroughs across multiple linguistic families. Beginning with Indo-European languages, these methods later extended into Sino-Tibetan studies, contributing innovations such as the reconstruction of Old Chinese phonology from the early twentieth century onward. The Austroasiatic Mon-Khmer hypothesis was similarly shaped by this wave of Western linguistic inquiry, as early scholars, including Maspero in the 1940s and Thomas in the 1960s, introduced Vietnamese lexical roots based on comparative Mon-Khmer word lists. Their work gained prominence by identifying Mon-Khmer–Vietnamese cognates that conveniently aligned with structuralist frameworks addressing sound change patterns and tonal genesis, reinforcing dominant Western linguistic paradigms at the time.
Under such influential factors, the Austroasiatic Mon-Khmer perspective on the origin of Vietnamese has been widely accepted, largely due to its institutional backing. Going with the crowd often appears to be the safest path in academia. Much like their Sinologist predecessors, Western-trained Vietnamese scholars, emerging from nearly a century of French colonial rule, have generally conformed to new rationalizations, either under external pressure or voluntarily. Consequently, local specialists frequently align themselves with the Austroasiatic camp to ensure their research receives recognition, circumventing the obscurity that befell many overlooked studies in earlier decades.
For many, this phenomenon is less a matter of intellectual preference and more about academic survival, ensuring inclusion within mainstream scholarly circles. Unfortunately, most scholars entrenched in the Austroasiatic framework fail to produce groundbreaking insights, ultimately remaining caught in a scholarly merry-go-round. Breaking free from this paradigm is a challenge only they can overcome.
If readers revisit the geographical divisions underpinning Austroasiatic theories, originating in the southeastern portion of the Southeast Asian peninsula, where the Mekong Basin meets the sea, they may notice stark contrasts with northern Vietnam’s historical ties to South China. These northern regions were once home to ancient Yue speakers, LuoYue, OuYue, and other groups documented in Chinese historical texts, who later intermingled with early Han resettlers following the annexation of the Nam Viet Kingdom into the Han Empire in 111 B.C. Linguistically, the ancient Vietnamese language and certain Chinese dialects developed in parallel through a comparable process of racial and cultural blending.
By 939 A.D., it is highly plausible that the ancient Annamese population possessed bilingual proficiency, enabling them to conduct official affairs in Middle Chinese (MC) while maintaining colloquial speech in a Sinitic-Yue mixed language. This hybrid tongue, referred to here as Ancient Annamese, would have been intelligible to metropolitan subjects within the Nam Hán State, encompassing Guangdong and Guangxi.
Austroasiatic theorists, despite their methodological shortcomings, have introduced linguistic instrumentalism into their studies. Their work has produced a catalog of over 100 basic Vietnamese words that plausibly share cognates with Mon-Khmer languages. However, based on recent findings of Sino-Tibetan roots, presented later in this research, many of these words likely emerged from linguistic contact with Mon-Khmer speakers residing in remote mountainous regions. This contact likely dates back to a distant past when Mon-Khmer and Viet-Muong speakers, displaced by Han expansion, remained in their homeland.
Eventually, both local inhabitants and northern settlers were absorbed into a colonial society, forming the emergent majority now recognized as the Kinh people. This linguistic convergence likely occurred during interactions between Annamese and Mon-Khmer speakers, as Vietnam's territory expanded southward beyond the 16th latitude after the twelfth century, culminating in the late 18th century when Cà mau extended to the Gulf of Thailand. Consequently, linguistic exchanges and borrowings became inevitable, evidenced by Chamic lexical elements in the Central Hue subdialect.
Through ongoing territorial expansion and intermarriage between settlers and indigenous populations, Mon-Khmer elements gradually integrated into Vietnamese vocabularies over time.
The homeland of all Southeast Asian languages originated in the same geographical area as Vietnamese. Merritt Ruhlen, in The Origin of Language (1994, p. 143), outlines the Austric linguistic family and its classification:
"The Austric family of Southeast Asia consists of four subfamilies: Austroasiatic, Miao-Yao, Daic, and Austronesian, the last two of which appear to be the closest to each other. The Austroasiatic sub-family consists of two branches, Munda and Mon-Khmer. The small Munda branch is restricted to northern India, while the Mon-Khmer branch, more numerous in both languages and speakers, is spread across much of Southeast Asia, often interspersed with languages of other families. Vietnamese and Khmer (or Cambodian) are the two best-known Mon-Khmer languages."
Further contextualizing Southeast Asian linguistic distribution, Ruhlen elaborates:
Regarding Austronesian expansion, Ruhlen (1994, p. 178) notes:"The Daic languages, of which Thai and Laotian are the two best known and the only ones to achieve the status of national languages, are found in Southern China, northern Vietnam, Laos, and Thailand. Austronesian languages are found on Taiwan, which is probably the original homeland of the family, but also on islands throughout the Pacific Ocean, and even on Madagascar in the Indian Ocean close to Africa. The presence of Chinese domination of Taiwan is a consequence of a recent migration from the mainland that began in 1626. Over six millennia earlier, a previous migration from the mainland, of people closely related to the Daic family, led to the original Austronesian occupation of the island, which turned out to be a small step in one of the longest and most hazardous migrations in human history."
The archaeological record of Southeast Asia indicates that the Neolithic revolution in this part of the world began in China around 8,000 years ago. Evidence of millet cultivation in the Yellow River basin and rice cultivation in the Yangzi basin dates to approximately that time. By 5,000 B.P. [before present], farming had spread southward to Vietnam and Thailand and eastward to coastal China. Archaeological findings from this period include villages, pottery, stone and bone tools, boats and paddles, rice, and the bones of domesticated animals such as dogs, pigs, chickens, and cattle.
"About 6,000 years ago, one or more of these agricultural groups crossed the Strait of Formosa (now the Taiwan Strait) and became the first inhabitants of Taiwan. From Taiwan, these ship-building agriculturalists spread first southward to the Philippines and then eastward and westward throughout most of Oceania. The archaeological record indicates that the northern Philippines were reached by 5,000 B.P., and 500 years later, these migrants expanded southward to Java and Timor, westward to Malaysia, and eastward to the southern coast of New Guinea. By around 3,200 B.P., their expansion had reached Madagascar to the west and had extended as far east as Samoa, the central Pacific, the Mariana Islands, and Guam in Micronesia."
In the Epilogue of The Origin of Language, Ruhlen (1994, pp. 195–196) emphasizes the two essential stages in historical linguistics:
1. Classification (Taxonomy): This stage defines all language families at every level and is crucial in establishing the linguistic lineage of languages before reconstruction efforts begin.
2. Comparative Method (Reconstruction): Once a language family is identified, scholars address specific historical linguistic questions, including sound correspondences, homelands, and the transformation of proto-language words into their modern derivatives.
Ruhlen critiques twentieth-century Indo-Europeanists for attempting to reverse these analytical levels, insisting that family-specific reconstruction and sound correspondences must be used to determine linguistic classifications. This inversion has led to theoretical stagnation, where anything beyond the obvious is considered outside the comparative method’s scope.
This methodological flaw is precisely what Austroasiatic theorists have perpetuated with the Mon-Khmer hypothesis concerning the Vietnamese language.
The hypothesis of Vietnamese linguistic origins was initially proposed by Indo-Europeanists who relied predominantly on the comparative method. Their approach involved identifying basic words with similar meanings and sound-change patterns within topological isoglosses, ultimately postulating a descent from common proto-languages. However, this classification effort lacked engagement with historical documentation, failing to incorporate Sino-Tibetan etymological continuity.
As linguistic research advances, reassessing the Austroasiatic Mon-Khmer hypothesis through comparative analysis, paired with historical records, is crucial for a comprehensive understanding of Vietnamese linguistic ancestry. (I)
The methodology applied by Austroasiatic theorists was rooted in rigid mechanical paradigms, often modeled after mathematical formulas derived from Indo-European linguistic schools, historically insufficient for Southeast Asian linguistic reconstruction. Specifically, these early approaches lacked substantial evidence regarding the people, their language, and their homeland. Consequently, they failed to establish the language family prior to engaging in comparative analysis, an inversion of the proper historical linguistic methodology, as Merritt Ruhlen previously emphasized.
Historical names play a crucial role in distinguishing origins, timelines, and affiliations. In academic discourse, naming conventions shape linguistic classification frameworks. For instance, the term Sinitic designates an ancient entity that had yet to emerge, whereas Yue refers to an earlier, distinct ethnolinguistic group that existed long before northwestern and northeastern resettlers moved south, intermixing with the native Yue population to form the entity later recognized as Sinitic. A similar transformation occurred with the aboriginal inhabitants further south, in what is now northern Vietnam, ultimately giving rise to those later known as Vietnamese.
Geopolitically, the historical name Vietnam eventually gave rise to the term Vietnamese, a designation that emerged long after Annamese. Similarly, Austroasiatic theorists constructed a linguistic narrative in which Austroasiatic predates the Mon-Khmer epoch, followed by Viet-Muong, which later evolved into modern Vietnamese, an assertion lacking concrete historical substantiation.
As Confucius aptly stated,"名正言順," loosely translated as "Everything is justified in the name." The Austroasiatic camp may not have considered that the earlier ancient state names associated with "Vietnam", its people, and their language only materialized after 939 A.D. During this period, Vietnam was referred to as Nhà Ngô (吳朝 Wǔcháo, the Ngô Dynasty), which bore no direct relation to the nominal state of Annam. The nation gradually evolved into an independent polity with successive name changes over the following centuries.
Notably, during this era, the historical NamHán Kingdom (南漢 NánHàn, "Southern State of Han"), encompassing coastal stretches of present-day Guangdong Province and the northwestern segment of northern Vietnam, featured an intriguing historical designation. King Liu Yan (劉嚴) of NamHán initially named his newly founded state ĐạiViệt (大越 DàYuè, "The Great Viet") before adopting the lasting historical identity NamHán (南漢), as documented in Chinese historical records (Lü, Shih-P'eng, 1964, p. 147). This nomenclatural shift reflects the demographic composition of the population itself, where Việt and Hán symbolized the integration of these identities.
The name ĐạiViệt would later become synonymous with ancient Vietnam, beginning with the Lý Dynasty (1009–1225). Interestingly, 大越 DàYuè surfaced more than once in Chinese history. One notable instance occurred in 895, during the turbulent decline of the Tang Dynasty, when Dong Chang (董昌) declared himself king and established 大越羅平國, later known as 越州 Yuezhou, in what is now 紹興 (Shaoxing City), Zhejiang Province (Bo Yang, Vol. 63, p. 155)
Further reinforcing this historical continuity, today’s Guangdong Province retains its ancestral state designation, NamJyut Kwok (南越國, SV NamViệtquốc), a reminder of the Yue origins of the region.
In Annam, the succeeding reigns of different dynasties adopted varying state names (quốchiệu 國號) throughout ancient times. Linguistically, the Austroasiatic Mon-Khmer, Viet-Muong, and Vietic classifications, alongside the concept of Vietnamese, are modern scholarly constructs employed to describe the independent Annam of the tenth century. At that time, its territorial boundaries stopped short at what is now Hàtĩnh Province, without extending into the southern-central region.
Under the Austroasiatic Mon-Khmer theoretical framework, Vietnamese conveniently aligns with each state name assigned in subsequent historical periods, extrapolating its origins back to prehistoric times. However, the intrinsic linguistic nature and characteristics of pre-Vietnamese, before evolving into modern Vietnamese, were not identical to what Austroasiatic Mon-Khmer theorists often classify as proto-Vietnamese.
Today, discourse surrounding Vietnamese often shifts toward analyzing distinct linguistic influences, such as Chamic elements embedded in regional subdialects like Huế. Indicative pronouns such as ni, nớ, mô, tê, ri, rứ, chừ, etc., have been theorized to originate from Chinese influence, among other sources. This underscores the reality that discussions on Vietnamese encompass disparate linguistic elements. The Vietnamese people and language of the tenth century likely had little connection to the Austroasiatic Mon-Khmer linguistic enclaves referenced in the early twentieth century.
Such assumptions are reinforced by the earliest forms of Nôm vocabulary, found in fifteenth-century texts such as Phậtthuyết Đại Báo Phụmẫu Ântrọng Kinh (Buddhist Canon on Returning Favors to One’s Parents).
The Austroasiatic Mon-Khmer hypothesis relies heavily on a curated list of basic words purportedly sufficient to determine Vietnamese linguistic origins. However, such an approach may be limited in accurately reflecting historical reality, especially given that Vietnam was first officially designated as a state in 1804 under King Gia Long of the Nguyễn Dynasty. (V)
By this period, Vietnamese resettlers had barely interacted with Khmer communities inhabiting newly annexed territories acquired from Cambodia as recently as the sixteenth century. Notably, the ethnic composition of these territories now consists of roughly 30% each of the three major groups: Vietnamese Kinh, Chinese Teochew, and Khmer.
For instance, if the Vietnamese word chồmhỗm ("squat") shares cognates with the Khmer chrohom, such an association seems natural given linguistic contact in shared geographic regions. However, chồmhỗm can also be traced to 犬坐 (quánzuò), a word dating back 2,000 years and recorded in Zuozhuan (左傳), demonstrating its broader linguistic depth beyond Mon-Khmer influences.
Terminologies such as Austroasiatic, Mon-Khmer, Viet-Muong, and Vietic are modern conceptual constructs, applied retrospectively to historical linguistic entities that remain largely theoretical. Analogously, consider the 2020 official census figures, which list Khmer and Chinese minorities as 1.37% and 0.78% of Vietnam’s population, respectively.
However, such statistics overlook the broader historical reality: a significant number of early Chinese immigrants gradually assimilated into the mainstream Kinh population, just as Khmer communities have been officially recognized as Vietnamese nationals since the early 1960s, both in legal classifications and sociopolitical integration.
From a historical perspective, Austroasiatic specialists frequently disregard Vietnamese history, likely due to a lack of historical evidence supporting their linguistic hypothesis. The southern territories of modern Vietnam once belonged to the Khmer Kingdom, and events in Khmer history bear no direct connection to ancient Annam.
Conversely, Vietnam’s historical trajectory is more congruent with China’s broader historical context , both linguistically and politically. The development of the Vietnamese language can be traced back over 2,500 years to Old Chinese (OC), underscoring its deep historical roots and linguistic evolution. This affirms that a language should not be merely analyzed through linguistic classification or laboratory-based phonological models, it must be contextualized within its historical lineage.
To situate these linguistic inquiries within a broader historical framework, we turn to the Yue groups who once inhabited the territorial domain of the NamViệt Kingdom (南越王國). These include:
-
LạcViệt (雒越, LuóYuè) and ÂuLạc (歐雒, OuLuo): widely regarded as the ancestral populations of early Vietnamese.
-
TâyViệt (西越, XiYue): considered precursors to Cantonese and Teochew-speaking communities.
-
ĐôngViệt (東越, DongYue): proto-Fukienese groups associated with the region now known as Fujian Province.
This progression of Yue linguistic strata illustrates how Annam remained deeply interwoven with China’s historical and linguistic legacy, even as it gradually charted an independent course.
Historical records suggest that both before and after 111 B.C., the Yue tribes likely spoke mutually intelligible variants of a common ancestral Yue language. This included the speech of neighboring communities in the Chu State (楚國, ca. 1030–223 B.C.), whose linguistic patterns might reflect early Taic affiliations.
King Liu Bang (劉邦), founder of the Western Han Dynasty (西漢王朝), and his followers were originally subjects of Chu. Their ancestors likely spoke an archaic Daic language belonging to the broader Taic linguistic family, which gradually diverged into distinct forms during the Warring States Period (475–403 B.C.), culminating in the Qin unification under Qin Shi Huang (秦始皇).
Following the Han conquest and annexation of NamViệt (南越, NanYue) in 111 B.C., its inhabitants likely retained mutual intelligibility with neighboring Yue communities. However, linguistic divergence increased as geographic distances widened. The territorial expanse of ancient NamViệt included parts of northeastern Vietnam, notably the Han prefecture of Giaochâu (交州), which later transitioned into the protectorate of Annam, known as the Pacified South.
From this multilingual environment, early forms of Vietnamese and Cantonese emerged from Taic-Yue foundations, gradually incorporating Sinitic elements through successive periods of Han rule.
These linguistic trajectories show minimal overlap with Austroasiatic Mon-Khmer classifications. Even the Daic ancestry of early Cantonese populations diverges from Austroasiatic models. While the Mon-Khmer framework has contributed methodologically, it fails to align with the documented historical processes that shaped the evolution of the Annamese language.
In contrast, the Sino-Tibetan classification offers stronger alignment with historical narratives. It traces linguistic convergence between ancient Vietnamese and Middle Chinese (MC), suggesting bilingual proficiency among Annamese populations by the tenth century. Language contact between Annamese and Cantonese likely persisted within the NamHán State, which encompassed today's Guangdong and Guangxi provinces and portion of northern Vietnam.
Ultimately, the author finds it difficult to reconcile Austroasiatic Mon-Khmer elements with the historical development of Annamese, a language whose formative evolution reflects closer affinities with Sinitic-Yue transitions than with Mon-Khmer derivations.
Historically, China, referred to as the Middle Kingdom (中國, Zhōngguó), functioned as a central state among smaller vassal entities. Today, it operates as a union of multinational regions under centralized governance. Within this framework, regions such as Tibet, Inner Mongolia, Xinjiang (Uighur), and the Daic-Kadai areas of Guangxi, along with Hong Kong and Taiwan (Formosa), retain distinct historical identities regardless of their linguistic or ethnic composition.
Within China’s borders, most Sinitic languages are classified as dialects of the broader Sinitic family. Consider Annam, which once served as a Chinese prefecture. Hypothetically, if Canton were to separate from China and evolve into an independent state, it could eventually resemble Vietnam or Taiwan, an outcome consistent with historical patterns. For example, Hainanese, spoken on Hainan Island, is linguistically related to MinNan, introduced by Fujianese settlers during the Han Dynasty. Yet, as with Teochew, speakers of these linguistic cousins often struggle to understand one another despite shared historical roots.
Understanding Vietnam’s development requires recognizing its emergence from a breakaway prefecture of Greater China. Had Vietnam remained part of China, there would be no debate over whether its population spoke a Sinitic language, similar to Cantonese or Fukienese, both of which fall under the Sino-Tibetan classification.
By contrast, the Austroasiatic Mon-Khmer perspective lacks historical grounding, not only in linguistic terms but also in relation to the former Khmer Kingdom, which developed independently of Vietnam’s trajectory. Politically, no aspect of Khmer history aligns with the narrative of ancient Vietnam under imperial Chinese influence.
After a millennium of Chinese colonial rule, it is notable that the primary language spoken in Vietnam did not evolve into a Sinitic language thanks to its separation from mainland China. Instead, it developed into full-fledged Vietnamese, with Middle Vietnamese emerging as an independent linguistic entity around 939 A.D.
To understand Vietnamese linguistic evolution over the centuries, its development may be likened to that of English. Just as Greek and Latin lexical components enriched the Anglo-Saxon foundation of English, integrating within the Indo-European family, so too did Sino-Tibetan and Sinitic elements merge with the Yue substratum, shaping Vietnamese into its distinct form.
IV) Archaeological Evidence and Yue Origins
Decades of archaeological excavations have uncovered material evidence supporting historical accounts of the ancient Yue peoples who once inhabited southern China. These groups, collectively referred to as 百越 (BáchViệt), are documented throughout China’s five-thousand-year historical record. The findings confirm that the South China region served as the native homeland of the ancient Taic-speaking populations, from whom the Southern Yue (南越族, NamViệttộc) and other tribal branches emerged (see Zhang Zengqi, 1990, Archaeology of Ethnic Minorities in China's Southwestern Regions / 中國 西南 民族 考古).
Geographically, China South (華南, Hoanam) stands in contrast to China North (華北, Hoabắc), which encompasses the Middle Plain (中原), the historical heartland of imperial China. This northern region extends beyond the Yellow River's northern bank, from Shaanxi in the west to the Shandong Peninsula (山東省) in the northeast, reaching Bohai Bay in the East China Sea. Many Tartaric dynasties arose in this zone, ruled by Altaic-origin powers such as the Khitan Empire (契丹) and the Liao State (遼國, 916–1125). Early Mandarin, as a vernacular standard, was shaped in this region during the Yuan Dynasty (元朝).
From the twelfth century onward, the Mongolian Rhyme Book (蒙古字韻) documented northern Mandarin pronunciations. In parallel, the Annamese Translated Wordbook (安南譯語) preserved Vietnamese vocabulary from the same period. Notably, idiomatic expressions such as Sưtử Hàđông (河東獅子, "Tiger wife") entered Vietnamese usage, likely through the vernacular court language popularized during the Han colonial period in ancient Annam.
This linguistic convergence helps explain phonological and lexical parallels between Vietnamese and Mandarin, challenging claims that Mandarin exerted little influence on Vietnamese. The twenty-five years of Ming Dynasty rule in the fifteenth century, during which Vietnamese literary works were destroyed and Chinese was imposed nationwide, further underscore the depth of linguistic impact.
The modern term Việtnam (越南) originally denoted "the Yue of the South", implicitly suggesting a counterpart: Việtbắc, or "the Yue of the North" (越北). Today, Sinicized Yue populations such as Cantonese speakers (漢化粵族) remain within China’s borders. Yet their ancestors may not have fully recognized the location of their original homeland in South China. Over time, Yue tribes dispersed widely, with some evolving into what later became known as the "Yue of the North." In a narrower sense, "Yue of the North" (粵北) refers to speakers of Mansheng (蠻聲, tiếngMôn = 聲蠻) in Shaozhou Tuhua (韶州 土話), a subdialect spoken in the border regions of northern Guangdong (廣東), Hunan (湖南), and Guangxi (廣西). These dialects are mutually unintelligible with Hunanese, Cantonese, and Mandarin, reflecting deep linguistic divergence despite shared Yue ancestry (see https://en.wikipedia.org/wiki/Yuebei_Tuhua.)
Ethnologically, the forefathers of these speakers were descendants of earlier Taic aboriginals who formed the population of the Chu State (楚國). Similarly, their Yue descendants likely contributed to the Han Chinese demographic in later historical periods (see Bình Nguyên Lộc, 1972, Nguồn gốc Mãlai của Dân tộc Việt Nam, The Malay Origin of the Vietnamese). These ancient Northern Yue populations were once concentrated in regions that now encompass Hebei (河北), Anhui (安徽), Hubei (湖北), and Jiangsu (江蘇) provinces in present-day China.
Following the trails of artifacts left by their Yue descendants along migratory routes from the Yangtze Basin, we find significant evidence of a southward migration. This migration was driven by the encroachment of early Tibetan nomadic groups beginning around the Xia Dynasty (夏朝, c. 21st–17th century B.C.) or the Yin Dynasty (殷朝, c. 16th–11th century B.C.). The proto-Yue tribes were pushed southward, passing through the Indo-Chinese Peninsula, postulated as the Austroasiatic (AA) homeland, and even reaching distant islands such as Indonesia. The discovery of Đôngsơn-style bronze drums strongly corroborates these migratory movements. Such artifacts, found as far as Java and New Guinea, closely resemble relics excavated from the Đôngsơn Culture (700 B.C.–100 A.D.) in North Vietnam’s Red River Delta.
The Yue people began crafting bronze drums as early as 600 B.C. or earlier in South China and ancient Annam, known during the Han period as Giaochâu Prefecture (交州). According to the Annals of the Later Han (後漢書), Han General Ma Yuan (馬援 Mã Viện) melted down bronze drums seized from the local LuóYuè rebels (雒越 LạcViệt) and repurposed the metal (14 B.C.–49 A.D.). Surviving examples of these drums remain some of the finest representations of indigenous Yue craftsmanship.
Precise dating of these artifacts provides compelling evidence supporting historical accounts of ancient Yue migrations into various regions. For instance, large Yue bronze drums similar to those from Đôngsơn were discovered in Wangjiaba, located in Yunnan’s Chuxiong Yi Autonomous Prefecture (萬家埧 楚雄 彝族 自治州), China, in 1976. These artifacts date back over 2,700 years. (See https://en.wikipedia.org/wiki/Dong_Son_drums.)
Further research remains necessary to strengthen the archaeological foundation of the Yue-based theory for prehistoric South China. This framework stands in direct contrast to the Austroasiatic hypothesis, particularly its Mon-Khmer linguistic subfamily. The overlapping characteristics between these two groups suggest that they may have shared a common ancestral heritage, differentiated primarily by temporal and geographic factors.
However, this analysis does not aim to examine events from what had happened 10,000 years ago in the South China and Southeast Asian regions from an ethnological or archaeological perspective but who might have still happed to speak any ancestral languages. Such an extended timeframe surpasses the scope of even glottochronology, which, at best, estimates linguistic affiliations of daughter languages based on 100 to 200 correctly identified core words in active use over approximately 5,000 years (Roberts J. Jefers et al., Ibid., p. 133). Instead, this paper focuses on much later historical periods, specifically within a timeframe of 2,000 to 3,000 years B.P. These periods center around the usage of fundamental vocabulary in certain ancient aboriginal languages that coexisted with the spoken language of the Chu (楚, SV Sở) population. This was a time when the Yue people were indigenous inhabitants of South China, as documented in classical Chinese literature, preceding the establishment of the Middle Kingdom (中國 Zhongguo (See Appendix J: Yueren Ge (越人歌).
Linguistically, the aboriginal languages spoken by the descendant-Taic populace of the Chu State (楚國) gradually merged with other Yue languages due to the state's vast territorial expanse, which housed a diverse population. This process occurred alongside the development of early Sino-Tibetan speech, which solidified Proto-Chinese and Archaic Chinese (ArC, 上古漢語) into standardized forms. The languages spoken by the subjects of ancient states collectively contributed to the formation of Old Chinese (OC, 先秦雅音). Within this framework, the diplomatic Yayu (雅語) was adopted by various states, including Wu (吳), Yue (越), Yan (燕), Han (韓), Zhao (趙), Qi (齊), and Qin (秦). This linguistic evolution ultimately led to Ancient Chinese (西漢古漢語, AC), the court language of the unified Han Chinese in the Middle Kingdom, later recognized globally as China, enduring through all dynastic transitions.
Today, nearly all Sinicized Yue languages spoken within China's borders have been classified as Chinese dialects, such as Cantonese, Fukienese, and Shanghainese. Meanwhile, alongside their development, the ancient Vietic language, ancient Vietnamese, continued evolving, intrinsically tied to China’s southward territorial expansion. Successive waves of Han settlers migrated into Giaochỉ, another designation for ancient Annam in Chinese annals. These settlers consolidated colonial rule and integrated indigenous Yue tribal customs within the Han ethnic sphere, including monarchal governance and Confucian education. This process mirrored the administrative structures already established in the NamViệt Kingdom (南越國, NanYueGuo, SV NamViệtquốc), ruled by the Triệu Dynasty, further accelerating the Sinicization of the native Annamese population in the southwestern region. (See Bo Yang, Sima Guang, Zizhi Tongjian 資治通鑑, Vol. 2, 1983.)
Anthropologically, the population within present-day South China, comprised of a racial admixture of Han settlers and the native Yue people, was frequently conscripted into the Han army, a practice dating back to the Qin Dynasty. These mixed-population soldiers participated in conquest campaigns that led to the invasion and occupation of ancient Annam. Han infantrymen conquered Annamese lands and established garrisons, followed by civilian resettlement. Later, during peacetime, civil officials and their families migrated to the region under the administration of Viceroy Sĩ Nhiếp (士攝 Shì Shè). Revered by the emerging Annamese aristocracy and later by the so-called Kinh plebeians, Sĩ Nhiếp disseminated Chinese language and culture across Annam.
Over the next 1,009 years of Chinese rule, the influx of soldiers, officials, refugees, and Han immigrants continued unabated. They confiscated land, resettled, and imposed substantial social changes, a dynamic that persists in the modern era. This transformative period saw the fusion of Ancient Chinese with the languages spoken by indigenous minority groups, including the Yue, Daic, and Mon-Khmer peoples. Successive layers of Sinitic linguistic influence evolved atop these substrata, shaping the early forms of the ancient Annamese language.
As with earlier settlers, Han newcomers intermarried with the local population, gradually forming a new dominant class, the ancestors of the Kinh people, who inhabited the northern region of ancient Vietnam. This development came at the expense of native groups such as the Muong and Mon-Khmer peoples, many of whom were displaced into remote mountainous areas and gradually became minorities in their own homeland.
The historical factors outlined above directly shaped modern Vietnamese identity and its language, resulting in a process of extensive Sinicization. To fully grasp the impact, one might consider what would have happened to a small vassal state like Vietnam, comparable to a modest Chinese province, after enduring over 2,200 years of colonial influence. A parallel hypothetical case could be drawn for Taiwan projected 700 years into the future. With over 300 years of colonization by Chinese rulers, descendants of Qing viceroys stationed on the island and the defeated Kuomintang armies that retreated there, Taiwan could potentially undergo a process similar to Vietnam’s millennia-long Sinicization. However, given advancements in modern communication technologies, Taiwan’s official Mandarin language would likely undergo fewer transformations compared to ancient Annamese.
The historical trajectory of the Vietnamese language initially seemed straightforward. Early Vietnamese scholars proposed that it evolved from the Sinitic branch of the Sino-Tibetan family. However, as linguistic research expanded, competing theories emerged in rapid succession. Following the initial Sino-Tibetan hypothesis in the late nineteenth century, the Austroasiatic Mon-Khmer hypothesis gained traction. Whether institutionally driven or not, every specialist in the field appeared eager to propose new frameworks. A notable example is the work of Paul Benedict (1975), who introduced the Tai-Kadai linguistic branch and formulated the Austro-Thai family, a reinterpretation of the earlier Austric hypothesis. This phenomenon resembled an academic Gold Rush, with scholars racing to formulate the next groundbreaking linguistic theory.
However, the process was far from straightforward. Unlike the prehistoric approach employed by the Austroasiatic Mon-Khmer hypothesis, Vietnamese historical linguistics requires both fluency in Vietnamese and expertise in ancient Chinese philology. It is not as rudimentary as the pioneering work of eighteenth-century Sinologist T.S. Bayer (see Knud Lunbæk, T.S. Bayer 1694-1738, Pioneer Sinologist, 1986). This complexity arises because few Vietnamese specialists, particularly Western linguists, can consistently distinguish Sinitic-Vietnamese words from Sino-Vietnamese terms within the Vietnamese lexicon, let alone identify remnants of Old Chinese in modern Vietnamese. While this distinction may seem straightforward, it remains highly challenging, a reality acknowledged by many Western learners of Vietnamese. (See An Chi. Vols. 1-5. Rong chơi Miền Chữ nghĩa. 2016-24 "A Journey in the Field of Vietnamese Etymology".)
By the early nineteenth century, Chinese historical linguistics was still an emerging field for Western scholars. Its learning curve was steep, but Vietnamese presented even greater difficulties due to its eight-tone system, compared to Mandarin’s four tones. Ancient Chinese rhyme books such as Guangyun (廣韻) and Huiyun (會韻) demanded immense intellectual effort to interpret. Scholars needed to analyze syllabic extractions, radicals, and phonological values within ancient linguistic systems, deciphering connotations and phonetic nuances embedded in classical texts. Complexities such as distinguishing intrinsic radicals from phonetics and decoding nuanced classifications like chongniu (重紐) phonological divisions (III, IV, etc.) (音) further complicated Chinese historical phonology.
When Western linguists venture into Sinitic-Vietnamese studies, they encounter varying philological standards employed by ancient Chinese scholars, standards that have already led to confusion among seasoned Sinologists. The methodologies used to analyze classical Chinese morphology were often dismissed as "unscientific" by modern linguistic approaches, leading to misconceptions that have disregarded classical perspectives crucial for linguistic accuracy. Indo-European specialists advocating for the Austroasiatic hypothesis have overlooked key phonological elements buried within classical Chinese vaults, further demonstrating the methodological shortcomings in modern Vietnamese classification.
For the most part, ancient Chinese rhyme books and classical texts have been underutilized, failing to gain recognition as indispensable tools for linguistic inquiry. Well into the early twentieth century, only a select few Western Sinologists, such as Bernhard Karlgren, successfully employed these sources in reconstructing historical phonology. Karlgren’s works, including Étude Sur la Phonologie Chinoise (1915) and Grammata Serica Recensa (1957) from Sweden’s Stockholm Oriental Institute, exhibited exceptional academic mastery of these materials. His methodologies significantly advanced Chinese historical phonology, inspiring further research into Sinitic-Vietnamese linguistics. Karlgren’s pioneering techniques, especially in reconstructing ancient Chinese sound values, continue to contribute to the reclassification of Vietnamese within the Sino-Tibetan family. (see Chapter 10 - Parallels with the Sino-Tibetan Languages.)
It is essential to assess native Vietnamese scholars researching the etymology of Vietnamese with Chinese origins cautiously, particularly when their work demonstrates scholarly aptitude. However, political motivations, often shaped by anti-China sentiment, have repeatedly influenced their research, leading to deviations from academic impartiality and alignment with partisan agendas.
To avoid direct engagement with Sinitic linguistic affiliations in Vietnamese etymology, many scholars have redirected their research toward nationalist discourse. In doing so, they find refuge within the Austroasiatic Mon-Khmer framework, where political narratives frequently overshadow objective linguistic inquiry. As the saying goes, 'The end justifies the means.' Consequently, some scholars have deliberately distanced themselves from Chinese linguistic associations to sidestep political controversies entirely. Yet such evasive strategies ultimately hinder the advancement of Vietnamese etymological studies.
As this research will illustrate, Chinese and Sino-Tibetan etymologies remain deeply interwoven. This complex political landscape will be examined further in a separate chapter, where the pervasive biases obstructing objectivity within Vietnamese linguistic scholarship will be critically assessed. (See Chapter 5)
What if scholars had embraced a more rigorous analytical approach rather than circumventing difficult discussions? This study, archaeologically aligned with the Austroasiatic hypothesis yet adopting a middle-ground perspective, presumes that early linguistic and cultural exchanges in ancient times align with evidence of Yue metallurgical expertise, exemplified by Đôngsơn-style bronze drums. These artifacts share connections with objects unearthed in Indonesia, which are dated to much later periods than older specimens such as the Ngọclữ bronze drums with distinct engraved motifs, objects not yet found in Indonesian excavations. All bronze drums discovered across regions of North Vietnam and South China trace back to the agricultural practices of ancient Taic-Yue tribes, whose material legacy reflects their engagement with water-paddy cultivation.
Readers can directly observe excavated Yue bronze drums at several exhibit locations, including those displayed in a Zhuang Cultural Village near Liuzhou City (柳州市) in the Guangxi Autonomous Region, in Daic regions like Xishuangbanna in Yunnan Province, and in China’s national museums in cities such as Nanjing, Yangzhou, Chongqing, Kunming, and Nanning. Major museums across Vietnam also house important collections of Yue bronze artifacts. The author has personally visited all these sites since the late 1990s.
As more archaeological evidence suggests that the Yue cultural sphere extended further north, scholars may increasingly recognize that the languages spoken by these aboriginal tribes in South China evolved differently into the Mon-Khmer linguistic branch, now predominantly located in the Indo-Chinese peninsula. Etymologically, the fundamental words shared with Austroasiatic Mon-Khmer roots form only a subset of vocabulary derived from Sino-Tibetan and Chinese linguistic traditions. With recent discoveries revealing Vietnamese etyma closely linked to Sino-Tibetan etymologies, many of which remain unfamiliar to Vietnamese historical linguists, new researchers are encouraged to examine hundreds of Sinitic-Vietnamese etyma within the Sino-Tibetan framework before committing to any particular classification. They must proceed carefully, as either stance could lead to theoretical pitfalls.
Ultimately, the discussion above serves as a prologue to the new etymological approaches presented in subsequent chapters. Readers should remain vigilant and critically evaluate the appeal of associating Vietnamese linguistic history with prestigious ancient cultures from neighboring regions, such as the monumental ruins of temples and walled palaces belonging to the Khmer, Champa, and Chinese civilizations, each of which was far more technologically advanced in antiquity. This caution is warranted, given past speculative narratives in which Vietnamese scholars have sought historical links to builders of some of the world’s most iconic monuments, including the Great Wall of China and the ancient Angkor Thom and Angkor Wat palaces.
On multiple occasions, Vietnamese archaeologists have also claimed artifacts from the Sahuỳnh and Óc-Eo cultures, discovered in territories annexed into Vietnam in more recent centuries, as relics belonging to their "ancestors." However, the craftsmanship of these objects suggests that such assertions overstate national ownership of ancient relics found in regions acquired only after the fifteenth century. These claims falsely imply that such artifacts were created by Vietnamese artisans. No impartial anthropological evidence has validated direct ancestral lineage from pre-Chamic cultures to modern Vietnamese populations. Applying similar scrutiny, the association of Vietnamese linguistic heritage with the Austroasiatic Mon-Khmer hypothesis warrants a similar reevaluation in principle.
This pattern of intellectual divergence has persisted. Many Vietnamese scholars, eager to dissociate from the so-called China camp, have devoted disproportionate attention to reinforcing the Austroasiatic narrative, sometimes beyond what would ordinarily be expected in linguistic research. In doing so, they risk inadvertently dismissing Sinitic and Sino-Tibetan perspectives on Vietnamese linguistic heritage.
From another standpoint, Austroasiatic theorists themselves might not place significant emphasis on such nationalist validation. However, regarding the case of Đôngsơn bronze drums, unearthed not only in their namesake locality but also on select Indonesian islands, both Austroasiatic and Austronesian scholars have integrated these artifacts into arguments supporting the presence of indigenous populations central to their respective hypotheses. If such claims hold, the Austroasiatic and Yue linguistic classifications would likely become mutually inclusive rather than separate.
For practical purposes and to maintain impartiality, the author refrains from extending this argumentation further into emotionally charged territory. Readers from diverse backgrounds can engage with these concepts when prepared to evaluate linguistic classifications critically, perhaps beginning with accessible resources such as illustrated maps depicting the migratory routes and material culture of ancient populations.
As old-timers mature, many find themselves shifting away from the Austroasiatic Mon-Khmer hypothesis, a framework they once held with conviction, upon encountering the Sino-Tibetan theory. If you count yourself among them, prepare to engage in debates surrounding contentious issues, starting with the foundational premise of the Austroasiatic Mon-Khmer origins of the Vietnamese language.
As scholars mature, many find themselves shifting away from the Austroasiatic Mon-Khmer hypothesis, a framework they once embraced with conviction, upon encountering the Sino-Tibetan theory. If you count yourself among them, be prepared to engage in debates surrounding this contentious issue, beginning with the foundational premise of Austroasiatic Mon-Khmer influences on the Vietnamese language.
This question can be approached through straightforward logic. Suppose Vietnam were to return the territories originally annexed from the Chamic and Khmer peoples, much like how China was compelled to cede Annam. Would the land comprising Vietnam’s central and southern regions, stretching from Hue to the southern tip of Càmau near the Gulf of Thailand, revert to its original spoken languages? More specifically, after more than 700 years under Vietnamese rule, what linguistic blend would emerge among the populations of a revived Champa and Khmer state? They all speak Vietnamese!
Analytically, the Vietnamization of Chamic and Khmer peoples, beginning nearly a millennium ago, parallels the Sinicization of Annam during its 1,009-year colonial rule under imperial China. In short, the Annamese would have long ceased speaking their indigenous languages after 111 B.C., as centuries of linguistic transformation unfolded alongside the continuous influx of racially mixed Han troops on military campaigns from the north. These migrations were further compounded by waves of Chinese refugees fleeing war and famine, permanently settling in southern regions. Over time, newcomers outnumbered both the native populations and earlier settlers, a phenomenon comparable to the demographic shifts that transformed the Americas over the past 400 years.
Ultimately, whether native Vietnamese once spoke a Mon-Khmer language before encountering Han "conquistadors" is of limited significance. Even if such a language persisted and evolved into modern Vietnamese, the hypothesis remains implausible, akin to suggesting that English, Spanish, or Portuguese in Latin America directly reflect pre-colonial indigenous tongues. Simply put, the present-day Vietnamese language bears little resemblance to existing Mon-Khmer languages, despite Austroasiatic theorists contending otherwise.
Patterns of such transformations recur throughout Vietnamese history. Today, these trends are evident in the growing presence of mainland Chinese migrant laborers establishing Chinatown districts in provinces like Hàtĩnh, Phúyên, Đắklắk, Bìnhdương, etc. Similar developments are visible in tourist hubs such as Nhatrang and Đànẵng, where prominently displayed Chinese signage reflects the expanding influence of new arrivals. Readers are encouraged to consider these patterns as a lens through which to understand the intersection of politics and linguistics, a subject explored further in the subsequent chapter.
Once readers grasp the rationale behind the formation of the
Vietnamese language, they can shift their focus toward substantive
linguistic evidence, particularly the Sino-Tibetan connections
outlined in this research. Meanwhile, deeper reflection on this
topic, comparable to a meditative practice rooted in Vietic
spirituality, may lead to new insights and intellectual
breakthroughs. This process fosters a fuller appreciation of
Vietic, or Yue, core linguistic matters, supported by reinstated
ethnological and geographical contexts backed by historical
record.
V) Historical and Cultural Context
This continuum of historical developments shaped modern Vietnamese, a language in which over 90 percent of its linguistic components derive from Chinese.
Originally Vietnamese scholarship reflects an intellectual tradition that acknowledges ancestral migrations from regions north of Vietnam’s borders, particularly China’s southern provinces. Historical records indicate that Vietnam’s later-acquired southern territories, despite their Austric associations per Austroasiatic and Austronesian classifications, were originally inhabited by distinct populations, notably the predecessors of the ancient Champa and Khmer civilizations, whose linguistic heritage bore no connection to the Annamese.
The notion that Vietnamese was historically aligned with Chinese dialects, even potentially classified within the Sino-Tibetan language family, likely remained unchallenged until the twentieth century. During this period, the Austroasiatic hypothesis gained prominence, categorizing Southeast Asian languages, including Vietnamese, within the Austroasiatic Mon-Khmer classification. However, proponents of this hypothesis often overlooked critical historical context, particularly the massive influx of northern immigrants who settled in ancient Annamese lands over the course of at least two millennia, shaping the linguistic landscape.
Regarding Sino-Tibetan etymology, this study presents compelling evidence that modern Vietnamese traces its roots to southern China, the same region where the Chinese language reached its fullest form.
Before analyzing Sinitic-Vietnamese etymology, it is essential to establish a geographical and historical context far north of present-day Vietnam. Archaeologically, the ancestors of modern Vietnamese, originally inhabiting the Phùngnguyên Culture region in Hoàbình Province, migrated southward from Dongtinghu Lake in today’s Hunan Province, traditionally regarded as the ancestral homeland of the Yue people.
Scholars familiar with Vietnamese and Chinese history recognize that the racial composition of ancient Vietnamese populations during these pivotal migrations, particularly following the loss of the last NamViệt Kingdom to the Han Empire, offers valuable insights into Sinitic-Vietnamese linguistic classification. The languages spoken by southern Chinese immigrants during this period formed the linguistic core of early Vietnamese, or more precisely, ancient Annamese.
A major wave of migration likely occurred when forebears fled the invasion of 500,000 troops led by Qin Shihuang (259–210 B.C.). Later, during the An Lushan Rebellion (755–763 A.D.) against Emperor Tang Minghuang, the empire’s population declined dramatically, from 52,919,309 to 16,900,000 (Bo Yang, 1983–93, Zizhi Tongjian 資治通鑑, Vol. 49). The question remains: where did two-thirds of the Tang population go? Historical records suggest that large numbers of displaced peoples fled southward into Annam, resettling in the Red River Delta and intermixing with local populations descended from earlier Daic migrations from southwest China. The early Yue natives of the NamViệt Kingdom, subjected to racial discrimination under successive Chinese dynasties beginning in 111 B.C., were forced into the southern Yue territory of Giaochỉ. Over time, these displaced Yue migrants from southern China became the dominant population of northern Vietnam (Bo Yang, 1992, Vol. 69, p. 172).
Linguistically, the north-to-south migratory movements left tonal imprints along a continuum, reflecting gradual evolution. Northern subdialects, such as that of Hanoi, retained a full set of four two-register tones, making Vietnamese an eight-toned language, similar to Cantonese. As migration continued through regions like Nghean, Hatinh, Quangbinh, Quangtri, and Hue, these tones reduced to five or seven, condensing into the heavily accented subdialects spoken in Danang, Quangngai, Binhdinh, Tuyhoa, Ninhhoa, and Phanthiet. Eventually, this tonal progression culminated in the lighter, more relaxed seven-toned system of Saigon and the southernmost provinces (Lụctỉnh), characterized by the free-flowing southwestern accent of the Mekong Delta.
Despite these subdialectal differences, mutual intelligibility remains high among Vietnamese speakers. When comprehension issues arise, they are typically due to differences in regional vocabulary. Northern subdialects often incorporate refined Sino-Vietnamese terminology, while southern subdialects favor a relaxed, colloquial style flavored with regional jargon. Some retain original semantic features and lexical syntax, for example, mắtkiếng, mắtkính, and kiếngmắt ("eyeglasses") (cf. Hainanese /mat4keng4/). Notably, the southern Vietnamese dialect, the youngest among these subdialects, began to develop only around 370 years ago.
The evolution of sound changes in the Vietnamese language progressed and accelerated, encompassing both morphological and lexical transformations across various subdialects. Modern Vietnamese subdialects illustrate how these phonetic shifts gradually expanded from north to south, likely beginning when Annam achieved sovereignty. As Vietnamese speakers migrated southward, they carried their language to new settlements. This linguistic transformation gained momentum after severing ties with Tang-era colloquial variants, distinct and distanced from Cantonese, following the pivotal year of 938, when General Ngô Quyền (吳權, Wu Quan in Chinese records) defeated the NamHan Empire (南漢帝國). This victory paved the way for his appointment as the first head of state of independent Annam the following year (Bo Yang, Zizhi Tongjian, Vol. 69, 1992, pp. 209–210).
Vietnamese history is defined by continuous resistance wars, with conflict against China looming persistently over its northern border. Across 2,282 years of documented history, beginning with the Thục Dynasty (257 B.C.-179 B.C.), Vietnam has spent a cumulative 1,474 years engaged in warfare, most often defending against Chinese expansionism. Although the last formal border war concluded in 1979, intermittent Chinese incursions, both territorial and maritime, have continued, sparking confrontations in 1984, 2013, 2015, and other instances. Additionally, 262 years saw internal factional conflicts and battles against various adversaries, including the Chams, Khmer, Siamese, French, Japanese, Americans, and opposing internal Vietnamese factions with international backing. In total Vietnam has only experienced 898 years of relative peace, often fragmented.
Throughout Vietnam’s history, its people, particularly men, have lived under the perpetual shadow of war. The most recent conflict, a decade-long war against the China-backed Khmer Rouge in Kampuchea, ended in 1989. The ever-present threat from its northern adversary has kept Vietnam in a near-constant state of military preparedness. Few nations have demonstrated such enduring nationalism, with every segment of Vietnamese society, from ruling elites to scholars to ordinary citizens, grappling with national identity in their own ways.
On the national stage, in our time, Vietnam’s Politburo imposed war debts upon its people, debts owed to communist China for supporting their rise to power through participation in the Vietnam War (1954-1975). This arrangement was part of a broader effort to bolster Maoist expansionism against the U.S.-backed South Vietnamese government. The emergence of a socialist state denied the country the opportunity for peaceful restoration of sovereignty following the end of French colonial rule in August 1945. Unlike regional counterparts such as India, Malaysia, and Singapore, each of which achieved political stabilization after Western colonial collapse, Vietnam remained under authoritarian governance, plagued by regression and underdevelopment. Decades of continuous warfare left its economy devastated and its intellectual landscape constrained, hindering scholarly progress.
This relentless historical backdrop has forged a war-hardened resilience among the Vietnamese, shaping their unwavering will to survive and fostering an intense nationalism. This sentiment manifested visibly in the early 2010s, when young patriots organized protests against Chinese aggression. Ironically, many were imprisoned by their own government for voicing opposition, with some even forced into exile, yet their nationalist fervor remained undeterred. Resentment toward China persists across generations, deeply embedded in historical consciousness. Unsurprisingly, this entrenched nationalist sentiment has influenced academic objectivity, particularly in theorizing Vietnamese linguistic affiliations, whether Austroasiatic Mon-Khmer or Sino-Tibetan.
Despite Vietnam’s deep historical ties with China, local scholars of later generations frequently minimize references to this connection in Vietnamese linguistics, particularly the shared 1,060-year legacy preceding 939 A.D. Instead, they academically align with Western trends, often engaging in small yet telling gestures, such as praising Western researchers simply for their ability to articulate a few Vietnamese words, an amusing display, to say the least.
This wholesale embrace of the Austroasiatic Mon-Khmer hypothesis reflects a broader intellectual shift, despite the viable alternative of recognizing its likely foundations in Yue ancestry and forging a distinct scholarly path rather than merely following existing frameworks. Ultimately, this selective approach, shaped by nationalism ingrained by Vietnam’s founding figures, continues to resonate deeply across generations.
Hypothetically, had the French, rather than the Chinese or Mon-Khmer peoples for the same matter, colonized Annam for the same 1,000 year-plus-extended period, it would not be surprising if certain indigenous basic words had been absorbed into a hybrid langue Française-Annamite (cf. those French spoken in Haiti, African countries, etc.) Similarly, this is to suggest the channel how the ancient Annamese likely integrated Mon-Khmer linguistic substrata into their evolving speech, ultimately leading to its classification within the Austroasiatic framework.
By the mid-nineteenth century, European missionaries had already arrived in Vietnam, having begun their efforts as Catholic emissaries in the seventeenth century. With backing from colonial authorities, these missionaries aggressively disseminated Western ideologies, spreading gospels of the Roman Church among both the literati and the illiterate populace. After Annam fell under French colonial rule in 1862, a rule that persisted until 1954, it entered a transformative phase, severing cultural and linguistic ties with its historical association with China. This transition included the adoption of Romanized orthography for the Vietnamese writing system and the complete abandonment of the millennia-old Chinese-character script.
A series of historical events followed, including the French colonial government's divide to rule policy, which segmented Annam into three distinct administrative regions: Tonkin, Annam, and Cochin-China. This restructuring ultimately led to the dissolution of the Nguyễn Dynasty's monarchy in Huế, relegating it to a nominal puppet government until 1954. This transition exemplified Vietnam’s perceived backwardness in adhering to a feudal system inherited from China. During this period, the West launched a fierce assault on two deeply entrenched yet deteriorating citadels of Confucianism: France’s attack on Annam's Huế Imperial Palace in 1883 and the Eight-Nation Alliance’s invasion of Qing’s Forbidden City in Beijing in 1900. Both imperial palaces were ruthlessly plundered, their treasures now prominently displayed in major Western museums.
The spread of Western civilization, despite the oppressive colonial rule and the cultural challenges posed by modern Western ideas, enabled the Annamese to rise, gaining the perspective to look beyond China’s historical dominance. In southern Vietnam, the innovative minds of French reformers introduced Western concepts, challenging traditional values and paving the way for cultural transformation.
Such subsequent events further solidified France’s role in steering Annam away from China’s influence, signaling the eventual collapse of old monarchies in both China (1911) and Annam (1954). In the linguistic domain, Western Austroasiatic theorists stepped in to fill the void left by the incomplete Sino-Tibetan hypothesis, which by then lacked substantial evidence to elevate it to a credible theory. Austroasiatic proponents captured the interest of Western-educated Vietnamese scholars in the latter half of the twentieth century, reshaping their perception of the Vietnamese national language through Western frameworks and intellectual paradigms. This shift marked a decisive departure from the older Chinese scholarly traditions once cherished by previous generations but rendered obsolete by the evolving academic landscape.
During the colonization of the highly Sinicized Annamese society, colonial administrators swiftly implemented cultural initiatives aimed at introducing Western values. These efforts included imposing supposedly superior Occidental ideals over entrenched Chinese traditions, securing ideological dominance with the backing of the French government. Among their strategies, the adoption of Western academic methodologies proved remarkably effective. However, throughout the colonial period, French intellectuals, holding positions of authority, pursued a scholarly agenda that sought to supplant Confucian values, often taking their efforts to extremes.
Not long after Vietnam’s newly emergent literati engaged with French academics, perspectives shifted decisively in favor of Western linguistic and academic frameworks. This marked a significant ideological departure from traditional affiliations with the Chinese. A French-educated Vietnamese of the mid-twentieth century might recall how colonialists audaciously taught school-aged Annamese children, entirely in French, that their ancestors were of Gallic descent. Laughably, many French colonialists themselves might have never known their people do not speak their ancestral Gallic language at all, which had long been replaced by Latin-based French.
Since the twenty-first century, many Vietnamese scholars have distanced themselves from the beliefs held by their nineteenth-century predecessors, exploring alternative approaches to their linguistic and cultural heritage.
The twentieth century demonstrated the efficacy of Western mechanisms through a series of geopolitical confrontations. These conflicts pitted Western democratic values, represented by the United States, against the enduring neo-feudal system of China, rebranded as a communist monarchy, governed by the Politburo and led by a general secretary who also served as head of state, or the modern ling. This trend was reflected in leaders such as President Hồ of Vietnam in 1945 and Chairman Mao of China in 1949 continuing through figures like successors like Trọng (purged in 2024), Lâm (2024) and Xi (2012-202?), respectively.
In contemporary era, by 1964, China had begun supplying arms to its Vietnamese communist allies, advancing its expansionist ambitions during the violent confrontations between Chinese communism and Western democracy in South Vietnam amid the Cold War (1947-1991). After China-backed North Vietnam emerged victorious in April 1975, the resulting Viet-communist regime established a totalitarian system that suppressed free speech and criticism. This censorship further distorted academic discourse, reinforcing ideological biases within scholarship.
Western intellectual traditions have undeniably contributed to transformative advancements in civil society. Wherever Westerners venture, they introduce technological progress infused with Occidental values. Scientific methodologies have consistently demonstrated their superiority in effectiveness and innovation. However, global discourse has yet to fully grasp the extent of high-tech civil surveillance, an area in which China’s rapid economic expansion has excelled. From monitoring smartphones and installing video surveillance in all public spaces to restricting access to transportation or car ownership through a credit-score grading system, China’s sophisticated technological apparatus operates with remarkable efficiency. This system provides its citizens with extensive security measures aimed at preventing criminal activity, second only to the nation’s economic prowess, which has surged over four decades following Deng Xiaoping’s economic reforms in 1984. Notably, China’s 'socialist-capitalist' economic model draws heavily from Western systems.
Had China evolved into a free, decentralized state, unrestricted by communist control, its developmental trajectory might have progressed even more rapidly. Yet, the nation continues to enforce stringent censorship, blocking access to external platforms such as Yahoo, Google, YouTube, X (formerly Twitter), BBC, and VOA, all safeguarded by the newly fortified Great Firewall. Meanwhile, an extensive network of digital operatives works tirelessly to monitor discourse and embed users within layers of state-sanctioned narratives.
Conversely, Vietnamese intellectuals closely observe the incentives tied to recognition within global academia, particularly through the adoption of proven Western methodologies. The Western-initiated Austroasiatic theory has garnered significant attention despite ongoing debates regarding its association with the Austroasiatic Mon-Khmer group and the Vietnamese language. Many local academics eagerly align with the prestigious Western stance favoring the Austroasiatic Mon-Khmer hypothesis, which provides a tangible framework for reconciling linguistic theory with archaeological findings.
For example, studies focusing on the southeastern region of the Indochinese peninsula as the Austroasiatic homeland offer insights into the discovery of highly advanced Đôngsơn-style bronze drums found in South China and Indonesia ( see Paul Sidwell's The Austroasiatic Central Riverine ). This approach is perceived as more credible than relying on traditional Vietnamese legends, folktales, and folklore to depict national prehistory, narratives that often resemble mythical storytelling. Oral traditions, while culturally significant, often face scrutiny regarding their historical accuracy and credibility, complicating their role in reconstructing prehistoric events. Nonetheless, these tales provided a means for passing down narratives about the nation's founders long before their documentation in Chinese history, following Vietnam’s early encounters with the Tần (秦 Qin) people prior to 204 B.C.
It is intriguing to consider that the ancient state name of Vietnam, "Vănlang", first recorded in Chinese annals as 文郎 Wénláng, might share a linguistic connection with "Penang", the name of the island state of Malaysia, officially 'Pulau Pinang' (Vietnamese: 'Cùlao Cau'). Pulau Pinang translates as 'The Island of the Areca Nut Palm (Areca catechu)' or 檳榔嶼 Bīnláng Yù in Chinese. This linguistic relationship may stem from the Chinese transliteration 檳榔 Bīnláng, rendered in Sino-Vietnamese as 'Tânlang' <~ */blau/, or "trầu" (betel) in modern Vietnamese.
The Austroasiatic Mon-Khmer proponents have struggled to compile basic lexical items across Mon-Khmer languages to identify consistent cognates that align with existing Vietnamese words, let alone connect them to Vietnamese legends to trace reliable sound change patterns. In contrast, Sinitic-Vietnamese etyma documented in Chinese historical records provide far more robust evidence for analyzing phonological evolution.
For instance, the term 董 (dǒng) in the legend of "Phùđổng Thiênvương" (扶董天王 Fúdǒng Tiānwáng), a mythical folk hero in Vietnamese history who repelled the Ân (殷 Yīn or 殷商 Yīnshāng) invaders from ancient China's Yin Dynasty, is also referred to as 董聖 (Dǒng Shèng; SV: Đổng Thánh). In Vietnamese folklore, this hero has been normalized as Thánh Gióng or Dóng (/Jong5/), meaning Saint Dóng or Gióng. Intriguingly, the phonological evolution evident in both pronunciations conforms to recognizable sound change patterns, including sound changing patterns /t- ~ j-/ and /d- ~ z-/.
Similar to the case of "Vănlang", the irony in Vietnam's history lies in the uncertainty surrounding the names of its legendarily revered ancestral kings, specifically King Hùng or King Lạc. Two significant yet enigmatic names are King "Hùng" 雄 (Mandarin: Xióng) and King "Lạc" 雒 (Mandarin: Luó). The name "Hùng" represents a Sino-Vietnamese pronunciation primarily based on ĐạiViệt Sửký Toànthư (大越 歷史 全書, Complete History of ĐạiViệt ) by Ngô Sĩ Liên, which relied on records from Chinese annals.
However, recent research suggests that "Hùng" 雄 may have been mistakenly identified for "Lạc" 雒. Notably, the term "Vua Hùng" might have originated from the Daic language term "pòkhun", where "pò-" translates to "bố" (father) and was extended to mean "vua" (king) in Vietnamese (Nguyễn Ngọc San, 1993, p. 93) .
The Austroasiatic Mon-Khmer proponents have largely adhered to the well-established paths laid by their predecessors, whose theories were initially reinforced by French academics. Following the physical withdrawal of French colonists from Indochina in 1954, one of the most consequential colonial legacies was Vietnam’s adoption of its national Romanized orthography, Quốc ngữ (國語). This writing system introduced grammatical structures influenced by French, solidifying logical {Subject + Verb + Object} models for modern Vietnamese. Moreover, it provided advanced intellectual tools and methodologies that further distanced contemporary Vietnamese from their ancestral linguistic roots. This phenomenon notably parallels developments in the fifteenth century following the Ming Dynasty’s 25-year rule over Vietnam, an era layered atop centuries of Chinese colonization from 111 B.C. to 939 A.D.
As outlined in this survey, the theorization of Sino-Tibetan-classified Vietnamese is rooted in linguistic specificity, historical phonologies, etymologies, and the prehistoric context of a hypothetical indigenous homeland. This framework aligns with the analytical methodologies employed by the Austroasiatic Mon-Khmer hypothesis, which has maintained steady academic publication over the past six decades. However, the Sino-Tibetan hypothesis has experienced fluctuating relevance in historical linguistic circles, largely due to the difficulty in identifying clear Sino-Tibetan-Vietnamese cognates. While the original hypothesis relied heavily on premises involving Sinitic-based vocabulary, its foundational etymological breakthroughs lost momentum and novelty decades after their inception.
Everything comes with a price, including the profound spiritual toll endured by the Vietnamese as they undergo Austroasiatic mental colonization. This process resembles coerced ideological transformation, requiring individuals to relinquish deeply held beliefs and traditions, exchanging the Oriental philosophy of "the Way of Life" (Nhânsinhquan 人生觀, or Đạo 道) for Western values. Simultaneously, the collective subconsciousness, shaped by competing ideologies, fosters a persistent skepticism toward foreign works that may conceal underlying ideological agendas. These concerns, rooted in Vietnam’s historical encounters with Chinese and French colonial rule, extend to contemporary influences from other nations, including Russia and the United States. Needless to say, such perspectives carry significant implications for the social sciences and humanities, particularly archaeology and historical linguistics.
Among Vietnam’s native ethnic minorities, the Kinh people have emerged as the dominant majority. This group represents a racially mixed fusion of Sinicized populations. Dating back to the period when Annam was a Chinese prefecture in the Tang Dynasty (618 to 907 A.D.), intermarriages among migrants of diverse racial backgrounds from both northern and southern China contributed to the formation of the Kinh majority. Settling primarily in the fertile Red River Delta, this population resided alongside the ruling class, particularly in Vietnam’s northeastern coastal regions, known for their abundant rice fields and fishing villages.
The Han settlers, effectively "conquistadors", were descendants of the aboriginal Yue people, who historically cultivated rice in delta regions south of the Yangtze River, including Jiangxi, Hubei, and Hunan provinces, and engaged in fishing along China’s southeastern coast in states such as WuYue, MinYue, and NanYue. Despite adopting new identities over time, these populations retained an awareness of their ancestral roots in the Yue genealogical line.
These connections link them to the largest indigenous communities still existing today, such as the Zhuang (Nùng) and Daic (Tày) minority groups concentrated in both mountainous and urbanized areas of China’s Yunnan, Guizhou, and Guangxi provinces. Their presence also extends into Vietnam’s northwestern provinces, including Laichâu, Hàgiang, and Tuyênquang, in contrast to the Vietnam's Muong ethnic groups who primarily inhabit remote mountainous regions such as Hoàbình and Ninhbình, far removed from urban centers, metropolises, and coastal paddy fields where the Sinicized Vietnam's Kinh people have been living since the Chinese colonial period.
On a broader politico-geographical scale, these associations extend to other Sinicized Yue groups that merged into the Han Chinese majority centuries ago. These include the Cantonese, Fukienese, and Wu-speaking populations of Guangdong, Fujian, and Zhejiang provinces. Linguistically, their Yue languages, including Yueyu, Min-Yueyu, and Wu-Yueyu, have long been recognized as fully Sinicized within the Sinosphere, a process dating back at least 2,250 years. Consequently, these languages are currently classified within the Sino-Tibetan linguistic family.
This rationale is rooted in Vietnam’s extended period under Chinese rule, spanning 1,010 years before its separation from the NamHán State in 939 A.D. Notably, core Vietnamese lexical items were systematically recorded in Chinese script in ancient times, dating back to their earliest documented usage. Examples include ngày (日 rì, "day"), suối (川 chuān, "creek"), rựa (戉 yuè, "axe"), gạo (稻 gào, "rice"), and dê (羊 yáng , "goat"). Conversely, these lexical items exhibit distinct divergence from comparative analyses based on the Austroasiatic Mon-Khmer linguistic framework (see Chapter 8 - The Mon-Khmer Association ). Austroasiatic Mon-Khmer theorists apply a distinct approach when analyzing contemporary Vietnamese, attributing its non-Austroasiatic vocabulary to Chinese loanwords. By comparison, the Mon-Khmer contribution to the Vietnamese linguistic landscape remains relatively minimal, primarily appearing as substratum influences.
To a lesser extent, the intellectual shift toward the Austroasiatic Mon-Khmer framework represents a material loss in its own right. In Vietnam today, while Mandarin ( Putonghua ) remains widely studied, scholars specializing in Chinese historical linguistics have become increasingly rare, relics of a bygone era. Sinitic-Vietnamese historical linguistics requires a distinct set of advanced analytical skills, yet the field has struggled to maintain relevance in contemporary academic discourse."
Scholars belonging to the second and succeeding fourth generations after 1975, educated within Vietnam’s socialist academic system, essentially a framework of politically constrained scholarship, ironically tend to exhibit equally strong admiration for Western academic traditions. This has given rise to a new class of Vietnamese scholarship that outwardly embraces Western intellectual influences while demonstrating hypersensitivity to criticism. This sensitivity likely stems from an underlying inferiority complex, reinforced by the consistently low academic standards applied to many graduate theses. Despite these scholarly shortcomings, political motivation remains a dominant force within academic circles, often driving researchers to align their work with the socialist system for professional gain.
The role of politics in shaping Vietnamese academic disciplines,
particularly in objective fields, warrants closer examination. This
issue will be explored further in the next chapter, where its
significant negative impact on intellectual integrity and scholarly
progress will be analyzed in greater detail.
VI) Exploring Two New Approaches
The excitement surrounding this research stems from two new theoretical approaches that significantly depart from traditional methodologies is the core of this research. This linguistic progression exemplifies how Vietnam's history, despite forging its own independent path, was deeply entwined with that of China’s has a unique linguistic legacy, which has raised so much of stakes.
New approaches have emerged, enabling the identification of Vietnamese core words and their classification within the Sinitic-Vietnamese domain while linking them to Sino-Tibetan etymologies. After more than three decades of independent exploration, the author considers himself an innovator in Sinitic-Vietnamese etymology. His self-directed journey has freed him from reliance on established Austroasiatic Mon-Khmer narratives. As Professor Nguyễn Đình-Hoà remarked in 1997, dismantling the Austroasiatic framework, starting with its foundational premises, would be a Herculean task. According to Professor Nguyễn, the essential issue lies in the evolution of Vietnamese itself, regardless of how Western linguists classify it. Similarly, Lê Đình Diệm, the author’s linguistic mentor at Saigon University, echoed this sentiment. This paper serves as a wake-up call to seasoned scholars in Vietnamese historical linguistics, urging them to reevaluate their reliance on outdated tools and the Austroasiatic Mon-Khmer hypothesis.
The intellectual enlightenment experienced through these discoveries will likely resonate with readers, fostering excitement as new horizons unfold. Positioned at the forefront of this discussion, the author has learned to remain composed in response to antagonistic or contentious remarks regarding the Sinitic-Vietnamese subject, regardless of their origin. Over time, he has stood firm against the "Austroasiatic Mon-Khmerists", who now freely disseminate their narratives online. As an active researcher, he counters by publishing his Sinitic-Vietnamese etymological findings rooted in Sino-Tibetan origins.
To wrap up our defense of postulation of the Sinitic position, the author emphasizes again that the Austroasiatic Mon-Khmer hypothesis was constructed based on lexical data from scattered Mon-Khmer languages, founded on the assumption that isolated languages might preserve original forms despite diachronic sound changes. Throughout their theorization, Austroasiatic Mon-Khmer theorists consistently overlooked Yue linguistic elements extensively documented in Chinese historical records, especially in the Kangxi Dictionary where the author found basic words such as 'eat', 'sleep', 'poop', 'fuck' as cited above. It is where the Yue and Sinitic elements meet to give birth to the Chinese languages and the Vietnamese language as we see them today.
These Yue entities, however, were disregarded, both historically and linguistically, in favor of constructing a new framework termed Austroasiatic.
By establishing an entirely new linguistic family on their own terms, Indo-European theorists circumvented the need to engage with the extensive information about the Yue available in ancient Chinese classics. They developed their theory by selectively manipulating data from living languages while largely ignoring significant historical contexts.
In contrast, Yue linguistic elements encompass not only analytical methodologies but also integrate history, archaeology, anthropology, and, where applicable, a spiritual dimension tied to national ideology.
Firstly, this comprehensive approach, grounded in Chinese historical records, ancient rhyme books, and classical texts, is foundational, as many Sinitic-Vietnamese etyma naturally intersect with anthropological categories in various ways. For instance, the Vietnamese concept of thờ (worship) aligns with several Sinitic correspondences, such as 侍 (shì, SV thị), 祠 (cí, SV từ), 祀奉 (sìfèng, VS thờphượng), or even 奉事 (fèngshì, SV phụngsự) in expressions like 忠臣 不 事 二 君 (Zhōngchén bù shì èr jūn), "Tôitrung không thờ hai chúa", translated as "Loyal subordinates will not serve two kings." Meanwhile, the closer Sino-Vietnamese version, Trung thần bất sự nhị quân, remains intelligible to most educated Vietnamese speakers due to its deep cultural resonance. Notably, the concept of thờ spans two distinct domains: spiritual worship and ideological loyalty.
The following enumeration explores the Yue linguistic and cultural sphere while intentionally excluding Austroasiatic Mon-Khmer values, which diverge from ancestral belief systems rooted in Vietnamese spiritual ancestral worship. This spiritual heritage constitutes the soul of the national language, complementing its historical evolution. Austroasiatic linguistic structures lack these two essential semantic dimensions.
Consider the developmental trajectory of 飯 (fàn, "meal"), which evolved into ban, bữa, buổi ('period of the day'; cf. Hainanese /buj²/, Fukienese /bəng²/). Similarly, one might examine the morph ban- in 白日 (báirì) aas in ban ngày ('daytime') as an independent linguistic element coexisting with other etyma, where both sound and concept have transferred to signify 'daytime.' By analyzing these transformations without being confined by the original representation of Chinese characters, one gains fresh insight into the Sinitic theory presented here, opening avenues for new interpretations.
Among these linguistic developments, words such as 'eatery' 食 shí (SV thực, VS 'xơi' )'and 'rice' 稻 dào (VS 'gạo', SV đạo) reflect fundamental aspects of Vietnamese cultural identity. This is exemplified in the proverb "Có thực mới vực được Đạo" ('One must first have sustenance before upholding principles'), which underscores the deep integration of sustenance and philosophy within the Vietnamese language.
In the early 2000s, when the author first shared these preliminary discoveries online, he faced initial indifference and resistance, including dismissive responses from a professor at a prominent U.S. institution. However, he remains confident that newcomers to the field who approach his findings with an open mind will recognize their novelty and significance, as they represent a fresh and groundbreaking perspective."
As life goes on in the pursuit of the new approach in the field, the author has consistently advocated for the Sinitic-Vietnamese perspective whenever the topic arose. While some may have found his repeated references to classic examples excessive, he firmly believes that his theorization offers something unique, building upon existing concepts while refining them into a clearer and more comprehensive framework.
To further elaborate on such linguistic perspective, let's examine a bit more about the spiritual dimension that underpins widespread belief in the ancestral Yue aborigines in South China, irrespective of their inclusion in Sinitic classifications. The Vietnamese practice of ancestral worship, termed tínngưỡng thờcúng tổtiên or tục thờcúng ôngbà (祖先崇拜 or 祖先教), has persisted for over two millennia. This tradition is akin to Buddhist conceptions of the afterlife, influencing societal conduct in earthly life regardless of simultaneous adherence to other religions. Far from being dismissed as a superstitious folk cult, ancestral worship constitutes a legitimate spiritual tradition interwoven into Vietnamese identity.
Vietnamese integrate elements of various religions into tangible representations of belief, such as placing photographic images of deceased ancestors alongside sacred figurines of Buddha or Jesus, complemented by incense-burning rituals. Regardless of whether Buddhism, Daoism, Catholicism, Christianity, Islam, or indigenous movements like Caodaism and Hoahaoism were introduced to Vietnam, all converge in the spiritual offerings dedicated to honoring ancestors. This fusion of Buddhism and Daoism underscores the enduring reverence for the ancient Yue, recognized as the forebears of the Vietnamese.
These ancestral traditions persist among Yue-descended communities across South China, including the Zhuang nationality, Fukienese, Hainanese speakers, and groups such as the Nùng and Tày, whose cultural practices remain evident in shrines and temples throughout Vietnam.
The Austroasiatic theory notably fails to acknowledge the role of spiritual values in the early stages of collective identity formation. This omission extends to historical contexts essential for understanding later developmental phases following tribal divisions. When juxtaposing linguistic theories, prehistoric social structures must be considered, as they played a crucial role in shaping shared languages within communities during documented historical periods. Linguistic evolution must also reflect geopolitical factors, including economic systems and state governance structures.
Such considerations are vital in classifying related Sinitic languages, including Southern Wu, Cantonese, and Min Nan dialects, within the Sino-Tibetan linguistic family. Conversely, efforts to deny the Yue origins alongside Han admixture, particularly its roots in the Chu State and its Yue subjects, illustrate how political influences distort linguistic research, diverting academic inquiry from its natural trajectory. A similar phenomenon occurs in Vietnam, where anti-Sinitic sentiment shapes scholarly classifications, underscoring the need for an analytical approach to politics' impact on linguistic studies.
Individuals positioned at the lower end of the academic spectrum, often selected to propagate state-sanctioned narratives, represent another form of intellectual opposition. These figures frequently lack the capacity for fact-based argumentation, failing to construct foundational premises necessary for rigorous linguistic inquiry. There is little expectation that they will embrace or acknowledge the validity of this theory, which has long challenged dominant linguistic frameworks.
The author personally considers engagement with persistently counterproductive opinions an unproductive exercise. This is one of the primary reasons why this paper was initially written in English, to mitigate exposure to disruptive discourse and distance itself from a demographic unlikely to engage meaningfully with its arguments.
A detailed examination of political issues shaping Vietnamese linguistics will be provided in a separate chapter, particularly addressing their broader impact on the humanities, which has influence a lot on the reclassification of the Sinitic Vietnamese linguistics as emphasized multiple times in this study.
Our second approach remains fundamentally Sinitic in essence, also well documented, with Chinese serving as its core foundation, primarily because "Sinitic" is the established linguistic term. Chinese dialects fall within the Sino-Tibetan family, a classification based not on historical political affiliations but rather on typological cognateness between Vietnamese words with multiple dialects and its extant Tibetan languages.
For the latter, let's examine the example of the Vietnamese word "bò" (cow) that demonstrates strong cognateness with Old Tibetan forms. According to Shafer (1966-1974), "cow" in Old Tibetan appears as ba, with variations across Bodic languages such as Western Bodish Burig bā, Groma and Śarpa bo (calf), Dangdźongskad and Lhoskad ba, and Central Bodish Lagate pa-, Spiti, Gtsang, and Dbus Ãba bʿa. Additional cognates include Mnyamslad and Dźad pa, Rgyarong (ki)-bri, -bru, and modern Bodic dialects such as New Mantśati (bullock), Tśamba Lahuli (ox bań), or Rangloi bań-ƫa (bullock).
Moreover, in Chinese, the character 牝 (byi/), denoting female animals, aligns with Old Tibetan ãbri-mo ("tame female yak"). A plausible etymological connection can be drawn between Old Tibetan ãbri-mo and the Vietnamese bê (calf). Given the cultural and agricultural significance of "cow", or more precisely, "water buffalo", to Vietnamese water-paddy agriculture, it is implausible to classify this term as a loanword, particularly within the Austroasiatic Mon-Khmer hypothesis.
Previous attempts to postulate Vietnamese etyma have relied on juxtaposition-based brainstorming, an intuitive method that predated the emergence of Austroasiatic Mon-Khmer theorization in Vietnamese linguistics. These early efforts lacked methodological refinement, as evidenced in misattributed examples. Many researchers failed to differentiate Sinitic-Vietnamese lexicons from Sino-Vietnamese categories, focusing exclusively on the superstrata of Sinitic-Vietnamese layers. This superficial resemblance between Chinese and Vietnamese etyma led to misclassification, with many lists erroneously presenting Vietnamese words as Chinese loanwords.
For instance, while the Sino-Vietnamese term sư for 師 (shī, "teacher") is widely recognized, for some scholars s/he may already know that thầy as an additional cognate. Additionally, thầymô reflects a normalized variant of 巫師 (wūshī, "shaman") in reverse syllabic order. Similarly, Sinitic-Vietnamese etyma such as sải 師 (shī, "monk") and phùthủy 巫師 (wūshī, "shaman") (see Tsu-lin Mei's APPENDICE G-8) share linguistic root ancestry with thầycô 老師 (lǎoshī, "teachers") (SH). Further cognates include:
- 婿 (xū, rể, "son-in-law")
- 姑爺 (gūyě, conrể, "son-in-law")
- 生 (shēng, "live") vs. đẻ ("give birth to", cf. Hainese /te1/)
These lexical relationships illustrate how Vietnamese evolved through interactions across multiple Chinese dialects over different historical periods, both diachronically and synchronically.
Advanced proficiency in Han-Nom or Sinitic-Vietnamese studies requires a measurable mastery of language analysis. Highly qualified scholars specializing in historical linguistics have become increasingly rare, contributing to the diminishing rigor in contemporary linguistic research. Many academic papers written in English, as mentioned earlier, fail to distinguish between Sino-Vietnamese and Sinitic-Vietnamese terms, a fundamental oversight that undermines their credibility.
While no direct critique is intended, such misclassifications remain widespread. Meanwhile, general Chinese literature readers often lack the expertise of prominent scholars such as Karlgren or Maspero.
Aspiring students exploring Vietnamese historical linguistics must make critical methodological decisions early on. While adopting Western methodologies and emphasizing objectivity free from state interference may seem appealing, this shift does not inherently equip researchers to address core linguistic issues embedded in centuries-old subjects tied to agriculturally driven economies.
For example, the Sino-Tibetan classification of "bò" and related etyma serves as a case study highlighting foundational elements in Vietnamese linguistic evolution. Despite this, newcomers often gravitate toward the Austroasiatic Mon-Khmer framework due to its structured data collection and systematic tabulation of Mon-Khmer lexical forms. However, this approach frequently fail to account for the phonetic shifts and fluidity inherent in Vietnamese and Chinese cognates.
This study seeks to refine these areas by reevaluating Vietnamese through a Sinitic-Vietnamese lens, emphasizing historically supported etymologies rather than speculative Austroasiatic Mon-Khmer classifications. (水)
Incorporating multiple perspectives, ranging from religion to politics, alongside archaeology, anthropology, history, and linguistic proficiency, is essential in understanding Vietnam’s linguistic evolution where the two new approaches are drawn. Western-educated individuals with pragmatic mindsets often advocate industrialization as a guiding principle for progress. By the turn of the twentieth century, Western industrialism had already demonstrated efficiency across various institutions, leading to the belief that scientific precision and methodological rigor could be applied to the field of historical linguistics.
Determined to modernize linguistic studies, they spearheaded the adoption of the Romanized Quốcngữ writing system, Vietnam’s adoption of Western ideas gained momentum with the first generation of French-educated scholars following the country’s independence from French colonial rule on July 20, 1954, a decisive break from Chinese influences, dismantling a thousand-year-old tradition. In this transitional phase, scholars acted swiftly to eliminate the Chinese-based Nôm script while simultaneously phasing out French as the primary academic language within colonial educational structures.
It is no surprise that proponents of Western methodologies favor measurable tools over traditional interpretative approaches, prioritizing precision over approximation in the study of Vietnamese etymology. However, some theorists took this ideology to extremes. By the late nineteenth century, and well into the new millennium, locally trained French-educated scholars perpetuated the dismissive notion, reportedly introduced by French grammarians, that the "Annamite" language lacked its own grammatical framework and required French grammar for proper writing. This misconception reinforced a lingering classification of Vietnamese as an isolated language, technically defined by rigid syntactic order rather than morphological inflection, trying to fit and distinguish the term based on inflected languages such as German, Russian, and Latin. Western linguists similarly classified Chinese under this framework.
As a matter of fact, Vietnamese and Chinese do not conform entirely to these classifications. The label isolated language carries significant interpretative weight and is often associated with contentious debates regarding linguistic categorization. This issue will be explored later in the chapter on disyllabicity, where the structural aspects of Vietnamese and Chinese will be examined in greater detail.
In contemporary times, Western influences extend beyond linguistic classification into various aspects of daily life, shaping cultural preferences in ways that cannot always be judged as entirely right or wrong. Traditional Vietnamese practices frequently undergo reinterpretation through Western frameworks, reflecting a globalized society where traditions merge and adapt.
Examples range from Western pharmaceutical diagnoses replacing traditional remedies, now often regarded as a last resort, to lifestyle preferences involving commercialized holidays such as Christmas, Western New Year, Valentine’s Day, Mother's Day, and Halloween. Wedding customs have similarly evolved, with increasing adoption of Western-style celebrations, including white wedding gowns, diamond engagement rings, and black attire for funerals, contrasting sharply with traditional Vietnamese mourning customs that emphasize coarse white fabrics.
Despite these cultural shifts, Chinese characters remain widely used in ceremonial inscriptions and religious practices. Whether in Spring Festival couplets, written prayers, or ancestral name plaques displayed in temples and altars, their presence underscores an enduring reverence for tradition.
Western Austroasiatic Mon-Khmer theorists have applied a similar lens to Vietnamese linguistic studies, aligning linguistic classification with dominant trends. However, a balanced approach is necessary, one that reconciles opposing perspectives to develop an etymological framework accommodating diverse viewpoints.
Discrepancies in newly identified Sinitic-Vietnamese etyma, Nôm words of Chinese origin or Hán-Nôm, can complement Austroasiatic Mon-Khmer findings rather than contradict them. Ideally, etymological discoveries from both perspectives should coexist rather than be dismissed outright. Core Vietnamese cognates identified in Mon-Khmer languages do not necessarily negate their origins in Chinese or Sino-Tibetan linguistic stocks.
For instance, examining the Khmer counting system, structured around a base-five system, can provide insight into numerical etymology. By breaking it down further into binary and then back into the decimal system, one may uncover explanations for foundational numerical terms that have long dominated Austroasiatic Mon-Khmer arguments since the inception of its theorization. (See Chapter 8 - The Mon-Khmer Association)
One might compare the Austroasiatic Mon-Khmer approach metaphorically to an arranged marriage within traditional Chinese customs, wrapped in a Western-style wedding gown, distinct from previously discussed traditional attire but still incorporating customary rituals imbued with Vietic elements. In other words, Austroasiatic Mon-Khmer theory on Vietnamese historical linguistics remains incomplete without recognizing Chinese influences, just as Chinese influences are inextricably linked with Vietnamese linguistic evolution.
Austroasiatic Mon-Khmer linguists may have a valid case, but historical support is essential to substantiate their arguments. This reality stems directly from Vietnam’s 1,009 years under Chinese imperial rule, as well as earlier prehistoric interactions. While the finer details may appear intricate, they remain inadequately addressed in academic circles.
Western methodologies offer a valuable framework for addressing long-standing linguistic classifications. However, one must acknowledge that Vietnamese linguistic foundations existed long before Western scholars initiated systematic analysis in the twentieth century. Essential linguistic resources, including dictionaries and rhyme books such as Éryá (爾雅), Shuōwén (說文), Tángyùn (唐韻), Guǎngyùn (廣韻), and dialectal annotations within the seventeenth-century Kangxi Dictionary, were already in circulation. Western scholars must engage deeply with these sources rather than attempt to develop entirely new theories, such as the Austroasiatic hypothesis, primarily for convenience in addressing a millennium-old field of study.
Consider, for instance, the misguided assertion that "Vietnamese has no grammar; take the French one and use it instead." This approach merely offered a shortcut to circumvent the complexities of learning Vietnamese grammar, just as similar tactics were employed to bypass the intricate study of ancient and modern Chinese. Assertions driven by convenience rather than academic integrity ultimately prove counterproductive.
Up until the late eighteenth century, Western academics possessed only minimal knowledge of the Chinese language (See Knud Lundbæk, 1986). The steep learning curve inherent to Chinese philology and historical linguistics necessitated alternative approaches. While certain methodologies proved useful in advancing linguistic classification, oversimplifications designed to accommodate Western audiences often failed to reflect the historical and cultural intricacies of Vietnamese linguistic development.
Reconciling linguistic methodologies with historical realities demands rigorous engagement with primary sources. Whether analyzing Vietnamese within the Sinitic-Vietnamese framework or assessing its Austroasiatic Mon-Khmer affiliations, scholars must recognize the impact of Vietnam’s geopolitical history on its linguistic trajectory.
Rather than impose theoretical constructs that neglect historical context, linguistic inquiry must strive for a balanced synthesis that respects documented historical interactions, cultural transformations, and evolving linguistic structures.
Figure 2.3 , Han's Giaochau Prefecture in 111B.C.
Source:
http://chinese-dialects.blogspot.com/2010/08/blog-post_22.html
Readers may wonder how Vietnamese integrated words with Mon-Khmer elements into their language. This process involved the distinct method of accentualizing borrowed words, assigning tonal distinctions to each syllable, much like how Vietnamese adapted French loanwords into its phonological system.
Regarding the inquiry into the presence of Austroasiatic Mon-Khmer cognates within Vietnamese basic vocabulary, my findings reveal that many Sinitic-Vietnamese etyma simultaneously appear within Sino-Tibetan etymologies, aligning seamlessly with corresponding Chinese forms. For example, ngà (牙 yá, "tusk") and máu (衁 huāng, "blood") exhibit such correspondence (衁). These etyma quantitatively extend beyond the limited Mon-Khmer lexical items frequently cited and recycled in Austroasiatic research. Qualitatively, they display subtle 'genetic' linguistic traits absent from Vietnamese words that bear resemblance to the Mon-Khmer lexicon (T).
The broader discussion regarding commonalities between Chinese and Vietnamese, encompassing word clusters, fixed expressions, idioms, and structural parallels, remains an ongoing debate. These similarities inevitably reflect the extensive influence of Chinese culture on Vietnamese language development, shaping its trajectory across centuries. The process of linguistic absorption can be divided into three distinct phases:
- The period preceding 111 B.C., when Yue linguistic elements were deeply embedded in the languages spoken across southern China.
- The millennium of Chinese colonial rule, which intensified Sinicization and reinforced administrative linguistic practices.
- The post-10th-century era, during which the independent state of Annam selectively absorbed additional Sinitic elements to support governance and scholarship, albeit at a slower pace than in earlier periods.
Vietnam continued to use the Chinese official court language for government records and literary works long after separating from China in the late 19th century. This parallels the approach adopted by Japanese and Korean, which actively incorporated Chinese-character-based vocabulary during the Tang Dynasty.
Over time, this linguistic evolution shaped Vietnamese into a distinct entity, yet one retaining deep historical ties to its Sinitic-Yue origins. (日)
Language evolves organically over time, shaped by continuous generational transmission. The development of Vietnamese from ancient periods to the present has followed a path of natural continuity, ensuring its growth remains unforced. Colloquially, Vietnamese speakers across diverse backgrounds, including scholars and merchants, continue to use a language heavily enriched with Sinitic-Vietnamese and Sino-Vietnamese words, which remain essential components of daily speech.
Lexically, the Vietnamese vocabulary includes a significant number of words with direct correspondence to Chinese etyma, such as:
- ăn ("eat") ~ 唵 ǎn (SV àm)
- ngủ ("sleep") ~ 臥 wò (SV ngoạ)
- đụ ("fuck") ~ 屌 diào (SV điệu)
- ỉa ("poop") ~ 屙 é (SV a)
- uống ("drink") ~ 飲 yǐn (SV ẩm)
- trừng ("stare") ~ 瞪 dèng (SV trừng)
- nói ("chat") ~ 聊 liáo (SV liễu)
- nấu ("cook") ~ 熬 áo (SV ngao)
- gạo ("rice") ~ 稻 dào (SV đạo)
- gà ("chicken") ~ 雞 jī (SV kê)
The presence of these words highlights the deep linguistic integration between Vietnamese and Chinese, making them indistinguishable from native vocabulary through the natural process of phonological adaptation. Whether these words originated directly from Chinese remains an area of continued exploration, but their assimilation into Vietnamese speech patterns suggests organic linguistic transmission rather than deliberate imposition.
Beyond these fundamental terms, Vietnamese also retains phonological similarities with certain southern Chinese dialects, such as Hainanese. Examples include:
- xơi ("eat") ~ 食 shí (Hai. /zha2/)
- bể ("broken") ~ 破 pò (Hai. /be6/)
- bồng ("carry a baby") ~ 抱 bāo (Hai. /bong2/)
Hainanese, as a subdialect of the MinNan linguistic group, descends from Yue languages, reinforcing the Yue-Sinitic theorization linking Vietnamese to a shared linguistic heritage.
In contemporary Vietnamese usage, approximately 90% of words in an average sentence derive from Sinitic-Vietnamese stock, while only 10% constitute pure Vietnamese or Nôm vocabulary. Even among Nôm words, many share undeniable cognates with Chinese etyma, verifying them as indigenous expressions. Consider the following examples:
- dừa ("coconut") ~ 椰 yé (SV gia)
- chuối ("banana") ~ 蕉 jiāo (SV chiêu)
- đường ("sugar") ~ 糖 táng (SV đường)
- sông ("river") ~ 江 jiāng (SV giang)
- gạo ("rice") ~ 稻 dà (SV đạo)
Even so-called pure Vietnamese words may trace their origins to Chinese or common Yue etyma, words also attested in Cantonese or Fukienese dialects. Examples include 睇 /t'ej3/ ("see") and 檨 /soã/ ("mango"), which have evolved into the modern Vietnamese forms thấy and soài in Quốc ngữ, preserving close phonological resemblance to earlier pronunciations despite slight deviations over time. . (See Chapter 10 - Parallels with the Sino-Tibetan languages)
Modern Vietnamese sentence structure shares similarities with the classical literary Chinese style seen in major works from the 12th century onward, allowing for near word-for-word translation. However, literary works from 16th-century Vietnam, including Buddhist scriptures, often sound "Shakespearean" to contemporary Vietnamese speakers, rendering them challenging to comprehend.
Two key reasons explain this linguistic disparity:
1. The target audience , sixteenth-century texts were written by and for Vietnamese scholars and the intelligentsia, whereas classical Chinese novels were composed in a vernacular Mandarin style that reflected everyday spoken language.
2. Grammatical evolution , ancient Annamese compositions differed significantly from modern Vietnamese grammar, which was later influenced by French syntactic structures due to Romanized Vietnamese orthography introduced by Western-educated pioneers such as Petrus Trương Vĩnh Ký and Phạm Quỳnh in the early 20th century. (X)
This evolution significantly refined Vietnamese composition, introducing structured sentence formations, thesis-driven arguments, and a systematic punctuation mechanism. Consequently, modern Vietnamese sentences integrate Sinitic-Vietnamese vocabulary within a grammatical framework influenced by French linguistic structures, developing independently from Chinese while reflecting broader cultural transformations. This linguistic progression illustrates how language and cultural development advanced concurrently, reinforcing each other over time.
Languages naturally evolve. Subdialectal deviation is a universal phenomenon that causes languages to diverge from their ancestral forms. Within the vast territory of the Middle Kingdom, Ancient Chinese fragmented into seven major dialects, eventually becoming mutually unintelligible. Each of these dialects continued branching into numerous subdialects. Similarly, Middle Chinese loanwords in Japanese, as reconstructed by Bernhard Karlgren, exhibit notable variations compared to those found in the Sino-Vietnamese vocabulary, both phonologically and semantically.
Vietnamese has followed its own distinct path. Even after only two decades of division between northern and southern Vietnam (1954–1975), linguistic variations emerged, with regional speakers sometimes struggling with word choices used by those from the opposing side. This underscores the profound influence of geography and historical shifts on linguistic differentiation, demonstrating how even relatively brief separations can leave lasting linguistic imprints. (See Appendix O.)
Historically, following Annam’s separation from China in 939, its language evolved independently, minimizing the extent of Sinicization and allowing it to develop into the early form of ancient Vietnamese. This linguistic trajectory contrasts sharply with the evolution of southern Chinese subdialects, such as Cantonese, Fukienese, Hainanese, and Taiwanese. Although originating from Yue-speaking regions, these dialects have undergone significant Sinitic influence, to the extent that they are now classified as Chinese dialects rather than Yue-origin languages.
Ironically, the Austroasiatic camp applies a similar argument, attributing Vietnamese’s heavy Sinitic influences to Sinicization rather than shared hereditary origins.
Despite centuries of geographical expansion, the core of Vietnamese has remained intact as a unified language. While northern Vietnamese is distinguished by its greater usage of Sino-Vietnamese vocabulary, likely due to its proximity to China, southern Vietnamese has incorporated Chamic and Khmer elements over the past 1,100 years. Nevertheless, Vietnamese speakers across regions still understand subdialectal differences.
Languages naturally evolve through recurring generational transmission. However, history provides many examples of language extinction, including the displacement of Manchurian by Mandarin in China, a process in which Mandarin itself absorbed elements from Manchurian. Similar cases are evident worldwide, where indigenous languages in North and Latin America continue to decline.
Government intervention frequently poses threats to linguistic minorities. The Sinicization of Cantonese serves as a stark example, beginning in 1911, following the fall of the Qing Empire. When China became a republic under Sun Yat-sen, northern officials pressured Sun to cede the presidency to Yuan Shikai, whose administration favored Mandarin as China’s official language, declaring it the national standard.
In modern times, such policies persist, as seen in the ban on Cantonese TV broadcasts in its native Guangdong Province, an ongoing attempt at linguistic homogenization.
Expanding this discussion, similar complexities arise in the classification of minority languages in China. For instance, Zhuang was originally classified under the Sino-Tibetan linguistic family (Shafer, 1941) but was later reclassified as part of Tai-Kadai (Fang-Kuei Li, 1966). This highlights the evolving nature of linguistic classifications and the influence of political and sociolinguistic factors on language recognition.
The Zhuang people constitute the largest minority group in China, with a population exceeding 17 million. Despite significant Chinese influence, primarily through cultural adaptation, the Zhuang language has preserved distinctive characteristics. This resilience is likely due to the historical settlement patterns of Zhuang communities, many of whom have lived in remote regions since ancient times. Unlike Cantonese, which has undergone extensive Sinicization, Zhuang remains uniquely distinct even though it is classified under the Tai-Kadai linguistic family. However, the Zhuang are an ethnic group rather than strictly a linguistic one. Given variations among their subdialects, many Zhuang speakers struggle to communicate across dialectal boundaries (Z). Overall, Zhuang variant speeches exhibit diverse non-hereditary influences, including linguistic features from Zhuang, Daic, and Chinese groups (Lan Hongyin, 1984, pp. 131–138).
Historically, both the Viets and the Zhuang were recorded as descendants of the Yue people, potentially referenced as "Bjet" or a term resembling "Bod" (cf. 百越 Baiyue, 百姓 Baixing; see Terrien Lacouperie, 1887). These groups are among the most evident representatives of the ancestral Yue aborigines, as documented in historical accounts. Vietnamese identity, more than ethnicity, is primarily defined through language, with strong Sinitic elements unifying the group. Chinese linguistic attributes, including syllabic structure, tonal variation, and semantic development, are apparent across Nôm, Sinitic Vietnamese, and Sino-Vietnamese lexical sets. Vietnamese melodic intonation, as seen in transliterations of foreign place names, further supports these connections. For instance, ancient Chamic names such as Vijaya and Kauthara were softened in Vietnamese through Sino-Vietnamese adaptations into Quinhơn (歸仁 Guiren) and Nhatrang (牙莊 Yazhuang).
Conversely, the Mon-Khmer groups represent ethnic identities rather than linguistic classifications. People of Mon-Khmer ancestry tend to define themselves collectively through ethnicity, distinguishing themselves from the Vietnamese Kinh majority and neighboring Muong minorities. For example, an individual of Muong ethnicity typically has no difficulty identifying with Vietnamese nationality, whereas a Vietnamese citizen of Khmer origin, despite being born and raised in Vietnam and fluent in the language, may or may not consider themselves ethnically Vietnamese. If they identify as Khmer rather than Vietnamese, their distinction is typically rooted in language rather than racial affiliation, as seen in similar cases with Vietnamese individuals of Chinese descent.
Bilingual speakers of Khmer origin often view their racial identity as aligning with Khmer communities in Cambodia, while simultaneously identifying as Vietnamese nationals based on citizenship. This dual perspective underscores the intricate relationship between ethnicity, language, and national identity among minority groups in Vietnam.
The formation of Vietnamese place names further illustrates linguistic divergence. If Vietnamese and Khmer languages had shared a genetic affiliation, there would have been little need for Vietnamese speakers to create entirely new place names such as Sóctrăng in place of Khleang, Càmau for Khmaw (in Khmer), Namvang for Phnom Penh, or Caomiên for modern Cambodian Khmer. Similarly, they developed transcriptions such as Sàigòn (西岸 Xī'àn, Cant. /Sajngon/ for "Westbank"). In contrast, Sino-Vietnamese place names like Tâyninh (西寧 Xining, "The Pacified West") or Bắcninh (北寧 Beining, "The Pacified North") align directly with corresponding Chinese place names, reflecting a strong linguistic pairing.
Anthropologically, Chinese identity is defined more as a cultural construct than a racial classification. The Chinese script is historically credited with unifying diverse ethnic groups within China, facilitating communication across speakers of different dialects regardless of their native languages. For example, in southern China, people from Jiangxi, Hunan, Guangxi, Sichuan, and Yunnan speak various forms of Southwestern Mandarin, distinct from southeastern dialects spoken in Jiangsu or MinNan provinces. Despite these dialectal differences, all groups can read and understand written Chinese, including Cantonese speakers and, interestingly, even ancient Annamese.
Linguistically, however, Sinicization has had minimal impact on groups such as the Uyghurs, Inner Mongolians, or Tibetans, even though these formerly independent regions were annexed into the Middle Kingdom centuries ago, a historical process comparable to Annam’s colonization under Chinese imperial rule.
The author has often considered Vietnamese a linguistic embodiment of ancient Yue speech due to its early divergence from the Chinese mainstream. However, this perspective does not fully hold. Unlike Cantonese or Fukienese, spoken by communities that have remained in their ancestral homelands in southern China for millennia, Vietnamese has followed a distinct evolutionary trajectory. Historically, while ancient Annam was a part of China, its linguistic development reflected the racial makeup of its population.
Following its separation from imperial China and establishment as an independent state, Annam expanded southward, leading to the inevitable integration of Chamic and Mon-Khmer elements both racially and linguistically. Nonetheless, the nation’s core population consisted of descendants of early settlers whose demographic composition had solidified during Annam’s colonial period under China. Over time, racial admixture with indigenous groups accompanied territorial expansion, paralleling the assimilation carried out by the Qin-Han dynasties with native populations in ancient northern Vietnam.
Notwithstanding these racial dynamics, the issue seems to have largely escaped the attention of anthropologists and linguists specializing in Vietnamese studies. The Austroasiatic Mon-Khmer hypothesis has positioned the narrative to suggest that local Mon-Khmer natives were "Vietnamized" rather than the reverse, a perspective likely shaped by the historical prestige of the Khmer Kingdom. However, proponents of this hypothesis appear to have overlooked key historical events, particularly the arrival of settlers from southern China who colonized and later Sinicized the local populations. This process of racial admixture began around the start of the first millennium A.D. and continued for the following thousand years after 111 B.C. These settlers later formed what became Vietnam's Kinh people, who subsequently "Vietnamized" later waves of Chinese immigrants, including the Minhhương people (明鄉人), descendants of the fallen Ming Dynasty. Fleeing Manchurian rule in mainland China, hundreds of Ming refugee fleets sailed southward to seek asylum in Vietnam during the 18th century. Some of these groups initially settled in Cambodia, living among Khmer communities before relocating to Vietnam.
Despite the historical significance of these migratory waves, little discourse has emerged on this subject within linguistic scholarship. Instead, academic focus has been largely placed on classifying the Vietnamese language within the Austroasiatic Mon-Khmer linguistic sub-family, a classification shaped more by prevailing academic trends than definitive historical analysis. This framework primarily relies on identifying basic word cognates between Mon-Khmer and Vietnamese languages, emphasizing shared lexical features.
To further understand the presence of Mon-Khmer lexicons in Vietnamese, it is critical to note that Austroasiatic Mon-Khmer vocabulary comprises far fewer core words compared to Sinitic-Vietnamese lexicons, which closely align with both Sino-Vietnamese and Sino-Tibetan etymologies. Ethnically, this mirrors the historical precedent wherein Chinese immigrants from southern China colonized and shaped ancient Annam. Following the 12th century, the early Annamese migrated southward, resettling across newly acquired territories in the now-extinct Champa and Khmer kingdoms. These interactions facilitated close ethnolinguistic exchanges, with groups adopting vocabulary from one another. Fundamental Mon-Khmer terms entered Vietnamese through direct linguistic contact, much like how Yue words were assimilated into Old Chinese. Evidence of these exchanges appears in the uneven distribution of basic Mon-Khmer words, which are present in some branches of the linguistic family but absent in others. Such words have been spoken across the high western mountain ranges of Vietnam since the early stages of territorial expansion.
For comparative reconstruction within the Sinitic linguistic framework, anthropological factors, such as history, culture, and speech patterns, must be carefully considered, as they shape Vietnamese national identity. Today, Vietnam comprises 54 distinct minority groups, each maintaining unique native dialects (e.g., Hmong, Daic, Nung, Chamic, and Mon-Khmer subdialects). These linguistic divisions persist independently of whether their ancestors were subjects of the ancient Nam Việt Kingdom. For example, the Li minority (黎族) of Hainan Island shares genetic affiliations with the Chamic people of Central Vietnam, reflecting their Austronesian linguistic roots. However, they were not directly connected to the ancient Annamese population until after the 12th century, when Chamic groups began intermingling with Annamese resettlers. This raises questions about the validity of linking Chamic lexical items such as ni and nớ ("that, there") to Chinese 那 (nà).
The comparative significance of Mon-Khmer basic words in Vietnamese must be weighed against the extensive Sino-Tibetan etymological parallels present in the language. This comparison is analogous to how Japanese and Korean selectively adopted Chinese loanwords by choice, integrating them into daily use. Similarly, fundamental Vietnamese words with clear cognates in various Sino-Tibetan languages may have existed from ancient times, potentially dating as far back as the legendary tale of Phù Đổng Thiênvương, Thánh Dóng (聖董), which describes ancestral resistance against invaders from China’s Yin Dynasty (殷朝) (Y) . If this legend contains historical truths, then centuries earlier, the Xia kings may have descended directly from nomadic horseback warriors, likely of proto-Tibetan origin. These groups may have migrated south of the Yangtze River, establishing contact with Taic-speaking natives, the ancestors of late Chu subjects and the Yue people, comprising most ethnic groups of southern China in later historical periods.
The complete formation of Vietnamese as it exists today likely took approximately 1,900 years, beginning in 111 B.C. and culminating in 939 A.D., when Middle Vietnamese emerged as a distinct entity. This marked its departure from the Sinicization that shaped Cantonese and Fukienese, which remained within the Sino-sphere. The Austroasiatic Mon-Khmer components can be conveniently factored into two separate periods, either remote antiquity or the 12th century, when Annam expanded south of the 16th parallel. Just as Chamic elements can be set aside in analyzing Vietnamese linguistic history, Mon-Khmer components had relatively little influence on the language’s earlier evolutionary stages.
Historical evidence strongly supports this view. Over centuries, waves of Han immigrants, including foot soldiers, newly appointed or exiled officials, and displaced refugees, emigrated from southern China, permanently settling in Annam. These new settlers eventually integrated into local communities long after Annam’s formal separation from Chinese rule. Remarkably, this process persists even today, as Chinese migrant laborers continue to resettle permanently in Vietnam.
During the French colonial era, spanning nearly 100 years, linguistic reforms had a lasting impact on Vietnam’s writing system. The modern Romanized Vietnamese script gained widespread adoption as intellectual circles spearheaded efforts to replace the traditional Vietnamese writing system in the early 20th century. This radical transformation marked a definitive break from the Sinitic cycle, altering semantic and syntactic structures derived from classical Chinese used in 17th-century and earlier texts. Today, spoken and written Vietnamese have been modernized to the point of increased precision and logic, incorporating Western linguistic mechanisms, including structured topics, complete sentences, and punctuation, while retaining a vast vocabulary stock of Chinese origin.
When Annam gained independence in 939 A.D., its territory was limited to the rice-growing regions surrounding the Red River Basin, located in present-day northern Vietnam. Historically, this region had been part of the Nam Việt Kingdom in southern China, approximately 300 years before 111 B.C. The Austroasiatic theory of a Mon-Khmer genetic affiliation with Vietnamese is challenged by the fact that Vietnam’s central and southern territories, south of the 16th parallel, were only incorporated into Annam after the 12th century. This territorial expansion resulted from warfare and political concessions from the Champa Kingdom (192–1832 A.D.). The Chams, of Austronesian origin, had established a long-lasting and powerful state, effectively serving as a geographical buffer between ancient Vietnamese and Mon-Khmer populations. (林) The hypothesis that Vietnamese shares an ancestral connection with Mon-Khmer is problematic, as contact between these groups occurred far later than Austroasiatic theorists propose.
By the time Annamese resettlers expanded southward, they were already of mixed racial heritage, descended from early northeastern Vietnamese aborigines and Han immigrants from southern China. These settlers included people from Chu (楚), Wu (吳), Yue (越), Min (閩), and other Yue-related states recorded during the Western Zhou period. After the Han Empire annexed NamViệt in 111 B.C., Han settlers migrated en masse into Annam, intermarrying with local populations. Their descendants continued migrating into what would later become central and southern Vietnam, beginning in the 12th century and lasting until the early 16th century.
Linguistically, interactions between southern Annamese settlers and Mon-Khmer speakers in newly acquired mountainous and delta regions likely contributed to Vietnamese absorption of Mon-Khmer vocabulary. These spatial contacts occurred primarily in the Central Highlands along the Trườngsơn Range, as well as the fertile Mekong Delta, where Khmer populations were concentrated. The presence of Mon-Khmer vocabulary in Vietnamese is largely the result of linguistic contact, rather than an intrinsic genetic relationship.
Austroasiatic theorists have repeatedly emphasized Mon-Khmer influences in Vietnamese linguistic classification. However, whether Vietnamese truly belongs within the Austroasiatic family depends largely on whether the analysis is approached historically or geographically. From a historical standpoint, Austroasiatic claims regarding genetic affiliations between Vietnamese and Mon-Khmer languages remain speculative, particularly given Vietnam’s prolonged northern migrations. From a geographic perspective, Austroasiatic Mon-Khmer speakers inhabited the Mekong River Basin long before Annamese settlers arrived, and proponents of the hypothesis often assume that the earliest Vietnamese were originally Mon-Khmer, disregarding evidence of massive Han migration.
A parallel can be drawn between these Austroasiatic claims and Vietnamese nationalist narratives, which assert cultural ownership over excavated relics from the Sahuỳnh and Óc-Eo civilizations. These regions once thrived under Chamic monarchs long before Annamese expansion. Nationalist scholars, eager to establish uninterrupted Vietnamese lineage, boldly claim these artifacts as the creations of their own ancestors, ignoring the fact that indigenous artisans had long since disappeared.
Similarly, Austroasiatic linguists attempt to trace Vietnamese linguistic ancestry to prehistoric Mon-Khmer origins, negotiating academic support for Mon-Khmer cognates. As early as the 20th century, Vietnamese scholars began asserting ancestral heritage over Dongsonian bronze drums, discovered across vast regions of Southeast Asia. These artifacts were widely credited as belonging to the forefathers of modern Vietnamese, despite limited evidence regarding their manufacturing techniques. Surprisingly, few Vietnamese scholars have linked these drums to Zhuang communities, who continue using similar instruments in northwestern Vietnam and southern China.
The key question remains: who were the actual creators of these advanced bronze drums found across Southeast Asia? Did the ancient Yue, who migrated south from China, introduce them? Or were they produced by Austroasiatic peoples who spread across Southeast Asia thousands of years ago? Nationalist scholars often cite accounts from The Book of the Later Han (後漢書 Hòu Hànshū) to argue that Han imperial forces annihilated Vietnam’s indigenous cultural heritage, as recorded in General Ma Yuan’s campaign, in which captured Lạc Việt bronze drums were melted down to create bronze horses. (See Wikipedia: Ma Yuan's bronze horses).
Despite the geographical distribution of bronze drums placing both the Vietnamese and the Zhuang within the same cultural sphere, historically, the latter group continues to use such drums for sacrificial ceremonies. Moreover, Zhuang folklore explicitly details the origins of their bronze drum tradition, whereas the Viet-Muong exhibit no equivalent connection. If the nationalist claims regarding Vietnamese heritage were taken at face value, then the Vietnamese would indeed be heirs to bronze drums, but this reasoning becomes inconsistent if they simultaneously assert ties to Khmer heritage, which holds an undeniably vast cultural footprint. The question then arises, who should be considered legitimate descendants of the Yue ancestry? Both spatially and temporally, this contradiction must be clarified.
The Austroasiatic Mon-Khmer hypothesis, meanwhile, disregards historical chronology and implicitly asserts that the Vietnamese descend from aboriginal forefathers who inhabited vast southern territories around 6300 B.C., long before Annam emerged as an independent state. These Austroasiatic aborigines are postulated to have spoken archaic forms of Mon-Khmer languages, with the Vietnamese model tailored to fit within an Austroasiatic framework. However, this classification fails to account for the significant presence of Chinese lexical influences in Vietnamese, which have shaped its linguistic structure in a manner distinct from traditional Mon-Khmer languages. Comparatively, the Vietnamese language is not wholly composed of borrowed Chinese vocabulary, as seen in the Bulgarian language’s absorption of Slavic elements. Instead, it shares some traits with the Haitian French Creole model, making it a false Chinese dialect akin to Cantonese. Other national languages around the world, including Spanish across Latin America, do not stem from indigenous languages but from colonial influences. Similarly, Mandarin is spoken by indigenous Taiwanese and native Singaporeans despite its foreign origins. Crucially, the ancestral Yue elements in Vietnamese existed prior to the development of Sinitic linguistic entities, as evidenced by the common Yayu (雅語) diplomatic language used for inter-state communication during the Eastern Zhou era.
Although the Austroasiatic hypothesis has been widely accepted and has classified Vietnamese within the Mon-Khmer sub-family, its foundational word list has yet to be systematically reviewed alongside Sino-Tibetan etymologies. Specialists in Austroasiatic and Sino-Tibetan studies have remained unaware of the degree to which over 400 Sinitic-Vietnamese etyma align with Sino-Tibetan linguistic structures. This study aims to examine Sino-Tibetan etymologies beyond those traditionally classified within the Austroasiatic Mon-Khmer framework.
The Austroasiatic theorists will likely react with astonishment upon reviewing the Sino-Tibetan basic word lists presented here. Furthermore, this repudiation of the Austroasiatic Mon-Khmer hypothesis is grounded not only in linguistic evidence but also in archaeology, anthropology, history, and philology. Recorded history indicates that Vietnamese forebears originated from the north, far removed from the Indo-Chinese peninsula, dating back at least 3,000 years. Anthropologically, Vietnamese mythology asserts that the Vietnamese are "offspring of dragons and deities" (Con Rồng Cháu Tiên) and were once considered "descendants of the Yellow Emperor" (黃帝 or 炎帝 SV Viêm Đế ), a legend also embraced by the Chinese. Both traditions appear to reinforce a shared ancestral Yue heritage, suggesting that early Yue peoples may have worshipped alligators, a practice absent among Mon-Khmer cultures (See Terrien Lacouperie, 1887). Historically and culturally, descendants of pre-Qin states, comprising subjects from Chu (楚), Wu (吳), Yue (越), and other southern polities, continue to commemorate the poet Khuất Nguyên (屈原, Qu Yuan) on the Fifth Day of the Fifth Month in the Lunar calendar (端午節, Duanwujie or the Dragon Boat Festival), honoring his martyrdom in resistance to Qin domination (See Trần Trọng Kim’s Việt-Nam Sử-Lược, Ngô Sỹ Liên’s Đại-Việt Sử-Ký, Bo Yang’s Sima Guang 資治通鑒, Zizhi Tongjian. 1983, Vol. 1).
Meanwhile, whether or not modern Vietnamese are true Yue descendants, they frequently identify themselves with the prestigious metallurgical tradition of bronze casting, a cultural legacy extending throughout Southeast Asia, including the Indonesian archipelago. However, they have cautiously refrained from claiming ownership of Chamic Hindu temple ruins scattered along Vietnam’s central coast, recognizing their distinct historical origins.
Vietnamese nationalist enthusiasm aligns with academic narratives crafted by Austroasiatic Mon-Khmer theorists, who framed Vietnam’s linguistic and cultural history within a broader Southeast Asian context. This perspective positioned Vietnam within grand civilizational narratives, including the Khmer Empire, which was once the most dominant regional force prior to the 11th century. Austroasiatic followers have been led to believe that Mon-Khmer speakers, having left cultural remnants across Southeast Asia, were ancestral Vietnamese. From a historical perspective, however, none of the Khmer ruins or thousands of years’ worth of artifacts discovered in Vietnam’s central region are connected to early Annamese populations.
Over a span of three millennia, successive waves of settlers encountering Khmer groups on-site contributed to linguistic development, resulting in Vietnamese absorption of new Mon-Khmer elements. On one hand, indigenous Mon-Khmer speakers in Vietnam logically retain their locally evolved languages, persisting among Mon-Khmer minority communities in remote mountainous regions. On the other hand, the Vietnamese Kinh majority remains concentrated in arable lowland areas along the central coast, the Red River Delta, and the Mekong River Basin, with the most recent major waves of settlement occurring around 310 years ago. The formation of Vietnamese identity was driven by intermarriage between indigenous foremothers and immigrant men, fostering family structures rooted in Confucian values. Generational cycles resulted in a racially mixed populace, expanding Vietnam southward through both demographic growth and territorial consolidation.
As we continue uncovering inconsistencies within the Austroasiatic Mon-Khmer hypothesis, historical analysis must take precedence over speculative prehistoric timelines. It is important for Austroasiatic scholars to recognize that indigenous Mon-Khmer speakers, having retreated into remote mountain regions, never played a governing role in Annam’s statehood. The historical Annamese, inhabitants of the Chinese-administered Annam Prefecture for approximately 1,060 years prior to independence in 939 A.D., formed the actual ancestral lineage of today’s Vietnamese Kinh majority.
The Austroasiatic camp has long maintained that Mon-Khmer linguistic elements coexisted alongside Vietic ones, regardless of whether the former group originated from the same Yue lineage. It is further assumed that both groups ultimately derived from Taic origins dating back to prehistoric times. The presence of contemporary Mon-Khmer minority communities, likely descendants of aboriginal settlers from neighboring Indo-Chinese territories, may have once been dominant in their native regions (Nguyễn Ngọc Sơn, 1993).
In any case, the method of wet rice farming with paddy fields, which Mon-Khmer groups continue to adopt, must have spread southward from the north, originating just below Dongting Lake (Độngđìnhhồ) in Hunan Province. This region has traditionally been regarded as the ancestral homeland of the Yue people, dating back approximately 3,000 years. Over time, wet rice agriculture extended further into mountainous areas where Daic and Zhuang ethnic groups remain concentrated today. This suggests that water paddy cultivation had already existed in the region long before its widespread adoption by Mon-Khmer communities.
The southern regions of China, home to the descendants of the ancient Yue, who had Taic roots and established the Chu State, later contributed to the diverse population of the Nam Việt Kingdom. These regions are likely where the ancestral Vietic people originated before migrating to what is now northern Vietnam for various reasons. Immigrants from southern China were not limited to refugees and exiles; they also included officials, foot soldiers, servants, and others who followed Han colonial expansion. As previously noted, these groups integrated with earlier resettlers, creating a racially mixed demographic composition, a classic example of anthropological assimilation.
Over generations, the descendants of these Yue emigrants completed their southward migration and permanently settled in Annam, laying the foundation for its emerging sovereignty. Initially, earlier generations communicated using either their mother’s or father’s language or a hybrid of both, eventually developing a distinct local speech. As successive generations moved further south, their descendants continued to identify as "people of the southern Yue" (Việt Nam), whether through a process of cultural assimilation or as a declaration of anti-China sentiment. This population evolved into the Kinh majority, who today speak the Vietnamese national language.
Where, then, do Austroasiatic factors fit into the broader framework of the ancient Yue theory, which is substantiated by historical evidence? Despite the Austroasiatic Mon-Khmer framework offering largely speculative prehistoric connections, its speakers may also trace their lineage to the same Taic roots associated with the historical Annamese, likely originating from a southern Yue branch. They may even share ethnic ties with Zhuang or Daic groups, who are historically credited with creating bronze drums. More specifically, they might be linked to the Maonan ethnicity (冒南族) of southern China, potentially related to ancestral Mon peoples. This hypothesis aligns with the cultural significance of bronze drums but excludes artifacts from the Óc-Eo and Sahuỳnh civilizations found in modern Vietnam. These artifacts were created by indigenous populations distinct from the early Vietnamese settlers and predate the emergence of Chamic peoples, who were likely related to the Li minorities residing on China’s Hainan Island.
The earliest Annamese resettled into the central coastal corridor regions relatively late, around the 12th century. In contrast, dominant Mon-Khmer speakers had inhabited the Indochina region for over 6,000 years before present, distinguishing them from later Vietnamese settlers. The Vietnamese remain a distinct group, separate from the 53 indigenous minority groups in Vietnam today. Among these minorities are southern Mon-Khmer speakers, referred to by the French as 'Montagnards,' who still live on their ancestral lands under Vietnam’s governance. These minority groups only came into contact with the late-arriving Vietnamese resettlers within the last few centuries. They inhabit areas along Vietnam’s border with Cambodia, spanning the western mountainous ranges and high plateaus, and extending into the southern Mekong Basin, territory annexed by Vietnam from Cambodia during the 16th century.
This raises two key questions regarding the origins of the Vietnamese. First, do they descend from a branch of the Yue, or are they Austroasiatic? A reality the Vietnamese must confront is that, despite nationalist claims, the Vietnamese today, including their Muong cousins, may not be direct descendants of the Yue. Unlike the Zhuang people, who continue to use bronze drums in tribal sacrificial ceremonies, Vietnamese nationalist narratives linking their origins to the Yue often reflect wishful thinking reinforced by collective belief. Second, what is frequently overlooked is the possibility that ancestral religious practices, the belief in ever-present spirits of one’s forebears offering protection and blessings, may have originated abroad as far back as 5,000 years ago (see Dong Zuo-Bin, 1933; Wu Qi-Chang, 1934; Fu Si-Nian, 1934).
This hypothesis is particularly relevant when examining the southern populations living in Vietnam’s recently annexed territories. These groups had no direct connection to the later immigrants from the north, who resettled and intermingled with earlier inhabitants, contributing to the gradual genetic transformation of the Annamese population. This regional transmutation laid the foundation for what ultimately became the Vietnamese nation, a process that aligns with the formula {4Y6Z8HCMK}.
Ethnically, their descendants, the modern Vietnamese, now live atop archaeological sites where cultural artifacts, including bronze drums, have been unearthed. Interestingly, these relics have been found not only in southern China, the ancestral homeland of the Yue, but also in regions as distant as Indonesia’s southernmost islands.
The discovery of Đông Sơn drums in New Guinea further supports evidence of ancient trade connections. These findings allow Austroasiatic scholars to align their narrative with Yue theorization, as both frameworks demonstrate inclusivity despite originating in distinct historical periods. This convergence is particularly significant for Vietic entities, both racially and linguistically, as their history spans more than 3,000 years, rooted in references to "Southern barbarians" in early Chinese records.
Figure 2.5 , Dongson Bronze Drums found in Indonesia
(Source:http://en.wikipedia.org/wiki/Dong_Son_drum)
x X x
Table 2.1 , Dongson Bronze Drums
Đôngsơn drums (also called Heger Type I drums) are bronze drums fabricated by the Đôngsơn culture in the Red River Delta of northern Vietnam. The drums were produced from about 600 BCE or earlier until the third century CE and are one of the culture's finest examples of metalworking.The drums, cast in bronze using the lost-wax casting method are up to a meter in height and weigh up to 100 kilograms (220 lbs.) Đôngsơn drums were apparently both musical instruments and cult objects. They are decorated with geometric patterns, scenes of daily life and war, animals and birds, and boats. The latter alludes to the importance of trade to the culture in which they were made, and the drums themselves became objects of trade and heirlooms. More than 200 have been found, across an area from eastern Indonesia to Vietnam and parts of Southern China.
The earliest drum found in 1976 existed 2700 years ago in Wangjiaba (万家坝) in Yunnan Chuxiong Yi Autonomous Prefecture China. It is classified into the bigger and heavier Yue (粤系) drums including the Dong Son drums, and the Dian (滇系) drums, into 8 subtypes, purported to be invented by Ma Yuan and Zhuge Liang. But the Book of the Later Han said Ma melt the bronze drums seized from the rebel Lạc Việt in Jiaozhi into horse.
The discovery of Đôngsơn drums in New Guinea, is seen as proof of trade connections , spanning at least the past thousand years , between this region and the technologically advanced societies of Java and China [South].
In 1902, a collection of 165 large bronze drums was published by F. Heger, who subdivided them into a classification of four types.
(Source: https://en.wikipedia.org/wiki/Dong_Son_drum)
Terminology such as Taic, Yue, Daic, Vietic, Muong, Annamese, Kinh, or Vietnamese corresponds to distinct historical periods. If the Austroasiatic term is included among them, it would likely fit between the Taic and Yue timeframes. In this sense, each term reflects a specific historical implication rather than retroactively attaching "good things", such as cultural developments and material artifacts, to forebears from later periods. National pride in inherited traditions often leads to subjective interpretations, enticing people to embrace all perceived "good things," including fine clay utensils or advanced bronze drums, under the assumption that they were exclusively passed down by their ancestors. Such historical misattribution is not unique to Vietnam; modern Chinese scholars have similarly claimed cultural curios found in southern China as their own. Examples include claims that bronze drums were invented by Ma Yuan (馬楥) of the Western Han and Zhuge Liang (諸葛亮) of the Eastern Han, or assertions that copperware existed in earlier epochs despite the absence of known bronze mines in the northeast, where the Shang Dynasty originated (Nguyễn Ngọc San, 1993). These false claims obscure objective analysis of prehistoric anthropological matters.
This dynamic raises an essential question: who, then, are the Vietnamese? The answer lies in the historical record showing that Han Chinese society was the result of a fusion with Yue peoples, represented by both Nam Việt (南越) and the Chu subjects before their kingdoms were incorporated into the Han Empire. Long before Annam gained sovereignty, earlier Muong groups, descendants of the Yue entity following the Viet-Muong split after the Qin-Han period, chose to flee into the mountains rather than assimilate under Han rule. As a result, they retained a relatively pure aboriginal lineage compared to those who remained and intermarried with migrants from southern China. Over time, the racially mixed descendants of these resettlers became known as the Kinh, forming the Vietnamese population.
Similar to the linguistic structure of the Vietnamese language, which features a combination of Yue and Sinitic elements but lacks direct ties to the prehistoric Austroasiatic framework, the racial makeup of contemporary Vietnamese likely emerged during the later colonial period in Annam. If one adheres to the timeline suggesting that the proto-Taic people gave rise to the Yue and Vietic populations, then the Austroasiatic peoples must have already migrated far beyond their Indo-Chinese homeland, reaching the southern hemisphere at least 6,000 to 4,000 years before present. This period lies beyond the scope of the present discussion on the historical development of Vietnamese and its speakers, which traces back to 111 B.C. Further exploration of this topic will be addressed in subsequent chapters.
Regarding linguistic affiliations with Sinitic languages, which closely parallel the racial composition of today’s Vietnamese populace, one could argue that had Vietnam remained a dependent prefecture of China’s successive dynasties beyond 939 A.D., rather than securing independence from the Nam Han (南漢) State, its language, even in its present 21st-century form, would have been classified as another Chinese dialect. The same linguistic framework that categorizes Fukienese (Amoy 廈門 Xiamen) and Cantonese as Sinitic languages would have applied to Vietnamese.
The linguistic divergence of Vietnamese, Fukienese, and Cantonese, alongside subdialects such as Amoy, Hainanese, Chaozhou (Teochow), and Toishanese, suggests they all originally evolved from a proto-Yue language. Their paths diverged significantly after 111 B.C., when the Han Empire annexed the vast territory of Nam Việt in southern China. Regardless of which dynasty governed, the land remained known as "China." Annam, or ancient Vietnam, remained part of China for 1,000 years until its mid-10th-century independence. To contextualize Vietnamese within the Sino-sphere, one could imagine an alternate historical scenario in which Fujian and Guangdong provinces had also seceded from China around the same period. Such speculation underscores the enduring Sinicization of Vietnam, even after separation, mirroring processes observed among its Yue cousin states to the north.
Anthropological evidence affirms a shared ancestral lineage between Vietnamese and southern Chinese populations. Lexically, Vietnamese basic etyma retain strikingly similar remnants of the common Yue linguistic substratum. Examples include "con" (child) 子 (仔) Amoy /kẽ/, "mợ" (mother) 母 mǔ Hainanese /maj2/, "biết" (know) 明白 míngbǎi Hainanese, Amoy /mɓat7/, "soài" (mango) 檨 Amoy /swãj4/, "dê" (goat) 羊 Chaozhou /jẽw1/, "gàcồ" (rooster) 雞公 jīgōng Hai. /kōj1koŋ1/, and "gàmái" (hen) 雞母 jīmǔ Hai. /kōj1maj2/.
While Fukienese and Cantonese were fully Sinicized and classified among major southern Chinese dialects, Vietnamese followed a separate linguistic trajectory after Giaochi (交趾 Jiaozhi) became one of nine prefectures under the Western Han. Unlike its northern linguistic counterparts, the development of the Vietnamese language was shaped by a racially mixed populace of aboriginal Yue and Han officials, as well as their foot soldiers migrating from the north. Following independence, the resettlement of Annamese further south introduced additional foreign linguistic influences along their expansion route.
Although archaeological findings challenge nationalist claims regarding ownership of southern indigenous artifacts, linguistic affiliations are better determined by their consistent evolutionary patterns. As Sinitic languages expanded southward around 100 B.C., the Indian-influenced Champa Kingdom, located south of ancient Annam, failed to establish meaningful contact with its northern neighbors. It also lost connections with its racial relatives, now known as the Li minority of Hainan Island. Further south, Chamic groups were frequently at odds with Khmer populations. Eventually, the Champa Kingdom was annexed into Annam, a process completed in the 18th century. Despite these interactions, besides placenames, Vietnamese linguists attribute only a handful of Chamic loanwords, such as U (mother), ni (this), and nớ (that), to the Hue dialect, though this postulation remains disputed.
Beyond geographical and anthropological factors dating 3,000 to 5,000 years before present, linguistic peculiarities found exclusively in Chinese and Vietnamese, such as tonality and disyllabicity, support a Sinitic-Yue affiliation. This contrasts sharply with the assimilation of Chinese loanwords into toneless Altaic languages such as Korean and Japanese, where borrowed Chinese lexemes underwent structural modifications in Kanji and Hanji to match native speech patterns.
The comparative analysis presented here underscores Vietnam’s linguistic and historical position within the Sino-Tibetan framework rather than the Austroasiatic Mon-Khmer model. The continuity of shared attributes between Vietnamese and Chinese, juxtaposed against the linguistic separation between Chinese and Korean or Japanese, further substantiates the argument for a Sino-Tibetan classification of Vietnamese.
The linguistic proximity between Vietnamese and Chinese is evident in their shared semantics, tonal registers, lexical classifiers, grammatical prepositions, conjunctions, and syntactic structures, reinforcing their common linguistic heritage. Before the adoption of Romanized Vietnamese, Chinese script was used in official documents for over 2,200 years later on to transcribe Nôm, the native Vietnamese language, as well as indigenous dialectal names for local products and places. This script, known as chữNôm, coexisted alongside standard Chinese writing, with one serving an official function and the other reserved for vernacular usage. For instance, "Nôm" (喃) and "Nam" (南) were used for "Nồm" while "tử" (子) and "tý" were utilized for con ("child") and chuột ("rat"). Similarly, "xú" (丑) and "sửu" were written for xấu ("ugly") and trâu ("buffalo"), while "tơ" and "ty" (絲) corresponded to silk-related terms.
Beyond these spatial and temporal factors, Chinese cultural influences, particularly Confucianism, directly impacted phonological changes in Vietnamese, including linguistic taboos and euphemisms. Words deemed homonymous with royal names or venerable elders were often avoided, as seen in substitutions like lời or lãi in place of Lợi (利) from King Lê Lợi’s name. Sound shifts must therefore be central to examining Yue roots in Vietnamese, as variations in pronunciation over time illustrate deeper linguistic patterns. Consequently, this study refrains from unearthing substrata for fossilized etyma, which may represent local remnants from the Austroasiatic Mon-Khmer stock, a domain long defended by Austroasiatic theorists.
Linguistic truth belongs to those who recognize what others overlook and continue advocating their views, even when they diverge from mainstream Sino-Tibetan classifications. Grammar also warrants exploration, even though it is among the fastest-changing elements of language. A 2017 linguistic study published in Phys.org: The 'myth' of language history: Languages do not share a single history highlights this variability. To illustrate, Vietnamese word formation often follows a syntactically reversed order of {stem + modifier}, which differs from that of other Chinese dialects, yet retains identical syllabic components. Ancient terms display similar structures in both languages, though, such as Hoanam (華南, "China South") or Thầnnông (神農, "God of Agriculture") instead of the reversed Nánhuá (南華) or Nóngshén (農神). Despite differences in syllabic order, particularly in phonological and syntactic structure, Vietnamese and Chinese linguistic similarities remain profoundly closely knitted in semantics which shows in several Middle Chinese words that Vietnamese is still having both forms, for example, bảođảm vs. đảmbảo (擔保 dānbǎo, "guarantee"), áiân vs. ânái (愛恩 ài'ēn vs. 恩愛 ēn'ài, "conjugal love"), hoen-ố 染污 (rǎnwū, "tainted") vs. ônhiễm (污染 (wūrǎn, "polutted").
The primary strength of the Austroasiatic Mon-Khmer hypothesis lies in its claim that Vietnamese shares foundational vocabulary with Mon-Khmer languages. Yet, as revealed in this study, the same core words also appear in Sino-Tibetan languages. These fundamental lexicons extend beyond the scope of basic Austroasiatic word lists, encompassing additional native and indigenous words within Vietnamese linguistic stock. Western-trained specialists in Vietnamese linguistics within the Austroasiatic Mon-Khmer camp have taken notice of this pattern and continue efforts to recruit institutional graduates into their school of thought. Expanding scholarly engagement in this field further reinforces Austroasiatic Mon-Khmer classifications. Novices in Vietnamese historical linguistics often reiterate previously established narratives taught in academic settings, effectively turning linguistic classification into a repetitive cycle.
If increased interest in this field fosters broader discussions, scholars may reconsider Vietnamese linguistic classification by examining Sino-Tibetan etymologies alongside Austroasiatic Mon-Khmer word lists. This study introduces new evidence revealing cognacy among Sino-Tibetan and Vietnamese basic etyma, demonstrating their linguistic affiliation beyond Austroasiatic narratives. (See Chapter 10 - Parallels with the Sino-Tibetan languages)
At the outset, it is necessary to examine the basic word lists that Austroasiatic specialists have relied on as the foundation of their hypothesis for over a century. At the turn of the 20th century, in an effort to solidify their theory, Austroasiatic pioneers launched counterarguments against the widely accepted Sino-Tibetan classification of Vietnamese (Meillet, A., 1952, pp. 526–27). Their approach has primarily involved extensive lexical tabulation and categorization of Khmer etyma, referred to here as etymology harvesting, within various Mon-Khmer linguistic subfamilies, such as Banahric and Katuic, while equating them with sibling Viet-Muong languages, including Muong, Ruc, and Thavung. Austroasiatic theorists have remained confident in their hypothesis, arguing that Vietnamese basic words align closely with Austroasiatic etymologies found across various Mon-Khmer dialects.
However, once the initial excitement surrounding the Austroasiatic classification subsided, it became clear that these basic etyma were distributed unevenly across multiple Mon-Khmer languages. In other words, some dialects retained similar forms, while others did not, suggesting that linguistic diffusion, rather than a shared genetic origin, may explain the similarities. In certain instances, this phenomenon may be attributed to regional linguistic contacts, particularly among Muong subdialects.
If there is a legitimate field of Sinitic-Vietnamese etymological linguistics, it must be distinguished from natural sciences, where standardized measurement tools are used universally. Linguistic methodologies rooted in Indo-European analysis may not be adequate for examining tonal languages. As a result, cognate etyma in different linguistic families should not be expected to share identical phonological forms. For example, words would generally be classified as loanwords if their phonology closely resembles that of another language, as seen in Sino-Vietnamese lexicons, which largely mirror Middle Chinese pronunciations. In contrast, Mon-Khmer cognates display phonological similarities across different languages, raising the possibility of coincidence. This contradicts the linguistic axiom that states: the closer two words are in phonetic appearance, the more distant their genetic affiliation is presumed to be. This pattern is particularly evident when comparing tonal and non-tonal languages, for instance, Vietnamese chồmhỗm ("squat") and Khmer /chorahom/ versus Mandarin 犬坐 (quánzuò).
Anthropologically, the racial admixture of Vietnamese bears striking similarities to the evolutionary processes shaping the Han Chinese. To be expressed in a formulary manner, initially, proto-Chinese {X}, originating from Tibetan regions of southwestern China, intermingled with proto-Yue aboriginals {YY}, presumably the Taic people, who comprised the majority of the Chu State’s population and spoke an ancient Daic language. This interaction occurred at a proportional ratio of 1, to 2, symbolically expressed as X/2Y. Over time, these groups formed the indigenous Yue populace {ZZZ}, inhabiting states such as Shu, Wu, and Yue. Their mixed descendants were later classified as Han {HHHH}, represented as 3Z4H (3 x Z and 4 x H). Under the Han Dynasty, these groups unified within the Middle Kingdom, effectively a "united states of Qin", marking the transition from Qin subjects to Han Chinese.
The racial composition of the Han Chinese, hence, represented as {X2Y3Z4H}, emerged through the fusion of proto-Chinese (X), proto-Yue (YY), indigenous Yue (ZZZ), and Han (HHHH). Similarly, the racial makeup of Vietnamese nationals evolved from proto-Yue {YY} and later Yue {ZZZ} to proto-Vietic {YYZZZ}, hence, assumingly, the ancestors of the Vietic or early Annamese represented as {2Y3Z+4H}. These groups gradually transformed into modern Vietnamese, represented as {4Y6Z8H+CMK}, where {C} symbolizes the Cham component and {MK} denotes Mon-Khmer influences. This racial admixture closely mirrors the composition of Fukienese and Cantonese populations, shaped by similar fusion processes during the Han Dynasty before and after 111 B.C.
Consequently, the Austroasiatic formula can be tentatively expressed as {6YCMK}, contrasting with the modern Vietnamese formula {4Y6Z8H+CMK}. These formulations encapsulate the historical processes that forged distinct yet interconnected racial and cultural identities.
As later chapters will elaborate in terms of historical factors, the development of Vietnamese has progressed in parallel with the racial composition of its speakers ({4Y6Z8H+CMK}). Historically, when Qin armies advanced southward, native Yue inhabitants ({2Y3Z2H}) from the Độngđìnhhồ region in present-day Hunan Province migrated en masse to the Red River Delta in northern Vietnam. This migration led to racial intermixing with indigenous groups, including the native Muong and the peoples associated with the Phùngnguyên Culture (c. 2000–1500 B.C.) (W) In subsequent periods, resettlers ({2Y3Z2H}) who had previously occupied the region intermingled with the newly arrived Yue groups ({4Y3Z2H}). Later, the ancestors of the Viet-Muong ({4Y3Z2H}) fled to the southwestern mountainous regions in response to Han invasions from 208 B.C., placing their linguistic heritage in direct contact with local Mon-Khmer speakers ({4Y+MK}). This historical interplay helps explain why certain Viet-Muong dialects exhibit phonological proximity to Mon-Khmer languages.
In short, symbolically, if Yue entities were expressed numerically to represent the proportions of racial blending shaping the genetic composition of the ancient Annamese, a plausible model might assign weighted values as {2Y3Z4H}. This theoretical construct draws upon historical records, including census data documenting population growth from 400,000 to 980,000 across the three Han prefectures of Giaochỉ, Cửuchân, and Nhậtnam within a century (111 B.C.–11 B.C.). Additionally, accounts indicate that between 15,000 and 30,000 unmarried women from the NamViệt State were forcibly married to Qin soldiers during the brief Qin Dynasty (Lu Shih-Peng, 1964, Eng. p. 11, Chin. p. 47).
Over the course of more than 2,000 years, post-Qin and Han Chinese populations gradually merged with the Yue, shaping what would become the modern Vietnamese identity. This transformation was defined by successive integrations of pre-existing native communities. While the racial enumeration presented here does not claim absolute scientific precision, it serves as a conceptual framework to inspire further inquiry into the evolution of the ancient Vietnamese people.
Efforts to reestablish scholarly objectivity extend beyond Vietnamese historical linguistics. Similar inquiries occur in fields such as biogenetics, which trace racial origins by mapping the genomes of targeted populations. These studies, in turn, will contribute to advancements in linguistic research. (S)
Since antiquity, Muong-speaking communities in mountainous regions have borrowed loanwords from Khmer or Kinh speakers when engaging in trade or striving for prestige, thereby integrating Mon-Khmer terms into the broader Vietnamese linguistic mainstream. This mutual exchange also facilitated the transmission of essential vocabulary among various languages. Even today, these linguistic interactions persist, as waves of northern migrants resettle in Vietnam’s western Central Highlands. Observing speech patterns among Muong villages in Hoàbình Province or Mon-Khmer communities in Gialai and Kontum provinces provides direct insight into this linguistic integration.
In practice, Vietnamese Kinh speakers in lowland areas rarely need to borrow lexicon from Montagnard groups for words they already possess in their language. Even among their close Muong relatives, lexical redundancy often negates the necessity of linguistic borrowing. Instead, the reverse scenario occurs more frequently. Additionally, shared words may be the result of linguistic coincidence rather than direct borrowing. Examples include:
- chồmhỗm ("squat") = Khmer /chorahom/
- chòhõ ("stand") = Khmer /ch ho/
- tầmvong ("stick") = Khmer /dm boong/
- rùmbeng ("fuss") = Khmer /rm poong/
- hầmbàlàng ("mix") = Khmer /ʔhm blang/ (Nguyễn Ngọc San, 1993, p. 45)
Austroasiatic theorists focus primarily on proving shared linguistic roots between Mon-Khmer and Vietnamese rather than scrutinizing social linguistic interactions that may explain these similarities. This approach underpins their classification of Vietnamese within the Austroasiatic family. At the same time, they have largely disregarded comparisons between Vietnamese and Sino-Tibetan etymologies, likely due to an insufficient awareness of potential linguistic affiliations. Their analytical framework also neglects other linguistic factors, including structural similarities between Vietnamese and Chinese. Again, that is where our second approach kicks with etymological analysis.
The Austroasiatic classification compensates for discrepancies in Vietnamese by linking its linguistic development with that of other Viet-Muong languages. This assumption suggests a common ancestral root within the broader Yue linguistic family of southern China. However, given the extensive migration and historical shifts that shaped Vietnamese linguistic evolution, this classification remains incomplete without considering its extensive Sinitic connections.
Popular Vietnamese wisdom wonders: "Is this a case of putting the plow in front of the buffalo in preparation for the paddy field?" (Cáicày đặt trước contrâu?) In other words, is a theory, i.e., Austroasiatic Mon-Khmer paradigm, being constructed before the supporting data is even plugged in? This evokes the classic analogy of the chicken and the egg, illustrating how logic can be bent to fit a narrative. The folk axiom humorously reminds us of a similarly questionable claim made by some Western grammarians in the early 20th century, that the Vietnamese language lacked grammatical rules until French structures were adopted and adapted, effectively "bringing it into existence."
This perspective misses the point entirely. Grammar does not define a language, just as words alone do not constitute it.
Basic word cognacy in many languages can often be attributed to linguistic contact rather than direct genetic affiliation. For instance, Indo-European numeral systems provide a clear example of semantic contamination, as seen in words such as September and October, which originally denoted the seventh and eighth months, respectively, but now correspond to the ninth and tenth months due to calendar modifications.
Terminologically, the Austroasiatic Mon-Khmer concept was strategically devised to encompass remnants of Indo-Chinese languages found in isolated communities across Vietnam’s western mountainous regions south of the 16th parallel. Additionally, it incorporates dialectal enclaves further north in southern China, spanning areas below the Yangtze River Basin, dating back to prehistoric periods. This classification remains flexible, adjusting to accommodate linguistic elements that might not fit elsewhere, such as Daic or Zhuang linguistic features.
Methodologically, Austroasiatic specialists have adapted Indo-European linguistic frameworks, which may appear scientifically rigorous to novice researchers, to advance Mon-Khmer etymological studies. Generally, Mon-Khmer basic words form the primary Austroasiatic lineage that entered the Vietnamese lexicon much later, particularly following Vietnam’s independence and subsequent territorial expansion. Consequently, nearly all Viet-Muong dialects originating from northern Vietnam’s Red River Delta have been mapped onto southwestern Mon-Khmer languages spoken in regions that did not historically belong to Vietnam before the 12th century. The assertion of Austroasiatic Mon-Khmer roots in Vietnamese thus traces back to a linguistic heritage belonging to a population that had not yet emerged, namely, the later Kinh people. In both historical and linguistic contexts, ancient Viet-Muong resettlers had no direct affiliation with Mon-Khmer speakers before the 2nd century B.C., nor with the Khmer Kingdom, which developed much later around the 10th century.
Conclusion
In the absence of fresh and compelling research, scholars working within the Sino-Tibetan framework have encountered increasing difficulty in gaining academic traction. This chapter establishes a foundation for renewed inquiry into Sino-Tibetan and Vietnamese linguistic connections, which have long remained eclipsed by the prevailing Austroasiatic Mon-Khmer paradigm that is being tackled relentlessly hitherto.. That framework continues to be challenged with growing intensity.
Through historical and linguistic analysis, this study emphasizes the need to refine the classification of Vietnamese. By investigating the external forces that shape linguistic narratives, the discussion opens the way for a broader reevaluation of Sinitic and Vietnamese linguistics. This includes the study of phonological development and the identification of core cognates embedded within Sino-Tibetan etymological strata.
ENDNOTES
(音)^ For example,
Division |
Character |
Beijing |
Cantonese |
Sino-Vietnamese |
Sino-Korean |
|---|---|---|---|---|---|
3 |
珉 |
mín |
man4 |
mân |
min |
4 |
民 |
mín |
man4 |
dân |
min |
(H)^ Figure 2.6
The oft-repeated claim that "a Hùng king named Chiêuvương lived for hundreds of years and had sixty wives" circulates across the internet. Such assertions reflect the extent to which official narratives, often shaped by ruling elites, have imposed mythologized versions of history upon the general public. This phenomenon underscores how state-sponsored historiography, particularly in countries like Vietnam and China, functions as a tool of ideological control.
When fabricated legends are elevated to historical fact, it raises a deeper question: who holds the authority to determine the truthfulness of other foundational matters, such as the origin of the Vietnamese language? In contexts where history is written by those in power, scholarly inquiry must remain vigilant against politicized distortions masquerading as cultural heritage.
(A)^ As previously mentioned, it is just another Western theorization. Our Western scholars keep inventing but they have ignored the historical Yue artifacts because, up until the 18th century, they were reluctant to restore old things, such as historical Chinese linguistics. So they love to create new things, building them from the start.
Exactly with the same approach, the author could make similar shortcuts to establish a theory on the origin of today's Europeans, for instance, all based on hypothesis. Say, he would solemnly state that their ancestors had come from the Middle Eastern region now called Iraq where the craddle of the world's oldest civilization once existed. And so said, he used some theory initiated by another author as premises for the 'new theory'. For example, according to Bo Yang (1983-93) ancestors of people of Europe were descendants of those who had previously lived there, that is, creators of that the 6000 year old civilization in today's Iraq, and they had been forced to flee from attacks waged by the Tartars on horse backs who had rapidly advanced from regions of southwestern Siberia and might have permanently settled there. That is what had happened in the ancient mainland of China. That historical detail also explains why the ancestral language of Turkey is shared by ancient northeastern Chinese and both Japanese and Korean, namely, they all having originated from the same root of the Altaic linguistic family. As a matter of fact, Chinese history recorded that the Han's army were frequently defeated by those Tartaric warriors.
Analogously that is how the Austroasiatic theory has been built, methodologically. In any case, let us not go astray with details of how such hypothesis could be theorized. Rome could not be built in one day after all.
(C)^ The name "Han" was a derivative from the compound 'Hanzhong' (漢中) where the First Han Emperor Han Gaozu (漢高祖) used to hold the post of viceroy who had ever been a subject of the Chu State (楚國) of which the populace were made up of the pre-Yue people called "Taic", hence, the "original Yue-Chu-Han" people. Readers will see more discussions and emphasis on the Han matter in the succeeding chapters.
(Q)^ In 2025, the government forcefully merged two provinces under an ahistorical designation. Gialai Province, named after the Jrai ethnic group of its highland region, was expanded through the annexation of the historically distinct Bìnhđịnh Province. Additionally, all references to administrative units at the 'huyện' (county) level were systematically eliminated, laying the groundwork for a broader territorial restructuring.
This reconfiguration aligns Vietnam’s administrative divisions with
China’s geopolitical framework, foreshadowing a future in which
Vietnam would be conveniently absorbed into 'Quảngnam
Province' (Greater South), situated alongside China’s existing
'Quảngtây' (Greater West) and 'Quảngđông' (Greater
East) provinces. With only the terminological shift from
'tỉnh' (province) to 'huyện' (county) remaining, this
transformation would seamlessly integrate Vietnam into China’s
administrative map!
These developments have sparked widespread discourse among
the Vietnamese populace in 2025, highlighting how politics and
nationalism are deeply intertwined with historical
linguistics. The trend, wherein political imperatives shape academic path, has
become increasingly pronounced, particularly in the wake of
Trump-era rhetoric surrounding nationalism and the 'Make America
Great Again' (MAGA) movement. Some Vietnamese who support Trump do
so under the belief that he represents a counterbalance to China’s
growing influence, an intersection that underscores the broader
entanglement between politics and scholarship.
(S)^ In fact, genetically, on the DNA side, at present time there appear new scientific studies made available on the internet at our finger tips, for example, see the quoted abstract from http://www.taiwandna.com/VietnamesePage.htm in the textbox below.
HLA-DR and -DQB1 DNA polymorphisms in a Vietnamese Kinh population from Hanoi.
Vu-Trieu A, Djoulah S, Tran-Thi C, Ngyuyen-Thanh T[sic], Le Monnier De Gouville I, Hors J, Sanchez-Mazas A.
Source: Department of Immunology and Physiopathology, Medical College of Hanoi, Vietnam.
Abstract
We report here the DNA polymerase chain reaction sequence-specific oligonucleotide (PCR-SSO) typing of the HLA-DR B1, B3, B4, B5 and DQB1 loci for a sample of 103 Vietnamese Kinh from Hanoi, and compare their allele and haplotype frequencies to other East Asiatic and Oceanian populations studied during the 11th and 12th International HLA Workshops. The Kinh exhibit some very high-frequency alleles both at DRB1 (1202, which has been confirmed by DNA sequencing, and 0901) and DQB1 (0301, 03032, 0501) loci, which make them one of the most homogeneous population tested so far for HLA class II in East Asia. Three haplotypes account for almost 50% of the total haplotype frequencies in the Vietnamese. The most frequent haplotype is HLA-DRB1*1202-DRB3*0301-DQB1*0301 (28%), which is also predominant in Southern Chinese, Micronesians and Javanese. On the other hand, DRB1*1201 (frequent in the Pacific) is virtually absent in the Vietnamese. The second most frequent haplotype is DRB1*0901-DRB4*01011-DQB1*03032 (14%), which is also commonly observed in Chinese populations from different origins, but with a different accessory chain (DRB4*0301) in most ethnic groups. Genetic distances computed for a set of Asiatic and Oceanian populations tested for DRB1 and DQB1 and their significance indicate that the Vietnamese are close to the Thai, and to the Chinese from different locations. These results, which are in agreement with archaeological and linguistic evidence, contribute to a better understanding of the origin of the Vietnamese population, which has until now not been clear.
PMID: 9442802 [PubMed - indexed for MEDLINE]
(I)^ See Ilia Peiros's Some thoughts on the problem of the Austro-Asiatic homeland
(Y)^ 商朝又稱殷、殷商(約前十七世紀至約前十一世紀),是中國第一個有直接且同時期文字記載的王朝。商朝前期屢屢遷都,而最後的二百七十三年,盤庚定都於殷(今中國安陽市),因此商朝又稱殷朝。有時也稱為殷商或殷。
商朝晚期,中國的歷史由半信半疑的時代過渡到信史時代。商是中國歷史上繼夏朝之後的一個朝代,相較於夏,具有更豐富的考古發現。
原夏之諸侯國商部落首領商湯率諸侯國於鳴條之戰滅夏帝國後建立。歷經十七代三十一王,末代君王商紂王於牧野之戰被周武王擊敗而亡。 https://zh.wikipedia.org/wiki/商朝 )根據《嶺南摭怪》中的越南傳說,中國殷代時,雄王因「缺朝覲之禮」,而招致殷王率兵來襲(又稱「殷寇」;而《大越史記全書·外紀·鴻厖紀》則記載為「雄王六世」時期「國內有警」)。正當大軍壓境之際,仙游縣(或作武寧縣)扶董鄉有一位三歲童子自動請纓,率領雄王軍隊前往殷軍陣前,「揮劍前進,官軍(雄王軍)隨後」,殷王陣前戰死,而童子亦隨即「脫衣騎馬升天」。其後,雄王尊該童子為「扶董天王」,立祠祭拜。
然而,近代越南學者陳仲金(Trần
Trọng-Kim)以實事求是的態度指出,中國殷朝入侵的傳說「實屬謬誤」,理由如下:「中國殷朝位於黃河流域一帶,即今之河南、直隸、山西和陝西地區。而長江一帶全為蠻夷之地。從長江至我北越,路途甚為遙遠。即使當時我國有鴻厖氏為王,無疑也不會有什麼紀綱可言,無非像芒族的一位郎官而已,因此他與殷朝無任何來往,怎能引起彼此間的戰爭?而且,中國史書亦無任何記載此事。因此,有何理由說殷寇就是中國殷朝之人呢?」因此,陳仲金將之視為「有一股賊寇稱為殷寇」而已。
(Source: https://web.archive.org/web/http://baike.baidu.com/view/1854748.htm)
[UNLESS LACVIET HAD BEEN PART OF THE ANCIENT CHU STATE(?) While they are about some legends of Thanh Giong, we focus only the
linguistic aspect of the matter here. Howerver, there exist
evidences that the ancient Vănlang state had already been in contact
with the Shang Dynasty with the Shang's 10th century B.C. bronze
artifacts found in Hunan Province. ] In Chinese group to bring relic back to Hunan, by Lin
Qi,: "A 3,000-year-old Chinese bronze, called min fanglei, will soon
return to its birthplace to be reunited with the lid from which it was
separated nearly a century ago. The reunion was made possible by a
private purchase by Chinese collectors on April 19 in New York.
Acclaimed as the "king of all fanglei", the square bronze, which dates
to the Shang Dynasty (c.16th century-11th century B.C), served as a
ritual wine vessel. It was excavated in Taoyuan, Hunan province, in
1922." (Source:
https://web.archive.org/web/http://www.chinadaily.com.cn/cndy/2014-03/21/content_17366159.htm)
(Remarks
in between [ ] and the string 'https://web.archive.org/web/' are made
and added by dchph.)
(SH)^ Starostin derives this word from Proto-Sino-Tibetan *rij (“many”), cognate with 皆 (OC *kriːj, “all”), 偕 (OC *kriːj, “together with”), as well as Tibetan ཁྲི (khri, “ten thousand”) and Burmese ရဲ (rai:, “police”).
For the pronoun "they" instead of "she", "he" or "s/he", the author finds that sometimes the current usage of the singular "they" is suitable in many circumstances adopted by the Washington Post in its stylebook in December 2015 or US local Examiner newspapers in September 22, 2016. It was also American Dialect's word of the year in 2015.
(水)^ For example, '果 guǒ' is fluid in the case of VS 'tráicây' 水果 shuíguǒ (fruits) and it could become VS 'kẹo' as a contracton of the normalized 'kẹođường' 糖果 tángguǒ (candies) in both of which each syllable derived from '果 guǒ' carries a different meaning, though. Sound pattern mechanism may not work rigidly in a uniform manner in this case then.
(泰)^ The first proposal of a genealogical relationship was that of Paul Benedict in 1942, which he expanded upon through 1990. This took the form of an expansion of Wilhelm Schmidt's Austric phylum, and posited that Tai-Kadai and Austronesian had a sister relationship within Austric, which Benedict then accepted. Benedict later abandoned Austric but maintained his Austro-Tai proposal. This remained controversial among linguists, especially after the publication of Benedict (1975) whose methods of reconstruction were idiosyncratic and considered unreliable. For example, Thurgood (1994) examined Benedict's claims and concluded that since the sound correspondences and tonal developments were irregular, there was no evidence of a genealogical relationship, and the numerous cognates must be chalked up to early language contact.
However, the fact that many of the Austro-Tai cognates are found in core vocabulary, which is generally resistant to borrowing, continued to intrigue scholars. There were later several advances over Benedict's approach: Abandoning the larger Austric proposal; focusing on lexical reconstruction and regular sound correspondences; including data from additional branches of Tai-Kadai, Hlai and Kra; using better reconstructions of Tai-Kadai; and reconsidering the nature of the relationship, with Tai-Kadai possibly being a branch (daughter) of Austronesian.Source: https://en.wikipedia.org/wiki/Austroasiatic_languages
(M)^ The Austroasiatic (Austro-Asiatic) languages, in recent classifications synonymous with Mon-Khmer, are a large language family of continental Southeast Asia, also scattered throughout India, Bangladesh, and the southern border of China. The name Austroasiatic comes from the Latin words for "south" and "Asia", hence "South Asia". Among these languages, only Khmer, Vietnamese, and Mon have a long-established recorded history, and only Vietnamese and Khmer have official status (in Vietnam and Cambodia, respectively). The rest of the languages are spoken by minority groups. Ethnologue identifies 168 Austroasiatic languages. These form thirteen established families (plus perhaps Shompen, which is poorly attested, as a fourteenth), which have traditionally been grouped into two, as Mon-Khmer and Munda. However, recent classifications have abandoned Mon–Khmer as a taxon, either reducing it in scope or making it synonymous with the larger family.
Austroasiatic languages have a disjunct distribution across India, Bangladesh and Southeast Asia, separated by regions where other languages are spoken. They appear to be the autochthonous languages of Southeast Asia, with the neighboring Indic, Tai, Dravidian, Austronesian, and Tibeto-Burman languages being the result of later migrations (Sidwell & Blench, 2011). Source: https://en.wikipedia.org/wiki/Austroasiatic_languages
(衁)^ "máu" :(1) hoang, (2) máu 衁 huāng (SV hoang) [ Vh @ M 衁 huāng, nǜ < MC hwaŋ < OC *hmaːŋ | *OC 衁 亡 陽 荒 hmaːŋ | Dialect: Cant. /fong1/ | MC 宕合三平陽微 | FQ 武方 | Shuowen: 血也。从血亡聲。《春秋傳》曰:“士刲羊,亦無衁也。” 呼光切〖注〗《字彙》作𥁃。又 𧖬、𧖭,同。 | Kangxi: 《康熙字典·血部·三》衁:《唐韻》《集韻》《正韻》𠀤呼光切,音荒。《說文》血也。《左傳·僖十五年》士刲羊,亦無衁也。《韓愈詩》衁池波風肉陵屯。《字彙》又入皿部,書作𥁃,非 | Guangyun: 衁 荒 hu光 曉 唐合 唐 平聲 一等 合口 唐 宕 下平十一 唐 xwɑŋ xuɑŋ xuɑŋ xuɑŋ hwɑŋ hʷɑŋ hwaŋ huang1 huang xuang 血也 || Wiktionary: Phono-semantic compound (形聲, OC *hmaːŋ): phonetic 亡 (OC *maŋ) + semantic 血 (“blood”). Etymology: Borrowed from Austroasiatic. Compare Proto-Mon-Khmer *ɟhaam ~ *ɟhiim (“blood”), whence Khmer ឈាម (chiəm, “blood”), Mon ဆီ (chim, “blood”), Proto-Bahnaric *bhaːm (“blood”), Proto-Katuic *ʔahaam (“blood”), Proto-Khmuic *maː₁m (“blood”). Chinese has final -ŋ because initial and final m are mutually exclusive (Schuessler, 2007). This word's rare occurrence in a traditional saying indicates that it is not part of the active vocabulary of OC, but a survival from a substrate language.|| Note: Bodman, Nicholas C. 1980. 'Proto-Chinese and Sino-Tibetan,' (in Frans Van Coetsem et al. (eds.) <em>Contributions to Historical Linguistics</em>) (p.120) : 'An interesting hapax legomenon for 'blood' appears in the Dzo Zhuan which has an obvious Austroasiatic origin: Proto-Mnong *mham, Proto-North Bahmaric *maham, 衁 hmam > hmang > ɣuáng.' || chardb.iis.sinica.edu.tw/char/21663: (1.) 血液。 , (2) 蟹黃。|| Guoyu Cidian: 血液。《說文解字.血部》:「衁,血也。」《左傳.僖公十五年》:「士刲羊,亦無衁也。」 ]
(T)^ 'Genetic' here could be used to apply to, but not limited to, roots and linguistic attributes, for example, 疼 téng in "đớnđau" ~ 疼痛 téngtòng, SV đôngthống (painful), 痛 tòng, SV thống (pain) \ OC *doŋw /*ŋw ~ -w ~> "đau" /daw1/ (pain), while 疼 téng in 疼愛 téng'ài', SV đôngái (love) ~> "thươngyêu", or "chân" 腳 jiăo (foot) and "bànchân" ~ 腳板 jiăobăn (in reverse order, "foot; sole of the foot"), etc., of which words of the same linguistic roots and peculiarities are absent from those of Chinese loanwords in Japanese or Korean.
(日)^ The cases of Japan and Korea the borrowed the Chinese-based vocabularies in the Middle Age could be analogized with the technical English language used in the computer language today, say, the programming language has been adopted by most countries in the world, including China, which will become an inseparate parts of their languages.
(X)^ Regarding the printing media activities with authors, their writing styles , Nôm scripts and heavily Chinese classical usage, Sino-Vietnamese etyma, etc. , and publication of works in both French and Quốcngữ in the mid-20th century. (See Tô Kiều Ngân's Mặc khách Sàigòn (Literati of Saigon). 2013. p. 16)
(Z)^ The Zhuang languages (autonym: Vahcuengh (pre-1982: Vaƅcueŋƅ, Sawndip: 话壮), from vah 'language' and Cuengh 'Zhuang'; simplified Chinese: 壮语; traditional Chinese: 壯語; pinyin: Zhuàngyǔ) are any of various Tai languages natively spoken by the Zhuang people. They are an ethnic rather than linguistic group. Most speakers live in the Guangxi Zhuang Autonomous Region within the People's Republic of China, where Standard Zhuang is an official language. Across the provincial border in Guizhou, Bouyei has also been standardized. Over one million speakers also live in China's Yunnan province.
The sixteen ISO 639-3 registered Zhuang languages are not mutually intelligible without previous exposure on the part of speakers, and some of them are themselves multiple languages. There is a dialect continuum between Wuming and Bouyei, as well as between Zhuang and various (other) Nung languages such as Tày, Nùng, and San Chay of northern Vietnam. However, the Zhuang languages do not form a linguistic unit; any cladistic unit that includes the various varieties of Zhuang would include all the Tai languages.
Citing the fact that both the Zhuang and Thai peoples have the same exonym for the Vietnamese, kɛɛuA1, Jerold A. Edmondson of the University of Texas, Arlington posited that the split between Zhuang and the Southwest Tai languages happened no earlier than the founding of Jiaozhi (交址) in Vietnam in 112 B.C, but no later than the 5th–6th century A.D. (Source: https://en.wikipedia.org/wiki/Zhuang_languages )
(V)^ The name Việtnam (Vietnamese pronunciation: [viə̀tnaːm]) is a variation of "NamViệt" (Chinese: 南越; pinyin: Nányuè; literally "Southern Việt"), a name that can be traced back to the Triệu Dynasty of the 2nd century B.C. The word Việt originated as a shortened form of BáchViệt (Chinese: 百越; pinyin: BǎiYuè), a word applied to a group of peoples then living in southern China and Vietnam. The form "Vietnam" (越南) is first recorded in the 16th-century oracular poem Sấm Trạng Trình. The name has also been found on 12 steles carved in the 16th and 17th centuries, including one at Bao Lam Pagoda in Haiphong that dates to 1558.
Between 1804 and 1813, the name was used officially by Emperor Gia Long. It was revived in the early 20th century by Phan Bội Châu's History of the Loss of Vietnam, and later by the Vietnamese Nationalist Party. The country was usually called Annam until 1945, when both the imperial government in Huế and the Viet Minh government in Hanoi adopted Việtnam. Since the use of Chinese characters was discontinued in 1918, the alphabetic spelling of Vietnam is official. (Sources: https://en.wikipedia.org/wiki/Vietnam)
(林)^"The kingdom of Champa (Campadesa or nagara Campa in Cham and Cambodian inscriptions, written in Devanagari as चंपा; Chăm Pa in Vietnamese, 占城 Chiêm Thành in Hán Việt and Zhàn chéng in Chinese records) was a Hindu and Buddhist kingdom that controlled what is now Vietnam from approximately the 7th century through to 1832.
The Cham people are the successor of this kingdom. They speak Cham, a Malayo-Polynesian language.
Champa was preceded in the region by a kingdom called Lin-yi (林邑, Middle Chinese *Lim Ip) or Lâm Ấp (Vietnamese) that was in existence from 192 A.D.; the historical relationship between Lin-yi and Champa is not clear. Champa reached its apogee in the 9th and 10th centuries. Thereafter, it began a gradual decline under pressure from Đại Việt, the Vietnamese polity centered in the region of modern Hanoi. In 1832, the Vietnamese emperor Minh Mạng annexed the remaining Cham territories. Mỹ Sơn, a former religious center, and Hội An, one of Champa's main port cities, are now heritage listed."
[...]In the Cham–Vietnamese War (1471), Champa suffered serious defeats at the hands of the Vietnamese, in which 120,000 people were either captured or killed, and the kingdom was reduced to a small enclave near Nha Trang with many Chams fleeing to Cambodia. (Source: https://en.wikipedia.org/wiki/Champa)
(W)^ Phùng Nguyên culture (2,000–1,500 B.C.). Đồng Đậu culture (1,500–1,000 BC). Gò Mun culture (1,000–800 B.C.). Đông Sơn culture (1,000 B.C.– 100 A.D.). Iron Age · Sa Huỳnh culture (1,000 B.C.–200 A.D.). Óc Eo culture (1–630 AD). The Gò Mun culture (c. 1,100-800 B.C.) was a culture of Bronze Age Vietnam during the Hong Bang reigns. (Source: https://en.wikipedia.org/wiki/Gò_Mun_culture)





