Framing Vietnamese within Yue‑Taic strata

by dchph in collaboration with Copilot

Comparative wordlists are the indispensable tools of historical linguistics. They allow us to align forms across languages, test hypotheses of cognacy, and reconstruct proto‑forms. Yet they are also treacherous: superficial resemblance can seduce us into false conclusions if we do not apply rigorous method. In the case of Vietnamese, where Sinitic, Austroasiatic, and Yue‑Taic strata overlap, the danger of misclassification is especially acute.

1. The promise of wordlists

From the 19th century onward, scholars compiled parallel lists of Vietnamese, Chinese, and Mon‑Khmer words. These lists revealed striking correspondences: Vietnamese mẹ “mother” with Khmer mday; Vietnamese đầu “head” with Chinese 頭 tóu; Vietnamese nước “water” with Proto‑Tai *nam*. Such comparisons suggested that Vietnamese was not a simple isolate but a convergence zone. Wordlists thus provided the first evidence for the layered nature of Vietnamese.

2. Methodological principles

Sound correspondences: True cognates show regular phonological patterns, not random similarity.
Semantic stability: Core vocabulary (body parts, kinship, natural elements) is more reliable than cultural terms.
Register awareness: Vietnamese often preserves both a vernacular form and a Sino‑Vietnamese doublet; both must be tracked.
Areal diffusion: Some similarities reflect borrowing across neighbors, not shared ancestry.

3. Comparative tables

The following table illustrates how wordlists must be read critically:

Gloss	Viet-namese	Sino‑ Viet-namese	Chin. (OC/MC)	Mon‑ Khmer	Proto‑ Tai	Notes
head	đầu / tróc	thủ	頭 tóu < OC duʔ	Khmer tpoal	thaw	đầu resembles 頭, but phonology suggests borrowing; tróc may preserve older layer.
tooth	răng	linh	齡 líng < MC lɛjŋ < OC *reːŋ	Khmer t’mieng, Mon rang	hnɯŋ	Austroasiatic alignment is stronger; Sino‑Vietnamese nha is literary.
sky	trời	thiên	天 tiān < OC l̥ˤin	—	hlɯi	trời may reflect Yue‑Taic mediation; thiên is a learned borrowing.

4. False cognates

Wordlists can mislead when superficial similarity masks different origins. For example, Vietnamese sóc “squirrel” resembles Chinese 松鼠 sōngshǔ, but the resemblance is coincidental: sóc is native, while 松鼠 is a descriptive compound (“pine‑rat”).¹

5. Semantic grids

To avoid misclassification, we must map semantic domains systematically. The following grid shows how kinship terms stratify:

Gloss	Sinitic-Vietnamese	Sino‑ Vietnamese	Notes
mother	mẹ < mợ < vú < u	mẫu, mô	母 mǔ, mú, wǔ, wú < MC məw < OC *mɯʔ
father	bố	phụ, phủ	父 fù, fǔ < MC pio < OC paʔ, baʔ
wife	vợ < bụa	phụ	婦 fù < MC buw < OC *bɯʔ

6. Conclusion

Comparative wordlists are indispensable, but they must be read with caution. Vietnamese demonstrates how easily false cognates can mislead, and how register stratification complicates classification. Only by combining phonological correspondences, semantic stability, and historical context can we use wordlists responsibly.

Key takeaways:

Wordlists are powerful but dangerous if read superficially.
Sound correspondences and semantic stability are the gold standard for identifying cognates.
Vietnamese often preserves both vernacular and Sino‑Vietnamese forms, which must be tracked separately.
False cognates are common; caution and rigor are essential.

Footnotes

Handel, Zev (1998). On false cognates in Sino‑Vietnamese comparison. Example of sóc vs. 松鼠. ↩

References

Baxter, William H.; Sagart, Laurent (2014). Old Chinese: A New Reconstruction. Oxford University Press.

Handel, Zev (1998). “False Cognates in Sino‑Vietnamese Studies.” Journal of Chinese Linguistics.

Luce, Gordon H. (1959). Comparative wordlists of Vietnamese, Mon, and Chinese. Rangoon University Press.

Sunday, October 12, 2025

The Comparative Wordlists — Method and Cautions