Sinitic-Vietnamese : APPENDICES

APPENDIX A

Examples of some polysyllabic and dissyllabic vocabularies

The following tabulation of randomly selected wordlists will help the readers make judgement whether: (1) Vietnamese is a dissyllabic language, (2) it should be written in the natural way of combining associated syllables to form a word. This in return will help them understand why a Chinese dissyllabic word when changing into Vietnamese equivalents, it might not follow the same old pattern as monosyllabic words do, (3) make sense of elaborations on other credible findings by other authors that support the postulation of genetic affiliation of those Vietnamese basic words that are cognate to both Sino-Tibetan and Chinese etymologies, (4) analyze supplementary materials as useful tools to approach Vietnamese and Chinese historical phonologies.

I) Composite words:

Ngáoộp, yếuxìu, ủdột, giómáy, lộnxàngầu, liềntùtì, bủnxỉn, rửngmỡ, lậtđật, bệurệu, mốckhính, thúiình, bệrạc, bêtha, chìnhình, đẩyđà, thắcmắc, trịchthượng, trịchbồlương, ởtruồngnhồngnhộng, trầntruồng, tòmò, tấtbật, bứcxúc, bứcrức, nóngtánh, nóngmáy, nónglòng, mongngóng, táymáy, tấtbật, bângkhuâng, bộpchộp, bồihồi, hữnghờ, phảngphất, mơhồ, chạngvạng, chậtvật, khúcmắc, ngờvực, bạttai, tuyệtcúmèo, háchxìxằng, hộtxíngầu, tứđỗtường, sạchbách, yêuđương, thươnghại, ấmcúng, làmbiếng, tộinghiệp, mồcôi, goábụa, híhửng, thấpthỏm... càphê, càrem, càpháo, càlăm, càkêdêngỗng, lacà, càgiựt, càgật, càrá, càrà, càrỡn, càrờ, Càná, càna, càtàng, càchớncàcháo, càtrậtcàduột, càrăng, càdựt, càràng, càlắc, càrịchcàtang, càtàng, càtửng... cùlần, cùlao, cùlét, cầncù, lùcù, cùrũ... hoasoan, hoavôngvang, hoacứtlợn, hoamắt, tàihoa, hoatay, hoaliễu, đàohoa, hoahoèhoasói, bahoa, bahoachíchchoè ... bagai, batrợn, tàiba, ađồngbảyđổi, chúangôiba, hộtxíngầu, caochótvót, baphải, bahồi, bồhòn, bồcâu, baquân... táhoảtamtinh, cứuhoả, hoảlò, hoảdiệmsơn, nhảydù, bếpmúc, baola, thừamứa, đằmthắm, nhạtthếch, chánphèo, ếẩm... châuchấu, bươmbướm, đuđủ, chuồnchuồn, lạcđà, sưtử... dưahấu, dưagan, bíđao, khổqua... trảđủa, chénđũa, bùlubùloa, sàbát, viếtlách, xấcbấcxangbang, tầmbậytầmbạ, tầmphào, bảvơbảláp, trớtquớt, tầmgửi, contầm, bánhtầm, bánhít, bánhdây, bánhdày, bánhxe, coicọp, bắtcóc, bắtnạt, bắtđền, đánhcá, cábóng, cháphi, cátô, cáhồng, cáthu, cáẻm, cáchẻm, cáchép, cángừ, cáđộ, đánhđáo, độcđáo, laỏmtỏi, chầndần, càmràm, cằnnhằn, phànnàn, nhõngnhẽo, tiềnnong, ruồngrẫy, obế, tângbốc, bặmtrợn, tréocẳngngỗng, baquexỏlá, thảgiàn, diệuvợi, xaxăm, xaxôi, xalắcxalơ, sạchbách, bângkhuâng, mônglung, ngỡngàng, ngơngác, tọcmạch, heomay, cùichỏ, chânmày, bảvai, chómực, chómá, chóđẻ, nhàquê, nhàvăn, nhàngủ, nhàmát, nhàtu, nhàlao, laocông, laophổi, mộttay, taychơi, tayvợt, tàytrời, chẫmrãi, gấprút, lẹlàng, tệlậu, cửasổ, maymắn, hấphối, dốtnát, thơngây, ủmtỏi, đắngngắt, giàusụ, nghèonàn, chậmrì, lềmề, nhẹhẫng, bãithama, gạocội, ngáoộp, biểnlận, sinhnhai, etc.

NOTE: "Composite" used here is to convey the meaning of something closely affixed to a radical which can not be broken into separate syllables and used independently for either one or both is a bound morpheme even though in the Chinese original form each character can stand alone as a word that may convey a certain meaning. In the category the same etymon appears as a sole syllable in Vietnamese that cannot function by itself that is merely a morpheme and may not mean anything semantically but it needs to appear in combined forms that go with other syllable to make a complete word. This kind of composite words are found numerous in the Vietnamese language that are commonly used in daily life.

To have more clear picture of what it actually means, compare words in English of the same nature: windy, curious, vague, pitiful, lovely, creamy, marvelous, tomato, salmon, unique, butterfly, kitchen, handy, camel, melon, excited, handsome, etc. Can you break syllables in each of these words into separate units and still use each of them independently with its original meaning? Of course you cannot.

II) Dissyllabic compound words:

Mồhôi, nướcmắt, nhanhchóng, nóngnực, nổigoá, nhàthờ, trườnghọc, giấybút, sinhđẻ, vợchồng, chamẹ, anhem, nhàcửa, trờiđất, đồngruộng, tiềnbạc, bànghế, chuacay, maquỷ, thầnthánh, trờiphật, bảngđen, sôngnúi, nhànước, máybay, sânbay, nhàmáy, ghếngồi, bànviết, giườngngủ, phòngăn, quẹtlửa, máylạnh, tủlạnh, máyhát, lýlẽ, chờđợi, ănuống, rượuchè, cờbạc, cấmkỵ, cẩuthả, etc.

NOTE: Just like compound words in English, e.g. blackboard, therefore, airplane, moreover, billboard, airport, bookworm, football, baseball, notebook, software, hardware, honeymoon, plywood, handicraft, aircraft, shipyard, graveyard, grapefruit, jackfruit, pineapple, etc.,

Vietnamese compound words are in great numbers. Each word-syllable in a word can be used independently as a word.

III) Reduplicative polysyllabic and dissyllabic compound words or binoms:

Lễlạc, tếtnhất, chắcchắn, lạnhlẽo, mấpmé, nướcnôi, nóngnẩy, nựcnội, caucó, cầukỳ, buồnbã, laođao, lậnđận, lếtthết, lúclắc, làulàu, vấtvả, tấttả, vậtvã, văngvẳng, lacà, dỡẹt, dỡình, ỡmờ, vờvĩnh, hữnghờ, chắcchiu, chătchíu, mằnmặn, ngọtngào, ngánngẫm, khờkhạo, giàgiặn, xaxôi, nặngnề, nhẹnhàng, tươmtất, rấmrớ, rầmrộ, rưngrưng, rộnràng, rùrì, rúrí, rờrẫm, rậmrực, tùtúng, phâyphây, phephẩy, phăngphăng, mêmẫn, chămchỉ, lolắng, mắcmỏ, rẻrúng, ấmức, viễnvông, mơmàng, sâusắc, đenđuá, hồnghào, hoahoè, dạidột, sờsoạn, mòmẫm, hẹphòi, rộngrãi, ấmức, thẳngthừng, quạuquọ, chắcchắn, vắngvẻ, côicút, lỗlã, dưdã, đauđớn, luônluôn, mêmãi, nhanhnhẩu, runrẩy, lắclư, lườilĩnh, liềnliền, nhạtnhẽo, nhạtnhoà, nhútnhát, dạndĩ, mạnhmẽ, nhẹnhàng, nặngnề, thấplètè, sạchsànhsanh, đồngxuteng, liềntùtì, lấplalấplững, bùlubùloa, híhahíhửng, xíxaxíxọn, lúngtalúngtúng, càrịchcàtang, lấplalấplững, bùlubùloa, híhahíhửng, tấtatấtửng, xấcbấcxangbang, lấplalấplửng, etc.

NOTE: Reduplicative compound words are made of a one-syllable word plus a variation of that with a little change in sound. This type of words renders a subtle change in meaning of the radical. An affix to the original word is usually a reduplicative element that has a different tone and initial or ending comes before of after a radical. Comparable structures of this type of words are those of English "childish", "slowly", "talkative", "handy", "continuous", "fashionable", "horrendous", "fabulous", exited", "exciting", "initial", "vital", "likewise", "shaking", "shaky", "lonesome", "troublesome", "mimicry", etc. An affixed syllable or add-on component, just like those similarly structured words in English, cannot be used independently.

IV) Polysyllabic "Vietnamized" neighboring Mon-Khmer and Daic words:

They are words that were made up with the combined elements of all Sinitic Vietnamese, Sino-Vietnamese, and other indigenous words indiscriminately.

Bơsữa, sữatuơi, sữachua, dưachua, tráibơ, tráisu, súbắp, diễnsô, bầusô, bánhbìtquy, bánhít, bánhchưng, bánhxèo, nămhợi, nămgà, tuổidậu, tuổihợi, làmthịt, làmcao, làmtàng, làmlẻ, làmăn, mầnăn, nângcao, lênmặt, lêngiọng, xuốnggiọng, xuốngnước, câucú, thathiết, thêthảm, tủithân, mồcôi, đầunậu, băngđảng, xốngáo, súngống, daobúa, hồhỡi, tụctằn, cánúc, hầubao, đầuđuôi, đóikhát, sấmsét, tàylay, tèmlem, xegắnmáy, câulạcbộ, hầmbàlần, tạppílù, nồiniêusoongchả, đaotobúlớn, nởmàynởmặt, bàconchòmxóm, đếnhẹnlạolên, bèogiạtmâytrôi, anhemcộtchèo, anhemcôcậu, anhemthúcba, mẹchồngcondâu, đèocaogióhút, tiềnrừngbạcbể, trànggiangđạihải, vòngvotamquốc, hốilộđútlót, etc.

V) Polysyllabic Vietnamized English and French words:

It is no doubt that writing foreign words such as 'Xan Phờ-ran-xít-cô' instead of 'San Francisco' is the most stupid way to do by the uneducated people; therefore, readers will not find such weird spellings in this paper but only the commonly accepted forms as reasonably natural as possible.

Càphê, càrem, xecamnhông, côngtennơ, đầukéocôngtennơ, phíchnước, sônước, canô, thùngphuy, lôcanh, origin, gin, gen, building, oánhtùtì, bíttết, lagu, sàlách, nướcsốt, xàbông, sôcôla, suwinggum, sabôchê, dămbông, phôma, yaua, vôlăng, mêgabai, internet, website, software, mashup, interview, rôbô, radiô, lade, photocópy, cọppi, ốcxygen, cạtbônát, đềphô, dốpdiếc, vốtka, virút, cờlê, mỏlết, tivi, video, dĩacompact, galăng, đôla, vila, tắcxi, xebuýt, phẹcmatuya, gạcmănggiê, cômpa, tráibôm, bômhơi, dăngxê, câulạcbộ, vacăng, ôtô, nhàga, ôten, dầuxăng, bùlon, cáisoong, chơigem, mànhìnhled, trượtpaten, chạymaratông, menbo, hợpgu, hợprơ, hămbơgơ, mesừ, mađam, xinêma, tuydô, kílômét, centimét, milimét, xebuýt, xemôtô, môtơ, đènmăngxông, xyláp, phạcmaxi, đốctờ, đìaréctơ, áoghilê, bộcomplê, ôpạclơ, micờrô, phắctuya, trảbiu, ốcxíthoá, sida, aid, căngxe, buyarô, rờmọt, móocchê, súngcanhnông, tủbuýpphê, chạyápphe, nhàbăng, trảcheck, sờnáchba, mìncơlaymo, bốtdờsô, aláchsô, ạctisô, căngtin, míttinh, Ácănđình, Hoathịnhđốn, Balê, Ănglê, Vaticăn, sôviết, bônxêvích, gạcđờco, gácgian, trứngốplết, hộtgàốpla, áobànhtô, áomăngtô, bugi, épphê, ácxít, átpirin, kýninh, đờmi, đờmigạcxông, đíplôm, đíplôma, găngtơ, ápphích, táplô, bancông, salông, khănmùxoa, lêmônát, rượurum, rượuvan, đườngrầy, xetăng, tănglều, miniduýp, carô, súngrulô, xerulô, mọtphin, xìphé, pháctuya, côngtắc, côngtơ, rôbinê, marisến, phôngten, bôlêrô, tănggo, rumba, phăngtadi, phuộcxét, xìcăngđan, sanđan, bigiăngtin, phúlít, batong, măngsông, đènpin, rờmọt, rờmoọc, boongtàu, tíchkê, bánsôn, đitua, vãira, đítcô, đăngxe, lăngxê, pianô, viôlông, honđa, trumpét, càtômát, xúchxích, patê, tráibơ, đắcco, xêrum, xiarô, xêry, băngrôn, băngnhạc, đồlen, rumba, bếpga, môđen, môđẹc, xilô, nồixúpde, pađờxuy, sơmi, balô, búpbê, tắcxi, buộcboa, côngtra, dềpô, áopull, quầngin, jắtkết, zêrô, sốpphơ, xếplớn, pátpo, vida, bida, côcacôla, pépsi, vôlăng, ămpiya, ampe, kílôoát, tăngdơ, xuỵtvôntơ, cátsét, ghisê, nhàbăng, tivi, gàrôti, chơisộp, kháchsộp, compiutơ, còmmăng, tíchkê, díppô, san-phơ-ran-xit-cô, etc.

NOTE: These variants of words of French and English origins are spelled in Vietnamese orthography. Even though words in this classification are in limited numbers, they are best representative of polysyllabic combining formation. They are loanwords of "foreign" origin. Their syllables are an integrated parts attached the others and cannot certainly be used as independent words even though the Vietnamese syllable itself may mean something else unrelated. How many that can you recognize besides the stupid 'san-phơ-ran-xít-cô'?

The implication of these examples is that if dissyllabic Sino-Vietnamese words are seen as "foreign" loanwords in the Vietnamese language, then their nature and characteristics are virtually the same, not to be separated.

VI) Culturally-accented Vietnamese words of Chinese origin:

ănđòn (deserved punishment) 挨打: ăidă

ăntiền (win a bet) 贏錢: yínqián

ănnhậu (have a drink) 應酬: yìngchóu

ănmày (beggar) 要飯: yàofàn

dêxồm (lecherous) 婬蟲: yínchóng

hẹnhò (dating) 約會: yèhuì

đánhcướp (rob) 打劫: dăjié

đánhbài (play cards) 打牌: dăpái

tầmbậy (tầmbạ, sàbát) 三八: sānbà

chánngán (sick of) 厭倦: yànjuān

bậtcười (laugh) 發笑: fáxiào

bậtkhóc (cry) 發哭: fákù

banngày (daytime) 白日: báirì

bồcâu (pigeon) 白鴿: báigē

chạngvạng (at dusk) 旁晚: bángwăn

cảgan (daring) 大膽: dàdăn

khờkhạo (foolish) 傻瓜: săguā

ấmcúng (cozy) 溫馨: wēnqìng

muárối (puppetry) 木偶戲: mù'ǒuxì

xinlỗi (apologize) 請罪: qǐngzuì

xinchào (hello) 見濄 jiànguò

chắcchắn (certainly) 確定: quèdìng

đưađón (see off and pick up) 接送: jiēsòng

chờđợi (expect) 期待: qídài

yêuđương (love) 愛戴: àidài

thươngyêu (affection) 疼愛: téng'ài

khôngdámđâu (it is not so) 不敢當: bùgăndàng

banngàybanmặt (in broad daylight) 青天白日: qīngtiānbáirì

đấttrờichứnggiám (Heaven and the Earth be the witnesses) 天地作證: tiāndìzuòzhèng

trờibấtdunggian (God punish bad people) 天不容姦: tiānbùróngjiān

langbạtkỳhồ (take on an adventure) 狼跋其胡: lángbáqíhú

nhưcágặpnước (like a fish back in water) 如魚得水: rúyúdéshuǐ, etc.

NOTE: The official Pinyin writing for the Chinese words above are always correctly written in combining formation because they are polysyllabic in nature, except for the diacritic marks that fall on the wrong vowel, e.g., 醉酒 zuìjǐu VS 'sayrượu' (drunk), 真牛 zhēnníu VS 'chơingầu', 垂柳 chuílǐu VS 'liễurũ' (willow), 拜求 bàiqíu VS 'váicầu' (prayer), etc. The implication of these basic and not-so-basic words of the same roots between Chinese and Vietnamese, in addition to those Sino- and Sinitic-Vietnamese vocabularies which are indispensable in the Vietnamese language, is that Chinese is classified as a polysyllabic language, so is Vietnamese.

x X x

APPENDIX B

The International Phonetic Alphabet in Unicode

APPENDIX C

Examples of some variable sound changes:

Thuận Nghịch Độc by Duc Tran

The author, a commentator and translator for Radio Free Asia (RFA) as of 2019, constructs an etymological analogy based a poem by Phạm Thái (1777-1813) which is written in "Thuận Nghịch Độc" form, that is, standard reading is for Sino-Vietnamese sound:

青春鎖柳冷蕭房 Thanh xuân khóa liễu lãnh tiêu phòng
錦軸停針礙點妝 Cẩm trục đình châm ngại điểm trang
清亮度蘚浮沸綠 Thanh lượng độ tiên phù phất lục

淡曦散菊彩疏黃 Đạm hy tán cúc thái sơ hoàng

情痴易訴簾邊月 Tình si dị tố liêm biên nguyệt

夢觸曾撩帳頂霜 Mộng xúc tằng liêu trướng đỉnh sương

箏曲強挑愁緒絆 Tranh khúc cưỡng khiêu sầu tự bạn

鶯歌雅詠閣蕭香 Oanh ca nhã vịnh các tiêu hương

while, as in old Chinese-based Nôm writings, Sintic-Vietnamese sounds can be also read in reverse (naturally some Sino-Vietnamese sounds, indispensable part of Vietnamese vocabularies, are included also):

香蕭閣詠雅歌鶯 Hương tiêu gác vắng nhặt ca oanh
絆緒愁挑強曲箏 Bận mối sầu khêu gượng khúc tranh
霜頂帳撩曾觸夢 Sương đỉnh trướng gieo từng giục mộng

月邊簾訴易痴情 Nguyệt bên rèm, tỏ dễ si tình

黃疏彩菊散曦淡 Vàng tha thướt, cúc tan hơi đạm

綠沸浮蘚度亮清 Lục phất phơ, rêu đọ rạng thanh

妝點礙針停軸錦 Trang điểm ngại chăm, dừng trục gấm

房蕭冷柳鎖春青 Phòng tiêu lạnh lẽo khóa xuân xanh.

from these reading we can see clearly the relations between those Sino- and Sinitic-Vietnamese words:

Các = Gác
Cẩm = Gấm
Cưỡng = Gượng
Liêm = Rèm, etc.

with this onset, we can apply the same patterns to other words:

Cận = Gần
Can = Gan
Cân = Gân
Cấp = Gấp
Cổn = Gợn
Các = Gác
Kê = Gà
Ký = Gửi
Kỵ = Ghét
Ký = Ghi
Tử = Chết
Tự = Chùa
Tự = Chữ (cái)
Thanh = Xanh
Vũ = Múa
Vũ = Mưa
Vân = Mây
Vạn = Muôn
Vọng = Mon
Võng = Mạng

and so on.

NOTE: Specifically with the examples above, in the comments regarding Chinese ~ Vietnamese cognates, with no exceptions Duc Tran seems to see only the Sinitic-Vietnamese sound changes in comparison with those of Sino-Vietnamese on one-to-one correspondence within the monosyllabic words even though he did mention about the correlation of those Vietnamese sounds to those of Mandarin sounds: "Cái lạ ở chỗ các ví dụ trên phần theo chiếu theo tiếng Bắc Kinh hay Pinyin đều theo một luồng phụ âm đầu nhất định." (That means "the interesting thing about the words in the example is that all consonantal initials as said in Beijing dialect or Pinyin follow a certain pattern of correspondent initials.") This is how it has been done by most of specialists in the Chinese-Vietnamese etymological fields.

x X x

APPENDICES D to G

The APPENDICES D to G below are excerpts from Prof. Tsu-lin Mei's The Austroasiatics in Ancient South China: Some Lexical Evidence and all credits of course belong to the author himself alone. In his research, Professor Mei proved that those cited etyma under his examination had the Austroasiatic origin. His findings help reinforce the hypothesis that those Chinese and Austroasiatic cognates are derived from the same Yue substratum. The cited forms analyzed by the author show that certain Chinese lexicons in the substratum of the Minnan dialects are Austroasistic. The author stated that ""the modern Vietnamese are the descendants of the ancient Yüeh and their present territory represents the AA-speaking region closest to Fukien." [...] "It is noteworthy that the forms we discuss are best represented in Vietnamese." In the Minnan words cited below, the author pointed out the closeness of all the Yue languages (AA) with other dialects as seen relevant, i.e., the forms in Foochow (FC) and Amoy (AM), and VN as Vietnamese, and for the latter, as being variably referred to in this paper, that could be Vietic, Viet., Annamese, Ancient Vietnamese, Sinitic-Vietnamese, etc. Methodologically, on the one hand, Prof. Tsu-lin Mei's survey demonstrates how he utilized the analytic tools in his etymological work that can serve as a good example for us to learn in exploring Sinitic-Vietnamese etymological field. While examining the appendices below, on the other hand, readers may want to keep in mind the at a certain point in ancient times, there had existed proto-Chinese and Archaic Chinese before the early Chinese crossed the Yangtze River and mixed with the Yue (AA) people in their habitats, hence, linguistically, there emerged 河 */ˠɑ/ as apposed to 江 */krong/ for "river", 齒 chǐ vs. 牙 yá as "tooth", etc. That is, observe how the Yue words become a part of the Chinese lexical stock. In short, the point to make here is they have been derived from the same source regardless the direction of the loans.

On the sideline, the Simplified Chinese scripts in the originally quoted excerpts hereof by the author were meant to target audience in the China's mainland, hence we convert them back into Traditional Chinese characters to get closer to the original archaic forms as we are examining historical linguistics. Readers can refer to his complete work on the Austroasiatic origin for cited words below in the archived link that follows. Additional comments inside the [ square brackets ] are comments inserted by dchph.

x X x

APPENDIX D

The case of "sông"
by Tsu-lin Mei

(1) 江 **krong/kang/jiāng ‘Yangtze River’, ‘river’.

Vietnamese "sông" /səwŋ/ (river) is cognate to the Chinese 江 jiāng and other forms in the Mon-Khmer languages, e.g., Bahnar, Sedang krong; Katu karung; Bru klong; Gar, Koho rong; Laʔven dakhom; Biat n’hong; Hre khroang; Old Mon krung. Cf. Tibetan klu ‘river’; Thai khlɔ:ŋ ‘canal’. [ 江 jiāng has long been recognized as Yue or Austroasiatic word which would later become general term for rivers that go with the naming convention in southern Chinese and Vietnamese alike. ]

It is immediately clear that the Chinese borrowed this word from the Austroasiatic /krong/ as in 江 jiāng for 長江 Chángjiāng. The word 江 jiāng as proper name for the ‘Yangtze River’ occurs in Ode 22 that belongs to the section Zhaonan 召南, and this term is also what the Zhou people used for the region which formerly belonged to Chu [ 楚國 ]. The form/krong/ is a general word for ‘river’ in Austroasiatic (AA). 江 jiāng is of relatively late origin. It did not occur in the oracle bones. The bronze inscriptions contain one occurrence of this word, and the Book of Odes, nine occurrences, in five poems and its use as names of rivers was limited to south of the Yangtze. The term 江南 (literally ‘south of the River’) as used during the Han dynasty refers to Changsha 長沙 and Yuzhang 豫章, in present Hunan [ 湖南 ] and Jiangxi [ 江西 ] situated in the middle section of the Yangtze and not the entire river. The notion that the Chinese met the AA’s in the Middle Yangtze region of course does not exclude their presence elsewhere; it just gives a precise indication of one of their habitats. It is perhaps pertinent to mention that the Vietnamese believed that their homeland once included the region around the Dongting Lake 洞庭湖 which is in that general area. Another Vietnamese legend states that their forefather married the daughter of the dragon king of Dongting Lake.

Textual and epigraphic evidence indicates that the word 江 jiāng came into the Chinese language between 500 and 1000 B.C. but archaeologists are increasingly inclined to the view that contact between North China and South China occurred as early as the Shang dynasty: artifacts showing strong Shang and early Chou influence have been discovered in the lower Yangtze region, and according to some scholars, also in the Han River (漢江) region. Early contacts with the Shang or Yin states can be further based on another mythical folk hero of Vietnam's history has been known as '扶董天王 Fúdǒng Tiānwáng (Phùđổng Thiênvương)', or 董聖 Dǒng Shèng (Thánh Dóng) , who defeated the Ân (殷 Yin) invaders from ancient China, that matches archaeological finds aforesaid. [ 根據《嶺南摭怪》裡的越南傳說。 ].

Phonetically 江 jiāng has a Second Division final in Middle Chinese (MC), and according to the Yakhontov-Pulleyblank theory, this implies a model –r- or –l- in OC. The Old Chinese (OC) reading for this word in Li Fang-kuei’s system is krung. The final has been reconstructed as /-awng/ by Pulleyblank, and /–ong/ by Yakhontov; both finals had a rounded back vowel in OC. [ Note that V /-əwŋ/fits well into/-awng/ as anylized by Pulleyblank, a bilabial closed /-oŋʷ/ final. That is to say the V /səwŋ/ or /səwŋʷ/ is pronunced with a complete lips closed to cut off air-stream at the end of the ending /-ŋʷ/. ]

Both these facts again suggest that 江 jiāng was a borrowed word. Tibetan had klu ‘river' but a Sino-Tibetan origin of klu/krong while Thai also borrowed klu and khlɔ:ŋ from AA.

Other Chinese words for "river" include OC 河 ˠɑ/g’ɑ with an earlier g’al or g’ɑr from Altaic for 'Huanghe' 黄河 (Yelllow River), 水 shuǐ [ as 'river' in 渭水 Weishui (Wei RIver in Jiangxi) ], and 川 chuān [ interestingly, both 水 shuǐ and 川 chuān plausibly being cognate to V "sông" as well. ] The Chinese 水 siwər/świ/shuǐ occurs in the oracle bones and can be traced to Sino-Tibetan: ‘water’ Tibetan ch’u; Bara, Nago dui; Kuki-chin tui. For the 川 t’iwen/tś’iwän/chuān, the nasal final in chuān probably represents the phonological parallel in the sound gloss in the Shuowen 水，準也 [ 準 turʔ [ For V 'nước' < nak (water) the word also corresponds to Old Vietic /dák/that is cognate to 淂 dé (MC dak) and picked up for compilation in the Kangxi DIctionary.

[ For our purpose, the bottom line is the cited cognates of M 江 /jiang/ were derived from the same root: SV = giang = VS sông = Vietic krông = AA krong = Thai khlɔ:ŋ = OC krong = Tibetan klu < ch'u, given 水 shuǐ and 川 chuān are cognates of V "sông", all without taking into consideration of the direction of borrowing, of which it is similar to the case of V 'chết' (to die) to be quoted in the Appendix E next. ]

Source: The Austroasiatics in Ancient South China: Some Lexical Evidence

Back to top

APPENDIX E

The case of "chết"
by Tsu-lin Mei

(2) 札 *tsɛt 'to die’

In the Zhouli [ 周禮 ], the gloss '札 as *tsɛt' in 越人謂死為札 “The Yüe people call ‘to die’ 札” occurs.** Tuan Yü-ts’ai assigns this character to his group 12, which corresponds most nearly to Karlgren’s group V that corresponds to Chiang Yu-kao's 脂 group, while Karlgren’s Grammata Serica Recensa (GSR) reconstructed it as /tsă/ in group II. Frequently, a given MC rhyme has more than one OC origin. 札 belongs to the MC 黑吉 rhyme; this rhyme derives from three different OC rhyme categories: 祭，微，and 脂 corresponding roughly to Karlgren’s II, V, and X. The only way to determine which OC rhyme category such words as this belong to is to examine their xiesheng [ 諧聲 ] connections. In the Shuowen, is defined as follows: 札牒也，从木乙聲. In GSR 505 a reading tʂiɛt is given for Karlgren’s group V. And in the Shih-ming, written by Liu Hsi, the sound gloss is 札，木節也 (木節 ts-，OC 脂 group). Clearly 札 should belong to the same group as 乙; the proper reconstructions is /tsɛt/ and not /tsăt/ as given in GSR 280b. Dong Tonghe does not give this character in his Shanggu Yinyün Piaokao, but it is simple enough to place it where it belongs — viz. on page 215 in Dong’s 微 group; the proper form in Dong’s system is tsət that represents the AA word for ‘to die’: VN chết; Muong chít, chét; Chrau chu’t, Bahnar kˠcit; Katu chet; Gua test; Hre ko’chit; Bonam kachet; Brou kuchêit; Mon chət. More cognate forms can be found in Pinnow, p. 259, item K324f. The Proto-Mon-Khmer form has been reconstructed by Shorto as kcət, which is extremely close to our OC form. There is even the possibility that Proto-MK k- is reflected in the glottal initial of the phonetic 乙.

'To die' in other east and southeast Asian languages are: Chinese 死 siər; Tib. ‘chi-ba, šhi; Lolo-Burm šei; Proto-Tai tai; Proto-Miao daih. Here Chinese goes together with Tibeto-Burman, and Proto-Tai goes together with Proto-Miao. None of these forms has any resemblance to tsɛt.

[ Overall for our purpose, yet, let us postulate the final as /-jət/, then for the reason that in Chinese there are many words, of which some might be doublets that denote the concept of 'death' or 'die' sounding similar to V 'chết' besides 札 **tsɛt, e.g., 死 sǐ < MC sji < OC sijʔ, 逝 shì, zhì, shé < MC tsjai < OC djats, 折 zhé, shé, tí, zhē (chiết, đề) < MC tsjet < OC tat , 卒 zú, cù (tốt, tuất, thốt) < MC tsyt < OC tswit, 陟 zhì, dé, shăn (trắc, triết, đắc) < MC ʈik < OC trǝk, etc. In the mean time, modern Khmer as represented by Cambodian, it is pronounced as /slab/, having nothing to do with V 'chết'. ]

Source: The Austroasiatics in Ancient South China: Some Lexical Evidence*

Back to top

APPENDIX F

The case of "ruồi"
by Tsu-lin Mei

(3) 維虫 *rwəi ‘fly’

For Vietnamese ruồi, rui is a very old word in Proto-AA ruwaj for ‘fly'. Mon-Khmer forms have a wide distribution: Cambodian rouy; Lawa rue; Mon rùy; Chaobon rùuy; Kuy ʔaruəy; Souei ʔɑrɔɔy; Brurùay; Ngeʔ, Alak, Tampuon rɔɔy; Loven, Brao, Stieng ruay; Chong rɔʔy; Pear roy, including some in the Munda branch, can be found in Pinnow, p. 268, item 356).

Like 江 jiāng ‘Yangtze River’, the loanword 維(虫) wéi(chóng) ‘fly’ suggests the word was loaned by the eary Chinese then they came to the middle Yangtze between 1000 and 500 B.C. from the Chu (楚) people that consisted of Austroasiastic origin. King Wu of Chu saw himself as a 'southern barbarian' just like what was described in the Lüshi Chunqiu (吕氏春秋), “Chu was derived from the barbarians.” So said 維 wei ‘fly’ was a term in the ancient Chu dialect that shows up in the Chuyu 楚語 (Sởngữ) section of the Guoyu 國語 (Quốcngữ) with the sentence “亡虫維虫之既多" (many gnats and flies). The OC value of 維虫 can be ascertained via its phonetic that is the name of an insect pronounced like 維. The initial of wei in MC is 喻四, the yü initial. Li Fang-kuei has argued convincingly that the OC value of yü IV is a flapped r- or l-; he writes it as r- such as the initial of the word 酉, one of the twelve earth’s branches, has r- in Proto-Tai, still attested in several modern dialects . The final of wei has been reconstructed as –d by Dong Tung-he and Li Fang-kuei, and as –r by Karlgren. These are values for the earlier stage of OC. By the time of the Guoyu, which is relatively late, the ending-d or –r had probably already become –i. In Li’s system, the distinction between hekou and kaikou (with or without –u-/-w-) is non-phonemic in OC, and the OC value of 維 in his system is rəd. In terms of our problem, there are two possibilities. Either OC had no –w- at all, phonemic or non-phonemic, in which case the best the Chinese could do to approximate the AA form (which has a rounded back vowel) is rəi < rəd; or with a non-phonemic OC –w- form as rwəi.

In the meanwhile, the standard word for ‘fly’ in OC is 蠅 riəng, which was already attested in the Odes. It is substantiated by old dictionaries; the Guangya 廣雅 defines 虫 as 虫羊, and the Fangyan 方言 states that 羊 (虫羊) is a dialect form of 蠅 ‘fly‘ [ cognate to VS 'nhặng, dặng, lằng' (bluebottle) that is in turn associated with riəng to be cognate VN 'ruồi' (fly, gnat). Meanwhile, there exists also the compound form V 'ruồinhặng', plausibly cognate to the aforementioned 虫羊 to carry the concept of 'flies and gnats', and usually in Vietnamese structurally the second syllabic-word '-nhặng' modifies the first one 'ruồi-', i.e., the second made the renaming semantically clear, easier to understood. Over the years the dissyllabic word has become the general binome to strengthen the connotation of the extended meaning of 'flies' as a whole including 'gnats', 'bluebottles' etc. All said, 維虫 weichong was a Chu word and it was a cognate of VN rui and 'ruồi', hapax legomenon locus classicus in the Chuyu 楚語 section of the Guoyu 國語. ]

Source: The Austroasiatics in Ancient South China: Some Lexical Evidence*

APPENDIX G

The case of "ngà"
by Tsu-lin Mei

(4) 牙 *ngra/nga/ya ‘tooth, tusk, ivory’

VN ngà ‘ivory’; Proto-Mnong (Bahnar) *ngo’la ‘tusk’; **Proto-Tai nga.

Chinese ya has Division II final in MC, which, according to the Yakhontov-Pulleyblank theory, calls for a medial –r-in OC ngra that was derived from an AA form similar to Proto-Mnong ngo’la.

For the Chinese 牙 yá as an AA loan it is cognate to Vietnamese 'ngà' (ivory) that is based upon the fact that 牙 yá is of relatively late origin. When it first appeared, it was only used for ‘animal tooth’ and ‘tusk' and still is the meaning in AA. While North China once had elephants, they became quite rare during the Shang and Zhou dynasties, and ivory had to be imported from the middle and lower Yangtze region. Imported items not infrequently bear their original names, and by our previous argument, the Yangtze valley was inhabited by the AA’s during the first millennium B.C.

In the meanwhile, the oldest Chinese word for ‘tooth’ is 齒 chǐ which once had an unrestricted range of application, including ‘molar,’ ‘tusk', and ‘ivory'. 齒 chǐ consists of a phonetic 止 and the remaining part as a signific pictograph showing the teeth in an open mouth. Ancestral forms of the pictograph occurred frequently in the oracle bones to represent 齒 chǐ. The graph of 牙 yá, however, has no identifiable occurrence in the oracle bones and only one probable occurrence in the bronze inscriptions. This statement is based upon the fact that 牙 yá is listed neither in Li Hsiao-ting’s compendium of oracle bone graphs nor in Yung Keng’s dictionary of bronze graphs. Karlgren cited a bronze form for 牙 yá in GSR (37b) but this occurrence of 牙 yá was a proper name. There are reasons to believe that the absence of 牙 yá from early epigraphic records was not merely accidental. The oracle bones contained many records of prognosis concerning illness, and among them tooth ache. The graphs used were always ancestral forms of 齒 chǐ. The oracle bones also contained a representative list of terms for parts of the body, including head, ear, eye, mouth, tongue, foot, and probably also elbow, heel, buttock, shank, but not 牙 yá as 'tooth'.

Beginning with the Book of Odes we have unambiguous evidence for the use of 牙 yá. But in the pre-Han texts 牙 yá still did not occur frequently, and an analysis of this small corpus reveals that 牙 yá was never used for human tooth. Hence the Shuowen’s definition of 牙 yá as 牡齒, usually interpreted as ‘molar’，seems to reflect a later, probably post-Qin, development. The most frequent occurrence of 牙 yá in the sense of ‘tooth’ is in the compound 爪牙 ‘claw and tooth' [ cf. Vietnamese 'nanhvuốt' but in the roundabout loan to become SV 'nha' /ɲja/ and probably 'răng' [ via the tentative medial –r-in OC ngra; cf. the southwestern Vietnamese subdialect Rạchgiá "găng"/ɣæŋ/ ] to be associated with an earlier form derived from Chinese 齡 líng (SV linh) ] and there the reference to animal tooth is quite clear. The Yijing contains a line in which the meaning of ya was ‘tusk’: 豶豕之牙，吉。 ‘the tusk of a castrated hog: propitious. ’ The line in Ode 17 '誰謂鼠無牙' probably means ‘Who says the rat has no tusks?’ but some scholars prefer to interpret 牙 yá simply as ‘teeth (incisors).’

A graph must first exist before it can become a part of another graph, and the older a graph, the more chances it has to serve as part of other graphs. By this criterion, 齒 chǐ is much older than 牙 yá. The meaning of 齒 chǐ in the oracle bones is primarily ‘human tooth’, including ‘molar.’ The use of 齒 chǐ as ‘tusk, ivory’ in most clearly illustrated in Ode 299 憬彼淮夷，來獻其琛，元龜象齒 “Far away are those Huai tribes, but they come to present their treasures, big tortoise, elephant’s tusks”; and not quite so clearly in two passages in the 禹貢 “Yukung,” both of which listed 齒，革，羽，毛 as items of tribute. Here 齒 chǐ can mean either ‘ivory’ or ‘bones and tusks of animals,’ all used for carving. Lastly, 齒 chǐ also applies to tooth of other animals, 相鼠有齒 “Look at the rat with its teeth” (Ode 52). In the oracle bones, 齒 chǐ occurs as the signific of three graphs. In the Shuowen, 齒 chǐ occurs as the signific of forty-one graphs, all having something to do with tooth; 牙 yá, only two graphs, one of which has a variant form with 齒 chǐ as the signific. The Shuowen also tells us that ya has a kuwen form in which the graph for 齒 chǐ appeared under the graph for 牙 yá. What this seems to indicate is that when 牙 first appeared, it was so unfamiliar that some scribes found it necessary to add the graph for 齒 chǐ in order to remind themselves what ya was supposed to mean. 牙 also occurs as the phonetic of eight graphs (six according to Karlgren) but none of these graphs is older than 齒 chǐ.

Elephants once existed in North China; remains of elephants have been unearthed in Neolithic sites as well as in An-yang. Ivory carving was also a highly developed craft during the Shang dynasty. These facts, however, should not mislead us into thinking that elephants had always been common in ancient North China. Yang Chung-chien and Liu Tung-sheng made an analysis of over six thousand mammalian remains from the An-yang site and reported the following finding: over 100 individuals, dog, pig, deer, lamb, cow, etc.; between 10 and 100 individuals, tiger, rabbit, horse, bear, badger (獾) etc.; under 10 individuals, elephant, monkey, whale, fox, rhinoceros, etc. The authors went on to say that rare species such as the whale, the rhinoceros, and the elephant were obviously imported from outside, and their uses were limited to that of display as items of curiosity. This view is also confirmed by literary sources. In the Han Feizi, it is said that when King Chou of the Shang dynasty made ivory chopsticks, Chi Zi, a loyal minister, became apprehensive – implying that when as rare an item as ivory was used for chopsticks, the king’s other extravagances could be easily imagined. Importation of ivory in the form of tribute was also reported in Ode 299 and in the “Yü-kung,” both of which were cited above.

The history of 牙 yá and 齒 chǐ can now be reconstructed as follows: The people of the Shang and Zhou dynasties have always depended upon import for their supply of ivory. But during the early stage, ivory and other animal tusks and bones were designated by 齒 chǐ, which was also the general word for ‘tooth’. Items made of ivory were also indicated by adding a modifier 象 hsiang ‘elephant’ before the noun, for example 象弭 ‘ivory bow tip' or 象箸 ‘ivory chopsticks’. Then 牙 yá came into the Chinese language in the sense of ‘tusk’ and because a tusk is larger than other types of teeth, ya gradually acquired the meaning of ‘big tooth, molar’ by extension, thus encroaching upon the former domain of 齒 chǐ. When later lexicographers defined 牙 yá as ‘molar’ and 齒 chǐ as ‘front tooth' [ cf. VS 'răngsỉ' 牙齒 yáchǐ; hence, 'răngcỏ' ] a general term for 'teeth' as they are describing, though without clear awareness, the usage of the Han dynasty and thereafter. By further extension, 牙 yá also became the general word for ‘tooth’ while retaining this special meaning of ‘ivory' [ The foregoing argument has brought up an interesting point in lexical developent in human languages, to say the least. ].

Some Min dialects still employ 齒 chǐ in the sense of tooth. The common word for tooth in Amoy is simply k’i. Fuzhou has nai3 which is a fusion of ŋɑ plus k’i, i.e. 牙齒 yáchǐ [ and probably Vietnamese 'răngcỏ' (teeth) ]. This strongly suggests that in Min the real old word for ‘tooth’ is 齒 chǐ as in Amoy, the implication being that this was still the colloquial word for ‘tooth’ well into Han when the first the Chinese resettled in Fujian Province. The Japanese use 齒 chǐ as kanji to write ha ‘tooth’ in their language; 牙 yá rarely occurs. Both these facts provide supplementary evidence for the thesis that the use of ya as the general word for ‘tooth’ was a relatively late development. [ As for a doublet for V "răng" could it possibly later originate from Chinese 齡 líng as well? ]

In a note published in BSOAS, vol. 18, Walter Simon proposed that Tibetan /so/ ‘tooth’ and Chinese /ya/ 牙 (OC nga) are cognates, thus reviving a view once expressed by Sten Konow. Simon’s entire argument was based upon historical phonology; he tried to show (a) OC had consonant clusters of the type sng- and C-, (b) by reconstructing 牙 yá as sngɔ > zngɔ > nga and 邪 xié as zˠa > za, one can affirm Xu Shen’s view that 邪 xié [ SV tà ] has 牙 yá [ SV nha ] as its phonetic, and (c) Chinese snga can then be related to a Proto-Tibetan *sngwa and Burmese swa: > θwa:. Our etymology for ya ‘tooth’ implies a rejection of Simon’s view; if ya is borrowed from AA, then the question of Sino-Tibetan comparison simply does not arise. And even if our theory is not accepted, there is no reason to adopt Simon’s analysis; ya is clearly a word of relatively late origin, and the fact that 邪 xié has 牙 yá as its phonetic can be explained by assuming that the z- of 邪 xie resulted from the palatalization of an earlier g-.

SOME OTHER SURVIVAL OF AUSTROASIATIC ETYMA

[ Quoted: "The Yiddish linguist, Max Weinreich, states, “A language is a dialect with an army and navy”. According to this definition, Mandarin has armies while Hokkien and Cantonese do not. However, Hokkien and Cantonese are linguistically different from Mandarin. They use different words and have different grammars. " For example, Fukienese (Hokkien) was officially classified by China's institutions into the Sino-Tibetan linguistic family > Chinese > Min > Coastal Min > Southern Min > Hokkien (as quoted by Wikkipedia.org ] There exists highly the possibility of the survival of AA forms in some modern Chinese dialects spoken in areas occupied by the ancient Yue peoples.

The Min dialects spoken in Fukien and northeastern Kwangtung [ and the Hannanese in the Hainan Island Province where the Li minorities have been considered as a descendants of ancestors of the Chamic people who had built the ancient Champa Kingdom located in today's Central Vietnam ] represent the most aberrant group of dialects in China. While most of the vocabulary found in these dialects can be traced back to early Chinese sources, there remains a residue of forms which cannot be explained in this way. A possible explanation of such words would be to consider them relic forms from the non-Chinese language spoken in this region before the Chinese began to settle there in the Han dynasty. [ Note that modern Chinese Sinologists classify Guangdong and Minnan dialect are of the Sino-Tibetan linguistic family based on their dominant Sinitic vocabularies. Imagine if Vietnam were not an independent country, then it is so apparent such a case as well. Invertly, if Fukien or Canton had survived the Han dominion, they would have been parallel cases of Vietnam with Fujian and Guangdong as independent states. As a result, all our classification for their languages should have been gone different direction, not as of Sino-Tibetan but Yue or AA languages, similar to what has happened to the classification of Vietnamese then. Written Chinese records prove that is the case, though, to say the least. In fact, accordrding to Haeree Park in Recent Advances in Old Chinese Historical Phonology (Society of Oriental and African Studies, University of London) "The Chu language is an attested language spoken during the Chu state of China, and is believed to be the base precursor to Xiang Chinese, a dialect spoken in Hunan. The Chu language is classified as a Sino-Tibetan language and sometimes referred to as "Para-Sinitic" meaning that it is related to the Chinese language having shared a common Proto-Sinitic ancestor. It had its own characters modeled on Zhou dynasty Chinese characters." ] The pre-Han inhabitants of Fukien were the MinYue; they appear to have been a semi-civilized state which was finally destroyed by Han Wu in 110 B.C. [ and ancient Vietnam as part of the NanYue Kingdom in 111 B.C. ]

(5) 虎 ‘tiger’ ** k’la(g)/χuo/hu

‘tiger’ in AA: kalaʔ; Munda ki’rɔ, kul, kula, kilo, etc.; Old Mon kla; Mon kla; Bahnar, Sedang kla; Sue kala; Brou klo; Old Khmer, klɔ; Khmer khla ‘felines’; Khasi khla; VN khɔi; Muong k’al, k’lal, kanh, etc. Pinnow reconstructs the Proto-Munda form as kala (Pinnow, p. 142, item 281), and we propose an alternate Proto-AA form, kalaʔ

The Chinese 虎 hu belongs to the OC 魚 yü group. According to Yakhontov, Pulleyblank, and Li, this group had /–a/ as its main vowel [ cf. Viet. 'cá' (fish) ]. It may or may not have had a final voiced consonant of some sort in OC; Yakhontov has none, and Li would have –g. In Li’s system 虎 MC /χuo/ would derive from an OC /χag/. Now, 虎 serves as the phonetic in some words with MC l-initial: 盧 MC luo, 慮 MC liwo, etc. Therefore, in Li’s system, 虎 hu ‘tiger’ could be reconstructed as χlag, since his OC medial –l- simply drops in MC; -r- on the other hand yields the Division II vowels. Further, certain Western Min dialects have an initial aspirated k’- in the word for tiger: Kienyang /k’o/, Shaowu /k’u/ [ cf. Viet. 'cọp' /kɔp8/ and 'hùm' 甝 hán (SV hàm) all for 'tiger' while 虎 hu is 'hổ' in Sino-Vietnamese. ]. This is not an isolated phenomenon in Min; for example, 許 Amoy k’ɔ, but MC xwo; 火 Kienyang k’ui, MC χu; Foochow k’auʔ, MC χut. From this we can see that MC χ- (in some cases) may go back to a stop k’-. Since 虎 is one of the words involved in this change, we are justified in reconstructing it as **k’la(g). This form is very close to Pinnow’s Proto-Munda reconstruction kala.

Our reconstruction of the Proto-AA form as kalaʔ is motivated by the fact that –ʔ is present in the word for tiger in several Munda languages. The Chinese word 虎 hu ‘tiger’ is in the rising tone, and one of the present authors has argued elsewhere that the MC rising tone derives from a final glottal stop ʔ. If so, the correspondence between Proto-AA kala and OC k’la is even closer.

There is no plausible Tibeto-Burman etymology for 虎 hu ‘tiger’; Tibetan has stag ‘tiger’, a totally unrelated word; Old Burmese has kla, a loan from Mon. The present habitats of the tiger (Panthera tigris) in China are the Southwest, the Southeastern coastal area, the Yangtze valleys, and Manchuria, with South China as the area of highest concentration. Appearances of the tiger in historical records coincide with the above, but also include northern Hebei and Shansi. Skeletal remains of the tiger were also found at the site of Anyang, in Henan. The distribution of the tiger is noteworthy in two respects: the heaviest concentration is in South China, presumably the habitats of the AA’s. The area of total absence includes the steppes and loessland of northwest China, the probably homeland of the Sino-Tibetan ancestors of the Chinese. From this perspective, it is easy to see why there is no word for tiger in Sino-Tibetan, or in the oldest stage of Chinese. The word attested as a pictograph in the oracle bones was derived from the AA’s word for tiger loaned the Chinese.

It is possible that 虎 had a disyllabic doublet, derived from the same AA source. The Zuozhuan says ”楚人謂乳穀，謂虎於菟。“' “The Chu people call ‘to nurse’ 穀, and ‘tiger’ 於菟”. The initial of 於 has the value y– in MC, but here is some reason to believe that its OC value is k- or k’-. 於 is a variant of 于, and the latter was used to transcribe “khotan” in the Shiji: 于闐，于闐also has a variant 狗竇; Kuo P’u’s (郭璞) commentary to the Fangyan states under 於虎菟 ‘tiger’: 今江南山夷呼虎為虎兔，音狗竇, “Nowadays the hill tribes in the south of the Yangtze call ‘tiger’ 虎兔, pronounced as 狗竇 (MC kəutəu).” The OC form of the Chu word for ‘tiger’ was therefore something like kat’a.

The only difference between AA kalaʔ and Chinese is –l- versus –t’- or –t-, which may conceivably be postulated that some AA forms have a dental stop: Pinnow regards Khmer khla ‘felines’ as a cognate of dho (thom) ‘tiger royal,’ and according to Kuhn, Karia kiɔʔ < kil-dɔʔ (Pinnow, p. 142). Kuiper has noted that there is a variation among Munda d, t, and l in initial position. It may be that AA kala had a dialect form kata, and the latter was represented by the Chu word for ‘tiger’.

(6) 囝 FC kiaŋ/AM kiă ‘son, child’

This word is attested for all Min dialects. From the conservative dialects of northeastern Fukien, the word originally ended in –n: Fuan kiɛŋ, Ningteh kian. The Proto-Min form was probably something like *kian with the tone which corresponds to the classical shang or rising tone. This word is attested textually quite early. The Tang poet Gu Kuang 顧况 living around 725-816, composed a poem using the word 囝: “it is pronounced like the word 蹇 (MC kɒn, shang tone); in Fukien ‘son’ is called 囝 in the popular language.” This word is clearly the same as the modern Min words for ‘son, child‘，related to the AA etymon represented by VN con ‘child'. This etymon is very widely distributed throughout Austroasiatic: Khmer koun, Spoken Mon kon, written Mon kon, kwen, Bru kɔɔn, Chong kheen, Wa kɔn, Khasi khu:n; it is also well represented in Munda: Kharia kɔnn ‘small,’ Santali ‘son, child,’ Hohon ‘child’. The Min form agrees with the AA forms predominantly show low to mid unrounded vowels. The Min form of Kienyang kyeŋ, however, has a rounded medial which may indicate that the Min forms derive from some type of earlier rounded vowel.

(7) 弩 *na/nuo/nu ‘crossbow’

‘crossbow’ in AA: VN nõ, ná; Proto-Mnong so’na; Proto-Tai hnaa.

Cf. Mon, Old Mon tŋa; Palaung kaŋ, kaŋaʔ; Tibeto-Burman: Nung the-na; Mso ta-na.

The crossbow is at present widely used by the tribes in southwest China and Indo-China. Early references to the crossbow in Chinese texts point to that general region as the place of origin. The Hanshu [ 漢書 ] explicitly mentioned the crossbow as one of the weapons used by the tribes inhabiting Hainan Island, and implied that it was also used by other tribes farther south. Sichuan was famous for its crossbow. Both the Hua Yang Guo Zhi [ 華陽國志 ] and the Hou Hanshu [ 後漢書] reported that when a white tiger roamed the area of Qin and neighboring states, a man from Pa [ 巴郡閬中人 ] had to be called in, and he killed the tiger with a crossbow made of white bamboo. King Anyang [ 安陽王 Andươngvương ], a prince from Sichuan, is said to have brought along the crossbow as he entered Vietnam when Zhao Tuo [ 趙佗 Triệu Đà ] tried to conquer Vietnam at the end of the Qin Dynasty; he was for a time stymied by King Anyang’s archers using crossbow.

The fact that the crossbow has a southern distribution, past and present, suggests that it was acquired by the Chinese. Phonology provides another reason. The Tai and Vietnamese, because of their proximity to Chinese speaking people, were the most likely points of contact. The Tai form implies voiceless initial sn-. VN ná is in the sắc tone, which comes from a voiceless initial. Proto-Mnong so’na indicates that perhaps the Proto-AA form should be s-na. Now, under the hypothesis that Chinese borrowed this word from AA, we only need to assume that s- (or the voicelessness of the initial n-) was lost in the process of borrowing. Under the contrary hypothesis that the loan was in the opposite direction, none of the AA or Thai forms can be easily explained [ cf. 弩 nú (SV 'nỗ) = VS/nỏ/ vs. Viet. 'ná', i.e., /sna/ > OC */na/ > 弩 nú > VS 'nỏ' ].

The Japanese scholar Fujita Toyohchi believes that the Chinese crossbow came from India, on the strength of the Sanskrit word dhanu ‘bow’ and the fact that India already used the crossbow in warfare during the fourth century B.C. The Sanskrit word may have something to do with Mon and Old Mon tŋa, Nung thə-na, Moso ta-na, but is unlikely to be the direct source of 弩; 弩 belongs to the MC 魚 rhyme, and as Chou Fa-kao has shown, Sanskrit –o and –u were regularly transcribed before the Tang dynasty by words belonging to 尤，侯，虞，模 rhymes but seldom by words belonging to the 魚 rhyme [ i.e. /-a/ 魚 = VS 'cá' vs. 弩 = VS 'ná' vs. /nỏ/ ].

(8) FC tyɔŋ/AM tɔŋ ‘shaman, spirit healer, medium’

It is not entirely clear whether the word in question is basically a nominal or verbal root since it occurs in constructions of both types. Thus in FC dialect we have tyɔŋ-tsi ‘shaman’, tyɔŋ-tsi ‘to shamanize,’ tyɔŋ-siŋ id., phaʔ-tyɔŋ ‘shaman’s assistant’ [ cf. VS 'đồngbóng' ( <~ 'thầybóng') ]; in Amoy we have id. (to dance under the influence of spirits), (note: both and mean ‘to leap, to dance’), the spirit leaves the shaman’, ‘to become possessed.’ In the Kienyang dialect (northwest Fukien) we have ‘shaman’ and ‘to become possessed'. Yungan (Central Fukien) has ‘to shamanize’ (‘to jump, to dance’), ‘shaman'. The common element in all these expressions is Foochow, Amoy, Kienyang, and Yungan; these forms point back to a Proto-Min 'id.' in the tonal category corresponding to the classical p’ing tone. All of the dialects show lower register (yang) tones indicating that the protoform had a voiced initial [ d- ]. The word in question is sometimes written with the character 童 tóng [ cf. SV 'đồng' < MC duŋ < OC dhoŋ as there exists the Chinese word 乩童 jītóng (SV kêđồng) 'kẻbóng' (?) for 'child shaman' and V 'lênđồng' to mean 'to dance under the influence of spirits.' In Sintic-Vietnamese, there are other Sinitic-Vietnamese words for 'shaman' that cought our attentention, e.g, 'thầymo', 'thầymô', 'sưmô', and , especially, 'phùthuỷ' as nouns, that is absolutely cognate to the Chinese 巫師 wúshī (VS 'phùthuỷ') where 'phù' = 'mô' = 巫 wū, wú < MC mʊ < OC mha and 'thuỷ' = 'thầy' = 師 shī < MC ʂɨ < OC *srij. In any cases we are just attempting to mix and match the AA form with both Chinese and Vietnamese cognates. ]

In Vietnamese we find a word which both semantically and phonologically corresponds to the unexplained Min etymon perfectly: ‘to shamanize, to communicate with spirits', ‘male, shamanistic spirit', ‘to shamanize, to communicated with spirits', ‘shameness', ‘female shamanistic spirit', ‘shaman, sorcerer' [ Does the author mean SV 'kêđồng' (乩童 jītóng) or 'kẻbóng' as noted above? ]. This word is not confined to Vietnamese within Austroasiatic. In Written Mon the cognate is ‘to dance (as if) under daemonic possession', ‘dance of shaman'. In Modern Mon the corresponding form is ‘to leap with the feet together, to proceed by leaps, to dance while under daemonic possession, to climb’; Shorto also lists a derived meaning ‘shaman(?)'. Further AA connections can be adduced: Shorto links the written Mon form with Khasi lyngdoh ‘priest’; to support this equation, one can cite similar examples of Mon final –ʔ corresponding to Khasi –h: spoken Mon, written Mon ‘belly,’ Khasi ‘id.’ On the Munda side, there are at least three good cognates: Santali ‘a kind of dance, drumming and singing connected with marriage’; Ho dong ‘a wedding song’; Sora toŋ ‘to dance'.

(9) AM/Fu‘an tam ‘damp, wet, moist’

These forms which are attested in most eastern Min dialects except Foochow can be related to VN [ 'tẩm' = 浸 jìn, jīn < MC cjɨm < OC cim, cims (?) ] (wet, moist).

(10) FC siŋ/AM tsim ‘a type of crab’

These forms may bear some relationship to VN 'sam' [ M 蟳 xún, SV 'tầm' (hairy sea crab) ]. The VN form is probably further related to Mon-Khmer forms such as Bahnar, Written Mon 'khatham' [ Whether the postulation of this Mon etymon is truthful or not, it is interesting to note that in SIntic-Vietnamese, besides SV 'tầm' there exist other words for different types of crabs, such as 'bakhía' (a kind of small crabs that live in paddies) that happens to be cognate to either 螃蟹 pángxiè or 蟛蜞 péngqí (amphibious crab). Meanwhile, 蟹 (SV giải: xiè, xiě, xié < MC ɠa < OC ghre:ʔ, kre:ʔ < PC: **qre:jh) alone is certainly cognate to VS 'ghẹ' (a type of sea crabs with thin long legs and a skinny pair of claws), meanwhile, 蜞 qí > 'cáy' (small crab), and 蟹 xiè > 'cua' (crab). ]

(11) FC paiʔ/AM bat ‘to know, to recognize’

AM b- generally corresponds to FC m-; the upper register tone with a voiced initial is also incongruous. Douglas gives a Tung-an form pat for Southern Min, so we regard the AM form as irregular. We can compare all these forms with VN 'biết' (to know, to recognize) [ cf. Hainanese /bat7/ vs. Hainanese-Chinese form /taj1/ ( 知 zhì: SV 'trí' /ʈej7/) that evolved into VS 'hay'/χɑj1/. Could it be possible that the word is totally Chinese contracted form transcribed as 捌 /bat/ from the dysyllabbic form 明白 /mɓat7/ (understand, know) where spoken Min dialects of 明 being pronounced variably as bêng, bîn, miâ, mê, mêe, mî, môa, mâ... e.g., 伊毋捌字. i m̄ at jī. "He cannot read." ]

(12) FC p’uoʔ/AM p’eʔ . cf. Fu'an p’ut ‘scum, froth’

Compare VN 'bọt' (scum, bubbles, froth) [ Chin. M 泡 pào, pāo < MC phaw < OC phra:ws, phru:s. Besides the derivatives such as VS 'bọtnước' (水泡 shuǐpào), in its own development, the Vietnamese word formed with 'bèo' (see below) to make the compound 'bèobọt' (‘worthless, scum, froth’) to associate it with the concepts of 'drifting about' (萍浮 píngfú or 萍泊 píngbó), hence, the 'unworthy'. ]

(13) 萍 FC p’iu /AM p’io ‘duckweed’

This word is recorded in Kuo P’u’s (AD 276-324) commentary to the Erya 爾雅 where he states that p’iao was the Jiangdong 江東 (East of the Yangtze) word for ‘duckweed'. VN 'bèo' (duckweed) is obviously related to all these forms. The VN form is probably further related to spoken Mon pè, Written Mon bew ‘to ride low in the water.’ [ And of course, there is certainly no doubt that the Chinese cognate is píng 萍 < MC bhēŋ, pjəjŋ < OC bieŋ, bjəjɲ; cf. 'lụcbình' 綠萍 lǜpíng (floating duckweed), 'bèogiạtmâytrôi' 萍梗浮雲 pínggěngfúyún (idiomatic, literally, floating duckweed and drifting cloud; figuratively, 'of uncertain whereabouts, destined to be gone with the wind, wandering fate'.) ]

(14) FC kie/AM kue, cf. Kienyang ai ‘(small) salted fish’

Baldwin and Maclay define the Foochow word as follows: “a kind of salted seafish; it is small varying from one to four or five inches in length.” There is a VN word kè which is defined as a ‘type of fish'; 'it is small and resembles the gecko.’ [ Even though the modern Vietnamese who are living along the coastline and netfishing has long been their major livelihood, they do not call fish solely by specialized 'type of fish' but the word must be compounded with a modifier with the classifer-prefix "cá-" /ká/ (fish) -- lexically similar to English 'catfish', 'snakefish', etc., -- in this case the word will become a binome called /cákè/, probably the gecko. ] The primitive Yue etymon probably meant a small fish of some sort, and the specialization of meaning took place in the various languages later. [ The aforesaid phenomenon did not happen in the Vietnamese as previously mentioned. Meanwhile V 'cá' /ká/ is plausibly cognate to 魚 yú (fish) < MC ŋʊ < OC *ŋa < PC **ŋja. See also APPENDIX N (A few lines in the whole article is the riders for the case of Vietnamese "cá", or "fish", in Minnan dialects. Specifically, words for fish of different kinds come with the prefix "cá-" + modifier, e.g., ]

[...]Above we have demonstrated that the language of the NanYue was most likely Austroasiatic. Might we not go one step further and suppose that all the various Yüeh peoples of ancient southeastern China were AA speaking? In other words, we would propose that the term Yeh was essentially linguistic. If this supposition is correct, then the present day Min dialects have an AA substratum, and we should expect to find a certain number of relic words of AA origin in these dialects. We believe that this is indeed the case. [...] It is noteworthy that the forms we discuss are best represented in Vietnamese. This is not surprising since the modern Vietnamese are the descendants of the ancient Yüeh and their present territory represents the AA-speaking region closest to Fukien and northeastern Kwangtung.

Source: The Austroasiatics in Ancient South China: Some Lexical Evidence

APPENDIX H

THE NEW SINO-VIETNAMESE WAR

1991: Sino-Vietnamese Detente period in Vietnam's East Sea (S. China Sea) (1991-present)
1979: Sino-Vietnamese conflicts (1979-90)
1979: Sino-Vietnamese Border's War
1988: Johnson South Reef Skirmish in Vietnam's East Sea (So. China Sea)
1974: Battle of the Paracel Islands in Vietnam's East Sea (So. China Sea)

THE HISTORY OF OTHER PAST SINO-VIETNAMESE WARS
1789: Tâysơn Dynasty -- Defeat of the Qing in Ngọchồi, Hànội
1427: Battle of Chilăng for independence: Lê Dynasty
1407: 4th Chinese domination of Vietnam (1407-27)
1287: Battle of Bạchđằng River (against the Third Mongol Invasion)
1284: General Trần Hưng Đạo (against the Second Mongol Invasion)
1257: Trần Thái Tông (Trần Dynasty -- against the First Mongol Invasion)
1075: Lý Dynasty -- the War against the Song
981: Battle of Bạchđằng River (2)
938: Battle of Bạchđằng River (1)
602: 3rd Chinese domination of Vietnam (602–938)
544: Lý Nam Đế (Early Lý Dynasty 544-602)
43 2nd Chinese domination of Vietnam (43–544)
40–43 A.D.: Trưng Sisters (againt the Han's domination)
111 B.C.: 1st Chinese domination of Vietnam (111 B.C.–40 A.D.)
208 B.C.: Triệu Dynasty
218 B.C.: Triệu Đà
221 B.C.: Andương Vương
258 B.C.: Vănlang

Approximately between 22nd–21st century B.C.: The Legendary Hồngbàng Dynasty

x X x

APPENDIX I

Sino-Vietnamese vocabulary explained

Sino-Vietnamese vocabulary (từ Hán Việt, Chữ Hán: 詞漢越, literally 'Chinese-Vietnamese words') is a layer of about 3,000 monosyllabic morphemes of the Vietnamese language borrowed from Literary Chinese with consistent pronunciations based on Middle Chinese. Compounds using these morphemes are used extensively in cultural and technical vocabulary. Together with Sino-Korean and Sino-Japanese vocabularies, Sino-Vietnamese has been used in the reconstruction of the sound categories of Middle Chinese. Samuel Martin grouped the three together as "Sino-xenic". There is also an Old Sino-Vietnamese layer consisting of a few hundred words borrowed individually from Chinese in earlier periods, which are treated by speakers as native words. More recent loans from southern Chinese languages, usually names of foodstuffs such as Vietnamese: lạp xưởng 'Chinese sausage' (from Cantonese), are not treated as Sino-Vietnamese but more direct borrowings.

Estimates of the proportion of words of Sinitic origin in the Vietnamese lexicon vary from one third to half and even to 70%. The proportion tends towards the lower end in speech and towards the higher end in technical writing. In the famous dictionary by Vietnamese linguist, about 40% of the vocabulary is of Sinitic origin.[1]

Monosyllabic loanwords

As a result of a thousand years of Chinese control, a small number of Sinitic words were borrowed into Vietnamese, called Old Sino-Vietnamese layer. Furthermore, a thousand years of use of Literary Chinese after independence, a considerable number of Sinitic words were borrowed, called the Sino-Vietnamese layer. These layers were first systematically studied by linguist Wang Li.

The ancestor of the Vietic languages was atonal and sesquisyllabic, featured many consonant clusters, and made use of affixes.The northern Vietic varieties ancestral to Vietnamese and Muong have long been in contact with Tai languages and Chinese as part of a zone of convergence known as the Mainland Southeast Asia linguistic area.As a result, most languages of this area, including Middle Chinese and Vietnamese, are analytic, with almost all morphemes monosyllabic and lacking inflection. The phonological structure of their syllables is also similar.Traces of the original consonant clusters can be found in materials from the 17th century, but have disappeared from modern Vietnamese.

The Old Sino-Vietnamese layer was introduced after the Chinese conquest of the kingdom of Nanyue, including the northern part of Vietnam, in 111 BC. The influence of the Chinese language was particularly felt during the Eastern Han period (25–190 AD), due to increased Chinese immigration and official efforts to sinicize the territory. This layer consists of roughly 400 words, which have been fully assimilated and are treated by Vietnamese speakers as native words. It has also been theorised that some Old-Sino-Vietnamese words came from a language shift from a population of Annamese Middle Chinese speakers that lived in the Red River Delta, in northern Vietnam, to proto-Viet-Muong.[2]The much more extensive Sino-Vietnamese proper was introduced with Chinese rhyme dictionaries such as the Qieyun in the late Tang dynasty (618–907). Vietnamese scholars used a systematic rendering of Middle Chinese within the phonology of Vietnamese to derive consistent pronunciations for the entire Chinese lexicon. After driving out the Chinese in 880, the Vietnamese sought to build a state on the Chinese model, using Literary Chinese for all formal writing, including administration and scholarship, until the early 20th century. Around 3,000 words entered Vietnamese over this period. Some of these were re-introductions of words borrowed at the Old Sino-Vietnamese stage, with different pronunciations due to intervening sound changes in Vietnamese and Chinese, and often with a shift in meaning.

**Examples of multiple-borrowed Sinitic words**
Chinese (Old > Middle)	Old Sino-Vietnamese	Sino-Vietnamese
味 mjəts > mjɨjH*	mùi 'smell, odor'	vị 'flavor, taste'
本 pənʔ > pwonX*	vốn 'capital, funds'	bản 'root, foundation'
役 wjek > ywek*	việc 'work, event'	dịch 'service, corvee'
帽 muks > mawH*	mũ 'hat'	mạo 'hat'
鞋 *gre > hɛ	giày 'shoe'	hài 'shoe'
嫁 kras > kæH*	gả 'marry'	giá 'marry'
婦 bjəʔ > bjuwX*	vợ 'wife'	phụ 'woman'
跪 gjojʔ > gjweX*	cúi 'bow, prostrate oneself'	quỳ 'kneel'
禮 rijʔ > lejX*	lạy 'kowtow'	lễ 'ceremony'
法 pjap > pjop*	phép 'rule, law'	pháp 'rule, law'

Wang Li followed Henri Maspero in identifying a problematic group of forms with "softened" initials g-, gi, d- and v- as Sino-Vietnamese loans that had been affected by changes in colloquial Vietnamese. Most scholars now follow André-Georges Haudricourt in assigning these words to the Old Sino-Vietnamese layer.

Sino-Vietnamese shows a number of distinctive developments from Middle Chinese:

Sino-Vietnamese distinguishes Early Middle Chinese palatal and retroflex sibilants, which are identified in all modern Chinese languages, and had already merged by the Late Middle Chinese period.

Sino-Vietnamese reflects Late Middle Chinese labiodental initials, which were not distinguished from labial stops at the Early Middle Chinese phase.

Middle Chinese grade II finals yield a palatal medial -y- like northern Chinese languages but unlike southern ones. For example, Middle Chinese kæw yields SV Vietnamese: giao, Cantonese gaau and Beijing Chinese: jiāo.

Modern compounds

Up until the early 20th century, Literary Chinese was the vehicle of administration and scholarship, not only in China, but also in Vietnam, Korea and Japan, similar to Latin in medieval Europe. Though not a spoken language, this shared written language was read aloud in different places according to local traditions derived from Middle Chinese pronunciation: the literary readings in various parts of China and Sino-Xenic pronunciations in the other countries.

As contact with the West grew, Western works were translated into Literary Chinese and read by the literate. In order to translate words for new concepts (political, religious, scientific, medical and technical terminology) scholars in these countries coined new compounds formed from Chinese morphemes and written with Chinese characters. The local readings of these compounds were readily adopted into the respective local vernaculars of Japan, Korea and Vietnam. For example, the Chinese mathematician Li Shanlan created hundreds of translations of mathematical terms, including Chinese: 代數學 ('replace-number-study') for 'algebra', yielding modern Mandarin dàishùxué, Vietnamese đại số học, Japanese daisūgaku and Korean daesuhak. Often, multiple compounds for the same concept were in circulation for some time before a winner emerged, with the final choice sometimes differing between countries.

A fairly large amount of Sino-Vietnamese compounds have meanings that differ significantly from their usage in other Sinitic vocabularies. For example:

bác sĩ is widely used with the meaning 'physician' or 'medical doctor', while in Mandarin it refers to a doctoral degree;

tiến sĩ (進士) is used to refer to 'doctoral degree', whilst in Mandarin it is used to refer to 'successful candidate in the highest imperial civil service examination'.

bạc 'silver' is the Old Sino-Vietnamese reflex of Old Chinese *bra:g 'white', cognate with later Sino-Vietnamese bạch 'white' and Non-Sino-Vietnamese bệch '(of complexion) chalky', yet in Mandarin means 'thin sheet of metal' (variants:,) and 鉑 (pinyin: bó) has also acquired the meaning 'platinum', whose Sino-Vietnamese name is bạch kim, literally 'white gold';

luyện kim means 'metallurgy' instead of its original meaning, 'alchemy';

giáo sư means 'teacher' in Mandarin, but is now associated with 'professor' in Vietnamese.

English "club" became kurabu in Japan, was borrowed to China, then to Vietnam, is read as câu lạc bộ, and abbreviated CLB, which can be an abbreviation for club.

linh miêu means 'civet' in Mandarin but means 'lynx' in Vietnamese.

ân nghĩa ~ ơn nghĩa not only retains its original Sinitic meaning "feeling of gratitude"^[3] ^[4] ^[5] but also acquires the extended meaning "favor, kindness".^[6]

thời tiết (時節) is used with the meaning of 'weather", while in Mandarin, it means a 'season' (mainly refers to a specific period of time, often within the context of a particular season).

thư viện (書院) means 'library' in Vietnamese, but in Mandarin, it refers to a 'study room' or an 'academy'.

phương phi (芳菲) is an adjective meaning 'fat' or 'corpulent', but in Mandarin, it means 'fragrant' or 'fresh-smelling'.

ung thư (癰疽) means 'cancer' in Vietnamese, but in Mandarin, it is a term used in traditional Chinese medicine meaning a 'skin abscess'.

thập phân (十分) means 'decimal' in Vietnamese, but in Mandarin, it means 'very'; 'extremely'.

thương (傷) has the meaning 'to like, to love', while also sharing the common meaning of 'to (be) injured, wounded' with Mandarin.

thư (書) refers to a letter, while in Mandarin, it means book. (Vietnamese uses sách (冊) instead)

There also a significant amount of Sino-Vietnamese compounds that are used, but the terms differ in different Sinosphere languages. Such as:


English university student	Vietnamese sinh viên 生員	Mandarin 大學生 /大学生 dàxuéshēng	Cantonese 大學生 /大学生 daaihhohksāang	Japanese 大学生 daigakusei	Korean 대학생 (大學生) daehaksaeng
professor	giáo sư 教師	教授 jiàoshòu	教授 gaausauh	教授 kyōju	교수 (敎授) gyosu
bachelor (academic degree)	cử nhân 舉人	學士/学士 xuéshì	學士/学士 hohksih	学士 gakushi	학사 (學士) haksa
doctorate (academic degree)	tiến sĩ 進士	博士 bóshì	博士 boksih	博士 hakushi	박사 (博士) baksa
library	thư viện 書院	圖書館 /图书馆 túshūguǎn	圖書館 /图书馆 tòuhsyūgún	図書館 toshokan	도서관 (圖書館) doseogwan
office	văn phòng 文房	事務所 /事务所 shìwùsuǒ	事務所 /事务所 sihmouhsó	事務所 jimusho	사무소 (事務所) samuso
map	bản đồ 版圖	地圖/地图 dìtú	地圖/地图 deihtòuh	地図 chizu	지도 (地圖) jido
clock	đồng hồ 銅壺	鐘/钟 zhōng, 時計/时计 (literary) shíjì	鍾/钟 jūng	時計 tokei	시계 (時計) sigye
hotel; inn	khách sạn 客棧	酒店 jiǔdiàn, 旅館/旅馆 lǚguǎn	酒店 jáudim, 旅館/旅馆 léuihgún	ホテル hoteru, 旅館 (traditional inn) ryokan	여관 (旅館) yeogwan
demonstration	biểu tình 表情	示威 shìwēi	示威 sihwāi	示威 shii	시위 (示威) siwi
autism	tự kỷ 自己	自閉症 /自闭症 zìbìzhèng	自閉症 /自闭症 jihbaijing	自閉症 jiheishō	자폐증 (自閉症) japyejeung

Self-coined Sino-Vietnamese compounds

Some Sino-Vietnamese compounds are entirely invented by the Vietnamese and are not used in any Chinese languages, such as linh mục 'priest' from 'soul' and 'shepherd', or giả kim thuật ('art of artificial metal'), which has been applied popularly to refer to 'alchemy'. Another example is linh cẩu ('alert dog') meaning 'hyena'. Others are no longer used in modern Chinese languages or have other meanings.


Definition farm	Chinese characters 莊寨	Vietnamese alphabet trang trại
city	城庯	thành phố
week	旬禮	tuần lễ
to be present at	現面	hiện diện
to entertain	解智	giải trí
to lack	少寸	thiếu thốn
to be proud	倖面	hãnh diện
pleasant to the eyes	玩目	ngoạn mục
orderly; proper	眞方	chân phương
(polite, respectful) you	貴位	quý vị
traditional	古傳	cổ truyền
festival	禮會	lễ hội
legend	玄話	huyền thoại
to satisfy	妥滿	thoả mãn
polite	歷事	lịch sự
important; significant	關重	quan trọng
millionaire	兆富	triệu phú
billionaire	秭富	tỷ phú
thermometer	熱計	nhiệt kế
(mathematics) matrix	魔陣	ma trận
biology	生學	sinh học
subject	門學	môn học
average	中平	trung bình
cosmetics	美品	mỹ phẩm
surgery	剖術	phẫu thuật
allergy	異應	dị ứng
hearing-impaired	欠聽	khiếm thính
bacteria; microbe; germ	微蟲	vi trùng
to update	及日	cập nhật
data; information	與料	dữ liệu
forum	演壇	diễn đàn
a smoothie (drink)	生素	sinh tố
dojo; martial art school	武堂	võ đường
cemetery	義地	nghĩa địa
a surgical mask	口裝	khẩu trang
thermometer	熱計	nhiệt kế
television (medium)	傳形	truyền hình
broadcast	發聲	phát thanh
animation	活形	hoạt hình
subtitles	附題	phụ đề
to transcribe	翻音	phiên âm
to transliterate	轉字	chuyển tự
visa	視實	thị thực
(informal) nurse; a medical assistant	醫佐	y tá
a specialist in humanities; an artist, painter, musician, actor, comic, etc.	藝士	nghệ sĩ
a singer	歌士	ca sĩ
a musician, especially a songwriter or a composer	樂士	nhạc sĩ
a poet	詩士	thi sĩ
a dentist	牙士	nha sĩ
an artist (painter)	畫士	hoạ sĩ
a member of any legislative body.	議士	nghị sĩ
prison	寨監	trại giam
victim	難人	nạn nhân
special forces	特攻	đặc công

Proper names

See also: Vietnamese exonyms. Since Sino-Vietnamese provides a Vietnamese form for almost all Chinese characters, it can be used to derive a Vietnamese form for any Chinese word or name. For example, the name of Chinese leader Xi Jinping consists of the Chinese characters Chinese: 習近平. Applying Sino-Vietnamese reading to each character yields the Vietnamese translation of his name, Tập Cận Bình.

Some Western names and words, approximated to Chinese languages often through Mandarin or in some cases approximated in Japanese and then borrowed into Chinese languages, were further approximated in Vietnamese. For example, Portugal is transliterated as Chinese: 葡萄牙 and becomes Bồ Đào Nha in Vietnamese. England became Anh Cát Lợi, shortened to Anh, while United States became Mỹ Lợi Gia, shortened to Mỹ . The formal name for the United States in Vietnamese is Hoa Kỳ ; this is a former Sinitic name of the United States and translates literally as "flower flag".

Country	Sinitic name	Mandarin Pinyin	Cantonese Yale	Vietnamese name
	澳大利亞	Àodàlìyǎ	Oudaaihleih'a	Úc
	奧地利	Àodìlì	Oudeihleih	Áo
	比利時	Bǐlìshí	Béileihsìh	Bỉ
Czechia	捷克	Jiékè	Jithāak	Tiệp Khắc
	法蘭西	Fǎlánxī (China), Fàlánxī (Taiwan)	Faatlàahnsāi	Pháp
	德意志	Déyìzhì	Dākyiji	Đức
	意大利	Yìdàlì	Yidaaihleih	Ý
	荷蘭 (from 'Holland', a misnomer)	Hélán	Hòhlāan	Hà Lan
	普魯士	Púlǔshì	Póulóuhsih	Phổ
	俄羅斯	Éluósī	Ngòhlòhsī	Nga
Spain	西班牙	Xībānyá	Sāibāanngàh	Tây Ban Nha
	南斯拉夫	Nán Sīlāfū	Nàahm Sīlāaifū	Nam Tư

Usage

Sino-Vietnamese words have a status similar to that of Latin-based words in English: they are used more in formal context than in everyday life. Because Chinese languages and Vietnamese use different order for subject and modifier, compound Sino-Vietnamese words or phrases might appear ungrammatical in Vietnamese sentences. For example, the Sino-Vietnamese phrase bạch mã ("white horse") can be expressed in Vietnamese as ngựa trắng ("horse white"). For this reason, compound words containing native Vietnamese and Sino-Vietnamese words are very rare and are considered improper by some. For example, chung cư ("apartment building") was originally derived from chúng cư ("multiple dwelling"), but with the syllable chúng "multiple" replaced with chung, a "pure" Vietnamese word meaning "shared" or "together". Similarly, the literal translation of "United States", Hợp chúng quốc is commonly mistakenly rendered as Hợp chủng quốc, with chúng (- many) replaced by chủng (- ethnicity, race). Another example is tiệt diện (; "cross-section") being replaced by tiết diện .

One interesting example is the current motto of Vietnam : "Cộng hòa Xã hội chủ nghĩa Việt Nam / Độc lập – Tự do – Hạnh phúc", in which all the words are Sino-Vietnamese (– –).

Writing Sino-Vietnamese words with the Vietnamese alphabet causes some confusion about the origins of some terms, due to the large number of homophones in Sino-Vietnamese. For example, both (bright) and (dark) are read as minh, thus the word "minh" has two contradictory meanings: bright and dark (although the "dark" meaning is now esoteric and is used in only a few compound words). Perhaps for this reason, the Vietnamese name for Pluto is not Minh Vương Tinh (– lit. "underworld king star") as in other East Asian languages, but is Diêm Vương Tinh and sao Diêm Vương, named after the Hindu and Buddhist deity Yama. During the Hồ dynasty, Vietnam was officially known as Đại Ngu ("Great Peace"). However, most modern Vietnamese know ngu as "stupid"; consequently, some misinterpret it as "Big Idiot". Conversely, the Han River in South Korea is often erroneously translated as sông Hàn when it should be sông Hán due to the name's similarity with the country name. However, the homograph/homophone problem is not as serious as it appears, because although many Sino-Vietnamese words have multiple meanings when written with the Vietnamese alphabet, usually only one has widespread usage, while the others are relegated to obscurity. Furthermore, Sino-Vietnamese words are usually not used alone, but in compound words, thus the meaning of the compound word is preserved even if individually each has multiple meanings.

Today Sino-Vietnamese texts are learnt and used mostly only by Buddhist monks since important texts such as the scriptures to pacify spirits (recited during the ritual for the Seventh Lunar month - Trai đàn Chẩn tế;) are still recited in Sino-Vietnamese pronunciations. Such as the chant, Nam mô A Di Đà Phật coming from 南無阿彌陀佛.

References
Sources
Further reading

Chiang . Chia-lu 江佳璐 . 2011 . Yuènán Hànzìyīn de lìshǐ céngcì yánjiū . zh:越南漢字音的歷史層次研究 . Study of Phonological Strata of Sino-Vietnamese . Taipei . National Taiwan Normal University . dead . https://web.archive.org/web/20140912074400/http://ir.lib.ntnu.edu.tw/retrieve/52226/metadata_02_01_s_05_0162.pdf . 2014-09-12 . none .

Chiang Chia-lu (江佳璐). (2014). 析論越南漢字音魚虞分韻的歷史層次 [Discussion on the Phonological Strata of Sino-Vietnamese as Reflected in the Distinction between Rhymes Yu (魚) and Yu (虞)]. Language and Linguistics, 15(5), 613–634.

Chiang Chia-lu (江佳璐). (2018). 《安南國譯語》所反映的近代漢語聲調系統 [The Tonal System of Early Mandarin Chinese as Reflected in Annanguo Yiyu]. 漢學研究, 36(2), 97–126.

Nguyen Thanh-Tung (阮青松). (2015). 漢越語和漢語的層次對應關係研究 [A study of the stratal corresponding relationship between Sino-Vietnamese and Chinese] (Master's thesis). National Chung Hsing University, Taiwan.

Phan, John D. (2010). Re-Imagining "Annam": A New Analysis of Sino–Viet–Muong Linguistic Contact. 南方華裔研究雑志 [Chinese Southern Diaspora Studies], 4, 3-24.

Phan . John, Duong . Lacquered Words: The Evolution of Vietnamese under Sinitic Influences from the 1st Century BCE through the 17th Century CE . Cornell University . 2013 . PhD thesis . 1813/33867 . free . none .

Vu . Duc Nghieu . The integration of Chinese words into the Vietnamese language . Research Institute for World Languages, Osaka University . 2010 . Departmental Bulletin Paper . 11094/8366 . free . none .

External links
Đào Duy Anh (1932), Hán Việt Từ Điển – a dictionary of Sino-Vietnamese words and expressions (in Vietnamese). volume 1 (A–M).

Miyake, Marc

Umbrous umbrella (2014); Sino-Vietnamese articles (2014); Sino-Vietnamese articles 1, 2 (2013); Sino-Vietnamese articles (2012); t for *p in Vietnamese (2012); Cantonese and Sino-Vietnamese (2010); Sino-Vietnamese articles (2010); From *m to z in Old Chinese and Vietnamese (2008).

Notes and References

Ky. Quang Muu. 2007. Doctoral thesis. Faculty of Linguistics, University of Social Sciences and Humanities, Vietnam National University, Hanoi.
Web site: Phan . John . January 2013 . Lacquered Words: the Evolution of Vietnamese under Sinitic Influences from the 1st Century BCE to the 17th Century CE . Cornell . 298–301.
[Ban Gu]
Web site: Bai Hu Tong : 卷四 : 誅伐 - Chinese Text Project . 2024-05-18 . ctext.org . zh-TW.
https://www.informatik.uni-leipzig.de/~duc/TD/td/index.php?word=%C3%A2n+ngh%C4%A9a&db=ve "ân nghĩa"
https://en.bab.la/dictionary/vietnamese-english/%C3%A2n-ngh%C4%A9a "ân nghĩa"

------------------

This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia article "Sino-Vietnamese vocabulary".

Source: Sino-Vietnamese vocabulary explained

x X x

APPENDIX J

Yueren Ge (越人歌) and the Vietnamese language
(Tiếng Việt trong bài Việt Nhân Ca)
Author: liketolearn

The Yueren song was a song in Yue language recorded in the 6th century BC by 鄂君子皙 Ngạc quân tử tích. This song has been extensively studied by Chinese historians and linguists for many decades because it's one of the few pieces of Yue language left.

This afternoon, I spent about 2 hours to examine the language in this song and I found it fit into modern Vietnamese quite well.

Before going into details, I want you to keep in mind the following:

1) This song was recorded approximately 2500 years ago, therefore you can't expect any modern language to fit perfectly into the song because all languages have changed a lot after 2500 years;

2) Vietnamese language has gone under heavy influence from Chinese and therefore many ancient Viet words were replaced by Chinese words. As a result, sometimes you can't find a modern Viet word that match the word in the Yueren song because that word was replaced by a Sino-Viet word;

3) The recording of the sound of the words in the Yueren song couldn't be 100% exact. Think about it, if you're a Chinese who doesn't know any English and you are to record the sound of an English song using Chinese characters, can you be 100% accurate in the recording of the sound?

4) The song couldn't be translated literally or word-for-word into Chinese. Think about translating a Chinese song to English and vice versa, you have to modify the song a lot in order for it to make sense in the language that it's translated into;

5) The song was recorded while it was being sung by a Yue girl; therefore, the tones of the word could be distorted or misheard because of the melody of the song.

6) The reconstruction of Chinese characters by linguists can never be exact but only close to the original sound.

Now let's take a look at the Yueren song.

滥兮抃草滥予？
昌枑泽予？
昌州州湛[饣甚]
州焉乎秦胥胥
缦予乎昭澶秦逾渗
惿随河湖

Here are two Chinese translations I found:

1) 晚今是晚哪？
正中船位哪？
正中王府王子到达。
王子会见赏识我小人感激感
天哪知王子与我小人游玩。
小人喉中感受

2) 今夕何夕兮？搴洲中流。今日何日兮？得与王子同舟。
蒙羞被好兮，不訾诟耻。心几顽而不绝兮，得知王子。
山有木兮木有枝，心悦君兮君不知。

Here, I'll stick to the first translation because it's simpler. If you have any other translated version of this song, please share it with me.

Now let's see how the words in this song match Vietnamese. I will post each sentence in Chinese characters first, then Mandarin pinyin, then Sino- Vietnamese, then Old Chinese Reconstructions of Karlgren and Starostin (characters that have no reconstruction would be put in a bracket [ ] with a question mark next to it), then I'll list the possible translations of the sentence, then I'll match them with Vietnamese words.

I use the Old Chinese reconstruction of Karlgren and Starostin because these are the only two that I have access to.

The first sentence is 滥兮抃草滥予？

Mandarin pinyin: làn xi biàn căo làn yú?
Sino-Vietnamese: Lạm hề biện thảo lạm dư?
Old Chinese Reconstruction of Karlgren: glâm jiei b'ian ts'ôg glâm dio
Old Chinese Reconstruction of Starostin: [滥?] g(h)ēj b(h)rens shūʔ [滥?] Ła
This sentence was translated as "Which day is it today?" or "Which night is it tonight?"

The word 滥 reads Lạm in Sino-Vietnamese. Karlgren reconstructed this word as glâm and it was matched with the word haemx in Zhuang language, which also means "night". I don't know what reason Karlgren gave for the reconstruction of a [g] before [lam], but with this reconstruction, it could go glam -- > gam --> gham --> haem in Zhuang language.

Well, in Vietnamese language, we also have the word hôm, which could mean "night" or "day", depending on which word it goes with.

In the phrase "đêm hôm khuya khoắc", hôm there means "night".

In the phrase "sao hôm", hôm there also means "night" (sao hôm = evening star).

In the phrase "hôm nay", however, hôm means "day". Hôm nay = today; hôm qua = yesterday.

In the phrase "sớm hôm", hôm there means "night".
(sớm = early, morning; hôm = night; sớm hôm = from morning to night)

Vietnamese "hôm" is undoubtedly related to Zhuang "haem" (night), but Vietnamese "hôm" can mean both "day" and "night". And perhaps because of this reason, the Chinese translations have both "which day is it today?" and "which night is it tonight?"

But besides "hôm", Vietnamese also has the word "đêm", which means "night" as well. However, unlike the word "hôm" which could mean both "day" and "night, the word "đêm" only means "night".

The initial of the word "đêm" is very close to the initial of 滥 in Chinese and Sino-Vietnamese (Lạm). We know that Vietnamese đ has a "flapping" sound that makes it sound similar to [l] and it is usually mispronounced as [l] by Chinese Vietnamese. If any Vietnamese lives near a Chinese community in Vietnam, they will know that the first-generation Chinese in Vietnam usually say "li lâu" instead of "đi đâu", "ở lây" instead of "ở đây". It seems that Chinese can't pronounce the Vietnamese đ and change it to l. Besides, interchanging of [t], [d] and [l] is common. Therefore, we can make a connection between Vietnamese "đêm" (night) and 滥 lam (supposed to mean "day" or "night").

Furthermore, Vietnamese "hôm" and "đêm" sound quite similar and could be a split from glam (reconstructed by Karlgren)

glam ~ gam ~ gham ~ ham ~ hôm
glam ~ lam ~ đam ~ đêm
Hôm nay = today ; hôm qua = yesterday
Đêm nay = tonight; đêm qua = yesterday night

So now we see the connection between 滥 (glam) and Vietnamese hôm and đêm.

Next, let's look at the word 兮. This word was constructed as jiei by Karlgren and as g(h)ēj by Starostin.

Chinese linguists matched this word with the word "neix" in Zhuang, and though I don't know Zhuang language, I suppose that neix goes with haemx must mean "tonight" or "today" in Zhuang language?

To be honest, I don't know how they could match jiei with nei. I suppose they followed a pattern that is similar to that of 人 [jan] in Cantonese and [nan] in Teochew? Anyway, for your information, Vietnamese language does also have the word "này" which means "this". Hôm nay = today ; đêm nay = tonight. (Nay is just a variant of the word "này").

Still, I want to talk about the reconstruction of g(h)ēj by Starostin.

g(h)ēj sounds like the word "kia" in Vietnamese, which means "that". Putting the word "kia" after the words "đêm" and "hôm", we have "đêm kia" (that night) and "hôm kia" (that day, the day before yesterday). So the meaning of "kia" in modern Vietnamese is a little bit different from the meaning of 兮 g(h)ēj. (this)

However, as I said, languages have changed a lot over 2500 years and what meant "this" 2500 years ago could mean "that" today. Besides, for your information, Vietnamese people do have a tendency of switching the meaning "this" to the meaning "that" just by changing the tone of the words.

Examples:

này = this ; nấy = that
cái này = this one; cái nấy = that one
đây = here ; đấy = there
ở đây = at here; ở đấy = at there

With the above examples, I want to show you that the meaning of "this" and "that" can change slightly over time, and it is possible that the word "kia" used to mean "this" in ancient Viet language, but in modern Viet language it means "that".

So to sum it up, the word 兮 can be either "này" or "kia" in Vietnamese language.

If it's này, then you have "hôm/đêm nay", which means "today" (or tonight). If it's kia, then you have "hôm/đêm kia", which means "that day" (or that night); but languages evolve over time so you can assume that it used to mean "this day" (or this night) in ancient Viet language.

Now let's skip to the end of the sentence. You have the phrase 滥予 at the end of the sentence.

If the phrase 滥兮 means "this night", then the phrase 滥予 should mean "which/what night" (So you can have the translation "which night is tonight?")

The word 予 is reconstructed as dio by Karlgren and Ła by Starostin. This word 予 is supposed to mean "which" or "what".

In modern Vietnamese we have the word "nào" for "which", and "gì, chi, sao" for "what".
The word "nào" sounds close to dio and Ła. It is known among linguists that many modern Vietnamese n came from [d].

For example:

nắng (sunny) <-- đắng (In Mường language, it's still đắng)
nước (water) <-- đác
náng (palm, sole) <-- đáng

Therefore, it is possible that modern Viet nào came from đào, and therefore very close to dio or Ła

Also, Vietnamese language has the word "đâu" which means "where". Both nào and đâu sound similar and both are interrogative words, so perhaps they were once used interchangeably, or perhaps they both stemmed from the same interrogative word in the ancient time used for all "what, where, which, who".

In summary, 滥予 could be "đêm/hôm nào" or "đêm/hôm đâu" in Vietnamese (assuming that "đâu" used to be an interrogative word with the same function as "nào").

Now back to the middle of the sentence with the phrase 抃草. The old pronunciation of these words were reconstructed as b'ian ts'ôg by Karlgren and b(h)rens shūʔ by Starostin. The phrase 抃草 (b'ian ts'ôg or b(h)rens shūʔ) sounds like "biết chắc" or "biết chăng" in Vietnamese.

biết sounds like b'ian and b(h)rens except for the ending, but remember that the song was recorded when it was being sung by a Yue girl, so it was possible that the -t ending was changed to the -n ending because the music note was long. Even today if you listen to Vietnamese songs, you'll notice that - p, -t, -c, -ch endings are changed into -m, -n, -ng, -nh endings respectively when the music notes are long or when the pitch of the notes are slightly off from the pitch of the tones. To the ears of Vietnamese, they are still -p, -t, -c, -ch (because we are used to words in the our language), but to the ears of foreigners they'll sound more like -m, -n, -ng, -nh.

chắc sounds like ts'ôg and shūʔ. The ʔ in shūʔ could be equivalent to -k ending.

Vietnamese chăng probably has origin from the word chắc. Today, "chăng" in Vietnamese is a questioning word. "Chắc" literally means "sure" but it can be used as a questioning word like "chăng" too. For examples:

- Cậu biết cô ấy là ai chắc? (You know who she is, sure?)
- Cậu biết cô ấy là ai chăng? (You know who she is, not?)
- Biết chắc anh ấy đang làm gì? (know what he is doing? (questioning one's self))
- Biết chăng anh ấy đang làm gì? (know what he's doing? (questioning one's self))
- Chắc cậu bé đã trưởng thành (perhaps, the little boy has grown up)
- (Phải) chăng cậu bé đã trưởng thành (perhaps, the little boy has grown up)

So you see that at first, "chắc" (which literally means "sure") was used as a questioning word. Later, this questioning word developed into "chăng". So now, we have the word "chăng" that has similar function as the word "chắc" in term of questioning, but it doesn't carry the meaning "sure" of the word "chắc".

Anyway, when you place the phrase "biết chắc" or "biết chăng" into the sentence, it makes perfect sense even in modern Vietnamese.

滥兮抃草滥予 - glâm jiei b'ian ts'ôg glâm dio

Đêm nay biết chắc đêm nào? or Đêm nay biết chăng đêm nào?

The main idea of the sentence is "tonight is what night?" but it's not simply that. It could be translated as "Does anyone know which night is tonight?" But she's not asking anyone. She's asking herself. It's hard to translate it perfectly into another language because of that ambiguous questioning created by the phrase "biết chắc" or "biết chăng", yet this kind of questioning is what Vietnamese use in their folk songs and poetry all the time.

Now, let's look at the next sentence: 昌枑泽予？

Mandarin pinyin: chang hù zé yú?
Sino-Vietnamese: Xương hộ trạch dư?
Old Chinese Reconstruction of Karlgren: t'iang g'o d'àk dio
Old Chinese Reconstruction of Starostin: Thaŋ g(h)ā(ʔ)s [泽?] Ła?
This sentence was translated as "正中船位哪？" (which honorable person is inside the boat?)

The character 昌 was reconstructed as t'ang or thang and I matched it with the word "trong" in modern Vietnamese which means "inside".
The word trong came from tlong and tlong sounds quite similar to thang.

The word 枑 was reconstructed as g'o or g(h)ā(ʔ)s and I matched it with the word "ghe" in modern Vietnamese which means "a small boat"

So 昌枑 (t'iang g'o) would be "tlong ghe" (inside the boat) in Vietnamese.

泽 (d'ak or lak) could be match with the word "là" in Vietnamese (though the endings are off), which means "to be".

If so, then 予 (dio) would take the meaning of "who". (昌枑泽予？= Inside the boat is whom?)

It's possible that in the ancient time there was one common interrogative word that was used for all "which, what, who, where". That's why in the above sentence 予 means "which/what" but in this sentence it means "who".

I said above that 予 could be matched with the word "đâu" or "nào" in modern Vietnamese.

So 昌枑泽予 when matched with Vietnamese would be "trong ghe là nào (ai)" .

However, there's another possibility that 泽 (d'ak or lak) was a Yue word used to indicate an honorable person. However, in modern Vietnamese, this word has been lost and replaced by the Sino-Viet word "vị" 泽.

If so, then 予 would take the meaning of "which/what" as in the above sentence.

It's also possible that 泽 (d'ak or lak) was a combination of the phrase "là ngài". Ngài could sound like "gài". (note that ngài was an older form of "người").

"Là gài nào?" could have been misheard in the song and became "lag nao", especially when the note on "là" was long and the note on "gài" was quick.

If so then 昌枑泽予 (t'iang g'o làk dio) would be "tlong ghe là-gài nào?" (Which person is inside the boat?)

Next: 昌州州湛[饣甚]

Mandarin pinyin: chang zhou zhou shèn?
Sino-Vietnamese: xương châu châu thậm?
Old Chinese Reconstruction of Karlgren: t'iang tiôg tiôg kâm
Old Chinese Reconstruction of Starostin: Thaŋ [州州] d(h)ǝmʔ
This sentence was translated as "正中王府王子到达" (Inside the boat, the prince comes)

The word 昌 has been matched with the "trong" (inside) in modern Vietnamese.

州州 was reconstructed as "dio dio". This word probably meant "the prince" in ancient Yue language, but we know that words for prince or royal people in modern Vietnamese are all Sino-Vietnamese: hoàng tử, vương tử, vương tôn etc. So it's understandable that we can't find a Viet word to match with this. The closest word I can think of is "chúa" but then it's controversial whether chúa has Viet origin or Han origin (though chúa is a Sino-Viet word, many SEA languages have words that are similar to the word chúa, so perhaps this was originally a native Viet word?).

湛 was reconstructed as d(h)ǝmʔ by Starostin and I matched it with the word "đến" in modern Vietnamese, which means "to come".
(Sidenote: Karlgren reconstructed 湛 as kam but I don't know how he could fit the k- in there and I can't find a Viet word to match it if it was kam)

Chinese linguists matched this word with the Zhuang word daengz (which I suppose mean "to come" too).

Viet đến and Zhuang daengz are similar anyway as the -n ending in Vietnamese is linked to the -ng ending in many other languages.

For example:

Viet chân (leg) --- Yao ching --- Thai ʒǝ:ŋ.A --- Khmer ʒaǝŋ
Viet bùn (mud) --- Thai pung (mud)
Viet đèn (light) --- Chinese 燈 đăng
Viet đền (to compensate) --- Chinese: 償 thường
Viet tên (name) --- Chinese 姓 tính (surname)
Viet chôn (burry) --- Chinese 喪 tang

So 昌州州湛 (t'ang dio dio d(h)ǝmʔ) when matched with Vietnamese would be "[bên] trong, chúa chúa đến" (Here I just use the word "chúa" which means "lord" in Vietnamese for 州州 though I know it's controversial)

Next sentence: 州焉乎秦胥胥

Mandarin pinyin: zhou yan hu qín xu xu
Sino-Vietnamese: châu yên hồ tần tư tư
Old Chinese Reconstruction of Karlgren: tiôg gian g'o dz'ien siwo siwo
Old Chinese Reconstruction of Starostin:[州] ʔan wā ʒ́in sa sa
This sentence was translated as 王子会见赏识我小人感激感激

The word 州, as said above, means the "prince" or some term for a royal person. The closest Viet word to match this is chúa but it's doubtful.

I matched the word 焉 (gian or ʔan) with "nghiền" in modern Vietnamese.

Nowadays, the word "nghiền" in Vietnamese means "to addict". But the meaning "to love" and "to addict" is not much different. In the ancient Yue language it probably meant "to love something a lot" (as I said, meaning of words could change a lot after more than 2500 years)

The word 乎 was reconstructed as g'o by Karlgren and wā by Starostin.
I matched 乎 with the word "qua" in Viet, which means "I, me". Though today, this word is rarely used in Vietnamese, the Muong people still have the word "qua" for "I, me".

秦 and 胥胥 were probably some Yue words to describe emotions and I admit that many words for emotion in Vietnamese today are Sino-Vietnamese, therefore it was hard for me to find some pure Viet words to match 秦 and 胥胥. Nevertheless, I tried and here are what I come up with.

I matched 秦 (dz'ien or z'in) with "thẹn" in Vietnamese (also a variant "tẽn"). Remember that Viet t- and th- came from [s] or initials that are similar to [s]. So thẹn and tẽn were something like sẹn and xẽn.

Modern Vietnamese "thẹn" or "tẽn" means "feeling shy" or "ashamed"…as when a man asks a young woman to go on a date with him, she shyly says "yes"…"thẹn thùng" is a word usually used to describe girls or young women when they feel shy and have blushful face.

I matched 胥 (sa or siwo) with the word "xao" in Vietnamese.

In modern Vietnamese, we have the word "xao xuyến" (or "xuyến xao") to describe an emotion (usually for some kind of love, some kind of longing), which the Vietnamese-English dictionary translates as "stirred, excited".

"Xao động" in Vietnamese is "agitated"

胥胥 would be "xao xao" in Vietnamese

Perhaps 秦 was a duplicate of 胥 "xao" (as in "xuyến xao"). Duplicating a word is a common phenomenon in Vietnamese (xao xuyến, bồi hồi, băn khoăn, vu vơ, lộng lẫy, xót xa…)

So 州焉乎秦胥胥 (tiôg gian g'o dz'ien siwo siwo) when matched with Vietnamese would be "chúa nghiền, qua thẹn [và] xao xao" (vương tử yêu, thiếp thẹn và thấy xao xuyến) ~ the prince loves, I feel shy but happy

Next sentence: 缦予乎昭澶秦逾

Mandarin pinyin: màn yú hu zhào chán qín yú shèn
Sino-Vietnamese: Man dư hồ chiêu thiền tần du sấm
Old Chinese Reconstruction of Karlgren: mwân dio g'o t'jog d'ân dz'ien diu ts'âm
Old Chinese Reconstruction of Starostin: [缦] Ła wa taw dan ʒ́in lo [渗]
This sentence was translated as: 天哪知王子与我小人游玩。

The word 缦 was reconstructed as mwân by Karlgren. Chinese linguists matched this word with the word ngoènz in Zhuang language to mean "day" (ngoenz sounds like ngày in Vietnamese).

I matched this word with "buổi" in Vietnamese. The meaning of "buổi" is similar to "hôm" and "ngày" but it refers to only half of a day. We have phrases like "buổi trước", "buổi ấy" (mean "the day before", "that day"). Perhaps a variant of "buổi" is "bữa". Bữa means "day", and it also means "meal, repast". We also have phrases like "bữa trước", "bữa ấy" (equivalent to "buổi trước" and "buổi ấy").

b in buổi matches with m in mwan.
u in buổi matches with w in mwan
ô in matches with a in mwan

The only thing that seems strange is the -i and the -n ending.

Now I want to point out that there is a pattern of interchanging of [n] ending and [ i] ending from Chinese to Vietnamese as well as from other languages to Vietnamese.

Examine the following cases:

Chinese 鮮 (Sino-Viet: tiên) ~ Vietnamese tươi "fresh"
Chinese 線 (Sino-Viet: tuyến) ~ Vietnamese sợi "string, filament"
Chinese 奔 (Sino-Viet: bôn) ~ Vietnamese vội "to be in hurry, haste"

You see the [n] ending in Chinese and Sino-Vietnamese turn out to be [i] in native Vietnamese.

Why does that happen?

Because many [n] ending in Old Chinese was [r]. Examples:

鮮 shar ---> shan ---> sjen ---> tiên in Sino Viet
線 sor --> son --> sjwèn --> tuyến in Sino Viet
奔 poǝ̄rs --> poǝ̄n --> bôn in Sino Viet

If in Chinese [r] became [n] ending, in Vietnamese, it became [i]

鮮 shar --> shai --> suoi --> tươi in Vietnamese
線 sors --> sợi in Vietnamese
奔 poǝ̄rs --> pội --> vội in Vietnamese

The pattern of [r] becoming [i] in Vietnamese can also be seen in the following cases:

Chinese 唆 sōr ---> Vietnamese xui, xúi "to urge, to incite, to induce"
Chinese 熯 sŋārʔ ---> Vietnamese sấy "to dry or burn over fire"

There was a time when Chinese confused the [r] ending with the [n] ending; later both [r] ending and [n] ending were merged to become [n] ending.

So buổi could have been something like bwor and was transcribed as 缦 mwân into Chinese.

The variant "bữa" (means "day, meal") further supports an ancient form that is similar to bwor

Furthermore, Viet "buổi" and "bữa" are connected to "ngày" through "mwân" and the Zhuang word "ngoenz".

Chinese linguists link mwan to ngoenz in Zhuang.

Yet mwan sounds similar to buổi and bữa in Vietnamese.

While Zhuang ngoenz sounds similar to Viet ngày.

Tai language has the word *ŋa:i.A for "morning meal" and *ŋwa.A for "yesterday" and *ŋwan.A for "day".

Note that the meaning of "morning meal" and "day" are similar to the meaning of "bữa" (meal, day) in Vietnamese.

So in the old time, the word could have been something like ŋbwar. Then this word was splitted into several different words.

In Vietnamese, it's ŋbwar --> ŋar --> ngày (day) and ŋbwar --> bwar ---> bữa and buổi (meal, day, half a day)

In Zhuang, it's ŋbwar ---> ŋwar ---> ngoan (day)

In Thai, it's ŋbwar ---> ŋwar --> ŋwaj (morning meal) and ŋbwar --> ŋwar --> ŋwan (day)

When the Chinese recorded this word in the Yueren song, they heard ŋbwar and recorded it as mwan.

予 was already matched with đâu or nào in Vietnamese. 缦予 (mwân dio) would be "buổi nào" or "bữa nào" or "ngày nào"

乎 was already matched with "qua" in Vietnamese. But in this case, I suggest that it means "we" instead of "I, me" (Like "ta" could mean both "I" and "we" in Vietnamese).

昭澶 was reconstructed as taw dan in Starostin and t'jog d'ân by Karlgren.

I matched this word with "đu đưa" in Vietnamese.

In modern Vietnamese, the phrase "đu đưa" is used to describe the motion of "swaying back and forth". You can "đu đưa" on a swing, a cradle, and of course on a boat in the middle of a river too.

đu sounds similar to taw and t'jog as reconstructed by Starostin and Karlgren.

đưa was probably something like dur or dor and was transcribed as d'ân by Chinese (as said above, there was a time when Chinese confused -r ending with -n ending).

This is similar to the case of "chua" (sour). Modern Viet chua was probably from something like "chur", similar to 酸 śūr in Old Chinese. But later when -r ending was confused with -n ending, Chinese changed it to son then swan (toan in Sino-Vietnamese).

Similarly you can argue that "mưa" (rain) was something like mur or bur (~ mul, mol, mun, mon, bul, bol, bun, bon in other languages).

mây (cloud) was from something like mar or mor (~ mal, mol, man, mon in other languages).

But let's set this aside and back to the Yueren song.

In the phrase 逾渗, 逾 was reconstructed as diu by Karlgren and lo by Starostin.

I matched 逾 with the word "trôi" in Vietnamese which means "flowing, drifting"

We know many tr- in modern Vietnamese came from bl-

Examples: trăng &lt-- blăng; trời &lt-- blời; trái &lt-- blái; tro &lt-- blo

Therefore, it is very possible that Vietnamese trôi came from blôi.

Traces of blôi can be seen in the word bơi lội (swimming). Today bơi and lội are two separate words and can go alone, but they usually go together and were probably a split from the word blôi (meaning "to drift, to flow")

Another Viet word that is related to blôi and lội is nổi ("to float").

So to sum it up.

blôi --> bơi and lội (both mean "to swim" but the meanings are slightly different)
blôi --> trôi (means "drifting, flowing")
blôi --> nổi (nổi "floating")

blôi in ancient Vietnamese could have all of the above meanings (drifting, flowing, floating, swimming) and it also sounds close to 逾 "lo"

渗 was reconstructed as ts'âm

I matched this word with the word "suối" in Vietnamese which means "stream", though the endings are different. The -m ending in ts'âm was probably some kind of suffix. In Vietnamese today, we have the phrase "suối mơ" which means "dreamy stream" used to describe beautiful scenaries. The Yue girl probably sang "súi mơ" (assuming that Viet language at that time didn't have complicated vowels like uối), but the note on súi was long and the note on mơ was quick, so it got recorded as "súm"

So 逾渗 (lo ts'âm) when matched with Vietnamese would be "blôi súi-mơ" (means: drifting a long a dreamy stream or floating on a dreamy stream).

So to sum up, the sentence 缦予乎昭澶秦逾渗 (mwân dio g'o t'jog d'ân dz'ien lo ts'âm) when matched with Vietnamese would be "buổi nào qua [lại] đu đưa thẹn trôi suối-mơ" (which could be roughly translated to English as "will there be another day when we can be together on the swaying boat, drifting along the beautiful stream").

If some of you wonder why I didn't incorporate the word "thẹn" (shy, shyly) into the English translation, it's because it would sound awkward in English if I do. However in Vietnamese, it sounds perfectly fine. Like "thẹn bước trên đường" (shyly walk on the road), "thẹn trôi trên sông" (shyly flow along the river).

The last sentence: 惿随河湖 Old Chinese Reconstruction of Karlgren: ziek zwie g'â g'o
Old Chinese Reconstruction of Starostin: [惿随] ghāj ghā
This sentence was translated as: 小人喉中感受。

When I look at the phrases "ziek zwie" and "g'â g'o", I think of "tức thở" and "nghẹn ngào" in Vietnamese.

Since many Vietnamese t came from s, tức thở would be xức sở in Old Vietnamese. Though "xức sở" is not exactly like "ziek zwie", it is quite close.

"tức thở" in modern Vietnamese means to feel suffocatting, unable to breath.

"nghẹn ngào" in modern Vietnamese means to feel choked by tears, choked with emotion.
(nghẹn means "choked", ngào is a duplicate of nghẹn; but "nghẹn ngào" means to be choked with emotion).

"nghẹn" has an -n ending so it doesn't match much with 河 g'â or ghāj.
河 was probably "ngạt", which means the same thing as "nghẹn"
"ngạt ngào" matches with 河湖 "g'â g'o" more than "nghẹn ngào"

or perhaps 河湖 g'â g'o was "ngạt cổ"? (cổ means "throat")

The phrase "nghẹn ngào" in modern Vietnamese (which means to be choked with emotion, to be overwhelmed with emotion, so much that makes a person unable to talk) probably stemmed from some old Viet phrases like ngạt cổ or nghẹn cổ (ngạt/nghẹn = choked; cổ = throat).

So 惿随河湖 when matched with Vietnamese would be "tức thở nghẹn ngào" (to be so overwhelmed with emotion that a person feel choked and unable to talk, speechless).

Now let's look at the whole song:

滥兮抃草滥予？glâm jiei b'ian ts'ôg glâm dio?
昌枑泽予? t'iang g'o d'àk dio?
昌州州湛 t'iang tiôg tiôg dâm
州焉乎秦胥胥 tiôg gian g'o dz'ien siwo siwo
缦予乎昭澶秦逾渗 mwân dio g'o t'jog d'ân dz'ien lo ts'âm
惿随河湖 ziek zwie g'â g'o

Vietnamese:

Hôm nay biết chắc hôm nào?
Trong ghe là-ngài nào?
Trong ghe, chúa chúa đến
Chúa nghiền, qua thẹn xao xao
Buổi nào [chúng] qua [lại] đu đưa thẹn trôi suối-mơ
[Ôi thiếp thấy] tức thở nghẹn ngào

English translation of Yueren song based on Vietnamese

Is there anyone know which night it is tonight? (or Is there anyone know which day it is today?)
Who is that person inside the boat?
Oh, Inside the boat is the prince
Being cherished by the prince, I feel shy but also stirred and happy.
Will there be another day when we can be together on the swaying boat, shyly drifting along the beautiful stream?
Oh I'm so overwhelmed with emotion.

Ok, that's it.

I know my "decoding" of the Yueren song based on modern Vietnamese language is not 100% solid and there may be some flaw or mistake in it, but please give me a break (as in don't be too harsh when you criticize my mistakes) as I am comparing a song written in a language that existed 2500 years ago transcribed into another language by non-native speakers of the original language with a modern language that has undergone much influence from Chinese. I don't think any expert linguist can match that language in Yueren song perfectly with any modern language, let alone an 18-year-old Vietnamese girl like me. But I just want to share what I've found…So any comment or correction on this?

liketolearn
Age: 20 years old
Location: Southern California
Interests: Asian history and culture
Main Interest in CHF: Asian History
Specialisation / Expertise: Vietnamese language

Source: http://www.chinahistoryforum.com/index.php?/topic/ 30242-yueren-ge-%26-36234%3B%26-20154%3B%26-27468%3B-and-vietnamese-language/

VIỆTNHÂN CA

Author: Đỗ N. Thành

Nhạn Nam Phi
(Đỗ N. Thành dịch)

Năm nầy bảo với năm xưa
Thương chàng hoàng tử thương chiều chiều xưa
Sớm chiều em hận tương tư
Mà ai hiểu đặng tình yêu sâu đầy.

PHÁT HIỆN LẠI VỀ VIỆT NHÂN CA (越人歌)

by Thanh Đo

Việt nhân ca quá nổi tiếng. Sau khi được đưa vào phim và hát thì nổi lên phong trào tìm hiểu Việt nhân ca trong dân gian chứ không còn là chuyện của các chuyên gia nghiên cứu văn hoá. Nổi tiếng vì có thể nói đó là bài thơ tình đầu tiên, bài dân ca xuất hiện sớm nhất được ghi nhận trọn vẹn, cách nay khoảng 2800 năm…

Tóm tắt về bối cảnh ra đời của Việt nhân ca: Lưu Hướng (刘向) là cháu bốn đời của Lưu Giao (刘交). Lưu Giao là em của Lưu Bang (刘邦) cao tổ của nhà Hán. Lưu Hướng là tác giả của sách Thuyết uyển (说苑). Sách có chương kể chuyện “Tương Thành Quân Thủy phong chi nhật” (襄成君始封之日). Tương Thành Quân là Sở Tương Vương (楚襄王) tên hiệu là Hùng Hoành (熊橫). Trong câu chuyện có nhắc đến Ngạc Quân Tử Tích (鄂君子皙) là vua Sở Hùng Ngạc (楚熊咢) dùng thuyền dạo mát ngoạn cảnh thì có người chèo thuyền hát bài dân ca Việt. Ngạc Quân Tử Tích nhờ người ghi lại và phiên dịch ra tiếng "Sở" là bài "Việt nhân ca".

Nguyên văn đoạn đó như sau:

"襄成君始封之日，衣翠衣，带玉剑，履缟舄（舄：xi4，古代一种双层底加有木垫的鞋；缟舄：白色细生绢做的鞋），立于游水之上，大夫拥钟锤（钟锤：敲击乐鼓的锤子），县令执桴（桴：鼓槌）号令，呼：“谁能渡王者于是也？”楚大夫庄辛，过而说之，遂造托（造托：上前求见）而拜谒，起立曰： “ 臣愿把君之手，其可乎？”襄成君忿然作色而不言。庄辛迁延（迁延：退却貌）沓手（沓：盥之误字，盥手即洗手）而称曰：“君独不闻夫鄂君子皙之泛舟于新波之中也？乘青翰之舟（青翰：舟名，刻成鸟形的黑色的船），极（：man2，上艹下两；芘：bi4。芘：不详为何物，疑为船上帐幔之类），张翠盖而检 (检：插上) 犀尾，班（班，同“斑”）丽袿 (袿：gui1，衣服后襟，指上衣) 衽 (衽：ren4,下裳)，会钟鼓之音，毕榜枻（榜：船；枻，yi4,桨。榜枻：这里指代船工）越人拥楫而歌，歌辞曰：‘滥兮抃草滥予，昌枑泽予昌州州，饣甚州州焉乎秦胥胥，缦予乎昭，澶秦踰渗，惿随河湖。’鄂君子皙曰：‘吾不知越歌，子试为我楚说之。’于是乃召越译，乃楚说之曰：‘今夕何夕兮，搴中洲流。今日何日兮，得与王子同舟。蒙羞被好兮，不訾诟耻。心几顽而不绝兮，知得王子。山有木兮木有枝，心说君兮君不知。’ 于是鄂君子皙乃揄修袂，行而拥之，举绣被而覆之。鄂君子皙，亲楚王母弟也。官为令尹，爵为执圭，一榜枻越人犹得交欢尽意焉。今君何以踰于鄂君子皙，臣何以独不若榜枻之人，愿把君之手，其不可何也？”襄成君乃奉手而进之，曰：“吾少之时，亦尝以色称于长者矣。未尝过僇（僇：lu4,羞辱）如此之卒也。自今以后，愿以壮少之礼谨命。”

Dịch nghĩa:

“Ngày đầu tiên Tương Thành Quân được phong quan tước, mặc áo đẹp, đeo kiếm ngọc, mang cao guốc, đứng phía trên dòng nước, đại phu gõ nhạc, đánh trống. Lệnh rằng: "Ai có thể đưa bổn vương lên đò?" Sở đại phu Trang Tân bước lên phía trước bái kiến, đứng thẳng nói rằng: "Thần nguyện nắm tay của quân vương, có được không ?" Tương Thành Vương phẫn nộ, mặt biến sắc và im lặng. Trang Tân mất mặt, phủi tay nói rằng: " Quân vương không nghe qua chuyện Ngạc Quân Tử Tích dạo thuyền trên làn sóng mới sao? Trên thuyền Thanh Hàn, cắm cờ xí, khoác áo choàng đẹp. Trong tiếng chuông trống, người chèo thuyền là người Việt đã hát. Lời hát là "Lạm hề biện thảo biện dư, xương Hoàn trạch dư xương châu châu, Thực thẩm châu châu yên hô tần tư tư, mạn dư hô chiêu, thẳn tần du sâm, đề tùy hà hồ ." Ngạc Quân Tử nói: " Ta không hiểu Việt ca, thử cho ta hiểu bằng tiếng Sở." Thế là cho người phiên dịch, bằng tiếng Sở nghiã là: "Kim tịch hà tịch hề, khiên trung châu lưu, kim nhật hà nhật hề, đắc dĩ vương tử đồng Chu. mông tu bị hảo hề, bất hiềm cấu sĩ. tâm kỷ phiền nhi bất tuyệt hề, tri đắc vương tử. Sơn hửu mục hề mục hửu chi, tâm thuyết quân hề quân bất tri.” "Nghe xong, Ngạc Quân Tử Tích xăn tay áo, đến ôm lấy, dùng mền thêu mà đắp lên. Ngạc Quân Tử Tích là em cùng mẹ với Sở vương, làm quan Lịnh-Doãn, tước vị cao sang, mà còn có thể cùng vui tận hết ý với người chèo thuyền Việt. Nay sao quân vương lại do dự hơn Ngạc Quân Tử Tích, thần tại sao không bằng người chèo thuyền, muốn nắm tay quân vương, tại sao lại không được ?" Tương Thành Quân đưa tay ra bước tới, nói: "ta từ nhỏ đã được người lớn khen đàng hoàng, chưa từng bất ngờ gập qua cảnh nầy. Từ nay về sau xin nghe lời chỉ dạy của tiên sinh."

Chính nhờ đoạn văn này mà Bài ca của người Việt còn tới ngày nay. Từ văn bản Hán ngữ đã nhiều người dịch ra tiếng Việt. Và đây là bản dịch có thể được coi là chuẩn:

Việt nhân ca

(Bản dịch Việt ngữ trên Diễn Đàn của Viện Việt Học)

Đêm nay đêm nào chừ, chèo thuyền giữa sông
Ngày này ngày nào chừ, cùng vương tử xuôi dòng.
Thẹn được chàng mến yêu chừ, nào chê phận thiếp long đong
Lòng rối ren mà chẳng dứt chừ, được gặp chàng vương tông
Non có cây chừ, cây có cành chừ; lòng yêu chàng chừ, chàng biết không?

Hơn hai nghìn năm nay, giai thoại vẫn nằm trong sách. Bao thế hệ đã đọc và ngợi ca đều bằng lòng với bản dịch mà chưa ai nghiên cứu nguyên văn của bài ca tức là bản tiếng Việt! Phải chăng đó là thứ ngôn ngữ bị mai một mà bao tháng năm do không hiểu được nên lớp lớp tài tử văn nhân bằng lòng với cái bóng, cái hình?

Biết bao nhiêu chuyên gia ngôn ngữ học cuả nhiều thế kỷ cận đại đã bỏ công nghiên cứu Ký âm của Việt Nhân Ca là ngôn ngữ gì? Tập hợp cuả tập thể nghiên cứu Việt nhân ca gồm những người am hiểu hầu hết các ngôn ngữ, họ dẫn chứng là ký âm của Việt nhân ca có thể giãi thích bằng tiếng nói các dân tộc : Tráng tộc 壮族、Đồng Tộc 侗族, Bố y Tộc 布依族, Thái tộc 傣族, Thủy tộc水族, Mao Nam tộc 毛南族, Hạ lào tộc 仫佬族, Lê tộc 黎族...Vì các dân tộc nầy đều có nguồn gốc từ Cổ-Việt-Tộc 古越族。Và cuối cùng thì Thuyết Ký âm Việt Nhân Ca được kết luận là của người Choang-Tráng Tộc ...

...Hiện giờ Việt nhân ca được biết như là bài dân ca của dân tộc "Choang", được ghi lại bằng ký âm bởi người Sở thời Xuân-Thu.* Một số ý kiến cho rằng lịnh doãn nước Sở là Ngạc Quân Tử Tích sau khi nghe bài hát của người Việt rồi nhờ người phiên dịch ra tiếng Sở. Sở quá rộng lớn nên Bắc Sở thường tự xưng là Kinh Sở và Nam Sở tự xưng là Tương Sở hay Tượng Sở. Trong lịch sử xưa có khi Nam Sở tách ra độc lập là nước Dương Việt. Nếu ngược thời Xuân thu đi về xa nữa, thì tận xa xưa có “lịnh doãn" của nước Sở là Tử Văn vào triều đình nhà Chu nói chuyện bằng tiếng Sở mà nhà Chu xưng là Hoa lại không ai hiểu... Điều nầy được ghi nhận trong Sử ký. Xin quí vị xét kỹ yếu tố câu chuyện nầy mà đừng lầm rằng tiếng Sở là tiếng Hoa. Ngay cả "lịnh-doãn" nước Sở nghĩa là gì thì người Hoa cũng không biết, nên chỉ ghi chú: quan "lịnh-doãn" là chức quan tương đương với "tể tướng" hay gọi là "thừa tướng". Thực ra lịnh- doãn (令尹) là từ đa âm cổ: quan lịnh-doãn hay quan loãn là quan loan là quan lang chỉ có trong tiếng Việt và người Việt mới hiểu. Quan chức người Việt thời Hùng Vương được gọi là quan lang là loan, khi ký âm bằng chữ vuông thì biến thành lịnh - doãn (令尹). Thời Xuân thu vẫn dùng ngôn ngữ Việt làm tiếng phổ thông giữa các quốc gia nhỏ ở Trung Nguyên và gọi là Nhã ngữ. Nhã ngữ là Việt ngữ mà ngày nay cũng bị gọi là Hoa ngữ, đã đơn âm hóa nên nhiều người lầm tưởng "Việt" "Hoa" là hai ngôn ngữ khác nhau. Ví dụ "Trữ-la" thôn thì thực ra là "Tử la" thôn có nghĩa là “thôn Tả". "Trả" “tả” hay "trái" chính là "Tó" (Triều Châu), “Chỏ" (Quảng Đông), "Chò" (Bắc Kinh) dù chung một gốc mà sau khi biến âm thì vùng nầy lại không hiểu ngôn ngữ vùng kia.

*Điều quan trọng cần lưu ý, là nhà nghiên cứu cổ nhạc cuả các dân tộc ở Trung quốc là ông Phùng Minh Tường đã khẳng định "Việt Nhân ca" bị cho là tiếng Choang thật sự không ổn! vì tìm hết các thể điệu Dân Ca cuả Choang không hát được "Việt Nhân ca" trong khi đó là 1 bài dân ca. nhưng ông ta cũng không tìm ra được "Việt nhân Ca" là của Dân tộc nào.

* Ký âm tiếng Việt của bài ca được ghi lại là:

滥兮抃草滥予 Lạm hề biện thảo lạm dư
昌枑泽予昌州州 Xương hằng trạch dư xương châu châu
饣甚州焉乎秦胥胥 Thực thầm châu yên hồ tần tư tư
缦予乎昭 Mạn dư hồ chiêu
澶秦逾渗惿随河湖 thìn tần du sâm, đề tuỳ hà hồ

(Bản ký âm nầy khi phiên ra Hán-Việt thì thường bị thiếu một chữ ở câu số 3:"饣", đó chính là chữ 飠-Thực.)

Phiên dịch ra Hán Việt cho một bài dùng chữ tượng hình cổ để "phiên âm" tiếng Việt thì sẽ rất là khó vì có chữ không còn được dùng nữa, nên không có trong từ điển. Mà dù cho có tra tự điển thì chưa chắc đúng bởi vì giọng đọc ở các địa phương khác nhau. Thêm nữa, cách nhau đến ngàn năm thì tiếng nói và cách viết của một số chữ có thể thay đổi và lại biến âm theo từng miền ngôn ngữ v v... Bản ký âm nầy cho đến nay vẫn bị cho là phiên âm để ghi lại tiếng "Choang" tức là tiếng "Thái" của Tráng tộc.

Nay tôi xin trình bài "Phục nguyên" những chữ ký âm cuả Việt Nhân ca như sau :

Chữ đã ghi lại Việt nhân ca được thể hiện bằng 33 chữ. Xin trình bày lại và xếp theo ý tôi:

滥兮抃草滥予昌枑泽予昌州州飠甚州焉乎秦胥胥缦予乎昭澶秦踰渗惿随渗惿随 ...河湖。

- Đó là tiếng Việt, xin sắp xếp lại, vì rất quan trọng, cho đúng thơ lúc -bát , 6-8: (chú ý 2 chữ có gạch nối là 1 chữ "đa âm")

滥兮抃-草滥予 Lạm hề biện-thảo lạm dư
昌枑泽予昌州州飠Xương hoàng trạch-dư xương châu thực
甚州焉乎-秦胥胥 Thẩm châu yên hô-tần tư tư
缦予乎-昭澶秦踰渗惿-随 Mạn dư hô-chiêu thìn tần du sâm đề-tùy.
...河湖。 Hà Hồ.

* Để dịch bài này từ tiếng Việt xưa ra tiếng Việt nay : xin giải thích những ký âm của Việt nhân ca:

滥 : "Lạm" là "Lam" hay "nam" tức là "Năm", "L" và "N" thường là biến âm, ngày nay màu "Lam" tiếng Triều Châu là "Nam". Rất nhiều nơi ở Quảng, Triều, Việt thường lẫn lộn "L" và "N".
兮: Hề... hầy, nầy, nè, đây... nhiều biến âm.
抃草: Biện-thảo là từ đa âm của "bảo".
予: "Dư" còn có âm "ia" (Triều Châu, Bắc kinh): Năm "dư" có thể như ngày nay là "năm kia", "năm Xưa" ;
昌: ký âm "xương" là "thương". Ngày nay tiếng Quảng Đông-thuần Việt là "Sẹc", Triều Châu-thuần Mân Việt là "Siaiê".
枑: "Hằng" hay "Hoàng".
泽予: "Trạch-Dư" hay "Trạch-Dử" là "Trử” hay "Tử",
飠: Thực, tiếng Quảng Đông là sực, Bắc kinh là Sữa: phát âm như là "Xưa".
甚 : Thẩm hay Thậm là Sẩm, sửm, sơm tiếng tiếng Quảng Đông, và Bắc kinh "Sum" phát âm như "Sớm".
州: Châu, phát âm Mân Việt -Triều Châu thì đọc là "Chiêu", "Chiệu" như "Chiều".
焉: (zen) Hiện nay phiên âm là "Yan" phát âm tiếng Bắc Kinh như em.
乎秦: "Hô-tần" đa âm, là "Hận" đơn âm.
乎昭: "Hô-chiêu" đa âm là "Hiểu" đơn âm.
澶: "Thẳn" hay "Đặng" hay "được". Nếu tra tự điển và phiên dịch là "Thìn" hay "chiền" là không đúng! Bên trái là bộ "Thủy" và bên phải là chữ "Đàn", đọc là "Thẳn" hay "đặng" và nghĩa là "nước xối... thẳng, thông, đặng". Tiếng Quảng Đông: "Thànn", Tiếng Triều Châu: "thànn" hay "thạnn".
胥胥: "tư tư" là Tương Tư.
秦踰: Tần Du, là ký âm "tình duyên" hay "tình yêu", 秦 là Tsình của tiếng Triều Châu ngày nay, 踰, du, Duyè (Quảng đông), Dua (Triều Châu).
渗: "Sâm" là Sâu, tiếng Quảng Đông ngày nay "sâu" vẫn là "Sâm".
惿随: "Đề-Tuỳ" đa âm là "đùy" đơn âm, là "đầy"
河- Hà: Hò 湖- Hồ: Hớ

Như vậy, nghĩa Việt của bài ca như sau:

Năm nầy bảo năm xưa
Thương Hoàng tử thương chiều chiều xưa
Sớm chiều em hận tương tư
Mà ai hiểu đặng tình yêu sâu đầy.
....Hò Hớ.

Theo khảo cứu của tôi thì Việt nhân ca là thơ lục bát của tiếng Việt, phù hợp với câu hò của dân ca Việt. Nếu thể hiện bài ca bằng thể lục bát ngày nay thì sẽ là:

Hò... ... hớ...
Năm nầy bảo với năm xưa
Thương chàng hoàng tử thương chiều chiều xưa
Sớm chiều em hận tương tư
Mà ai hiểu đặng tình yêu sâu đầy.

Việc khảo cứu và giải mã bí mật của Việt nhân ca, đối với tôi rất là dễ bởi vì tôi biết chữ tượng hình người Hoa đang dùng vốn là chữ Việt. Khi nghiên cứu cổ sử, tôi thường đọc theo nhiều phương ngữ khác nhau là Bắc Kinh, Quảng Đông, Triều Châu, Hán Việt. Vì thế có thể nói, nhìn vào Việt nhân ca là thấy được bài thơ Việt liền! Thích thú với chi tiết 2800 năm về trước, tiếng Việt đã dùng "biện- thảo" là "bảo" , "nầy" kia, "nầy" xưa, "thương chiều chiều xưa", "em hận tương tư" v v... Nhưng có điều tôi chưa biết "Hò...hớ" là nghĩa gì và cũng chưa bao giờ nghĩ đến sẽ tìm hiểu "Hò......Hớ" là gì! Vậy mà Việt nhân ca bản gốc đã làm tôi kinh ngạc và "ngộ" ra rằng "Hò...hớ" là dân ca của người Việt khi gắn bó với sông hồ, với ghe, thuyền: Hò...Hớ nghĩa là "Hà 河" ..."Hồ 湖 "

Nguồn: http://diendan.lyhocdongphuong.org.vn

x X x

APPENDIX L

Only a few lines in the whole article is the riders for the case of Vietnamese "cá", or "fish", in Min dialects.
Look for it yourself -- it's fun to learn something!

Ketchup's Chinese origins a sticky subject for US foodies

Updated: 2013-03-22 11:24
By Michael Barris in New York (China Daily)
Source: http://usa.chinadaily.com.cn/epaper/2013-03/22/content_16335504.htm

As a language expert, Alan Yu is used to all kinds of influences showing up in English words.

But even the University of Chicago linguistics professor is surprised at the Chinese origins of the word "ketchup".

"This is what academics, having dinner together, talk about as one of the more interesting bits of the English language," Yu says of the far-flung roots of many English words.

While German, French and Latin generally are said to have made the biggest impact on the English that Westerners speak, read and mangle, Chinese also appears as an influence in words such as kumquat, gung ho, and kow tow. But for millions of Americans used to dumping the beloved condiment on their French fries, scrambled eggs and hamburgers, none of those connections may be as startling as the Chinese link to ketchup.

In fact, HJ Heinz Co, the Pittsburgh-based maker of one of the world's best selling ketchup brands, confirmed in a statement to China Daily that ketchup "originated from a Chinese sauce pronounced catsup".

In a nutshell, here's the deal on ketchup, at least according to Dan Jurafsky, a Stanford University professor who has written a blog called "The Language of Food". Jurafsky's blog cites evidence that ketchup has roots in eastern China's Fujian province as a fish sauce. "This fish sauce in the Southern Min dialect in the 18th century was called something like 'ke-tchup', 'ge-tchup', or 'kue-chiap', depending on the dialect," Jurafsky writes.

"Those of you who speak Southern Min or Cantonese dialects will recognize the last syllable of the [American pronunciation of the word], chiap or tchup, as the word for 'sauce', - pronounced zhi in Mandarin," Jurafsky writes.

A 1982 Mandarin-to-Southern-Min dictionary, he says, confirms that the first syllable of the written Chinese name for ketchup is an archaic word pronounced gu in spoken Southern Min, and meaning a preserved fish. So ketchup is an "archaic word for fish sauce" in the Hokkien dialect of Southern Min Chinese, Jurafsky concludes.

Furthermore, Jurafsky says, early English recipes show that the original ketchup was indeed fish sauce, the stinky cooking sauce called nuoc mam in Vietnam, nam pla in Thailand, patis in the Philippines, all made from salting and fermenting anchovies.

Jurafsky's blog also sheds light on a long-time mystery - why ketchup sometimes is spelled "catsup". Since Hokkien isn't written with the Roman alphabet, the same archaic, Western process that transcribed fish sauce as ke-tchup, also delivered the world catsup, and even katchup.

In the statement provided to China Daily by Heinz, the company says its founder, Henry Heinz, "chose ketchup as the spelling" for his product to "differentiate" it from rivals' "catsup".

Linguist Yu agrees that the theories surrounding ketchup's origin are noteworthy. "It was a surprise to me," he says. "First, the tomato is not originally from Asia, so that is strange. What is even more strange is that it is somehow related to fish sauce."

In its original incarnation, ketchup was made with something other than tomatoes, Jurafsky notes. Tomatoes - a food not traditionally associated with Asian cuisine - were added to the recipe around 1800. From 1750 to 1850 its chief ingredient was fermented walnuts or sometimes fermented mushrooms, he writes. But as the blog points out, Samuel Johnson's seminal 1755 dictionary stated that English mushroom ketchups were "just an attempt to imitate the taste of an earlier original sauce that came from Asia."

Despite their abiding love for ketchup, Americans don't have a monopoly on how the popular sauce is used. Jurafsky's blog notes that people in China like to use it on fried chicken, and in Sweden it's a frequent pasta garnish. In Thailand, teens dip potato chips in ketchup, while in Eastern Europe it is a favorite pizza topping.

Taking her cue from Jurafsky's blog, Anzia Mayer, a writer for the China-focused blog Tea Leaf Nation, has dipped into scholarly literature to highlight other common English words with Chinese links. "I was shocked to find some common words that sounded really English to me," recalls Mayer, who is a senior at Amherst College in Massachusetts. Besides ketchup, her list includes kumquat, typhoon and gung ho.

At 22, Mayer has already made four trips to China and is "more or less" fluent in Mandarin. She hopes to become a Chinese teacher or journalist after graduation. Of her writing about the surprising influence of Chinese on common English words and expressions, she says: "I'm excited by the feeling of being able to make something foreign familiar."

michaelbarris@chinadailyusa.com

(China Daily 03/22/2013 page11)

APPENDIX O

Vietnamese Polysyllabism

October 2, 2012 @ 2:48 pm · Filed by Victor Mair under Announcements, Writing systems

There is a movement called Vietnamese2020 that aims to substantially reform the writing system by the year 2020. The main change would be to group syllables into words. As the advocates of this change point out, most words in Vietnamese are disyllabic (the same is true of Mandarin). The proponents of the reform believe that, among others, it would reap the following benefits:

1. achieve greater compatibility with the needs of information processing systems

2. comport better with the findings of cognitive science

3. put the kibosh on the false notion of monosyllabism, which they say is unnatural and does not exist in real languages

I myself had these additional thoughts:

1. Would the adoption of polysyllabism (i.e., linking of syllables into words) in Vietnamese obviate the need for so many diacritics (i.e., reduce homonymy)? Without knowing the precise details of Vietnamese romanization, the plethora of diacritical marks has always led me to suspect that the script may be fraught with redundancy and overspecification, especially if the basic unit of grammar were taken to be the word rather than the syllable. The fact that many Vietnamese in their casual writing omit the diacriticals and are still able to make themselves understood (see below) underscores this possibility.

2. Would the adoption of polysyllabism make indexing, dictionary compilation, etc. easier and more user-friendly? This has certainly been the case with Romanized Chinese and Japanese (e.g., in dictionaries and encyclopedias arranged according to alphabetical order by words), and I suspect that the same would be true of Korean as well.

I ran these proposals and ideas by a number of Western specialists in Vietnamese language and culture. Their reactions were, to put it mildly, unenthusiastic.

Bill Hannas notes that this sort of proposal has been around for a few decades at least, and that the following line in the proposal does not offer much hope for adoption: "In practice, while awaiting official orthography guidelines, hopefully, from a governmental body such as a national language academy, …"

Eric Henry states:

This is the first time I ever encountered this proposal. The article doesn't make it clear whether this idea has any government backing or not. To me the idea of pretending that Vietnamese compound expressions are unitary words in the same sense that "asparagus" or "daffodil" are words seems silly and artificial. The Vietnamese used to use hyphens to accomplish the same purpose; thus fangfa 方法 ("method") was "phương-pháp," and so on. Then people discovered that they could get along fine without hyphens, and that the absence of hyphens gave the page a pleasantly uncluttered look. Conjoining syllables in the manner proposed seems to me a way of reverting to hyphens [VHM: without the hyphens]. But then it's natural to be attached to whatever one is habituated to—and I happen to be habituated to un-conjoined syllables.

To which I replied, "ex cept in Eng lish".

Eric continued:

I don't see how polysyllabism could reduce the need for diacritics. Vietnamese people of course write to each other all the time with no diacritics and can still figure out 98% of the text, but everyone knows and feels that this is just a makeshift. It would perhaps be nice to eliminate the need for the circumflex and the half moon by inventing a few special vowel signs—but I don't see how the tone marks themselves could be represented in spelling (cf., for comparison, luomazi [National Romanization for Mandarin]: han, harn, haan, hann)—that would just be a nuisance, especially since Vietnamese has, not four, but six tones. Vietnamese orthography has already (i.e., centuries ago) made a move in the direction of new vowel symbols with the letters "ư" and "ơ."

Maybe a Vietnamese equivalent of DeFrancis's ABC Chinese dictionary could be created. It might be wonderfully useful for some purposes, as the ABC dictionary is wonderfully useful for some purposes. But I haven't really thought this through.

Another correspondent replied:

This has nothing to do with the government. It looks to me like it's the work of some overseas Vietnamese linguistics grad student or (former grad student) who has now gone slightly crazy because of the "East Sea/South China Sea/Really Far South Mongolian Sea. . ." issue.

The author has several pages. Another one (hocthuat.org) has a long study that argues for the linguistic connections between Vietnamese and Chinese, but it now has the following disclaimer:

STATEMENT OF RENUNCIATION OF THE SINITIC CAMP

Here comes a painful decision. I would like to renounce my long standing belief in what I have elaborated in this electronic publication about Sinitic Vietnamese. That is to say, I no longer believe in what I used to see as vestiges of sinitic linguistic elements in Vietnamese vocabulary stock that are postulated in my research paper. The reason for my taking this course of action is, admittedly, politically motivated because I do not want my work later to serve for unforeseen evil purposes, especially in the face of Chinazi's overt actions trying to impose its hegemonism onto today's Vietnam. My blood is boiling with revulsion and hatred after seeing a series of unrolling events currently taking place in the East Vietnam Sea. Civilized people mostly see that those behaviors could only be committed by warmongers, descendants of those same savages as vividly and accurately described in "The Ugly Chinaman" 醜陋的中國人 by Bo Yang 柏楊. Don't take me wrong, though both matters not related, given the fact that my blood is genetically embedded with Chinese DNA.

For Heaven's sake, please forgive me for all what I have been laboring on hitherto. I would appreciate your understanding and ask that you take this unstate [sic] moment of truthfulness as a statement of my renunciation of the sinitic camp and I shall accept all consequences thereof. My apology to my fellow scholars, too, and yet, if you still need to read my writings for some reason, focus instead on the antithesis of what is discussed herein, that is, "de-sinitize" them by taking the opposite view. You may still quote any material in this paper but remember to annotate your citation with this statement accordingly. You could post your comments and questions on Ziendan TiengViet.

It so happens that another language movement in Vietnam going on right now is called English2020; it aims to make all school leavers proficient in English by that year.

Steve O'Harrow comments:

There is an "English 2020" project being spearheaded by Professor Nguyen Ngoc Nhung on behalf of the SRVN Ministry of Education & Training that aims to make English language instruction available in a broad range of fields at the secondary and tertiary levels [by 2020]. It is the only domestic national-level language-related initiative I know of at this time in Viet Nam. One might be forgiven for suspecting that the proposers of the Vietnamese2020 movement stole the name "2020" from the Ministry of Education & Training English initiative.

The article you link here looks rather "iffy," to say the least. In reality, it is probably a scheme put on line by some Viet Kieu ["overseas Vietnamese"] someplace outside of the country itself. In my opinion, after my 50 years of Vietnamese language teaching and research in Viet Nam, Europe and America, there is a zero chance of this spelling movement taking hold. Why? Because the current system works well. It is known and used by nearly 90 million people.

The Vietnamese populace is already one of the most literate in Southeast Asia and it has been literate for a very long time. They are not likely to change what works well.

"If it ain't broke, don't fix it." And believe me, they won't.

What is endlessly interesting to this observer over the years is that for a long time now, the handful of folks who identify themselves as Vietnamese but who live overseas, are of the impression that what they cook up in the cafés of Paris or the campuses of the USA is going to have some magic impact on the millions and millions of Vietnamese who are actually living their day-to-day lives in Viet Nam itself. There are all kinds of looney ex-pats out there and each one has a fantastic plot to do something, reform the language, overthrow the government, invent a perpetual motion machine that serves pho on the side. They're constantly going around appointing each other prime minister of governments in exile or re-claiming the Nguyen Dynasty throne. Mind you, founding a new goofy religion actually works sometimes – as long as you are really in Viet Nam, that is.

But if you are abroad, "fuhged-daboudit," [especially if you live in Brooklyn].

Responding to my technical questions about the possible value of a polysllabic approach to Vietnamese writing, Steve remarked:

Short answer: NO. Longer answer: I really do not know enough about the technology of information processing, etc. to be 100% sure and I do know that many Vietnamese disagree on which words are polysyllabic & which are not [Chinese loans are easier to judge, but Mon-Khmer vocabulary is another question and mixed lexemes are even fuzzier]. The main obstacle to information processing at this point in time seems to be the fact that we do not have decent optical character recognition programs, due to a lack of typographic consistency and the fact that Vietnamese printing in the past has been all over the map. However, none of the "fixes" will eliminate the need for the diacritics and there is a lot of misunderstanding among those folks who do not actually read/speak Vietnamese which marks are diacritical [only the five tone marks] and which are integral parts of letters [hooks, bars, and circumflexes]. A Vietnamese native speaker does not see, say, the letters "o" and "ô" or "e" and "ê" as being "o with / without a circumflex" or "e with / without a circumflex" – rather s/he conceives of them simply as completely distinct letters, as different as we would think of "e" and "o" in English. The folks whom this system confuses are mainly foreigners, so who gives a damn?

A 2nd point would be that there is a lot of disagreement on what constitutes a "word" in Vietnamese. Is "Không quân" [Airforce] one or two words? I really don't think we are going to come to any substantial agreement in the foreseeable future and I really don't think it matters a whole helluva lot, at least not to the Vietnamese reading public

Again, the main point is that the current Vietnamese writing system works well for Vietnamese people in Viet Nam itself, so any substantial changes would likely be counter-productive. Just remember the old US saying: IF IT AIN'T BROKE, DON"T FIX IT! – it is just as true in VN as it is in the US. Tinkers be damned.

Finally, just before I was about to make this post, I received these brilliant remarks from a Vietnamese specialist who wishes to remain anonymous:

If Vietnamese were written as words, and not as syllables, there would be less need for diacritics (tones and "special"–in the sense that they lack Western alphabet equivalents–letters) because an equivalent amount of information (cues) is provided by the word division.

By adding information up front of one sort, you get by with less information of another sort. Word division in orthography means that society and its individuals have invested resources in an upgraded system that rewards users with greater clarity for less effort. You put the effort in at the beginning–deciding the rules and learning them.

We don't specify every phonological detail in English writing because we don't need them to get to meaning. The reader, if s/he cares about it, can supply those details later, after accessing the word-meaning. Often an unambiguous pronunciation is possible only after the word has been retrieved from one's mental lexicon. It surely does not derive from the successive letter-sounds. By the same logic, written Vietnamese words would be overspecified if they included all the diacritics in use at present.

Because indicating tone in computerized writing is such a bother, Vietnamese usually just leave them out of their informal correspondence, such as emails. The messages can still be understood, albeit with some difficulty. Word division would restore the missing redundancy.

Information technology, and indexing in particular, depend on having "tokenized" units, usually at the word level. Most of the tokenizing work is done already in languages with word division. For CJV (not K), however, a tokenizing function is needed.

It all comes down to the same rule: you can pay the cost once up front (create and learn rules for word division) or in perpetual installments.

It is remarkable that, although Chinese, Japanese, Korean, and Vietnamese have four different writing systems, they all are vexed with the problem of whether or not to join syllables into words. That, I believe, is the result of the latter three still retaining vestigial traces or influences of the Chinese characters. But even character writing could adopt word spacing if enough of its users would agree to follow such a norm.

[A tip of the hat to Jonathan Smith and thanks to Liam Kelley and Michele Thompson]
Share:

October 2, 2012 @ 2:48 pm · Filed by Victor Mair under Announcements, Writing systems

Permalink

45 Comments »

J.W. Brewer said,

October 2, 2012 @ 3:20 pm

For Chinese and Japanese, you may be characterizing the issue backwards – what is going on is not so much breaks between syllables rather than between words, but no information-conveying breaks at all except at the end of sentences and thus no visible distinction betweeen single-character (or single-syllable) words and multiple-character (or multiple-syllable) words, although in Japanese many individual kanji of course have polysyllabic readings. That lack of information-conveying breaks was once common practice for texts written in our alphabet, but was abandoned in favor of inserting blank spaces at word-breaks in the latter part of the first millenium A.D. http://en.wikipedia.org/wiki/Scriptio_continua. The Vietnamese situation may be different altogether.
Sili said,

October 2, 2012 @ 4:21 pm

Really Far South Mongolian Sea

This should probably not amuse me as much as it does.

I award the the author a swimming holiday to Austria.
JS said,

October 2, 2012 @ 4:31 pm

^
Chinese writing certainly provides "breaks between syllables" in the sense that the salient written units, characters, map (almost) without exception to single syllables of speech; the addition of physical "blank space" as that called upon to separate English words would, of course, be redundant.

However, Korean orthographical standards do call for word separation, meaning that in the case of (standard) Korean writing, both the syllable and the word are strongly marked in written text — though as one might expect, there are in the case of the word many cases in which decisions regarding division are variable and arbitrary.
Peter said,

October 2, 2012 @ 5:13 pm

^ I agree that characters neatly (and nearly without exception) subdivide an expression into syllables. Having blank spaces between the _words_ though–that would not be redundant. It would be kind of helpful (but no one will ever do it).
Victor Mair said,

October 2, 2012 @ 5:28 pm

@Peter

"that would not be redundant" — clear thinking on your part

"but no one will ever do it" — actually, a lot of people have done it (e.g., Chow Tse-tsung and Apollo Wu). Who knows? Someday it might just catch on. That would be a boon for IT specialists, dictionary makers, indexers, grammarians, and sundry others.
Peter said,

October 2, 2012 @ 5:52 pm

@Victor

That would be convenient. Considering that most Chinese (or, I suppose, Americans) can't tell the difference between a morpheme and a word, I'm not holding out a great deal of hope.
Ellen K. said,

October 2, 2012 @ 5:57 pm

In English we have cases where whether something is a word or two is somewhat arbitrary, and even cases where we don't agree on if it's one word or two. This doesn't seem to get in the way of our use of the written language. Curious that none of the writers, all writing in English, mention that we have this in English. My question for them would be, is this any different from English, other than that in English we've had time to standardize many of the cases that can go either way?
Victor Mair said,

October 2, 2012 @ 6:14 pm

@Peter

Most Americans (and other speakers of English) know what a word is (i.e., know where to put spaces between words) — in 99+% of the cases. Otherwise we wouldn't be able to hold these conversations on Language Log. And you can be sure that commenters would jump down the throats of us bloggersifweforgottoputinthosespaces.

As for what a morpheme is, that's specialized knowledge that can be left to linguists and others who delight in the study of languages.
tram said,

October 2, 2012 @ 7:39 pm

Funny example. Is "Airforce" one or two words?
Ruben Polo-Sherk said,

October 2, 2012 @ 7:56 pm

I think in understanding this issue it's important to realize that, just like with Chinese (when it is divided), the compounds aren't really being divided into syllables–they're being divided into morphemes, and that they simultaneously get divided into syllables is just a coincidence.
JS said,

October 2, 2012 @ 8:47 pm

^ Hmmm… from a synchronic point of view it might be possible to claim that in Chinese and Vietnamese writing, compounds are being divided into syllables and that it is the correspondence of those syllables to morphemes which is only a coincidence… after all, in these two cases, the salient written unit's relationship to the syllable is (all but) invariant, while its relationship to the morpheme is much confounded by the significant and increasing number of morphemes that are longer than one syllable.

However, historically speaking, your view is reasonable as the preference for disyllabic "compound" words in both languages (which seems to have followed on processes of reduction of longer and otherwise more phonologically complex words to CV[C]?) means that the relationship between originally logographic Chinese characters and modern-day morphemes is indeed in some sense original and essential…
Brad said,

October 2, 2012 @ 9:15 pm

I think one of the non-English rebuttals should be:
So everyone needs to deal with the made up hassles of distinguishing between compound words, hyphenated compounds, and multi-word compounds?

It's a distinction that the writing system makes, yet the organization system for the dictionaries resolutely ignores it. Does the meaning of 'air' change dramatically when followed by 'man'? If it does, you put 'airman' in the dictionary whether it's 'airman', 'air-man', or 'air man'.
:-/

Every Japanese book that I have that has spaces between the Japanese words is either a kids book or a Japanese as a foreign language text. The kids books have spaces between the words because uninterrupted strings of hiragana or katana can be difficult to parse quickly.

And the native Japanese dictionaries intendend for children that I have get along just fine using Japanese kana ordering for the dictionary, so "alphabetizing" only benefits the people that have memorized the arbitary order of 26 letters instead of memorizing the arbitrary order of 52-some kana.

In the written form used by adults, spaces would be redundant because the information is either conveyed through other indications:
- grammatical particles indicating the end of the word
- kanji interrupting the hiragana streams
- punctuation
and once someone gets into things like verb conjugation and so on, distinguishing between the various components really becomes quite arbitrary.

All of the electronic dictionary work that I've done has involved looking up words using longest substring style lookup. So if X and Y are words, but someone also decided that XY is a word, you don't have to care. So if the electronic translation people need to build better word tables, that's not a very compelling argument to change tradition.

In other words, God save us from yet another spelling reform, especially if it's for someone else's language.
Ran Ari-Gur said,

October 2, 2012 @ 9:29 pm

@Ruben Polo-Sherk: I don't know Vietnamese, so please correct me if I'm being clueless, but — I don't think that's completely true. For example, the Vietnamese Wikipedia gives "London" as "Luân Đôn" — not, I submit, because it's composed of the morphemes "Luân" and "Đôn". (However, it also gives "Paris" as "Paris", and "Wikipedia" as "Wikipedia"; so there's definitely a tendency to write borrowed morphemes solid even when they're polysyllabic, but it competes with a tendency to write spaces between syllables even within polysyllabic morphemes.)
michael farris said,

October 3, 2012 @ 1:54 am

Some initial random musings.

There's a fair amount of variation in how borrowed morphemse (which have undergone Vietnamization) are written. If you take 'salad' I've seen all three:

xa lát

xa-lát

xalát

with the first being the most common.

Words that don't undergo Vietnamization (like Paris) remain written as one word.

Word division seems a thornier issue in Vietnamese that any other language I've examined. When I was actively learning Vietnamese there were times I could understand a sentence just fine but couldn't have hoped to divide it into words (or could think of a number of ways of doing so). Leaners of Thai and Khmer I've talked to report very similar experiences while learners of Mandarin mostly don't. It might be a SEAsia thing…

Yes Vietnamese speakers can get by in some contexts without diacritics (I used to receive emails from one which I could understand pretty well) but this is partly due to diacritics being used most of the time – you can sort of 'see' the diacritics when they're not there. I'm also assuming there's some deliberate vocabulary and syntactic choices being made to facilitate understanding. But diacritic free Vietnamee (minus other massive changes) seems like a non-starter.

IME unlike most writers of languages with diacritics, when a diacritic appears over a lower case i in Vietnamese speakers tend to write the dot and diacritic both (when writing by hand, in print the diacritic replaces the dot). I'm not sure what, if anything, this means, but it's sort of distinctive.

You really should do a post on those Viet Kieu who want a return of Chu Nom (character based script). They make the word division (or other orthographic reform) plans seem completely feasible (nb I'm not talking about scholars who are interested in Chu Nom from an academic point of view who do very valuable work but those with half-baked plans for compulsory education and the like)
Ruben Polo-Sherk said,

October 3, 2012 @ 3:14 am

JS, Ran Ari-Gur: Good point bringing up the polysyllabic morphemes.

First of all, the polsyllabic morphemes in Chinese or Vietnamese are basically anomalies in one relevant sense: they cannot combine with other morphemes to form compounds in the way monosyllabic morphemes generally can. There are also very few of them.

So it is not unreasonable to ignore them when figuring out how to transcribe Vietnamese, which has a large substructure of monosyllabic morphemes, and, because the importance of these monosyllabic morphemes, decide to simplify and standardize by making each syllable written separately, which is what they did with quoc ngu. And so that is how you get Lon Don. With foreign words, though, as michael farris said, it's not entirely standardized. The other exception is in cases like with the current featured article on Vietnamese wikipedia, which has "dreadnought" in it, which is clearly not written in quoc ngu–it's written English–and so isn't subject to the syllable-dividing rule.

With pinyin, it's essentially the same. The substructure of Chinese consists almost entirely of monosyllabic morphemes and so, if someone decides to write with spaces to separate those morphemes, they may, for the sake of consistency, separate syllables of polysyllabic morphemes as well. But the motivation cannot be to distinguish syllables–that doesn't really make any sense, I think. If you argue that this is done to mimick the boundaries between Chinese characters, you get back to the point of morphemic structure, since a major function of Chinese characters is to support this kind of structure. It is possible, of course, to write a language like English, with no such structure, in Chinese characters, but the system of two-character compounds would not fit in general (and therefore there would really be no reason to not write each character separately if you transition from that into an alphabetic script). This is essentially an innate feature of the language, and not the writing system.

So, to put it simply, when disyllabic morphemes are split, this is done basically to be consistent in a system that, in order to accomodate a substructure of monosyllabic morphemes, has been standardized (by convention or personal choice) to have spaces between syllables. The chief concern is the division between morphemes.

(In case anyone doesn't understand what I mean by "substructure of monosyllabic morphemes", I'll explain it this way: Vietnamese and Chinese have it, and English doesn't. It's the thing that makes the issue of word division a real pain in the ass in the Vietnamese and Chinese, and not a problem at all in English. With Vietnamese and Chinese, because of the importance of the organization at the morpheme level, the concept of "word" doesn't fit well.)

(Not really part of my argument, but maybe something to think about: We write "New York" with a space, but, though originally it was two morphemes, it is now really just one. So we do sort of have this in English, too.)

(Ran Ari-Gur: I am not the all-knowing god of Vietnamese.)
richard howland-bolton said,

October 3, 2012 @ 6:02 am

"ex cept in Eng lish"?
"ex cept in Engl ish" surely :-)
Victor Mair said,

October 3, 2012 @ 6:31 am

@richard howland-bolton

surely not
Ruben Polo-Sherk said,

October 3, 2012 @ 6:44 am

Shouldn't it be Eng glish?
Gene Buckley said,

October 3, 2012 @ 7:24 am

Linguistically, compounds like air force are single words composed of other words: this is the beauty of hierarchical structure. Orthographies make different choices about how to handle that layered structure in writing. English is inconsistent, sometimes using a space, hyphen, or no division at all, often related to how familiar or "lexicalized" the compound is: water tower vs. waterfall.

Spelling practice varies over time and space; hyphens used to be more common, and still are relatively more common in British than in American orthography. German, where these compounds have the same linguistic structure as in English, has a more consistent orthography, regularly writing compounds as one word (Wasserturm, Wasserfall) regardless of length; see this dramatic Afrikaans example, since it (like Dutch) follows the same practice.

In Chinese, and therefore in Sino-Vietnamese, the compounds mainly at issue are closer to English per-mit, con-fer, and tele-phone. Because the meaning of the whole is often not very predictable from the meaning of the components, speakers shouldn't have much trouble learning to treat most such items as single written words, although there would no doubt be a role for (somewhat arbitrary) standardization. I think Victor's point is that to make no reference at all to word structure (whether by using spaces nowhere or everywhere) is to leave the reader completely on his or her own, when an orthography could give some significant information through the judicious use of spaces.

It's another question whether further compounding should be written as a single word. Victor, as I take it, is mainly talking about the equivalent of per mit, although there will also be words like build ing that are semantically more transparent. Today Vietnamese writes the equivalent of build ing per mit. A writing reform that ended with building permit might be superior to buildingpermit, since the spaces show the relative grouping of (pairs of) morphemes where they do the most good, while still identifying the internal constituency of larger compounds. If Vietnamese and German represent the extremes, English orthography might for once actually be rather sensible, if only it were more consistent.
Victor Mair said,

October 3, 2012 @ 7:28 am

That's why it's "English".
Matt Anderson said,

October 3, 2012 @ 7:55 am

Ruben Polo-Sherk,

Maybe I don't understand your point exactly, but, in Mandarin, polysyllabic words can certainly combine with other morphemes to form longer words. For example, húdié 蝴蝶 'butterfly' can combine with gǔ 骨 'bone' to form the word húdiégǔ 蝴蝶骨 'sphenoid'; xìbāo 細胞 'cell' can combine with zhì 質 'substance' to form xìbāozhì 細胞質 'cytoplasm'; and lǚyóu 旅遊 'tourism' can combine with qū 區 'district' to form lǚyóuqū 旅遊區 'tourist area'. &, while the individual syllables of xìbāo and lǚyóu can themselves be said to be morphemes, húdié is itself a single morpheme.
Ruben Polo-Sherk said,

October 3, 2012 @ 8:30 am

Certainly polysyllabic words can combine with other morphemes. My point about the restriction on polysyllabic *morphemes* doing so was with regard to *how* they do it. The only way they can is basically through the same mechanism that we use to get "tennis racket" and "toaster oven". 蝴蝶骨 is basically "butterfly bone" in this same sense. There's an important difference between that sort of union and the one in, for example 理解, or 看见.
M (was L) said,

October 3, 2012 @ 9:37 am

Does it make a lot of sense to bust a gut over foreign names and words? Every written language is challenged by this. Every language has to deal with it, and often by special localization rules that differ for each commonly-encountered foreign language. Often, it's a matter of "drop back ten and punt."

It seems to me that whatever Vietnamese decides to do with Vietnamese vocabulary, and with loan-words that have become sufficiently adopted that they are now de facto Vietnamese vocabulary, is one question – - – but not a decision that ought to be driven by foreign words. Tail wagging the dog, no?
Steve said,

October 3, 2012 @ 11:57 am

POINT ONE: The folks who worry about joining Vietnamese syllables or not joining Vietnamese syllables are in the same league with theologians worrying about how many angels can dance on the head of a pin. 90 million Vietnamese use an orthographic system that works well for them. In the early post-WW2 period, they undertook a massive literacy campaign that worked very well because, for a native speaker of Vietnamese, the writing system is not nearly as difficult to learn as say, the English system is for native speakers of English.
POINT TWO: If one makes the axiomatic statement that "As the advocates of this change point out, most words in Vietnamese are disyllabic (the same is true of Mandarin)," one begs the question of what constitutes a "word." Many commentators appear to be judging whether and utterance in Vietnamese is a word based on whether what is expressed can be called "a (i.e., one) word" in, say, English or French. This is, in my opinion, a highly subjective stance.

In any event, judging the matter as a non-native speaking student of both Vietnamese and Mandarin Chinese for the last 50+ years, it strikes me that the rate of apparent monosyllabicity in Vietnamese is much greater than in Mandarin Chinese – indeed, Vietnamese appears to have the highest rate of monosyllabicity and the lowest rate of phonemic redundancy of any language I have taken a scholarly interest in. For what it's worth…
Steve said,

October 3, 2012 @ 12:17 pm

While this discussion is very interesting for us [and to me especially, since this is basic to what I have been doing every day for the past half century], it is rather meaningless from the point of view of the users of the Vietnamese writing system. It is very unlikely that any writing reforms will be instituted in the foreseeable future. They would cause more chaos that benefit. For example, if you look at Ho Chi Minh's manuscripts and other handwritten materials, you will see that he often liked to write "z" for "d" and "r" and "gi" – these are reflexions of the similar Northern pronunciation of the graphs in question [odd, since he spoke with a Central accent in day-to-day conversation]. Because of Ho's iconic status in much of Viet Nam [but clearly not all of Viet Nam], some true-believers have pushed the idea that the writing system should make the same substitution. However, there are other regions in Viet Nam where there is no "z" sound whatsoever and where "d" and "r" and "gi" do not represent the same sounds anyway. And there is even a very small part of the country where "d" and "r" and "gi" are pronounced as separate contrasting sounds.
What this means is that one immediately begs political questions of national unity when one advocates writing reform of a system that is both universally employed [except in a few private spheres] and widely accepted from the Ca Mau peninsula to the Chinese border.
So I come back to my sainted mother's old Indiana wisdom: "if it ain't broke, don't fix it!"
Ran Ari-Gur said,

October 3, 2012 @ 1:11 pm

@M (was L): I don't think anyone is suggesting otherwise. I fear you might be refuting a straw man . . .
michael farris said,

October 3, 2012 @ 1:40 pm

Apropos of what Steve has written it's important to note that Quoc Ngu is not a transcription of a particular dialect or language variety (which is still arguably the case for Pinyin) but an orthography that has slowly evolved to work for speakers of dialects with rather different phonemic inventories.

Each distinction made in the script reflects a difference made somewhere (except for i and y as full syllables and for all I know somewhere does make that distinction) but nowhere makes all the distinctions (though a few dialects might come pretty close) and which differences are levelled varies from region to region (or village to village).

It is not calculated to look appealing to westerners but it does a remarkably good job of providing a working unified orthography for the language.
M (was L) said,

October 3, 2012 @ 2:59 pm

@Ran Ari-Gur – I was responding to the handwringing about Lon Don. What matters is how you write Hanoi in Vietnamese. How you write London or Paris or East Lansing doesn't really come into it except as a footnote.
JS said,

October 3, 2012 @ 3:28 pm

Ruben Polo-Sherk:
It would indeed be interesting if this were a principled distinction… but is noun compounding really "importantly different" from the sort of example you mention (li3jie3 理解, from two verbs in "parallel," or kan4jian4 看见, from two verbs in "series")? It seems possible that, historically, there simply haven't been enough disyllabic+monomorphemic verbs around to feed such processes… and such as have appeared more recently do get up to a certain amount of funny stuff, esp. of a "reduplicative" nature (e.g., lao1laodaodao 唠唠叨叨, shu3shuluoluo 数数落落, etc.)
Jongseong said,

October 3, 2012 @ 4:13 pm

Korean has been written with spaces between words since at least the 1930s; before that, spacing depended largely on the author, and before that, spaces were not used.

Spacing continues to vex Koreans, but this is largely due to the agglutinative morphology. For example, suffixes are supposed to be written without spaces and dependent nouns are supposed to be spaced, but Korean is full of cases where the same form can behave as a suffix or a dependent noun, as in daero 대로. As a suffix meaning "based on" or "following", you have beop-daero 법대로 ("following the law") with no space; as a dependent noun meaning "as", you have mal-han daero 말한 대로 ("as spoken") with space (I'm using the hyphen to separate morphemes in the romanization). Think of the confusion in English between "a while" and "awhile" or "maybe" and "may be", but much more frequent in the language.

Compound nouns are another source of ambiguity, much as in English (which has the additional option of hyphenation to confuse matters further—"crybaby", "cry-baby", or "cry baby"?). Korean rules allow for optional spacing in many cases, which I guess is pragmatic.

I'm less familiar with North Korean rules, but in general they use spaces quite a bit less than in South Korea. Compound nouns are generally written without spaces, and I think even dependent nouns may be written without spaces, so that the example above would be mal-han-daero 말한대로 in North Korean spelling.

I don't think you could come up with a spacing rule for Korean that is at once simple and can satisfy everyone. However, for all the confusion about correct spacing, you wouldn't find anyone arguing for going back to no spaces between words. Korean is so much more readable with spaces. For what it's worth, Koreans don't have the confusion between syllables and words regarding their own language, though they have the advantage that polysyllabic morphemes are so common in Korean.

Knowing next to nothing about Vietnamese and based on the simple fact that it is an isolating language with limited affixation, I would think spacing rules for Vietnamese would be simpler than for Korean.
Ruben Polo-Sherk said,

October 3, 2012 @ 4:39 pm

The issue is semantic: Polysyllabic morphemes are independent in a way that the monosyllabic morphemes, when functioning as part of a compound, are not. They contain the entirety of the meaning. Now, even if it can be used independently, the monosyllabic morphemes, when they are serving to construct a compound, do not–the meaning of each is part of a large set of fundamental "nuts and bolts" that are put together to have meaning that can stand by itself. This fundamentalness is what I was talking about, and there are no (or at least trivially few) disyllabic morphemes in this group of fundamental ones.
Matt said,

October 3, 2012 @ 8:09 pm

One interesting thing about spaces in Japanese kids' books is that they don't come between words and particles. So in kana it's "いぬがはなを" (dog-NOM flower-ACC) but in Romaji it's (usually) "inu ga hana o". (Although the Portuguese missionaries used the same separation as modern kana: "inuga fanauo".) One useful effect of adding spaces to Japanese orthography would be the provision of a final, by-fiat answer to what exactly constitutes a word in Japanese. (Tongue only partly in cheek.)
wren ng thornton said,

October 3, 2012 @ 8:27 pm

@Ellen K:

There are certainly ambiguous cases in English, but I think the issue is one of severity. Most of the English examples I can think of are ones where the compositional structure has been lost to us (e.g., "a lot", "after all") and we treat the set phrase as a single word. (The other examples are compound nouns, but German seems to do fine with eliminating the spaces there.) However, to pick Japanese as an example, because of its agglutinative nature the issue of distinguishing words is problematic even for productive structures.

For example, Japanese uses a lot of verb compounding. This is vaguely similar to English's system of modal verbs, except that it's extremely productive instead of involving a closed set of forms. Depending on the verbs involved, these compounds could be (a) entirely compositional, (b) syntactically compositional but with non-compositional semantics, (c) semantically non-compositional to the point of being aspectual/affective markers, often with phonetic non-compositionality, or (d) non-compositional to the point that they are considered to be simple inflections rather than compounds. In the conventional romanization we treat most of (d) as single words; treat (a), (b), and the remainder of (d) as separate words; and waffle back and forth over (c). But because there's a continuum here —from clearly compositional processes through to tense/aspect/mood/polarity inflections— wherever you draw the line is going to be problematic.

To pick another issue, in the traditional romanization we separate off case morphemes from their noun (etc). This is strange, but then there's a continuum between case morphemes and postpositions, so again there's this issue of where to draw the boundary (if indeed any boundary should be drawn). And this gets confounded into other issues too. For true adjectives, the morpheme converting them into adverbs is traditionally romanized as part of the same word. Whereas for adjectival nouns, the morpheme converting them into adverbs is traditionally written as a separate word (since it's related to the dative). And that morpheme coincides with one for converting verbal stems into adverbs, but for verbal stems people waffle back and forth about whether it should be separated or not. That morpheme is also a form of the copula, so surely you'd want to be consistent about how you treat the copula elsewhere right? Etc. Etc.

If Vietnamese is at all similar, it's no wonder they settled on spaces between each morpheme/syllable. It's a bit extreme, but at least it's consistent, eh?
Ran Ari-Gur said,

October 3, 2012 @ 11:04 pm

@M (was L): Re: "I was responding to the handwringing about Lon Don": I don't see how you can have been, seeing as there wasn't any . . .
JS said,

October 4, 2012 @ 12:01 am

Ah… I am not clear on all points, but sense in your last comment a view of Chinese and Vietnamese word formation rather different from that which I have in mind: where I tend to think mostly about larger words formed from smaller words proper by a variety of processes (some of which might be properly called "compounding" and some not), it seems you view these languages as engaging in word-building from stores of (often bound) morphemes (the "nuts and bolts") in a more self-conscious manner — a la "classical" compounding in English, or novel unions of Sino-Japanese elements in Japanese?

These two possibilities are not mutually exclusive, of course… but my tendency to see the latter sort of "compounding" as more exceptional and less interesting might be the reason I have been slow to appreciate your suggestion regarding the relative productivity of monosyllabic vs. disyllabic morphemes in compounds (a difference I suppose I might see as merely a reflection of the sorts of words available in the language at a given time.)
Matt said,

October 4, 2012 @ 12:31 am

Also, part of role that Chinese characters play in Japanese orthography is indicating word division. The basic principle is that "A change from kana to kanji usually indicates that a new word has begun."

人類社会のすべての構成員の固有の尊厳と平等で譲ることのできない権利とを承認することは、世界における自由、正義及び平和の基礎であるので、

jinruishakainosubeteno koseiinno koyuno songento byodode yuzurukotonodekinai kenritoo shoninsurukotowa, sekainiokeru jiyu, seigioyobi heiwano kisodearunode…

That simple rule above gets us about halfway to a working tokenizer — of the 12 "words" above, at least 6 or 7 are arguably "really words" if you accept the particles-are-part-of-the-word-they-follow argument. The lexicon needed to mop up the edge cases isn't unworkably enormous.

Of course, this doesn't mean that kanji are necessary for Japanese writing to make sense (as harped on endlessly in other threads), as any shift to a kanji-free writing system would surely see the introduction of spacing as well. But in my opinion this is part of the reason why there is such resistance to ideas like "only write Sino-Japanese words with kanji; write the native vocabulary (like 譲る in the example above) in kana" — the arrangement of different types of characters conveys the same sort of information as whitespace, albeit less efficiently and unambiguously.
Ruben Polo-Sherk said,

October 4, 2012 @ 8:31 am

JS: I'm sorry, but I'm not entirely sure what you're saying, so forgive me if I'm talking about something entirely different.

Aren't these two types of compounding entirely different phenomena? The first one isn't really particular to Chinese, and doesn't have anything to do with the morphological substructure, so I left it out of my original post. In fact, my point was that these two-morpheme compounds are *different* from compounds like "tennis racket" (if that's what you mean by "'classical' compounding"?).

Do you mean that you see the mechanism for establishing the meanings of two-(bound) morpheme compounds from their constituent parts as irregular to the point that you consider these compounds to be mostly "set" combinations, and therefore unitary?

If so, I'll try to explain why I see it the way I described it.

From my own experience learning them, I find that a many (a majority?) of compounds are understandable entirely from their constituent morphemes. More specifically, in the past, when I came across an unfamiliar compound, but knew each morpheme, I would be able to understand what that compound meant from my knowledge of those morphemes. In fact, there have been times when I wanted a particular word, but hadn't learned it yet, and was able to successfully "derive" it from morphemes that I already knew (If you want examples of the kind of compounds I derived, some I remember now are 区分、両日、根源、外面的、変容).
JS said,

October 4, 2012 @ 9:59 am

^ Thanks for your remarks. Basically I feel that compounding from bound morphemes in Chinese at least, while it certainly exists, is not terrifically productive — such words (dian4shi4 电视 and the like) smell more like our coinages from Greek/Latin roots (what I imprecisely called "classical" compounding) or the Sino-Japanese contribution to CJK (ke1xue2 科学). The examples you raised earlier (li3jie3 理解, kan4jian4 看见, myriad others) are instead in origin free-free syntactic adjacencies (the latter arguably still phrasal), and I see no reason in principle why polysyllabic morphemes couldn't wind up involved in such lexicalization processes. So this second is indeed the "tennis racket" category, though much richer in practice than such a designation might suggest.

Incidentally, in neither case would I see the meanings of these Chinese "compounds" as generally transparent given their individual components, though the latter sort were at some point freely composed and thus are arguably so from time to time…
Jason said,

October 4, 2012 @ 11:26 am

@ JS

I think you are confusing compounds, which are similar to Germanic words in English (e.g. airport, kitchen table), and agglutination, which accounts for Greek and Latin words in English (e.g. deconstructionism). Mandarin, like English, employs both; however, compounding is by far the more productive form.
Ran Ari-Gur said,

October 4, 2012 @ 3:29 pm

@Jason: By "coinages from Greek/Latin roots" or "'classical' compounds", I assume that JS is referring to words like "biology", "telescope", "interject", etc., where a single word is formed by compounding (?) two bound morphemes ("bio-" and "-ology", "tele-" and "-scope", "inter-" and "-ject", etc.). Lexically and semantically, they're very similar to compounds of free morphemes like "life science" and "distance viewer", and to verb-particle idioms like "throw in".
JS said,

October 4, 2012 @ 8:53 pm

^ So… yeah, dian4shi4 etc. strike me as "biology"-type words, built self-consciously from the nuts-and-bolts Ruben Polo-Sherk has referred to, while the core of the Mandarin lexicon consists more of "life science"-type words (though of course of very diverse phrasal origins, found across word classes, and often with constituents no longer free.)

@Jason, not sure what you would want to call "agglutination" in Mandarin as distinct from "compounding"… perhaps -de suffixation to create "one who does X" meanings, -hua suffixation to create "ish"-ish meanings, and the like? In which case you would have processes limited in number but very productive indeed…

Apologies if I've derailed discussion… to return to the point, I might say I've found it interesting that those with knowledge of Vietnamese language and writing seem to find the suggestion of word division so asinine. The situation surely can't be so different from that of Mandarin, where IN THEORY (this naturally being as far as the present discussion means to extend), word division would be a workable and an at least marginally useful orthographical device.
Ruben Polo-Sherk said,

October 4, 2012 @ 11:16 pm

Ok, now I see what you're saying. I think that our disagreement comes from how we are viewing the processes involved in compounding for the "core" of the lexicon.

We agree on the fact that polysyllabic morphemes can form part of "life science" compounds, but I am claiming that there is an important distinction between "life science"/"tennis racket"/蝴蝶骨 compounds and ones like 空間/変化. In the former, both parts are stand-alone, independent words, and you are using the life/tennis/butterfly to specify the kind of science/racket/bone. This is not the same construction involved in 空間 or 理解. (If anything, the former is a lot closer to the "biology" type in construction). Whether or not polysyllabic morphemes can involved in a particular process has, obviously, nothing to do with how many syllables they have; it has to do with the fact that, whatver the reason, all polysyllabic morphemes in Chinese are stand-alone, independent words, and not building blocks*.

For the purposes of this discussion, I'm splitting compounds into three types (some of my examples are Chinese; others are Sino-Japanese, but the mechanism is the same):

1) 电视, 化学, etc. These are essentially the same as "classical" compounds in English.

2) tennis racket, 蝴蝶骨, etc. This exists in lots of languages and is unremarkable.

3) 空間, 想要, 見解, 区分, 理解, 変化. This is what I mean by the core of the lexicon.

*If you know of any exceptions, please let me know, but I maintain that they'd still be statistically rare enough to be irrelevant to my larger point.
JS said,

October 5, 2012 @ 11:06 pm

^ To be "compounds" at all, all items under your (2) as well as (3) must be lexemes in their own right, with transparency or lack thereof merely a function of time, among other factors, correct? Surely da4ren(2) 大人 ("descriptive" compound; currently 'adult' and formerly 'your honor', etc.) is no different from hu2die2gu3, with li3jie3 and others (though very often of entirely different first syntactic structure) distinct from these only due to gradual loss of transparency? So, my claim was only that polysyllabic morphemes, though relatively few in number, may also engage in such processes.

I don't think we should speak of privileged "building blocks" in Mandarin aside from "suffixes" like -de, -jia, -men, etc., and arguably the bound forms on occasion exploited in your (1).
Ruben Polo-Sherk said,

October 7, 2012 @ 6:21 am

It seems that we've been using arguments that assume one interpretation or the other on whether these morphemes are lexemes or not, basically arguing from inconsistent paradigms. It seems to me that you see every morpheme, with the exception of things like -学, 电视, and -的, as always functioning as a lexeme. In Sino-Japanese, that interpretation is absolutely untenable–there's no question that the compounds themselves are the lexemes, but in Chinese, it's not so clear. There's only a valid distinction between polysyllabic morphemes and monosyllabic ones (or, more precisely, between bound and free ones) if you *don't* see every morpheme as a lexeme (excepting the agglutinative ones). If things like 理解 are taken to be clearly two words instead of one, then there is, of course, no utility to having the concept of a core process for forming the vocabulary at all (but my earlier point would nevertheless be correct–then every element of the lexicon, with a few very rare exceptions, is still monosyllabic). I'm not going to try to convince you or anyone else that things like 理解 are actually unitary in Chinese, since I don't believe that myself: many tools for analyzing other languages (for example, the concepts of parts of speech and word boundaries) are not suitable for Chinese, and everything looks fuzzy when you look at it from those perspectives. I'll only conclude with an argument for transparency and compositionality of these compounds: suppose you know what 理論、理解、解説、説明、and 回答 mean; you can infer what the "meanings" (or parts of meanings) are represented by 理 and 解. And then if you see 解答 for the first time, you can understand it compositionally. (I'm not claiming that *every* compound works like this–there are many, of course, that are rather opaque–but I do think that the majority remain compositional.)
Gpa said,

October 14, 2012 @ 4:29 pm

Vietnamese borrows mainly from Cantonese, which is a remnant from Middle Chinese, not Mandarin, which is a bunch of reduced sounds from Middle Chinese, so using Mandarin seems irrelevant. And using Japanese is more irrelevant. Most of the words in Japanese use ancient Chinese monosyllabic combinations with other monosyllabic words to form a disyllabic or polysyllabic word. Koreans due to their borrowing from Chinese, just like Japanese which borrows across the many varieties of Chinese dialects, so any Chinese dialect's original word is now not their own anymore. Basically, Japanese, Korean, and Vietnamese use the same method to convey Chinese disyllabism: Using approximate sounds via their devised writing systems, all via Chinese, to form the Chinese words, which might or might not sound like the original Chinese word anymore, due to Japanization, Koreanization and Vietnamization of these original Chinese words. 蝴蝶: 蝴 & 蝶 both mean "butterfly/butterflies", which are rarely separated to form other disyllabic / polysyllabic words in Chinese.

Source: http://languagelog.ldc.upenn.edu/nll/?p=4233

APPENDIX P

Strong Sino-Vietnamese Word Choice
in the Northern Vietnamese Sub-dialect vs. Southern Sub-dialect

LIST I

Ấntượng = Đángghinhớ, đángnhớ

Bang = Tiểubang (State)

Bắcbộ/Trungbộ/Nambộ = Bắcphần/Trungphần/Namphần

Báocáo = Thưatrình, nói, kể

Bảoquản = Chechở, giữgìn, bảovệ

Bàinói = Diễnvăn

Bảohiểm (mũ) = Antoàn (mũ)

Bèo = Rẻ (tiền)

Bồidưỡng = Nghỉngơi, tẩmbổ, sănsóc, chămnom, ănuốngđầyđủ, hốilộ

Bứcxúc = Dồnnén, bựctức

Bấtngờ = Ngạcnhiên

Bổsung = Thêm, bổtúc

Cáchly = Côlập

Cảnhbáo = Báođộng, phảichúý

CáiAlô = Cáiđiệnthọai

Cáiđài = Radio, máyphátthanh

Cănhộ = Cănnhà

Căng (lắm) = Căngthẳng

Cầulông = Vũcầu

Chảnh = Kiêungạo, làmtàng

Chấtlượng = Phẩmchất

Chấtxám = Trítuệ, sựthôngminh

Chếđộ = Quychế

Chỉđạo = Chỉthị, ralệnh

Chỉtiêu = Địnhsuất

Chủnhiệm = Trưởngban, Khoatrưởng

Chủtrì = Chủtọa

Chữacháy = Cứuhỏa

Chiêuđãi = Thếtđãi

Chui = Lénlút

Chuyênchở = Nóilên, nêura

Chuyểnngữ = Dịch

Chứngminhnhândân = ThẻCăncuớc

Chủđạo = Chính

Cocụm = Thuhẹp

Côngđoàn = Nghiệpđoàn

Côngnghiệp = Kỹnghệ

Côngtrình = Côngtác

Cơbản = Cănbản

Cơkhí (tĩnhtừ!) = Cầukỳ, phứctạp, máymóc

Cơsở = Cănbản, nguồngốc

Cửakhẩu = Phicảng, Hảicảng

Cụmtừ = Nhómchữ

Cứuhộ = Cứucấp

Diện = Thànhphần

Dựkiến = Phỏngđịnh

Đàotị = Tịnạn

Đầura/Đầuvào = Xuấtlượng/Nhậplượng

Đạitáo/Tiểutáo = Nấuănchung, ăntậpthể/Nấuănriêng, ăngiađình

Đạitrà = Quymô, cỡlớn

Đảmbảo = Bảođảm

Đăngký = Ghidanh, ghitên

Đápán = Kếtquả, trảlời

Đềxuất = Đềnghị

Độingũ = Hàngngũ

Độngnão = Vậndụngtríóc, suyluận

Đồngbàodântộc = Đồngbàosắctộc

Độngthái = Độngtĩnh

Độngviên = Khuyếnkhích

Độtxuất = Bấtngờ

Đườngbăng = Phiđạo

Đườngcaotốc = Xalộ

Giacông = Làmăncông

Giảiphóng = Lấylại, đemđi

Giảiphóngmặtbằng = Ủichođấtbằng

Giảnđơn = Đơngiản

Giaolưu = Giaothiệp, traođổi

Hạchtoán = Kếtoán

Hảiquan = QuanThuế

Hàngkhôngdândụng = Hàngkhôngdânsự

Hátđôi = Songca

Tốpca = Hợpca

Hạtnhân (vũkhí) = Nguyêntử

Hậucần = Tiếpliệu

Họcvị = Bằngcấp

Hệquả = Hậuquả

Hiệnđại = Tốitân

Hộnhà = Giađình

Hộchiếu = Chứngnhận

Hồhởi = Phấnkhởi

Hộkhẩu = Tờkhaigiađình

Hội Chữthậpđỏ = Hội Hồngthậptự

Hoànhtráng = Nguynga, tránglệ, đồsộ

Hưngphấn = Kíchđộng, vuisướng

Hữuhảo = Tốtđẹp

Hữunghị = Thânhữu

Huyện = Quận

Kênh = Băngtần

Khảnăng = Cóthểxẩyra

Khẩntrương = Nhanhlên

Khâu = Bộphận, nhóm, ngành, ban, khoa

Kiềuhối = Ngoạitệ

Kiệtsuất = Giỏi, xuấtsắc

Kinhqua = Trảiqua

Làmgái = Làmđiếm

Làmviệc = Thẩmvấn, điềutra

Liênhoan = Đạihội, ănmừng

Liênhệ = Liênlạc

Linhtinh = Vớvẩn

Lợinhuận = Lợitức

Lượctóm = Tómlược

Lýgiải = Giảithích

Nắmbắt = Nắmvững

Nângcấp = Nâng, hoặcđưagiátrịlên

Năngnổ = Siêngnăng, tháovát

Nghệnhân = Thợ, nghệsĩ

Nghệdanh = Tên (nghệsĩ)

Nghĩavụquânsự = Điquândịch

Nghiêmtúc = Nghiêmchỉnh

Nghiệpdư = Tàitử, đilàmthêm, nghềphụ, nghềtaytrái

Nhàkhách = Kháchsạn

Nhấttrí = Đồngý

Đồngthuận = Đồnglòng

Nhấtquán = Luônluôn, trướcsaunhưmột

Ngườinướcngoài = Ngoạikiều

Nỗiniềm = suytư

Phầncứng = Cươngliệu

Phầnmềm = Nhuliệu

Phảnánh = Phảnảnh

Phảnhồi = Trảlời, hồiâm

Phátsóng = Phátthanh

PhóTiếnsĩ = Caohọc

Ga = Phitrường, phicảng, (gaxelửa)

Phivụ = Mộtvụtraođổithươngmại

Phụchồinhânphẩm = Hoànlương

Phươngán = Kếhoạch

Quátải = Quásức, quámức

Quántriệt = Hiểurõ

Quảnlý = Quảntrị

Quảngtrường = Côngtrường

Quânhàm = Cấpbậc

Quyhoạch = Kếhoạch

Quytrình = Quátrình

Sốc (“shocked) ” = Kinhhoàng, kinhngạc, ngạcnhiên

Sơtán = Tảncư

Sư = Sưđoàn

Sứckhoẻcôngdân = Ytếcôngcộng

Sựcố = Trởngại

Tậpđoàn/Doanhnghiệp = Côngty

Tênlửa = Hỏatiễn

Thamgialưuthông = Láixe

Thamquan = Thămviếng

Thanhlý = Thanhtoán, chứngminh

Thânthương = Thânmến

Thicông = Làm

Thịphần = Thịtrường

Thunhập = Lợitức

Thưgiãn = Tỉnhtáo, giảitrí

Thuyếtphục (tính) = Cólý, hợplý, tinđược

Tiêntiến = Xuấtsắc

Tiếncông = Tấncông

Tiếpthu = Tiếpnhận, thâunhận, lãnhhội

Tiêudùng = Tiêuthụ

Tổlái = Phihànhđòan

Tờrơi = Truyềnđơn

Tranhthủ = Cốgắng

Trítuệ = Kiếnthức

Triểnkhai = Khaitriển

Tưduy = Suynghĩ

Tưliệu = Tàiliệu

Từ = Tiếng, chữ

Ùntắc = Tắtnghẽn

Vấnnạn = Vấnđề

Vậnđộngviên = Lựcsĩ

ViệnUngbướu = ViệnUngthư

Vôtư = Tựnhiên

Xáctín = Chínhxác

Xecon = Xedulịch

Xekhách = Xeđò

Xửlý = Giảiquyết, thihành

LIST II

Bắtmắt = Đẹpmắt, Ưanhìn, Hấpdẫn

Bìnhổn = Quânbình, ổnđịnh

Cănhộ = Cănnhà

Cảitạo = Tùkhổsai

Chuicửahậu = Côngdu

CụcĐườngbiển = Hànghải

CụcĐườngsắt = Hỏaxa

Cókhảnăng = Cóthể

Dũngcảm = Mạnhmẽ

Đạitrà = Quymô

Đápán = Câutrảlời, Đápsố

Đẳngcấp = Giaicấp

Đilàmsuốt = Đilàmsuốtngày, suốtbuổi

Độngthái = Độngtĩnh

Độngviên = khuyếnkhích

Giámềm = Giárẻ

Giáhữunghị = Giátượngtrưng

Giảmtốc = Giảmtốcđộ

Giaodịch = Thươngthảo

Hâm, Tửng = Khùng, mátgiây

Hộlý = Dâmnô

Hiểnthị = Xem, Thấy

Khẩutrang = Băngvệsinh

Khẩntrương = Gấprút, Khẩncấp

Làmchủ = Nôlệ

Lênlớp = Dạyđời, Sửalưng

Mặtbằng = Diệntíchđất

Nhânthân = Thânnhân

Phảnbiện = Phảnđối

Quantâm = Lolắng

Quảngbá = Quảngcáo, Truyềnbá

Quảnlý = Sởhữu

Sânbay = Phitrường

Tàuchủnướclạ = Tàucộngxâmlăng

Tàuvũtrụ = Phithuyền

Tiếnsĩhữunghị = Tiếnsĩgiấy, tiếnsĩdỏm

Tiếnđộ = Tiếntrình

Tiếpcận = Gầngũi, Giaotiếp

Tưvấn = Cốvấn

Tốchất = Tưchất

Tuesday, April 1, 2025

APPENDICES

Examples of some polysyllabic and dissyllabic vocabularies

I) Composite words:

II) Dissyllabic compound words:

III) Reduplicative polysyllabic and dissyllabic compound words or binoms:

IV) Polysyllabic "Vietnamized" neighboring Mon-Khmer and Daic words:

V) Polysyllabic Vietnamized English and French words:

VI) Culturally-accented Vietnamese words of Chinese origin:

APPENDIX C

Examples of some variable sound changes:

Thuận Nghịch Độc by Duc Tran

The author, a commentator and translator for Radio Free Asia (RFA) as of 2019, constructs an etymological analogy based a poem by Phạm Thái (1777-1813) which is written in "Thuận Nghịch Độc" form, that is, standard reading is for Sino-Vietnamese sound:

from these reading we can see clearly the relations between those Sino- and Sinitic-Vietnamese words: Các = GácCẩm = GấmCưỡng = GượngLiêm = Rèm, etc.with this onset, we can apply the same patterns to other words:

and so on.

APPENDICES D to G

The case of "sông"by Tsu-lin Mei

The case of "sông"by Tsu-lin Mei

(1) 江 **krong/kang/jiāng ‘Yangtze River’, ‘river’.

The case of "chết"by Tsu-lin Mei

The case of "chết"by Tsu-lin Mei

(2) 札 *tsɛt 'to die’

The case of "ruồi"by Tsu-lin Mei

(3) 維虫 *rwəi ‘fly’

APPENDIX G

The case of "ngà"by Tsu-lin Mei

The case of "ngà"by Tsu-lin Mei

(4) 牙 *ngra/nga/ya ‘tooth, tusk, ivory’

SOME OTHER SURVIVAL OF AUSTROASIATIC ETYMA

(5) 虎 ‘tiger’ ** k’la(g)/χuo/hu

(6) 囝 FC kiaŋ/AM kiă ‘son, child’

(7) 弩 *na/nuo/nu ‘crossbow’

(8) FC tyɔŋ/AM tɔŋ ‘shaman, spirit healer, medium’

(9) AM/Fu‘an tam ‘damp, wet, moist’

These forms which are attested in most eastern Min dialects except Foochow can be related to VN [ 'tẩm' = 浸 jìn, jīn < MC cjɨm < OC *cim, *cims (?) ] (wet, moist).

(10) FC siŋ/AM tsim ‘a type of crab’

(11) FC paiʔ/AM bat ‘to know, to recognize’

(12) FC p’uoʔ/AM p’eʔ . cf. Fu'an p’ut ‘scum, froth’

(13) 萍 FC p’iu /AM p’io ‘duckweed’

(14) FC kie/AM kue, cf. Kienyang ai ‘(small) salted fish’

Monosyllabic loanwords

Modern compounds

English

Vietnamese

Mandarin

Cantonese

Japanese

Korean

Self-coined Sino-Vietnamese compounds

Proper names

Yueren Ge (越人歌) and the Vietnamese language(Tiếng Việt trong bài Việt Nhân Ca)Author: liketolearn

VIỆTNHÂN CA

PHÁT HIỆN LẠI VỀ VIỆT NHÂN CA (越人歌) by Thanh Đo

Việt nhân ca

APPENDIX L

Ketchup's Chinese origins a sticky subject for US foodies

Vietnamese Polysyllabism

Strong Sino-Vietnamese Word Choicein the Northern Vietnamese Sub-dialect vs. Southern Sub-dialect

LIST I

Ấntượng = Đángghinhớ, đángnhớ

LIST II

from these reading we can see clearly the relations between those Sino- and Sinitic-Vietnamese words:

Các = Gác
Cẩm = Gấm
Cưỡng = Gượng
Liêm = Rèm, etc.

with this onset, we can apply the same patterns to other words:

The case of "sông"
by Tsu-lin Mei

The case of "sông"
by Tsu-lin Mei

The case of "chết"
by Tsu-lin Mei

The case of "chết"
by Tsu-lin Mei

The case of "ruồi"
by Tsu-lin Mei

The case of "ngà"
by Tsu-lin Mei

The case of "ngà"
by Tsu-lin Mei

These forms which are attested in most eastern Min dialects except Foochow can be related to VN [ 'tẩm' = 浸 jìn, jīn < MC cjɨm < OC cim, cims (?) ] (wet, moist).

Yueren Ge (越人歌) and the Vietnamese language
(Tiếng Việt trong bài Việt Nhân Ca)
Author: liketolearn

PHÁT HIỆN LẠI VỀ VIỆT NHÂN CA (越人歌)

by Thanh Đo

Strong Sino-Vietnamese Word Choice
in the Northern Vietnamese Sub-dialect vs. Southern Sub-dialect