First, a flow chart:
For those of you who know Chinese, the characters on the far left are Chinese for "garbage". (The little Japanese kana-like things to the left of the characters are the Taiwan BoPoMoFo pronunciation guides for the characters, I included them because the proper way to pronounce the characters, especially the ones on the right, are unknown to many.) Now, for those of you who have had exposure to both the Taiwanese flavour of Mandarin and the Mainland Chinese flavour of Mandarin (aka. Pu3 Tong1 Hua4), the two characters on the left are one of the more extreme examples of when the two flavors do not agree on pronunciation.
The PuTongHua was standardized after the 1949 split of the Communist and Nationalist parties, and it abolished some of the pronunciation suggestions of the Nationalist Government (circa 1923, roughly), substituting with pronunciation more local to the Beijing dialect. For the most part, the difference is one of tones, and the majority being differences in 2nd and 3rd tones, or 3rd and 5th/light tones. But for this phrase, the mainland Chinese read
la1 ji1
and the Taiwanese read
le4 se4
(which I noted in BoPoMoFo next to the characters in the above image).
And I think I have a plausible reason to explain the difference.
I started this line of research when I wondered "Why use a two-word phrase to describe an every day object when neither of the characters, as far as I know, have been used in any other context?" Looking into Ci2 Hai3 (the sea of words), I looked for the etymology of the phrase.
What I found was what is summarized in the above flow-chart.
- Originally there were two different phrases for refuse. Listed on the far right above. The first (top) one is pronounced la4 sa4, the second (bottom) ke4 sa4. The extant documents using the characters in the second form only use them together to mean refuse, and the second character in the the phrase is not even found in many modern computer Chinese character sets (I cannot actually input it in any of the input methods that I tried; the character does not exist according to any of the computers I have at home [which is why I present the above as a picture, rather than text... I had to manufacture the character from scratch in an image editing software]).
- The first form of the original two phrases is used more often, and cited in more texts. (Notice that the second character in the two phrases agree, suggesting its sole meaning is that of refuse.) The first character in that form, beside being read as la4 and used in the phrase, can also be read lie4 and is used (in that pronunciation) as a generic verb meaning "to stir, to mix, to part" etc.
- But the characters for garbage, in either form, are rather complicated looking. And since garbage is such an "everyday" occurence, the common people soon took over the language and demanded simpler ways of writing the two characters. The first attempts at simplification just directly replaced the original characters by the ones in the middle column, pronouncing them the same way as before.
- However, the replacement characters are too common: by themselves they each have multiple meanings used in other phrases with other pronunciations! The first character is most commonly pronounced la1 and means "to pull". The second character has pronunciations xi1 or ji2 (amonth others) with different, obscure (by modern standards) meanings. This is probably the origin of the Mainland Chinese pronunciation la1 ji1.
- So since the replacement characters are too common, they decided to change the radicals for the characters (since garbage has more to do with dirt than with hands anyway). The second character (in the final form) has a previous meaning (when pronounced ji2) as "danger". The first character, however, is a completely new invention, and only used now for the phrase "garbage".
- As to where the pronounciation le4 se4 comes from: most likely it occurred from a corruption/vowel shift (probably someone on the Nationalist Government's standardization committee came from a locale whose dialect reads it le4 se4). An evidence that the "proper/ancient" pronunciation should be la4 sa4 can be found in Cantonese, where a rough approximation (since I don't know any standard transliteration methods for Cantonese) is lab sab (I won't attempt to figure out which one of the 9 tones it should be). (Cantonese is well known as one of the Chinese dialects that preserves the most of the classical Chinese pronunciatios.)
So that's the story of garbage.