Mojibake (文字化け, moji character + bake change, literally ghost characters or changed characters) is Japanese for broken characters: the result of trying to display text in character encodings which a piece of software is not configured to deal with. This is often because they are "foreign" alphabets with respect to the makers of the software, but the problem can also arise between different encodings of the same language - such as between EUC-JP and Shift-JIS, both encodings of Japanese characters.
In mid 1990s as this problem became common, several website featured mojibake not as a problem to be tackled but as a computer joke. Words and even sentences were "deciphered" with meanings made up to deliver funny messages. It was even joked that this must be the work of extraterrestrials or ghosts trying to deliver secret messages.
It is called luan ma (亂碼 or 乱码 luan4 ma3), or "chaotic code(s)", in Chinese.
Example: "文字化け" might be displayed as "•¶Žš‰»‚¯" (of course, depending on the software you use to view this article, that example may not show up correctly).
Problems in other languages
This problem is not unique to the Asian users. All Central/Eastern European computer users didn't fare any better. Because computers weren't connected in any network even in mid- to late eighties there were different character encodings for every language with diacritical characters.
During the '90s, Russian computer users had to endure several different competing encodings (Unix KOI8-R, Windows CP-1251, DOS 866, standard ISO 8859-5, and several others) for the Cyrillic alphabet. Badly configured servers and lack of compatibility made garbled text a common and frustrating experience. Russian users, scared of the strange and unusual characters appearing instead of familiar Cyrillic letters, called them (krokozyabry). Many E-mail servers stripped the 8th bit from the characters as permitted by earlier standards (which makes complete hash out of UTF-8 as well as all of the above). For this reason many Cyrillic users used to resort to Roman transliteration. An even more frustrating problem emerged in early 2000s, when a popular e-mail client Outlook started replacing all entered Cyrillic characters with question marks, when replying to or forwarding a message created in another codepage.
In Poland every entity selling early DOS computers "invented" its own encoding, and reprogrammed EPROMS of CGA/EGA/Hercules cards with character shapes in these encodings. Additionally users of then popular home computers (like Amiga, Atari ST) "invented" their own encodings, incompatible with international standards (ISO 8859-2), vendor standards (IBM CP852, Windows CP1250) and locally agreed upon PC/MS DOS standards (Mazovia). The situation began to improve when (thanks to academic and user groups pressure) ISO 8859-2 succeeded as the "internet standard" with limited support of the dominant vendor's software. Thanks to numerous problems with all those encodings, even today some users tend to refer to Polish diacritical characters as krzaki (bushes). See also Polish codepages.
External links