13 lines
784 B
Text
13 lines
784 B
Text
|
It often happens that you have non-Roman text data in Unicode, but you can't
|
||
|
display it -- usually because you're trying to show it to a user via an
|
||
|
application that doesn't support Unicode, or because the fonts you need aren't
|
||
|
accessible. You could represent the Unicode characters as "???????" or
|
||
|
"\15BA\15A0\1610...", but that's nearly useless to the user who actually wants
|
||
|
to read what the text says.
|
||
|
|
||
|
What Text::Unidecode provides is a function, unidecode(...) that takes Unicode
|
||
|
data and tries to represent it in US-ASCII characters (i.e., the universally
|
||
|
displayable characters between 0x00 and 0x7F). The representation is almost
|
||
|
always an attempt at transliteration -- i.e., conveying, in Roman letters, the
|
||
|
pronunciation expressed by the text in some other writing system.
|