17 lines
793 B
Text
17 lines
793 B
Text
|
It often happens that you have text data in Unicode, but you need
|
||
|
to represent it in ASCII. For example when integrating with legacy
|
||
|
code that doesn't support Unicode, or for ease of entry of non-Roman
|
||
|
names on a US keyboard, or when constructing ASCII machine identifiers
|
||
|
from human-readable Unicode strings that should still be somewhat
|
||
|
intelligeble (a popular example of this is when making an URL slug
|
||
|
from an article title).
|
||
|
|
||
|
Note that this module generally produces better results than simply
|
||
|
stripping accents from characters (which can be done in Python with
|
||
|
built-in functions). It is based on hand-tuned character mappings
|
||
|
that for example also contain ASCII approximations for symbols and
|
||
|
non-Latin alphabets.
|
||
|
|
||
|
This is a Python port of Text::Unidecode Perl module by Sean M.
|
||
|
Burke.
|