ed435af7fe
It often happens that you have text data in Unicode, but you need to represent it in ASCII. For example when integrating with legacy code that doesn't support Unicode, or for ease of entry of non-Roman names on a US keyboard, or when constructing ASCII machine identifiers from human-readable Unicode strings that should still be somewhat intelligeble (a popular example of this is when making an URL slug from an article title). Note that this module generally produces better results than simply stripping accents from characters (which can be done in Python with built-in functions). It is based on hand-tuned character mappings that for example also contain ASCII approximations for symbols and non-Latin alphabets. This is a Python port of Text::Unidecode Perl module by Sean M. Burke.
16 lines
793 B
Text
16 lines
793 B
Text
It often happens that you have text data in Unicode, but you need
|
|
to represent it in ASCII. For example when integrating with legacy
|
|
code that doesn't support Unicode, or for ease of entry of non-Roman
|
|
names on a US keyboard, or when constructing ASCII machine identifiers
|
|
from human-readable Unicode strings that should still be somewhat
|
|
intelligeble (a popular example of this is when making an URL slug
|
|
from an article title).
|
|
|
|
Note that this module generally produces better results than simply
|
|
stripping accents from characters (which can be done in Python with
|
|
built-in functions). It is based on hand-tuned character mappings
|
|
that for example also contain ASCII approximations for symbols and
|
|
non-Latin alphabets.
|
|
|
|
This is a Python port of Text::Unidecode Perl module by Sean M.
|
|
Burke.
|