b91b459a1e
PR: ports/108574 Repocopied by: marcus
23 lines
1.1 KiB
Text
23 lines
1.1 KiB
Text
Uniutils consists of five programs for finding out what is in a Unicode file.
|
|
They are useful when working with Unicode files when one doesn't know the
|
|
writing system, doesn't have the necessary font, needs to inspect invisible
|
|
characters, needs to find out whether characters have been combined or in what
|
|
order they occur, or needs statistics on which characters occur.
|
|
|
|
uniname defaults to printing the character offset of each character, its byte
|
|
offset, its hex code value, its encoding, the glyph itself, and its name.
|
|
|
|
unidesc reports the character ranges to which different portions of the text
|
|
belong. It can also be used to identify Unicode encodings (e.g. UTF-16be)
|
|
flagged by magic numbers.
|
|
|
|
unihist generates a histogram of the characters in its input, which must be
|
|
encoded in UTF-8 Unicode.
|
|
|
|
ExplicateUTF8 is intended for debugging or for learning about Unicode. It
|
|
determines and explains the validity of a sequence of bytes as a UTF8 encoding.
|
|
|
|
Unirev is a filter that reverses UTF-8 strings character-by-character (as
|
|
opposed to byte-by-byte).
|
|
|
|
WWW: http://billposer.org/Software/unidesc.html
|