New in 0.9.4:
* The data tables and line breaking algorithm have been updated to Unicode
version 6.0.0.
* A new include file unigbrk.h is provided. It declares functions for
grapheme cluster breaking, that is, determining the boundaries between
graphemes. See the documentation chapter "Grapheme cluster breaks in strings"
for details.
* In the include file unictype.h, constants are defined for the group of
general categories LC ("Cased Letter").
* In the include file unictype.h, functions for associating canonical
combining classes with names have been added:
uc_combining_class_name
uc_combining_class_long_name
uc_combining_class_byname
* In the include file unictype.h, functions for the Arabic joining type and
the Arabic joining group have been added:
uc_joining_type_name
uc_joining_type_long_name
uc_joining_type_byname
uc_joining_type
uc_joining_group_name
uc_joining_group_byname
uc_joining_group
* In the include file unictype.h, functions for new predefined properties
have been added:
uc_is_property_cased
uc_is_property_case_ignorable
uc_is_property_changes_when_lowercased
uc_is_property_changes_when_uppercased
uc_is_property_changes_when_titlecased
uc_is_property_changes_when_casefolded
uc_is_property_changes_when_casemapped
But it's recommended to use the case mapping functions from unicase.h
instead.
* In the include file unictype.h, the functions for bidi class, formerly known
as bidirectional category, have been renamed:
uc_bidi_category_name -> uc_bidi_class_name
uc_bidi_category_byname -> uc_bidi_class_byname
uc_bidi_category -> uc_bidi_class
uc_is_bidi_category -> uc_is_bidi_class
The old function names still exist, but are obsolete.
* In the include file unictype.h, functions for returning long names of
property values have been added:
uc_general_category_long_name
uc_bidi_class_long_name
The functions
uc_general_category_byname
uc_bidi_class_byname
have been extended to recognize long names as well as short names.
* It is now easier to detect the subminor version: The value of the variable
_libunistring_version and of the macro _LIBUNISTRING_VERSION now includes
also the subminor version.
* The functions u8_mbtouc and u8_mbtouc_unsafe now handle ill-formed UTF-8
input in a better way, that is more compliant with W3C recommendations.
* The functions u8_strcoll, u16_strcoll, u32_strcoll now produce results that
are less dependent on the iconv implementation in use.
* The functions u8_strstr, u16_strstr, u32_strstr now perform in O(n) time
worst-case, where n is the sum of the lengths of the argument strings.
libunistring provides a library that implements Unicode strings (in
three flavours: UTF-8 strings, UTF-16 strings, UTF-32 strings),
together with functions for Unicode charactets (character names,
classifications, properties) and functions for string processing
(formatted output, width, word breaks, line breaks, normalization,
case folding, regular expressions).