2010-12-07 18:58:13 +01:00
|
|
|
@comment $NetBSD: PLIST,v 1.19 2010/12/07 17:58:13 adam Exp $
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
bin/derb
|
|
|
|
bin/genbrk
|
update to 4.2.1
major changes:
Locale Data: ICU uses and supports data from Common Locale Data Repository
(CLDR) 1.7 , which includes data for 146 languages, 159 territories,
468 locales- 21% more locale data than the previous release.
Number system support and the number keyword.
Number system override in DateFormat
Numerics used by Hebrew Calendar date in Hebrew locale
BCP47 (language tag) / Locale transformation
BCP47 mapping of LDML keywords
Encoding selector: Return a list of charsets that can handle the input text
Simple duration: Implementation of CLDR duration format
Available/Preferred keywords for a locale (Calendar, Collation, and Currency)
StringPrep standard profiles: RFC3491 NAMEPREP, RFC3530 NFS4, RFC3722 iSCSI,
RFC3920 NodePrep/ResourcePrep, RFC4011 MIB, RFC4013 SASLprep, RFC4505 trace
and RFC4518 LDAPprep
Miscellaneous Arabic shaping enhancements
UTF-8 friendly internal data structure for Unicode data lookup
API to get CLDR version used by ICU
ISCII charset converter updates (added Gurumukhi, other updates)
Performance improvements in Time Zone Name format/parse, and in
DateIntervalFormat construction
2009-08-05 19:01:17 +02:00
|
|
|
bin/gencfu
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
bin/gencnval
|
2007-03-23 13:51:13 +01:00
|
|
|
bin/genctd
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
bin/genrb
|
|
|
|
bin/icu-config
|
2010-12-07 18:58:13 +01:00
|
|
|
bin/icuinfo
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
bin/makeconv
|
|
|
|
bin/pkgdata
|
|
|
|
bin/uconv
|
|
|
|
include/layout/LEFontInstance.h
|
|
|
|
include/layout/LEGlyphFilter.h
|
update to icu-3.0
major changes:
ICU 3.0 includes the latest bug fixes, locale/charset updates, and
performance/build/porting enhancements.
- Collation
Collation data is in a separate data tree, allowing for easier
modularization and maintenance.
getFunctionalEquivalent API allows for better caching and UI support.
- Unicode 4.0.1
ICU is updated to the latest version of Unicode standard, which had
significant property changes.
- CLDR 1.1
Updates to CLDR 1.1, with many updates to locale data, and special
emphasis on collation data.
- Formatting
As an aid to migration of traditional C (stdio) and C++ (iostream)
formatting, the POSIX-like input/output library, icuio, is officially
supported.
Significant digits now supported in DecimalFormat, for general use and
%g support.
- RFC822 time zone format support in DateFormat for compatibility.
- Currency formatting/parsing improvements
Allows parsing multiple currencies with one formatter, without knowing the
currency in advance. Much cleaner design allowing extensibility to other
measurement units in the future.
- Regular expressions (C)
The regular expressions framework now features a C API, instead of just C++.
- Locales
Locale canonicalization spec defined and implemented. Provides
interoperability with POSIX and .NET locale IDs, more RFC 3066 support.
- Layout engine
Layout engine now supports using different canonically-equivalent Unicode
forms of the same text: e.g. a + ´ or á. This is especially important for
non-Latin scripts.
- Build Environment
ICU can now build its data library much faster on most platforms.
For a complete list see:
http://oss.software.ibm.com/cvs/icu/~checkout~/icu/readme.html?tag=release-3-0
2004-06-26 22:18:50 +02:00
|
|
|
include/layout/LEGlyphStorage.h
|
|
|
|
include/layout/LEInsertionList.h
|
2003-06-23 09:49:39 +02:00
|
|
|
include/layout/LELanguages.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
include/layout/LEScripts.h
|
|
|
|
include/layout/LESwaps.h
|
|
|
|
include/layout/LETypes.h
|
|
|
|
include/layout/LayoutEngine.h
|
2003-06-23 09:49:39 +02:00
|
|
|
include/layout/ParagraphLayout.h
|
|
|
|
include/layout/RunArrays.h
|
Update from version 3.6nb2 to 4.0.1.
Pkgsrc changes:
o New MASTER_SITE
o Adjust PLIST
o Remove no-longer-needed patches, since corresponding changes
have been adopted upstream
o BUILDLINK_ABI_DEPENDS bumped to >=4.0, since a new shared library
version is installed
o Fixes security vulnerability, ref. below.
Dependent pkgsrc packages will have their revisions bumped shortly
due to the (possibly/probably) changed ABI.
Upstream changes:
4.0.1:
ICU4C 4.0.1 is a maintenance release of ICU4J 4.0. The primary
changes of this release were:
* Updated time zone data to 2008i
* Technical preview of string search implementation using
Boyer-Moore algorithm (#6286). For detail information, please
see the tech note here.
* #5691 Conversion: consistent illegal sequences
* #6435 Bad @stable ICU4.0 tags
* #6597 TestDisplayNamesMeta failure
* #6670 Test failure in format/TimeZoneTest/TestShortZoneIDs
4.0:
Major changes in ICU 4.0 include the following:
* Common Changes
o Unicode 5.1 (#5696)
o Locale Data: ICU uses and supports data from Common
Locale Data Repository (CLDR) 1.6 , which includes many
improvements in quality and quantity of data.
o add/removeLikelySubtags (#6124)
o Charset converter file size improvement (#5987)
o Date Interval Formatting (#6157) Note: Calendar type
supported by this feature is Gregorian only in this
release.
o Improved Plural support
* ICU4C Specific Changes
Additional Calendars
+ Chinese (#4081)
+ Coptic/Ethiopic (#4571)
* ICU4J Specific Changes
o Charset
+ Graduated from Technology Preview status
+ ICU2022 Converter (#5791)
+ HZ Converter (#6128)
+ SCSU/BOCU-1 Converter (#2147)
+ Charset Converter Callback (#6144)
o Thai Dictionary break iterator (#5385)
o JDK TimeZone support (#5975)
o Locale Service Provider (#5976)
o More convenient formatting of year+month, day+month,
and other combinations (#6304)
o Simple Duration Formatting (#6303)
* ICU4C Security Fixes
ICU4C 4.0 resolves the vulnerabilities CVE-2007-4770 and
CVE-2007-4771 which were found in earlier versions of ICU.
The standard ICU tests verify that these have been corrected,
however, the updated versions of the previous tests may be
run by applying the following patch to ICU 4.0: r24324. As
well, ICU4C and ICU4J 4.0 resolve the issue underlying
CVE-2008-1036.
2009-03-25 23:30:19 +01:00
|
|
|
include/layout/loengine.h
|
|
|
|
include/layout/playout.h
|
|
|
|
include/layout/plruns.h
|
|
|
|
include/unicode/basictz.h
|
|
|
|
include/unicode/bms.h
|
|
|
|
include/unicode/bmsearch.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/brkiter.h
|
update to 4.2.1
major changes:
Locale Data: ICU uses and supports data from Common Locale Data Repository
(CLDR) 1.7 , which includes data for 146 languages, 159 territories,
468 locales- 21% more locale data than the previous release.
Number system support and the number keyword.
Number system override in DateFormat
Numerics used by Hebrew Calendar date in Hebrew locale
BCP47 (language tag) / Locale transformation
BCP47 mapping of LDML keywords
Encoding selector: Return a list of charsets that can handle the input text
Simple duration: Implementation of CLDR duration format
Available/Preferred keywords for a locale (Calendar, Collation, and Currency)
StringPrep standard profiles: RFC3491 NAMEPREP, RFC3530 NFS4, RFC3722 iSCSI,
RFC3920 NodePrep/ResourcePrep, RFC4011 MIB, RFC4013 SASLprep, RFC4505 trace
and RFC4518 LDAPprep
Miscellaneous Arabic shaping enhancements
UTF-8 friendly internal data structure for Unicode data lookup
API to get CLDR version used by ICU
ISCII charset converter updates (added Gurumukhi, other updates)
Performance improvements in Time Zone Name format/parse, and in
DateIntervalFormat construction
2009-08-05 19:01:17 +02:00
|
|
|
include/unicode/bytestream.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/calendar.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
include/unicode/caniter.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/chariter.h
|
|
|
|
include/unicode/choicfmt.h
|
|
|
|
include/unicode/coleitr.h
|
|
|
|
include/unicode/coll.h
|
Update from version 3.6nb2 to 4.0.1.
Pkgsrc changes:
o New MASTER_SITE
o Adjust PLIST
o Remove no-longer-needed patches, since corresponding changes
have been adopted upstream
o BUILDLINK_ABI_DEPENDS bumped to >=4.0, since a new shared library
version is installed
o Fixes security vulnerability, ref. below.
Dependent pkgsrc packages will have their revisions bumped shortly
due to the (possibly/probably) changed ABI.
Upstream changes:
4.0.1:
ICU4C 4.0.1 is a maintenance release of ICU4J 4.0. The primary
changes of this release were:
* Updated time zone data to 2008i
* Technical preview of string search implementation using
Boyer-Moore algorithm (#6286). For detail information, please
see the tech note here.
* #5691 Conversion: consistent illegal sequences
* #6435 Bad @stable ICU4.0 tags
* #6597 TestDisplayNamesMeta failure
* #6670 Test failure in format/TimeZoneTest/TestShortZoneIDs
4.0:
Major changes in ICU 4.0 include the following:
* Common Changes
o Unicode 5.1 (#5696)
o Locale Data: ICU uses and supports data from Common
Locale Data Repository (CLDR) 1.6 , which includes many
improvements in quality and quantity of data.
o add/removeLikelySubtags (#6124)
o Charset converter file size improvement (#5987)
o Date Interval Formatting (#6157) Note: Calendar type
supported by this feature is Gregorian only in this
release.
o Improved Plural support
* ICU4C Specific Changes
Additional Calendars
+ Chinese (#4081)
+ Coptic/Ethiopic (#4571)
* ICU4J Specific Changes
o Charset
+ Graduated from Technology Preview status
+ ICU2022 Converter (#5791)
+ HZ Converter (#6128)
+ SCSU/BOCU-1 Converter (#2147)
+ Charset Converter Callback (#6144)
o Thai Dictionary break iterator (#5385)
o JDK TimeZone support (#5975)
o Locale Service Provider (#5976)
o More convenient formatting of year+month, day+month,
and other combinations (#6304)
o Simple Duration Formatting (#6303)
* ICU4C Security Fixes
ICU4C 4.0 resolves the vulnerabilities CVE-2007-4770 and
CVE-2007-4771 which were found in earlier versions of ICU.
The standard ICU tests verify that these have been corrected,
however, the updated versions of the previous tests may be
run by applying the following patch to ICU 4.0: r24324. As
well, ICU4C and ICU4J 4.0 resolve the issue underlying
CVE-2008-1036.
2009-03-25 23:30:19 +01:00
|
|
|
include/unicode/colldata.h
|
update to icu-3.0
major changes:
ICU 3.0 includes the latest bug fixes, locale/charset updates, and
performance/build/porting enhancements.
- Collation
Collation data is in a separate data tree, allowing for easier
modularization and maintenance.
getFunctionalEquivalent API allows for better caching and UI support.
- Unicode 4.0.1
ICU is updated to the latest version of Unicode standard, which had
significant property changes.
- CLDR 1.1
Updates to CLDR 1.1, with many updates to locale data, and special
emphasis on collation data.
- Formatting
As an aid to migration of traditional C (stdio) and C++ (iostream)
formatting, the POSIX-like input/output library, icuio, is officially
supported.
Significant digits now supported in DecimalFormat, for general use and
%g support.
- RFC822 time zone format support in DateFormat for compatibility.
- Currency formatting/parsing improvements
Allows parsing multiple currencies with one formatter, without knowing the
currency in advance. Much cleaner design allowing extensibility to other
measurement units in the future.
- Regular expressions (C)
The regular expressions framework now features a C API, instead of just C++.
- Locales
Locale canonicalization spec defined and implemented. Provides
interoperability with POSIX and .NET locale IDs, more RFC 3066 support.
- Layout engine
Layout engine now supports using different canonically-equivalent Unicode
forms of the same text: e.g. a + ´ or á. This is especially important for
non-Latin scripts.
- Build Environment
ICU can now build its data library much faster on most platforms.
For a complete list see:
http://oss.software.ibm.com/cvs/icu/~checkout~/icu/readme.html?tag=release-3-0
2004-06-26 22:18:50 +02:00
|
|
|
include/unicode/curramt.h
|
update to 4.2.1
major changes:
Locale Data: ICU uses and supports data from Common Locale Data Repository
(CLDR) 1.7 , which includes data for 146 languages, 159 territories,
468 locales- 21% more locale data than the previous release.
Number system support and the number keyword.
Number system override in DateFormat
Numerics used by Hebrew Calendar date in Hebrew locale
BCP47 (language tag) / Locale transformation
BCP47 mapping of LDML keywords
Encoding selector: Return a list of charsets that can handle the input text
Simple duration: Implementation of CLDR duration format
Available/Preferred keywords for a locale (Calendar, Collation, and Currency)
StringPrep standard profiles: RFC3491 NAMEPREP, RFC3530 NFS4, RFC3722 iSCSI,
RFC3920 NodePrep/ResourcePrep, RFC4011 MIB, RFC4013 SASLprep, RFC4505 trace
and RFC4518 LDAPprep
Miscellaneous Arabic shaping enhancements
UTF-8 friendly internal data structure for Unicode data lookup
API to get CLDR version used by ICU
ISCII charset converter updates (added Gurumukhi, other updates)
Performance improvements in Time Zone Name format/parse, and in
DateIntervalFormat construction
2009-08-05 19:01:17 +02:00
|
|
|
include/unicode/currpinf.h
|
update to icu-3.0
major changes:
ICU 3.0 includes the latest bug fixes, locale/charset updates, and
performance/build/porting enhancements.
- Collation
Collation data is in a separate data tree, allowing for easier
modularization and maintenance.
getFunctionalEquivalent API allows for better caching and UI support.
- Unicode 4.0.1
ICU is updated to the latest version of Unicode standard, which had
significant property changes.
- CLDR 1.1
Updates to CLDR 1.1, with many updates to locale data, and special
emphasis on collation data.
- Formatting
As an aid to migration of traditional C (stdio) and C++ (iostream)
formatting, the POSIX-like input/output library, icuio, is officially
supported.
Significant digits now supported in DecimalFormat, for general use and
%g support.
- RFC822 time zone format support in DateFormat for compatibility.
- Currency formatting/parsing improvements
Allows parsing multiple currencies with one formatter, without knowing the
currency in advance. Much cleaner design allowing extensibility to other
measurement units in the future.
- Regular expressions (C)
The regular expressions framework now features a C API, instead of just C++.
- Locales
Locale canonicalization spec defined and implemented. Provides
interoperability with POSIX and .NET locale IDs, more RFC 3066 support.
- Layout engine
Layout engine now supports using different canonically-equivalent Unicode
forms of the same text: e.g. a + ´ or á. This is especially important for
non-Latin scripts.
- Build Environment
ICU can now build its data library much faster on most platforms.
For a complete list see:
http://oss.software.ibm.com/cvs/icu/~checkout~/icu/readme.html?tag=release-3-0
2004-06-26 22:18:50 +02:00
|
|
|
include/unicode/currunit.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/datefmt.h
|
|
|
|
include/unicode/dbbi.h
|
|
|
|
include/unicode/dcfmtsym.h
|
|
|
|
include/unicode/decimfmt.h
|
2000-12-21 19:14:18 +01:00
|
|
|
include/unicode/docmain.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/dtfmtsym.h
|
Update from version 3.6nb2 to 4.0.1.
Pkgsrc changes:
o New MASTER_SITE
o Adjust PLIST
o Remove no-longer-needed patches, since corresponding changes
have been adopted upstream
o BUILDLINK_ABI_DEPENDS bumped to >=4.0, since a new shared library
version is installed
o Fixes security vulnerability, ref. below.
Dependent pkgsrc packages will have their revisions bumped shortly
due to the (possibly/probably) changed ABI.
Upstream changes:
4.0.1:
ICU4C 4.0.1 is a maintenance release of ICU4J 4.0. The primary
changes of this release were:
* Updated time zone data to 2008i
* Technical preview of string search implementation using
Boyer-Moore algorithm (#6286). For detail information, please
see the tech note here.
* #5691 Conversion: consistent illegal sequences
* #6435 Bad @stable ICU4.0 tags
* #6597 TestDisplayNamesMeta failure
* #6670 Test failure in format/TimeZoneTest/TestShortZoneIDs
4.0:
Major changes in ICU 4.0 include the following:
* Common Changes
o Unicode 5.1 (#5696)
o Locale Data: ICU uses and supports data from Common
Locale Data Repository (CLDR) 1.6 , which includes many
improvements in quality and quantity of data.
o add/removeLikelySubtags (#6124)
o Charset converter file size improvement (#5987)
o Date Interval Formatting (#6157) Note: Calendar type
supported by this feature is Gregorian only in this
release.
o Improved Plural support
* ICU4C Specific Changes
Additional Calendars
+ Chinese (#4081)
+ Coptic/Ethiopic (#4571)
* ICU4J Specific Changes
o Charset
+ Graduated from Technology Preview status
+ ICU2022 Converter (#5791)
+ HZ Converter (#6128)
+ SCSU/BOCU-1 Converter (#2147)
+ Charset Converter Callback (#6144)
o Thai Dictionary break iterator (#5385)
o JDK TimeZone support (#5975)
o Locale Service Provider (#5976)
o More convenient formatting of year+month, day+month,
and other combinations (#6304)
o Simple Duration Formatting (#6303)
* ICU4C Security Fixes
ICU4C 4.0 resolves the vulnerabilities CVE-2007-4770 and
CVE-2007-4771 which were found in earlier versions of ICU.
The standard ICU tests verify that these have been corrected,
however, the updated versions of the previous tests may be
run by applying the following patch to ICU 4.0: r24324. As
well, ICU4C and ICU4J 4.0 resolve the issue underlying
CVE-2008-1036.
2009-03-25 23:30:19 +01:00
|
|
|
include/unicode/dtintrv.h
|
|
|
|
include/unicode/dtitvfmt.h
|
|
|
|
include/unicode/dtitvinf.h
|
|
|
|
include/unicode/dtptngen.h
|
|
|
|
include/unicode/dtrule.h
|
update to 4.2.1
major changes:
Locale Data: ICU uses and supports data from Common Locale Data Repository
(CLDR) 1.7 , which includes data for 146 languages, 159 territories,
468 locales- 21% more locale data than the previous release.
Number system support and the number keyword.
Number system override in DateFormat
Numerics used by Hebrew Calendar date in Hebrew locale
BCP47 (language tag) / Locale transformation
BCP47 mapping of LDML keywords
Encoding selector: Return a list of charsets that can handle the input text
Simple duration: Implementation of CLDR duration format
Available/Preferred keywords for a locale (Calendar, Collation, and Currency)
StringPrep standard profiles: RFC3491 NAMEPREP, RFC3530 NFS4, RFC3722 iSCSI,
RFC3920 NodePrep/ResourcePrep, RFC4011 MIB, RFC4013 SASLprep, RFC4505 trace
and RFC4518 LDAPprep
Miscellaneous Arabic shaping enhancements
UTF-8 friendly internal data structure for Unicode data lookup
API to get CLDR version used by ICU
ISCII charset converter updates (added Gurumukhi, other updates)
Performance improvements in Time Zone Name format/parse, and in
DateIntervalFormat construction
2009-08-05 19:01:17 +02:00
|
|
|
include/unicode/errorcode.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/fieldpos.h
|
|
|
|
include/unicode/fmtable.h
|
|
|
|
include/unicode/format.h
|
2010-12-07 18:58:13 +01:00
|
|
|
include/unicode/fpositer.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/gregocal.h
|
2010-12-07 18:58:13 +01:00
|
|
|
include/unicode/icudataver.h
|
|
|
|
include/unicode/icuplug.h
|
|
|
|
include/unicode/idna.h
|
|
|
|
include/unicode/localpointer.h
|
|
|
|
include/unicode/locdspnm.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/locid.h
|
update to icu-3.0
major changes:
ICU 3.0 includes the latest bug fixes, locale/charset updates, and
performance/build/porting enhancements.
- Collation
Collation data is in a separate data tree, allowing for easier
modularization and maintenance.
getFunctionalEquivalent API allows for better caching and UI support.
- Unicode 4.0.1
ICU is updated to the latest version of Unicode standard, which had
significant property changes.
- CLDR 1.1
Updates to CLDR 1.1, with many updates to locale data, and special
emphasis on collation data.
- Formatting
As an aid to migration of traditional C (stdio) and C++ (iostream)
formatting, the POSIX-like input/output library, icuio, is officially
supported.
Significant digits now supported in DecimalFormat, for general use and
%g support.
- RFC822 time zone format support in DateFormat for compatibility.
- Currency formatting/parsing improvements
Allows parsing multiple currencies with one formatter, without knowing the
currency in advance. Much cleaner design allowing extensibility to other
measurement units in the future.
- Regular expressions (C)
The regular expressions framework now features a C API, instead of just C++.
- Locales
Locale canonicalization spec defined and implemented. Provides
interoperability with POSIX and .NET locale IDs, more RFC 3066 support.
- Layout engine
Layout engine now supports using different canonically-equivalent Unicode
forms of the same text: e.g. a + ´ or á. This is especially important for
non-Latin scripts.
- Build Environment
ICU can now build its data library much faster on most platforms.
For a complete list see:
http://oss.software.ibm.com/cvs/icu/~checkout~/icu/readme.html?tag=release-3-0
2004-06-26 22:18:50 +02:00
|
|
|
include/unicode/measfmt.h
|
|
|
|
include/unicode/measunit.h
|
|
|
|
include/unicode/measure.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/msgfmt.h
|
2010-12-07 18:58:13 +01:00
|
|
|
include/unicode/normalizer2.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/normlzr.h
|
|
|
|
include/unicode/numfmt.h
|
update to 4.2.1
major changes:
Locale Data: ICU uses and supports data from Common Locale Data Repository
(CLDR) 1.7 , which includes data for 146 languages, 159 territories,
468 locales- 21% more locale data than the previous release.
Number system support and the number keyword.
Number system override in DateFormat
Numerics used by Hebrew Calendar date in Hebrew locale
BCP47 (language tag) / Locale transformation
BCP47 mapping of LDML keywords
Encoding selector: Return a list of charsets that can handle the input text
Simple duration: Implementation of CLDR duration format
Available/Preferred keywords for a locale (Calendar, Collation, and Currency)
StringPrep standard profiles: RFC3491 NAMEPREP, RFC3530 NFS4, RFC3722 iSCSI,
RFC3920 NodePrep/ResourcePrep, RFC4011 MIB, RFC4013 SASLprep, RFC4505 trace
and RFC4518 LDAPprep
Miscellaneous Arabic shaping enhancements
UTF-8 friendly internal data structure for Unicode data lookup
API to get CLDR version used by ICU
ISCII charset converter updates (added Gurumukhi, other updates)
Performance improvements in Time Zone Name format/parse, and in
DateIntervalFormat construction
2009-08-05 19:01:17 +02:00
|
|
|
include/unicode/numsys.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/parseerr.h
|
|
|
|
include/unicode/parsepos.h
|
|
|
|
include/unicode/platform.h
|
Update from version 3.6nb2 to 4.0.1.
Pkgsrc changes:
o New MASTER_SITE
o Adjust PLIST
o Remove no-longer-needed patches, since corresponding changes
have been adopted upstream
o BUILDLINK_ABI_DEPENDS bumped to >=4.0, since a new shared library
version is installed
o Fixes security vulnerability, ref. below.
Dependent pkgsrc packages will have their revisions bumped shortly
due to the (possibly/probably) changed ABI.
Upstream changes:
4.0.1:
ICU4C 4.0.1 is a maintenance release of ICU4J 4.0. The primary
changes of this release were:
* Updated time zone data to 2008i
* Technical preview of string search implementation using
Boyer-Moore algorithm (#6286). For detail information, please
see the tech note here.
* #5691 Conversion: consistent illegal sequences
* #6435 Bad @stable ICU4.0 tags
* #6597 TestDisplayNamesMeta failure
* #6670 Test failure in format/TimeZoneTest/TestShortZoneIDs
4.0:
Major changes in ICU 4.0 include the following:
* Common Changes
o Unicode 5.1 (#5696)
o Locale Data: ICU uses and supports data from Common
Locale Data Repository (CLDR) 1.6 , which includes many
improvements in quality and quantity of data.
o add/removeLikelySubtags (#6124)
o Charset converter file size improvement (#5987)
o Date Interval Formatting (#6157) Note: Calendar type
supported by this feature is Gregorian only in this
release.
o Improved Plural support
* ICU4C Specific Changes
Additional Calendars
+ Chinese (#4081)
+ Coptic/Ethiopic (#4571)
* ICU4J Specific Changes
o Charset
+ Graduated from Technology Preview status
+ ICU2022 Converter (#5791)
+ HZ Converter (#6128)
+ SCSU/BOCU-1 Converter (#2147)
+ Charset Converter Callback (#6144)
o Thai Dictionary break iterator (#5385)
o JDK TimeZone support (#5975)
o Locale Service Provider (#5976)
o More convenient formatting of year+month, day+month,
and other combinations (#6304)
o Simple Duration Formatting (#6303)
* ICU4C Security Fixes
ICU4C 4.0 resolves the vulnerabilities CVE-2007-4770 and
CVE-2007-4771 which were found in earlier versions of ICU.
The standard ICU tests verify that these have been corrected,
however, the updated versions of the previous tests may be
run by applying the following patch to ICU 4.0: r24324. As
well, ICU4C and ICU4J 4.0 resolve the issue underlying
CVE-2008-1036.
2009-03-25 23:30:19 +01:00
|
|
|
include/unicode/plurfmt.h
|
|
|
|
include/unicode/plurrule.h
|
2006-01-03 01:04:42 +01:00
|
|
|
include/unicode/ppalmos.h
|
2010-12-07 18:58:13 +01:00
|
|
|
include/unicode/ptypes.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/putil.h
|
|
|
|
include/unicode/pwin32.h
|
|
|
|
include/unicode/rbbi.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
include/unicode/rbnf.h
|
Update from version 3.6nb2 to 4.0.1.
Pkgsrc changes:
o New MASTER_SITE
o Adjust PLIST
o Remove no-longer-needed patches, since corresponding changes
have been adopted upstream
o BUILDLINK_ABI_DEPENDS bumped to >=4.0, since a new shared library
version is installed
o Fixes security vulnerability, ref. below.
Dependent pkgsrc packages will have their revisions bumped shortly
due to the (possibly/probably) changed ABI.
Upstream changes:
4.0.1:
ICU4C 4.0.1 is a maintenance release of ICU4J 4.0. The primary
changes of this release were:
* Updated time zone data to 2008i
* Technical preview of string search implementation using
Boyer-Moore algorithm (#6286). For detail information, please
see the tech note here.
* #5691 Conversion: consistent illegal sequences
* #6435 Bad @stable ICU4.0 tags
* #6597 TestDisplayNamesMeta failure
* #6670 Test failure in format/TimeZoneTest/TestShortZoneIDs
4.0:
Major changes in ICU 4.0 include the following:
* Common Changes
o Unicode 5.1 (#5696)
o Locale Data: ICU uses and supports data from Common
Locale Data Repository (CLDR) 1.6 , which includes many
improvements in quality and quantity of data.
o add/removeLikelySubtags (#6124)
o Charset converter file size improvement (#5987)
o Date Interval Formatting (#6157) Note: Calendar type
supported by this feature is Gregorian only in this
release.
o Improved Plural support
* ICU4C Specific Changes
Additional Calendars
+ Chinese (#4081)
+ Coptic/Ethiopic (#4571)
* ICU4J Specific Changes
o Charset
+ Graduated from Technology Preview status
+ ICU2022 Converter (#5791)
+ HZ Converter (#6128)
+ SCSU/BOCU-1 Converter (#2147)
+ Charset Converter Callback (#6144)
o Thai Dictionary break iterator (#5385)
o JDK TimeZone support (#5975)
o Locale Service Provider (#5976)
o More convenient formatting of year+month, day+month,
and other combinations (#6304)
o Simple Duration Formatting (#6303)
* ICU4C Security Fixes
ICU4C 4.0 resolves the vulnerabilities CVE-2007-4770 and
CVE-2007-4771 which were found in earlier versions of ICU.
The standard ICU tests verify that these have been corrected,
however, the updated versions of the previous tests may be
run by applying the following patch to ICU 4.0: r24324. As
well, ICU4C and ICU4J 4.0 resolve the issue underlying
CVE-2008-1036.
2009-03-25 23:30:19 +01:00
|
|
|
include/unicode/rbtz.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
include/unicode/regex.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/rep.h
|
|
|
|
include/unicode/resbund.h
|
|
|
|
include/unicode/schriter.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
include/unicode/search.h
|
2010-12-07 18:58:13 +01:00
|
|
|
include/unicode/selfmt.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/simpletz.h
|
|
|
|
include/unicode/smpdtfmt.h
|
|
|
|
include/unicode/sortkey.h
|
update to 4.2.1
major changes:
Locale Data: ICU uses and supports data from Common Locale Data Repository
(CLDR) 1.7 , which includes data for 146 languages, 159 territories,
468 locales- 21% more locale data than the previous release.
Number system support and the number keyword.
Number system override in DateFormat
Numerics used by Hebrew Calendar date in Hebrew locale
BCP47 (language tag) / Locale transformation
BCP47 mapping of LDML keywords
Encoding selector: Return a list of charsets that can handle the input text
Simple duration: Implementation of CLDR duration format
Available/Preferred keywords for a locale (Calendar, Collation, and Currency)
StringPrep standard profiles: RFC3491 NAMEPREP, RFC3530 NFS4, RFC3722 iSCSI,
RFC3920 NodePrep/ResourcePrep, RFC4011 MIB, RFC4013 SASLprep, RFC4505 trace
and RFC4518 LDAPprep
Miscellaneous Arabic shaping enhancements
UTF-8 friendly internal data structure for Unicode data lookup
API to get CLDR version used by ICU
ISCII charset converter updates (added Gurumukhi, other updates)
Performance improvements in Time Zone Name format/parse, and in
DateIntervalFormat construction
2009-08-05 19:01:17 +02:00
|
|
|
include/unicode/std_string.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
include/unicode/strenum.h
|
update to 4.2.1
major changes:
Locale Data: ICU uses and supports data from Common Locale Data Repository
(CLDR) 1.7 , which includes data for 146 languages, 159 territories,
468 locales- 21% more locale data than the previous release.
Number system support and the number keyword.
Number system override in DateFormat
Numerics used by Hebrew Calendar date in Hebrew locale
BCP47 (language tag) / Locale transformation
BCP47 mapping of LDML keywords
Encoding selector: Return a list of charsets that can handle the input text
Simple duration: Implementation of CLDR duration format
Available/Preferred keywords for a locale (Calendar, Collation, and Currency)
StringPrep standard profiles: RFC3491 NAMEPREP, RFC3530 NFS4, RFC3722 iSCSI,
RFC3920 NodePrep/ResourcePrep, RFC4011 MIB, RFC4013 SASLprep, RFC4505 trace
and RFC4518 LDAPprep
Miscellaneous Arabic shaping enhancements
UTF-8 friendly internal data structure for Unicode data lookup
API to get CLDR version used by ICU
ISCII charset converter updates (added Gurumukhi, other updates)
Performance improvements in Time Zone Name format/parse, and in
DateIntervalFormat construction
2009-08-05 19:01:17 +02:00
|
|
|
include/unicode/stringpiece.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
include/unicode/stsearch.h
|
2004-04-04 18:58:16 +02:00
|
|
|
include/unicode/symtable.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/tblcoll.h
|
|
|
|
include/unicode/timezone.h
|
update to 4.2.1
major changes:
Locale Data: ICU uses and supports data from Common Locale Data Repository
(CLDR) 1.7 , which includes data for 146 languages, 159 territories,
468 locales- 21% more locale data than the previous release.
Number system support and the number keyword.
Number system override in DateFormat
Numerics used by Hebrew Calendar date in Hebrew locale
BCP47 (language tag) / Locale transformation
BCP47 mapping of LDML keywords
Encoding selector: Return a list of charsets that can handle the input text
Simple duration: Implementation of CLDR duration format
Available/Preferred keywords for a locale (Calendar, Collation, and Currency)
StringPrep standard profiles: RFC3491 NAMEPREP, RFC3530 NFS4, RFC3722 iSCSI,
RFC3920 NodePrep/ResourcePrep, RFC4011 MIB, RFC4013 SASLprep, RFC4505 trace
and RFC4518 LDAPprep
Miscellaneous Arabic shaping enhancements
UTF-8 friendly internal data structure for Unicode data lookup
API to get CLDR version used by ICU
ISCII charset converter updates (added Gurumukhi, other updates)
Performance improvements in Time Zone Name format/parse, and in
DateIntervalFormat construction
2009-08-05 19:01:17 +02:00
|
|
|
include/unicode/tmunit.h
|
|
|
|
include/unicode/tmutamt.h
|
|
|
|
include/unicode/tmutfmt.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/translit.h
|
Update from version 3.6nb2 to 4.0.1.
Pkgsrc changes:
o New MASTER_SITE
o Adjust PLIST
o Remove no-longer-needed patches, since corresponding changes
have been adopted upstream
o BUILDLINK_ABI_DEPENDS bumped to >=4.0, since a new shared library
version is installed
o Fixes security vulnerability, ref. below.
Dependent pkgsrc packages will have their revisions bumped shortly
due to the (possibly/probably) changed ABI.
Upstream changes:
4.0.1:
ICU4C 4.0.1 is a maintenance release of ICU4J 4.0. The primary
changes of this release were:
* Updated time zone data to 2008i
* Technical preview of string search implementation using
Boyer-Moore algorithm (#6286). For detail information, please
see the tech note here.
* #5691 Conversion: consistent illegal sequences
* #6435 Bad @stable ICU4.0 tags
* #6597 TestDisplayNamesMeta failure
* #6670 Test failure in format/TimeZoneTest/TestShortZoneIDs
4.0:
Major changes in ICU 4.0 include the following:
* Common Changes
o Unicode 5.1 (#5696)
o Locale Data: ICU uses and supports data from Common
Locale Data Repository (CLDR) 1.6 , which includes many
improvements in quality and quantity of data.
o add/removeLikelySubtags (#6124)
o Charset converter file size improvement (#5987)
o Date Interval Formatting (#6157) Note: Calendar type
supported by this feature is Gregorian only in this
release.
o Improved Plural support
* ICU4C Specific Changes
Additional Calendars
+ Chinese (#4081)
+ Coptic/Ethiopic (#4571)
* ICU4J Specific Changes
o Charset
+ Graduated from Technology Preview status
+ ICU2022 Converter (#5791)
+ HZ Converter (#6128)
+ SCSU/BOCU-1 Converter (#2147)
+ Charset Converter Callback (#6144)
o Thai Dictionary break iterator (#5385)
o JDK TimeZone support (#5975)
o Locale Service Provider (#5976)
o More convenient formatting of year+month, day+month,
and other combinations (#6304)
o Simple Duration Formatting (#6303)
* ICU4C Security Fixes
ICU4C 4.0 resolves the vulnerabilities CVE-2007-4770 and
CVE-2007-4771 which were found in earlier versions of ICU.
The standard ICU tests verify that these have been corrected,
however, the updated versions of the previous tests may be
run by applying the following patch to ICU 4.0: r24324. As
well, ICU4C and ICU4J 4.0 resolve the issue underlying
CVE-2008-1036.
2009-03-25 23:30:19 +01:00
|
|
|
include/unicode/tzrule.h
|
|
|
|
include/unicode/tztrans.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/ubidi.h
|
|
|
|
include/unicode/ubrk.h
|
|
|
|
include/unicode/ucal.h
|
2006-01-03 01:04:42 +01:00
|
|
|
include/unicode/ucasemap.h
|
2003-06-23 09:49:39 +02:00
|
|
|
include/unicode/ucat.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/uchar.h
|
|
|
|
include/unicode/uchriter.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
include/unicode/uclean.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/ucnv.h
|
|
|
|
include/unicode/ucnv_cb.h
|
|
|
|
include/unicode/ucnv_err.h
|
update to 4.2.1
major changes:
Locale Data: ICU uses and supports data from Common Locale Data Repository
(CLDR) 1.7 , which includes data for 146 languages, 159 territories,
468 locales- 21% more locale data than the previous release.
Number system support and the number keyword.
Number system override in DateFormat
Numerics used by Hebrew Calendar date in Hebrew locale
BCP47 (language tag) / Locale transformation
BCP47 mapping of LDML keywords
Encoding selector: Return a list of charsets that can handle the input text
Simple duration: Implementation of CLDR duration format
Available/Preferred keywords for a locale (Calendar, Collation, and Currency)
StringPrep standard profiles: RFC3491 NAMEPREP, RFC3530 NFS4, RFC3722 iSCSI,
RFC3920 NodePrep/ResourcePrep, RFC4011 MIB, RFC4013 SASLprep, RFC4505 trace
and RFC4518 LDAPprep
Miscellaneous Arabic shaping enhancements
UTF-8 friendly internal data structure for Unicode data lookup
API to get CLDR version used by ICU
ISCII charset converter updates (added Gurumukhi, other updates)
Performance improvements in Time Zone Name format/parse, and in
DateIntervalFormat construction
2009-08-05 19:01:17 +02:00
|
|
|
include/unicode/ucnvsel.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/ucol.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
include/unicode/ucoleitr.h
|
|
|
|
include/unicode/uconfig.h
|
2007-03-23 13:51:13 +01:00
|
|
|
include/unicode/ucsdet.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
include/unicode/ucurr.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/udat.h
|
|
|
|
include/unicode/udata.h
|
Update from version 3.6nb2 to 4.0.1.
Pkgsrc changes:
o New MASTER_SITE
o Adjust PLIST
o Remove no-longer-needed patches, since corresponding changes
have been adopted upstream
o BUILDLINK_ABI_DEPENDS bumped to >=4.0, since a new shared library
version is installed
o Fixes security vulnerability, ref. below.
Dependent pkgsrc packages will have their revisions bumped shortly
due to the (possibly/probably) changed ABI.
Upstream changes:
4.0.1:
ICU4C 4.0.1 is a maintenance release of ICU4J 4.0. The primary
changes of this release were:
* Updated time zone data to 2008i
* Technical preview of string search implementation using
Boyer-Moore algorithm (#6286). For detail information, please
see the tech note here.
* #5691 Conversion: consistent illegal sequences
* #6435 Bad @stable ICU4.0 tags
* #6597 TestDisplayNamesMeta failure
* #6670 Test failure in format/TimeZoneTest/TestShortZoneIDs
4.0:
Major changes in ICU 4.0 include the following:
* Common Changes
o Unicode 5.1 (#5696)
o Locale Data: ICU uses and supports data from Common
Locale Data Repository (CLDR) 1.6 , which includes many
improvements in quality and quantity of data.
o add/removeLikelySubtags (#6124)
o Charset converter file size improvement (#5987)
o Date Interval Formatting (#6157) Note: Calendar type
supported by this feature is Gregorian only in this
release.
o Improved Plural support
* ICU4C Specific Changes
Additional Calendars
+ Chinese (#4081)
+ Coptic/Ethiopic (#4571)
* ICU4J Specific Changes
o Charset
+ Graduated from Technology Preview status
+ ICU2022 Converter (#5791)
+ HZ Converter (#6128)
+ SCSU/BOCU-1 Converter (#2147)
+ Charset Converter Callback (#6144)
o Thai Dictionary break iterator (#5385)
o JDK TimeZone support (#5975)
o Locale Service Provider (#5976)
o More convenient formatting of year+month, day+month,
and other combinations (#6304)
o Simple Duration Formatting (#6303)
* ICU4C Security Fixes
ICU4C 4.0 resolves the vulnerabilities CVE-2007-4770 and
CVE-2007-4771 which were found in earlier versions of ICU.
The standard ICU tests verify that these have been corrected,
however, the updated versions of the previous tests may be
run by applying the following patch to ICU 4.0: r24324. As
well, ICU4C and ICU4J 4.0 resolve the issue underlying
CVE-2008-1036.
2009-03-25 23:30:19 +01:00
|
|
|
include/unicode/udatpg.h
|
update to icu-3.0
major changes:
ICU 3.0 includes the latest bug fixes, locale/charset updates, and
performance/build/porting enhancements.
- Collation
Collation data is in a separate data tree, allowing for easier
modularization and maintenance.
getFunctionalEquivalent API allows for better caching and UI support.
- Unicode 4.0.1
ICU is updated to the latest version of Unicode standard, which had
significant property changes.
- CLDR 1.1
Updates to CLDR 1.1, with many updates to locale data, and special
emphasis on collation data.
- Formatting
As an aid to migration of traditional C (stdio) and C++ (iostream)
formatting, the POSIX-like input/output library, icuio, is officially
supported.
Significant digits now supported in DecimalFormat, for general use and
%g support.
- RFC822 time zone format support in DateFormat for compatibility.
- Currency formatting/parsing improvements
Allows parsing multiple currencies with one formatter, without knowing the
currency in advance. Much cleaner design allowing extensibility to other
measurement units in the future.
- Regular expressions (C)
The regular expressions framework now features a C API, instead of just C++.
- Locales
Locale canonicalization spec defined and implemented. Provides
interoperability with POSIX and .NET locale IDs, more RFC 3066 support.
- Layout engine
Layout engine now supports using different canonically-equivalent Unicode
forms of the same text: e.g. a + ´ or á. This is especially important for
non-Latin scripts.
- Build Environment
ICU can now build its data library much faster on most platforms.
For a complete list see:
http://oss.software.ibm.com/cvs/icu/~checkout~/icu/readme.html?tag=release-3-0
2004-06-26 22:18:50 +02:00
|
|
|
include/unicode/udeprctd.h
|
|
|
|
include/unicode/udraft.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
include/unicode/uenum.h
|
2003-06-23 09:49:39 +02:00
|
|
|
include/unicode/uidna.h
|
2007-03-23 13:51:13 +01:00
|
|
|
include/unicode/uintrnal.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
include/unicode/uiter.h
|
2010-12-07 18:58:13 +01:00
|
|
|
include/unicode/uldnames.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/uloc.h
|
2004-04-04 18:58:16 +02:00
|
|
|
include/unicode/ulocdata.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/umachine.h
|
|
|
|
include/unicode/umisc.h
|
|
|
|
include/unicode/umsg.h
|
|
|
|
include/unicode/unifilt.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
include/unicode/unifunct.h
|
|
|
|
include/unicode/unimatch.h
|
|
|
|
include/unicode/unirepl.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/uniset.h
|
|
|
|
include/unicode/unistr.h
|
2000-12-21 19:14:18 +01:00
|
|
|
include/unicode/unorm.h
|
2010-12-07 18:58:13 +01:00
|
|
|
include/unicode/unorm2.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/unum.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
include/unicode/uobject.h
|
update to icu-3.0
major changes:
ICU 3.0 includes the latest bug fixes, locale/charset updates, and
performance/build/porting enhancements.
- Collation
Collation data is in a separate data tree, allowing for easier
modularization and maintenance.
getFunctionalEquivalent API allows for better caching and UI support.
- Unicode 4.0.1
ICU is updated to the latest version of Unicode standard, which had
significant property changes.
- CLDR 1.1
Updates to CLDR 1.1, with many updates to locale data, and special
emphasis on collation data.
- Formatting
As an aid to migration of traditional C (stdio) and C++ (iostream)
formatting, the POSIX-like input/output library, icuio, is officially
supported.
Significant digits now supported in DecimalFormat, for general use and
%g support.
- RFC822 time zone format support in DateFormat for compatibility.
- Currency formatting/parsing improvements
Allows parsing multiple currencies with one formatter, without knowing the
currency in advance. Much cleaner design allowing extensibility to other
measurement units in the future.
- Regular expressions (C)
The regular expressions framework now features a C API, instead of just C++.
- Locales
Locale canonicalization spec defined and implemented. Provides
interoperability with POSIX and .NET locale IDs, more RFC 3066 support.
- Layout engine
Layout engine now supports using different canonically-equivalent Unicode
forms of the same text: e.g. a + ´ or á. This is especially important for
non-Latin scripts.
- Build Environment
ICU can now build its data library much faster on most platforms.
For a complete list see:
http://oss.software.ibm.com/cvs/icu/~checkout~/icu/readme.html?tag=release-3-0
2004-06-26 22:18:50 +02:00
|
|
|
include/unicode/uobslete.h
|
|
|
|
include/unicode/uregex.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
include/unicode/urename.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/urep.h
|
|
|
|
include/unicode/ures.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
include/unicode/uscript.h
|
|
|
|
include/unicode/usearch.h
|
|
|
|
include/unicode/uset.h
|
|
|
|
include/unicode/usetiter.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/ushape.h
|
update to 4.2.1
major changes:
Locale Data: ICU uses and supports data from Common Locale Data Repository
(CLDR) 1.7 , which includes data for 146 languages, 159 territories,
468 locales- 21% more locale data than the previous release.
Number system support and the number keyword.
Number system override in DateFormat
Numerics used by Hebrew Calendar date in Hebrew locale
BCP47 (language tag) / Locale transformation
BCP47 mapping of LDML keywords
Encoding selector: Return a list of charsets that can handle the input text
Simple duration: Implementation of CLDR duration format
Available/Preferred keywords for a locale (Calendar, Collation, and Currency)
StringPrep standard profiles: RFC3491 NAMEPREP, RFC3530 NFS4, RFC3722 iSCSI,
RFC3920 NodePrep/ResourcePrep, RFC4011 MIB, RFC4013 SASLprep, RFC4505 trace
and RFC4518 LDAPprep
Miscellaneous Arabic shaping enhancements
UTF-8 friendly internal data structure for Unicode data lookup
API to get CLDR version used by ICU
ISCII charset converter updates (added Gurumukhi, other updates)
Performance improvements in Time Zone Name format/parse, and in
DateIntervalFormat construction
2009-08-05 19:01:17 +02:00
|
|
|
include/unicode/uspoof.h
|
2004-04-04 18:58:16 +02:00
|
|
|
include/unicode/usprep.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/ustdio.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
include/unicode/ustream.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/ustring.h
|
2007-03-23 13:51:13 +01:00
|
|
|
include/unicode/usystem.h
|
2006-01-03 01:04:42 +01:00
|
|
|
include/unicode/utext.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/utf.h
|
|
|
|
include/unicode/utf16.h
|
|
|
|
include/unicode/utf32.h
|
|
|
|
include/unicode/utf8.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
include/unicode/utf_old.h
|
ICU 3.2 includes the latest bug fixes, locale/charset updates, and
performance/build/porting enhancements. The following list summarizes
the main new features in this release.sion.
CLDR 1.2.
This is the main new feature in the release. ICU locale data is now completely
built from the CLDR 1.2 data, which contains data for 232 locales, covering 72
languages and 108 territories. Many translated names for languages,
territories, and scripts have been added, as well as for time zones,
calendars, and other named items such as collation. For more information,
see http://www.unicode.org/press/pr-cldr1.2.html.
Miscellaneous
Universal Timescale conversions. ICU now provides mechanisms for quickly and
reliably converting between the different binary representations of date/time
used on different platforms.
Accept-Language. ICU provides a mechanism for matching Accept-Language against
a list of locales.
DateFormat and Calendar Performance. Object construction performance has been
significantly improved.
Footprint. The size of executables that statically link to ICU has been
reduced.
Stdin. The icuio library can now read from stdin.
UnicodeSet C API. More uset_* C API were added.
i5/OS (os/400). Building ICU has been simplified to allow more configure
options to work.
POSIX. Default codepage determination has been fixed.
2005-03-27 12:27:20 +02:00
|
|
|
include/unicode/utmscale.h
|
2004-04-04 18:58:16 +02:00
|
|
|
include/unicode/utrace.h
|
2000-12-20 19:27:59 +01:00
|
|
|
include/unicode/utrans.h
|
|
|
|
include/unicode/utypes.h
|
2010-12-07 18:58:13 +01:00
|
|
|
include/unicode/uvernum.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
include/unicode/uversion.h
|
Update from version 3.6nb2 to 4.0.1.
Pkgsrc changes:
o New MASTER_SITE
o Adjust PLIST
o Remove no-longer-needed patches, since corresponding changes
have been adopted upstream
o BUILDLINK_ABI_DEPENDS bumped to >=4.0, since a new shared library
version is installed
o Fixes security vulnerability, ref. below.
Dependent pkgsrc packages will have their revisions bumped shortly
due to the (possibly/probably) changed ABI.
Upstream changes:
4.0.1:
ICU4C 4.0.1 is a maintenance release of ICU4J 4.0. The primary
changes of this release were:
* Updated time zone data to 2008i
* Technical preview of string search implementation using
Boyer-Moore algorithm (#6286). For detail information, please
see the tech note here.
* #5691 Conversion: consistent illegal sequences
* #6435 Bad @stable ICU4.0 tags
* #6597 TestDisplayNamesMeta failure
* #6670 Test failure in format/TimeZoneTest/TestShortZoneIDs
4.0:
Major changes in ICU 4.0 include the following:
* Common Changes
o Unicode 5.1 (#5696)
o Locale Data: ICU uses and supports data from Common
Locale Data Repository (CLDR) 1.6 , which includes many
improvements in quality and quantity of data.
o add/removeLikelySubtags (#6124)
o Charset converter file size improvement (#5987)
o Date Interval Formatting (#6157) Note: Calendar type
supported by this feature is Gregorian only in this
release.
o Improved Plural support
* ICU4C Specific Changes
Additional Calendars
+ Chinese (#4081)
+ Coptic/Ethiopic (#4571)
* ICU4J Specific Changes
o Charset
+ Graduated from Technology Preview status
+ ICU2022 Converter (#5791)
+ HZ Converter (#6128)
+ SCSU/BOCU-1 Converter (#2147)
+ Charset Converter Callback (#6144)
o Thai Dictionary break iterator (#5385)
o JDK TimeZone support (#5975)
o Locale Service Provider (#5976)
o More convenient formatting of year+month, day+month,
and other combinations (#6304)
o Simple Duration Formatting (#6303)
* ICU4C Security Fixes
ICU4C 4.0 resolves the vulnerabilities CVE-2007-4770 and
CVE-2007-4771 which were found in earlier versions of ICU.
The standard ICU tests verify that these have been corrected,
however, the updated versions of the previous tests may be
run by applying the following patch to ICU 4.0: r24324. As
well, ICU4C and ICU4J 4.0 resolve the issue underlying
CVE-2008-1036.
2009-03-25 23:30:19 +01:00
|
|
|
include/unicode/vtzone.h
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
lib/icu/${PKGVERSION}/Makefile.inc
|
update to 4.2.1
major changes:
Locale Data: ICU uses and supports data from Common Locale Data Repository
(CLDR) 1.7 , which includes data for 146 languages, 159 territories,
468 locales- 21% more locale data than the previous release.
Number system support and the number keyword.
Number system override in DateFormat
Numerics used by Hebrew Calendar date in Hebrew locale
BCP47 (language tag) / Locale transformation
BCP47 mapping of LDML keywords
Encoding selector: Return a list of charsets that can handle the input text
Simple duration: Implementation of CLDR duration format
Available/Preferred keywords for a locale (Calendar, Collation, and Currency)
StringPrep standard profiles: RFC3491 NAMEPREP, RFC3530 NFS4, RFC3722 iSCSI,
RFC3920 NodePrep/ResourcePrep, RFC4011 MIB, RFC4013 SASLprep, RFC4505 trace
and RFC4518 LDAPprep
Miscellaneous Arabic shaping enhancements
UTF-8 friendly internal data structure for Unicode data lookup
API to get CLDR version used by ICU
ISCII charset converter updates (added Gurumukhi, other updates)
Performance improvements in Time Zone Name format/parse, and in
DateIntervalFormat construction
2009-08-05 19:01:17 +02:00
|
|
|
lib/icu/${PKGVERSION}/pkgdata.inc
|
2000-12-21 19:14:18 +01:00
|
|
|
lib/icu/Makefile.inc
|
|
|
|
lib/icu/current
|
update to 4.2.1
major changes:
Locale Data: ICU uses and supports data from Common Locale Data Repository
(CLDR) 1.7 , which includes data for 146 languages, 159 territories,
468 locales- 21% more locale data than the previous release.
Number system support and the number keyword.
Number system override in DateFormat
Numerics used by Hebrew Calendar date in Hebrew locale
BCP47 (language tag) / Locale transformation
BCP47 mapping of LDML keywords
Encoding selector: Return a list of charsets that can handle the input text
Simple duration: Implementation of CLDR duration format
Available/Preferred keywords for a locale (Calendar, Collation, and Currency)
StringPrep standard profiles: RFC3491 NAMEPREP, RFC3530 NFS4, RFC3722 iSCSI,
RFC3920 NodePrep/ResourcePrep, RFC4011 MIB, RFC4013 SASLprep, RFC4505 trace
and RFC4518 LDAPprep
Miscellaneous Arabic shaping enhancements
UTF-8 friendly internal data structure for Unicode data lookup
API to get CLDR version used by ICU
ISCII charset converter updates (added Gurumukhi, other updates)
Performance improvements in Time Zone Name format/parse, and in
DateIntervalFormat construction
2009-08-05 19:01:17 +02:00
|
|
|
lib/icu/pkgdata.inc
|
2009-07-25 15:02:05 +02:00
|
|
|
lib/libicudata${SO_EXT}${SO_SUFFIX}
|
2010-12-07 18:58:13 +01:00
|
|
|
lib/libicudata${SO_EXT}.46${SO_SUFFIX}
|
|
|
|
lib/libicudata${SO_EXT}.46.0${SO_SUFFIX}
|
|
|
|
lib/libicudata.a
|
2009-07-25 15:02:05 +02:00
|
|
|
lib/libicui18n${SO_EXT}${SO_SUFFIX}
|
2010-12-07 18:58:13 +01:00
|
|
|
lib/libicui18n${SO_EXT}.46${SO_SUFFIX}
|
|
|
|
lib/libicui18n${SO_EXT}.46.0${SO_SUFFIX}
|
|
|
|
lib/libicui18n.a
|
2009-07-25 15:02:05 +02:00
|
|
|
lib/libicuio${SO_EXT}${SO_SUFFIX}
|
2010-12-07 18:58:13 +01:00
|
|
|
lib/libicuio${SO_EXT}.46${SO_SUFFIX}
|
|
|
|
lib/libicuio${SO_EXT}.46.0${SO_SUFFIX}
|
|
|
|
lib/libicuio.a
|
2009-07-25 15:02:05 +02:00
|
|
|
lib/libicule${SO_EXT}${SO_SUFFIX}
|
2010-12-07 18:58:13 +01:00
|
|
|
lib/libicule${SO_EXT}.46${SO_SUFFIX}
|
|
|
|
lib/libicule${SO_EXT}.46.0${SO_SUFFIX}
|
|
|
|
lib/libicule.a
|
2009-07-25 15:02:05 +02:00
|
|
|
lib/libiculx${SO_EXT}${SO_SUFFIX}
|
2010-12-07 18:58:13 +01:00
|
|
|
lib/libiculx${SO_EXT}.46${SO_SUFFIX}
|
|
|
|
lib/libiculx${SO_EXT}.46.0${SO_SUFFIX}
|
|
|
|
lib/libiculx.a
|
|
|
|
lib/libicutest${SO_EXT}${SO_SUFFIX}
|
|
|
|
lib/libicutest${SO_EXT}.46${SO_SUFFIX}
|
|
|
|
lib/libicutest${SO_EXT}.46.0${SO_SUFFIX}
|
|
|
|
lib/libicutest.a
|
2009-07-25 15:02:05 +02:00
|
|
|
lib/libicutu${SO_EXT}${SO_SUFFIX}
|
2010-12-07 18:58:13 +01:00
|
|
|
lib/libicutu${SO_EXT}.46${SO_SUFFIX}
|
|
|
|
lib/libicutu${SO_EXT}.46.0${SO_SUFFIX}
|
|
|
|
lib/libicutu.a
|
2009-07-25 15:02:05 +02:00
|
|
|
lib/libicuuc${SO_EXT}${SO_SUFFIX}
|
2010-12-07 18:58:13 +01:00
|
|
|
lib/libicuuc${SO_EXT}.46${SO_SUFFIX}
|
|
|
|
lib/libicuuc${SO_EXT}.46.0${SO_SUFFIX}
|
|
|
|
lib/libicuuc.a
|
|
|
|
lib/pkgconfig/icu-i18n.pc
|
|
|
|
lib/pkgconfig/icu-io.pc
|
|
|
|
lib/pkgconfig/icu-le.pc
|
|
|
|
lib/pkgconfig/icu-lx.pc
|
|
|
|
lib/pkgconfig/icu-uc.pc
|
update to icu-3.0
major changes:
ICU 3.0 includes the latest bug fixes, locale/charset updates, and
performance/build/porting enhancements.
- Collation
Collation data is in a separate data tree, allowing for easier
modularization and maintenance.
getFunctionalEquivalent API allows for better caching and UI support.
- Unicode 4.0.1
ICU is updated to the latest version of Unicode standard, which had
significant property changes.
- CLDR 1.1
Updates to CLDR 1.1, with many updates to locale data, and special
emphasis on collation data.
- Formatting
As an aid to migration of traditional C (stdio) and C++ (iostream)
formatting, the POSIX-like input/output library, icuio, is officially
supported.
Significant digits now supported in DecimalFormat, for general use and
%g support.
- RFC822 time zone format support in DateFormat for compatibility.
- Currency formatting/parsing improvements
Allows parsing multiple currencies with one formatter, without knowing the
currency in advance. Much cleaner design allowing extensibility to other
measurement units in the future.
- Regular expressions (C)
The regular expressions framework now features a C API, instead of just C++.
- Locales
Locale canonicalization spec defined and implemented. Provides
interoperability with POSIX and .NET locale IDs, more RFC 3066 support.
- Layout engine
Layout engine now supports using different canonically-equivalent Unicode
forms of the same text: e.g. a + ´ or á. This is especially important for
non-Latin scripts.
- Build Environment
ICU can now build its data library much faster on most platforms.
For a complete list see:
http://oss.software.ibm.com/cvs/icu/~checkout~/icu/readme.html?tag=release-3-0
2004-06-26 22:18:50 +02:00
|
|
|
man/man1/derb.1
|
2007-03-23 13:51:13 +01:00
|
|
|
man/man1/genbrk.1
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
man/man1/gencnval.1
|
2007-03-23 13:51:13 +01:00
|
|
|
man/man1/genctd.1
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
man/man1/genrb.1
|
|
|
|
man/man1/icu-config.1
|
|
|
|
man/man1/makeconv.1
|
|
|
|
man/man1/pkgdata.1
|
|
|
|
man/man1/uconv.1
|
|
|
|
man/man8/genccode.8
|
|
|
|
man/man8/gencmn.8
|
2004-04-04 18:58:16 +02:00
|
|
|
man/man8/gensprep.8
|
2007-03-23 13:51:13 +01:00
|
|
|
man/man8/icupkg.8
|
2000-12-20 19:27:59 +01:00
|
|
|
sbin/genccode
|
|
|
|
sbin/gencmn
|
2010-12-07 18:58:13 +01:00
|
|
|
sbin/gennorm2
|
2004-04-04 18:58:16 +02:00
|
|
|
sbin/gensprep
|
2007-03-23 13:51:13 +01:00
|
|
|
sbin/icupkg
|
2004-04-06 18:36:00 +02:00
|
|
|
share/icu/${PKGVERSION}/config/${MH_NAME}
|
Update from version 3.6nb2 to 4.0.1.
Pkgsrc changes:
o New MASTER_SITE
o Adjust PLIST
o Remove no-longer-needed patches, since corresponding changes
have been adopted upstream
o BUILDLINK_ABI_DEPENDS bumped to >=4.0, since a new shared library
version is installed
o Fixes security vulnerability, ref. below.
Dependent pkgsrc packages will have their revisions bumped shortly
due to the (possibly/probably) changed ABI.
Upstream changes:
4.0.1:
ICU4C 4.0.1 is a maintenance release of ICU4J 4.0. The primary
changes of this release were:
* Updated time zone data to 2008i
* Technical preview of string search implementation using
Boyer-Moore algorithm (#6286). For detail information, please
see the tech note here.
* #5691 Conversion: consistent illegal sequences
* #6435 Bad @stable ICU4.0 tags
* #6597 TestDisplayNamesMeta failure
* #6670 Test failure in format/TimeZoneTest/TestShortZoneIDs
4.0:
Major changes in ICU 4.0 include the following:
* Common Changes
o Unicode 5.1 (#5696)
o Locale Data: ICU uses and supports data from Common
Locale Data Repository (CLDR) 1.6 , which includes many
improvements in quality and quantity of data.
o add/removeLikelySubtags (#6124)
o Charset converter file size improvement (#5987)
o Date Interval Formatting (#6157) Note: Calendar type
supported by this feature is Gregorian only in this
release.
o Improved Plural support
* ICU4C Specific Changes
Additional Calendars
+ Chinese (#4081)
+ Coptic/Ethiopic (#4571)
* ICU4J Specific Changes
o Charset
+ Graduated from Technology Preview status
+ ICU2022 Converter (#5791)
+ HZ Converter (#6128)
+ SCSU/BOCU-1 Converter (#2147)
+ Charset Converter Callback (#6144)
o Thai Dictionary break iterator (#5385)
o JDK TimeZone support (#5975)
o Locale Service Provider (#5976)
o More convenient formatting of year+month, day+month,
and other combinations (#6304)
o Simple Duration Formatting (#6303)
* ICU4C Security Fixes
ICU4C 4.0 resolves the vulnerabilities CVE-2007-4770 and
CVE-2007-4771 which were found in earlier versions of ICU.
The standard ICU tests verify that these have been corrected,
however, the updated versions of the previous tests may be
run by applying the following patch to ICU 4.0: r24324. As
well, ICU4C and ICU4J 4.0 resolve the issue underlying
CVE-2008-1036.
2009-03-25 23:30:19 +01:00
|
|
|
share/icu/${PKGVERSION}/install-sh
|
2003-12-03 17:52:48 +01:00
|
|
|
share/icu/${PKGVERSION}/license.html
|
Update to version 2.4.
Based on a PR pkg/20825 by Hiramatsu Yoshifumi, modified by me.
- follow PKG_SYSCONFDIR
List of major changes for this release:
* Regular Expressions Phase 1
ICU 2.4 introduces a Regular Expression C++ API that is modeled after
the JDK 1.4 API. ICU 2.4's Regular Expression API supports Unicode
level 1 regular expressions (see Unicode Regular Expression
Guidelines) but not all pattern metacharacters and features are
supported yet. Regular expressions leverage all of the UnicodeSet
support, including all Unicode 3.2 property names and property value
names. Future ICU releases will complete the pattern support, add
support for higher Unicode regex levels, and improve performance. For
more details see the API References and the User Guide.
* Modularized ICU library building
ICU 2.4 provides build-time switches to prune parts of the library
code, for smaller custom distributions. For details see the readme
file.
* Character set alias management support
Additional APIs map alias+standard to a unique charset name (e.g.,
"Shift-JIS"+"IANA"->"ibm-943_P14A-2000") and enumerate all charset
names in the alias table, not just the installed ones. See
convrtrs.txt and ucnv.h.
These APIs allow programmers to avoid data corruption problems when
different platforms use the same names for different character
conversion mappings.
* EBCDIC-z/OS converter option
The EBCDIC converter now handles swapped LF/NL mappings
algorithmically instead of with modified .ucm/.cnv conversion table
files. This makes this behavior available for all supported EBCDIC
conversions without adding to the data package size. See "swaplfnl" in
convrtrs.txt.
* Additional converter
A new converter implementation has been added for the encoding of IMAP
mailbox names. See RFC 2060/5.1.3. Mailbox International Naming
Convention and "IMAP-mailbox-name" in convrtrs.txt.
* Customizable break iteration
ICU 2.4 allows registration of a BreakIterator with a locale ID. This
allows applications to provide more sophisticated word/sentence break
engines and use them seamlessly with the ICU APIs. In future releases,
this registration mechanism will be extended to all relevant ICU
services. If you are interested in ICU customization, please try out
this feature.
* Collation performance
ICU 2.4 collation was improved in several areas, with an emphasis on
performance:
* Latin-1: Improved performance of u_strcoll().
* Russian/Cyrillic: Improved performance by tailoring collation for
cyrillic-script languages, removing UCA contractions that are not
used for modern Russian (this uses the [suppressContractions]
tailoring option).
* Korean: Improved performance by resolving collation elements for
modern Hangul syllables at build time (this uses the [optimize]
tailoring option).
* Japanese: The default strength for Japanese was reduced from
quaternary to tertiary as in all other locales.
* UnicodeSet performance
UnicodeSet performance is significantly improved, especially for
add(codePoint) and contains(codePoint).
* Unicode property aliases ICU 2.4 introduces APIs for mapping between
all appropriate Unicode property aliases and property value aliases
and ICU property enumeration constants. See u_getPropertyName() etc.
in uchar.h.
* Unicode string functions
* There are new C functions for searching for last occurrences of
characters and partial strings. See u_strrstr(), u_strrchr32()
etc.
* New C/C++/Java functions for efficient checking if a string
contains more than a certain number of code points. See
hasMoreChar32Than().
* Copying UnicodeStrings via the standard assignment operator and
copy constructor does not preserve readonly aliasing any more
because this can sometimes have unexpected and dangerous effects.
A new fastCopyFrom() member function provides the old copy
semantics. See Jitterbug 1794 for more details.
* UTF macros simplified
The low-level C macros for handling code points in 8-bit and 16-bit
Unicode strings have been replaced by a simpler, more consistent set
with more concise names. For details see utf_old.h and utf.h.
Similarly, ICU 2.4 defines the UChar32 consistently (now always as
int32_t) and adds a U_SENTINEL non-code point value for new APIs.
* Performance tests
ICU 2.4 has a new performance test framework and additional
performance tests using this framework. This is not currently
documented, but it is available as part of the source distribution at
source/test/perf/.
2003-03-22 00:44:05 +01:00
|
|
|
share/icu/${PKGVERSION}/mkinstalldirs
|