The package was created by Hiramatsu Yoshifumi, with some very small changes
by me:
- different wording in COMMENT
- BUILD_DEPENDS instead of DEPENDS. The required version of
p5-ExtUtils-CBuilder is not specified in Build.PL, so I left it that way.
- Added USE_LANGUAGES.
=====================================
This Perl module is an Encode::Encoding subclass that uses
Encode::Detect::Detector to determine the charset of the input data
and then decodes it using the encoder of the detected charset.
It is similar to Encode::Guess, but does not require the configuration
of a set of expected encodings. Like Encode::Guess, it only supports
decoding--it cannot encode.
Major changes in ICU 3.6 include the following:
- Unicode: ICU uses and supports Unicode 5.0, which is the latest major release of Unicode. Unicode 5.0 will be used in many operating systems and applications, and this version of ICU is important maintain interoperability with these new operating systems and applications. More information about Unicode 5.0 can be found in the Unicode press release.
- Locale Data: ICU uses and supports data from Common Locale Data Repository (CLDR) 1.4, which includes many improvements in quality and quantity of data. There is 25% more CLDR locale data in 245 locales in ICU.
- ICU4C Specific Changes
- Charset Detection: A charset detection framework was added, which provides heuristics for detecting the charset for unlabeled sequences of bytes.
- Layout: The font layout engine has support added for Tibetan, Sinhala and Old Hangul.
- BiDi: The BiDi algorithm was enhanced to be more flexible and efficient
- ICU Data Management: The new icupkg tool provides an easier way to manage ICU's data library. This tool allows you to add, update or remove data from ICU's data archive.
- Time Zones The time zone data is modularized to allow easier building and updating of the data.
- Word Boundaries: The Thai word break iteration was improved to be more accurate. Also dictionary based detection of Thai word boundaries is now active for all locales.
- UText
- The BreakIterator uses UText for abstract text processing.
- 64-bit indexing is now used to allow access to larger chunks of text.
- API for read-only locking for security and robustness was added.
- Performance
- The u_sprintf/u_sscanf performance from the icuio library has been improved for number formatting/parsing.
- Constructing a DateFormat is significantly faster for many locales.
- Opening and closing a charset converter is significantly faster.
- The UTF-8 transformation functions and macros are faster.
- The UText API was improved for performance.
- The collation open and close functions have a small performance improvement.
Overview of Changes in Namazu 2.0.17 - Mar 12, 2007
* filter/win32/ole*.pl: correspondence Office 2007. [for Windows]
* filter/win32/olevisio.pl: It corresponds to Visio 2000 of another type.
For Visio 2007/.vdx file. [for Windows]
* OOo bug correction.
* for Office Open XML file. [for Windows]
* nmzcat: SJIS output. [For Windows]
* mailutime: Bug correction related to passing.
* To the code in which it considers after 2038(In the direction that doesn't
correspond).
* File-MMagic: Imported 1.27.
* For eml file.
* libnmz: Speed-up of retrieval.
* nmzchkw.pl: New addition. (contrib)
* libnmz: The bug around the memory is corrected. (users-ja#821).
* namazu and namazu.cgi: The bug that falls into an infinite loop is corrected.
* namazu and namazu.cgi: Correction of HTML emphasis tag. (for Windows)
* gcnmz and nmzmerge: The output of the log is corrected and the format is
corrected.
* namazu and namazu.cgi: The possibility that the buffer overflow
cuts it when the template is corrected is corrected.
* filter/mp3.pl: MP3-Info 1.21.
* namazu.spec.in: add nmzcat, nmzegrep.
* namazu.spec.in: fix filter-requires-namazu.sh.
* conf/namazurc-sample: It is added to the comment that Suicide_Time
is only UNIX.
* scripts/mknmz.in: The mistake of the number of dummy arguments of
process_file() is corrected.
* filter/pdf.pl: 'Unable to convert pdf file (maybe copying
protection)' was corrected at option --debug.
* filter/msofficexml.pl: Added new fiter.
* filter/visio.pl: Added a new filter.
* filter/mp3.pl: Support MP3-Info 1.21's behavior.
* tests/*: It deals with trouble in which make check fails
because of the environment of Mac + gettext 0.14.2.
* tests/data/ja/*: Added new file.
* Fix some bugs.
Pakcaged by Aleksey Cheusov and requested in PR 35469.
This distribution contains the Net::Dict module for Perl.
Net::Dict is a class implementing a simple client API
for the DICT protocol defined in RFC2229.
Sed 4.1.5
* fix parsing of a negative character class not including a closed bracket,
like [^]] or [^]a-z].
* fix parsing of [ inside an y command, like y/[/A/.
* output the result of commands a, r, R when a q command is found.
----------------------------------------------------------------------------
Sed 4.1.4
* \B correctly means "not on a word boundary" rather than "inside a word"
* bugfixes for platform without internationalization
* more thorough testing framework for tarballs (`make full-distcheck')
----------------------------------------------------------------------------
Sed 4.1.3
* regex addresses do not use leftmost-longest matching. In other words,
/.\+/ only looks for a single character, and does not try to find as
many of them as possible like it used to do.
* added a note to BUGS and the manual about changed interpretation
of `s|abc\|def||', and about localization issues.
* fixed --disable-nls build problems on Solaris.
* fixed `make check' in non-English locales.
* `make check' tests the regex library by default if the included regex
is used (regex tests had to be enabled separately up to now).
----------------------------------------------------------------------------
Sed 4.1.2
* fix bug in 'y' command in multi-byte character sets
* fix severe bug in parsing of ranges with an embedded open bracket
* fix off-by-one error when printing a "bad command" error
REG_STARTEND macro, it doesn't work as expected. A simple test case is:
printf '\0\n\0\n' | nbsed /a/d
This test does not yet work as expected, but at least it doesn't cause
segmentation faults anymore. Handling of '\0' bytes must be improved.
to the versions corresponding to the 2006-09 release. This should have
been done when the main hugs package was updated to this version back in
October of past year.
Changes:
2006-11-11 Mikio Hirabayashi
* estraier.c (est_set_ecode): new function.
* estraier.c (est_search_union): scoring of ASIS mode was modified.
* estraier.c (est_resmap_add, est_resmap_dump_keys): new functions.
* estraier.c (est_narrow_scores): efficiency of narrowing index was improved.
* estraier.c (est_utime): new function.
* estraier.c (est_cond_scores, est_cond_set_narrowing_scores): new functions.
* estraier.c (est_rescc_put): a bug of memory leak was fixed.
* estnode.c: the function "est_datum_printf" was replaced by "cbdatumprintf".
* estmaster.c (sendnodecmdsearch): accuracy of hints was improved.
* estfraud.c (sendnodecmdputdoc): accuracy of hints was improved.
* estfraud.c (sendnodecmdputdoc): morphological analyzer support was added.
* estfraud.c (sendnodecmdputdoc): accuracy of hints was improved.
* estwaver.c (runinit, procinit): "-apn", "-acc", "-sv", "-si", "-sa" options was added.
* estscout.c: new file.
* estsupt.c: new file.
- Release: 1.4.10
Collection.
The Perl 5 module Text::RewriteRules uses a simplified syntax for
regexp-based rules for rewriting text. You define a set of rules,
and the system applies them until no more rule can be applied
1.62
- interface to libxml2's pull-parser XML::LibXML::Reader
(initiated by Heiko Klein)
- make error messages intended to the user report the line of the
application call rather than that of the internal XS call
- XML::LibXML::Attr->serializeContent added (convenience function)
- fix getAttributeNode etc. w.r.t. #FIXED attributes (as well as some
cases with old buggy libxml2 versions)
- warn if runtime libxml2 is older than the one used at the compile time
- if compiled against libxml2 >= 2.6.27, new parse_html_* implementation is used
allowing encoding and other options to be passed to the parser
- DOM-compliant nodeNames: #comment, #text, #cdata, #document, #document-fragment
- toString on empty text node returns empty string, not undef
- cloneNode copies attributes on an element as required by the DOM spec
0.15 08 Feb 2007 Grant McLean
- Fixed handling of entities in attribute values
- Cleaned up some benign warnings
0.14 23 Apr 2006 Matt Sergeant
- Fixed CDATA section parsing (Uwe Voelker)
- Fix Makefile.PL for VMS
- Support calling set_handler() mid-parse
- Fix for when random modules overload UNIVERSAL::AUTOLOAD()
- Fix case when ParserDetails.ini isn't being updated but we are doing an
upgrade.
0.13 24 Oct 2005 Matt Sergeant
- Complete re-write of XML::SAX::PurePerl for performance
- Support Encoding & XMLVersion in DocumentLocator interface
- A few conformance tweaks to match perl SAX 2.1.
Enca is an Extremely Naive Charset Analyser. It detects character
set and encoding of text files and can also convert them to other
encodings. The charset detecting functionality is also available as
a library.
Enca currently can determine 8bit charsets of Belarussian, Bulgarian,
Croatian, Czech, Estonian, Hungarian, Latvian, Lithuanian, Polish,
Russian, Slovak, Slovene, Ukrainian, and Chinese texts and also
some multibyte encodings, independent on language (provided it's
some European language).
XML-RPC is a quick-and-easy way to make procedure calls over the
Internet. It converts the procedure call into XML document, sends
it to a remote server using HTTP, and gets back the response as
XML.
This library provides a modular implementation of XML-RPC for C
and C++.
XXX: build system is custom-made (i.e. not using libtool), PLIST
will be wrong for many OPSYS -- please fix!
Version 2.5
* fixed lang.map for php files
* fixed url.lang
* --debug-langdef can be interactive
* nohilite.lang that does not perform any highlighting, but
only formats the input file into the output format (dealing
with output format special characters)
* default.lang to which source-highlight falls back when no
input language is specified or available
* infer script languages
* --header and --footer options do not require --doc option
* --statistics print elapsed time
* highlight cls, dtx and sty LaTeX files
* language definition for Tcl
* language definition for Sql
* language definition for bibtex
* infer language of script files
Version 2.4
* language definition for C# (thanks to Hemmi Shigeru)
* language definition for XML (thanks to Andy Buckley)
* language definition for shell scripts (thanks to Dirk Jagdmann)
* fixed language definition for HTML (tags with numbers are highlighted,
e.g., <h1>)
* updated language definition for logtalk (thanks to Paulo Moura)
* produces the list of elements of a language definition file
(--show-lang-elements)
* output format definition for HTML where fonts by default are
not fixed width.
* bug fix in url regular expressions
* bug fix with nonsensitive keywords (thanks to Andrea Ercolino)
* improved documentation concerning installation of Boost regex library
Version 2.3
* the regex automaton is printed on the standard output
(instead of the standard error)
* language definition for postscript
* DocBook output format
* fixed bug in .map files with \r characters
* fixed expression for email
Version 2.2
* fixed a bug that sigsegv when more than one input file
is provided
* fixed a compilation error with gcc 4.0
* generate references using ctags
* fixed a bug with LaTeX output of " with some inputencs
and with latexcolor
* handle direct color specifications independently from HTML
* fixed conversion of hexadecimal characters in output language
definitions
* fixed compilation error with regex 1.33
* include man page
* language definition for diff output
* fixed bugs in generation of the regular expression automaton
* extended documentation with some tutorials on input language
definitions
* generate more compact output (reduced size)
* in LaTeX output longtable is not used anymore
the NetBSD packages Collection.
This Perl 5 module provides a compromise between SAX and DOM
processing by allowing to use DOM API to process only reasonably
small parts of an XML document.
0.04 Wed Jan 11 10:33:35 UTC 2006
- Oops, fixed the typo in imgbase default URL
0.03 Wed Jan 11 10:17:37 UTC 2006
- Now it requires Text::Emoticon 0.03
- Use MSN site as a default imgbase now