Change log:
2014-06-02 Németh László <nemeth at numbertext dot org>:
* escape spaces in paths of ODF files
2014-05-28 Németh László <nemeth at numbertext dot org>:
* add long path/Unicode path support in WIN32 environment:
- hunspell#233 (reported by mahak gark) and LibreOffice fdo#48017
* flat ODF support, eg.:
hunspell doc.fodt
cat doc.fodt | hunspell -l -O
* new options:
- -X (XML) input format
- -O (ODF or flat ODF) input format
- --check-apostrophe: check and force Unicode apostrophe usage
(ASCII or Unicode apostrophe has to be in the
WORDCHARS section of the affix file)
* fix ODF support:
- break 1-line XML of ODT documents at </style:style>, too,
not only at </text:p> (limiting tokenization problems, when
fgets stops within an XML tag)
- show ODF file path on the UI instead of the temporary file
* fix XML support:
- ', ", &, < and > in replacements converted to XML entities
- recognize &apos at tokenization, depending from WORDCHARS
- ' in tokens converted to ' before spell checking and
in the output of the pipe interface
* better apostrophe usage:
- WORDCHARS only with one of the Unicode or ASCII apostrophe
results extended word tokenization: both of them will be part of
the words (if they are inside: eg. word's, but not words').
- convert Unicode apostrophes to ASCII ones for 8-bit dictionaries
(eg. English dictionaries), or for UTF-8 dictionaries only
with ASCII apostrophe supports (eg. French dictionaries).
* updated manual:
- hunspell.4 renamed to hunspell.5, see
hunspell#241 reported by Cristopher Yeleighton
- updated translations
- note about long/Unicode paths in WIN32 (hunspell.3)
2014-04-25 Németh László <nemeth at numbertext dot org>:
* OpenDocument support, eg.
hunspell *.odt
hunspell -l *.odt
* always load default personal dictionary (fix
filtering bad words - reduce this word list - using
it as a personal dictionary workflow)
* fix parsing/URL recognition problem (bad tokens
with aposthrophes)
2013-07-25 pchang9@cs.wisc.edu
* moz#897255 Wasted work in line_uniq
* moz#897780 Wasted work in SuggestMgr::twowords
2013-07-25 Caolán McNamara <caolanm at LibO>:
* hunspell#167 layout problems with long lines
- based on the original fix by xorho
adapted to HEAD
* rhbz#925562 upgrade config.guess for aarch64
2013-07-24 pchang9@cs.wisc.edu
* moz#896301 Wasted work in SfxEntry::checkword
* moz#896844 Wasted work in AffixMgr::defcpd_check
2013-06-13 Konstantin Khlebniko
* #49 HashMgr::add_word computes wrong size for struct hentry
2013-06-13 Ville Skyttä
* #53 Man page syntax fixes
2013-04-19 John Thomson <john thomson at SIL>
* win_api: add remove() of Hunspell API (hun#3606435)
2013-04-19 Rouslan Solomokhin <at sf.net>
* fix crash in suggestions for 99-character long words
by extending arrays of SuggestMgr::forgotchar_*
(hun#3595024, also http://crbug.com/130128),
thanks to also Pawe�<82> Hajdan to report the patch
2013-04-01 Caolán McNamara <caolanm at LibO>:
* hunspell: -Werror=undef
2013-03-13 Caolán McNamara <caolanm at LibO>:
* rhbz#918938 crash in interaction with danish thesaurus
2012-09-18 Németh László <nemeth at numbertext dot org>:
* src/hunspell/affixmgr.*: - fix morphological analysis of
compound words (hun#3544994, reported by Dávid Nemeskey, fdo#55045)
2012-06-29 Caolán McNamara <caolanm at LibO>:
* fix various coverity warnings
2012-01-10 Ehsan Akhgari <ehsan at mozilla dot com>
* moz#710940 Firefox Crash [@ AffixMgr::parse_file(char const*, char
const*) ]
2011-12-16 Jared Wein <jwein at mozilla dot com>
* moz#710967 Incorrect argument passed to strncmp in
AffixMgr::parse_convtable
2011-12-06 Caolán McNamara <caolanm at LibO>:
* rhbz#759647 fixed tempname of hunSPELL.bak collides with other users
when multiple edits in one dir
2011-10-13 Caolán McNamara <caolanm at LibO>:
* moz#694002 crash in hunspell affixmgr on exit with bad .aff
* leak in hunspell affixmgr with bad .aff
2011-09-19 Caolán McNamara <caolanm at LibO>:
* make libparsers.a not installed thanks to Tomá Chvátal
2011-06-23 Caolán McNamara <caolanm at LibO>:
* fix some windows compiler warnings
2011-05-24 Németh László <nemeth at numbertext dot org>:
* src/hunspell/affixmgr.*: allow twofold suffixes in compounds
by extended version of Arno Teigseth's patch, see hun#3288562.
- new option for this feature: COMPOUNDMORESUFFIXES
2011-02-16 Németh László <nemeth at numbertext dot org>:
* src/*/Makefile.am: fix library versioning, the probem reported by
Rene Engerhald and Simon Brouwer.
* man/hunspell.4: new version based on the revised version of Ruud Baars
Do it for all packages that
* mention perl, or
* have a directory name starting with p5-*, or
* depend on a package starting with p5-
like last time, for 5.18, where this didn't lead to complaints.
Let me know if you have any this time.
a) refer 'perl' in their Makefile, or
b) have a directory name of p5-*, or
c) have any dependency on any p5-* package
Like last time, where this caused no complaints.
This changes the buildlink3.mk files to use an include guard for the
recursive include. The use of BUILDLINK_DEPTH, BUILDLINK_DEPENDS,
BUILDLINK_PACKAGES and BUILDLINK_ORDER is handled by a single new
variable BUILDLINK_TREE. Each buildlink3.mk file adds a pair of
enter/exit marker, which can be used to reconstruct the tree and
to determine first level includes. Avoiding := for large variables
(BUILDLINK_ORDER) speeds up parse time as += has linear complexity.
The include guard reduces system time by avoiding reading files over and
over again. For complex packages this reduces both %user and %sys time to
half of the former time.
No longer needs ncurses (at least on NetBSD 5.0).
Official changelog:
2008-11-01: Hunspell 1.2.8 release:
- Default BREAK feature and better hyphenated word suggestion to accept
and fix (compound) words with hyphen characters by spell checker
instead of by work breaking code of OpenOffice.org. With this feature
it's possible to accept hyphenated compound words, such as "scot-free",
where "scot" is not a correct English word.
- ICONV & OCONV: input and output conversion tables for optional character
handling or using special inner format. Example:
# Accepting de facto replacements of the Romanian comma acuted letters
SET UTF-8
ICONV 4
ICONV Å È
ICONV Å£ È
ICONV Å È
ICONV Å¢ È
Typical usage of ICONV/OCONV is to manage an inner format for a segmental
writing system, like the Ethiopic script of the Amharic language.
- Extended CHECKCOMPOUNDPATTERN to handle conpound word alternations, like
sandhi feature of Telugu and other writing systems.
- SIMPLIFIEDTRIPLE compound word feature: allow simplified Swedish and
Norwegian compound word forms, like tillåta (till|låta) and
bussjåfør (buss|sjåfør)
- wordforms: word generator script for dictionary developers (Hunspell
version of unmunch).
- bug fixes
2008-08-15: Hunspell 1.2.7 release:
- FULLSTRIP: new option for affix handling. With FULLSTRIP, affix rules can
strip full words, not only one less characters.
- COMPOUNDRULE works with all flag types. (COMPOUNDRULE is for pattern
matching. For example, en_US dictionary of OpenOffice.org uses COMPOUNDRULE
for ordinal number recognition: 1st, 2nd, 11th, 12th, 22nd, 112th, 1000122nd
etc.).
- optimized suggestions:
- modified 1-character distance suggestion algorithms: search a TRY character
in all position instead of all TRY characters in a character position
(it can give more readable suggestion order, also better suggestions
in the first positions, when TRY characters are sorted by frequency.)
For example, suggestions for "moze":
ooze, doze, Roze, maze, more etc. (Hunspell 1.2.6),
maze, more, mote, ooze, mole etc. (Hunspell 1.2.7).
- extended compound word checking for better COMPOUNDRULE related
suggestions, for example English ordinal numbers: 121323th -> 121323rd
(it needs also a th->rd REP definition).
- bug fixes
2008-07-15: Hunspell 1.2.6 release:
- bug fix release (fix affix rule condition checking of sk_SK dictionary,
iconv support in stemming and morphological analysis of the Hunspell
utility, see also Changelog)
2008-07-09: Hunspell 1.2.5 release:
- bug fix release (fix affix rule condition checking of en_GB dictionary,
also morphological analysis by dictionaries with two-level suffixes)
2008-06-18: Hunspell 1.2.4-2 release:
- fix GCC compiler warnings
2008-06-17: Hunspell 1.2.4 release:
- add free_list() for C, C++ interfaces to deallocate suggestion lists
- bug fixes
2008-06-17: Hunspell 1.2.3 release:
- extended XML interface to use morphological functions by standard
spell checking interface, spell() and suggest(). See hunspell.3 manual page.
- default dash suggestions for compound words: newword-> new word and new-word
- new manual pages: hunspell.3, hzip.1, hunzip.1.
- bug fixes
pkgsrc change:
buildlink3.mk:
Bump API_DEPENDS, since shlib name changed. No dependencies in pkgsrc.
Release notes:
2008-04-12: Hunspell 1.2.2 release:
- extended dictionary (dic file) support to use multiple base and
special dictionaries.
- new and improved options of command line hunspell:
-m: morphological analysis or flag debug mode (without affix
rule data it signs the flag of the affix rules)
-s: stemming mode
-D: list available dictionaries and search path
-d: support extra dictionaries by comma separated list. Example:
hunspell -d en_US,en_med,de_DE,de_med,de_geo UNESCO.txt
- forbidding in personal dictionary (with asterisk, / signs affixation)
- optional compressed dictionary format "hzip" for aff and dic files
usage:
hzip example.aff example.dic
mv example.aff example.dic /tmp
hunspell -d example
hunzip example.aff.hz >example.aff
hunzip example.dic.hz >example.dic
- new affix compression tool "affixcompress": compression tool for
large (millions of words) dictionaries.
- support encrypted dictionaries for closed OpenOffice.org extensions or
other commercial programs
- improved manual
- bug fixes
2007-11-01: Hunspell 1.2.1 release:
- new memory efficient condition checking algorithm for affix rules
- new morphological functions:
- stem() for stemming
- analyze() for morphological analysis
- generate() for morphological generation
- new demos:
- analyze: stemming, morphological analysis and generation
- chmorph: morphological conversion of texts
"ncurses" option. "wide-curses" now just toggles whether we use
wide or narrow curses, which is a much simpler knob for users.
Bump the PKGREVISION to 2.
updated to latest version by me:
Hunspell is the default spell checker of OpenOffice.org office suite
and expectant spell checker of Mozilla Firefox and Thunderbird.
Main features:
* Unicode support.
* Conditional and multiple affixes for languages with rich morphology.
* Extended compound word support.
* Morphological analysis (in custom item and arrangement style).
* Hunspell is based on MySpell and works also with MySpell dictionaries.
* GPL/LGPL/MPL tri-license