textproc/p5-Lingua-EN-Tagger from 0.16nb2 to 0.20.
pkgsrc changes:
- add newly introduced dependency to www/p5-HTML-Tagset
Upstream changes since 0.16:
0.20 Aaron Coburn 7/6/12
Escaped curly braces in regex patterns.
In perl 5.17 this becomes necessary.
0.19 Aaron Coburn 5/28/12
Added missing metadata fields to
Makefile.PL
0.18 Aaron Coburn 5/11/12
Added requirement for 5.8 for proper
unicode support. Modified get_sentences
routine for $ chars as with preceeding
issue.
0.17 Aaron Coburn 5/10/12
Added better error handling for loading
YAML files. Fixed error in get_sentences
routine related to (, [ and { characters
being fused to the preceding word.
to trigger/signal a rebuild for the transition 5.10.1 -> 5.12.1.
The list of packages is computed by finding all packages which end
up having either of PERL5_USE_PACKLIST, BUILDLINK_API_DEPENDS.perl,
or PERL5_PACKLIST defined in their make setup (tested via
"make show-vars VARNAMES=..."), minus the packages updated after
the perl package update.
sno@ was right after all, obache@ kindly asked and he@ led the
way. Thanks!
textproc/p5-Lingua-EN-Tagger as dependency of scheduled import of
Lingua::EN::Inflect::Phrase, which is a dependency of scheduled update
of DBIx::Class::Schema::Loader.
The module is a probability based, corpus-trained tagger that assigns POS
tags to English text based on a lookup dictionary and a set of probability
values. The tagger assigns appropriate tags based on conditional
probabilities - it examines the preceding tag to determine the appropriate
tag for the current word. Unknown words are classified according to word
morphology or can be set to be treated as nouns or other parts of speech.
The tagger also extracts as many nouns and noun phrases as it can, using a
set of regular expressions.