freebsd-ports/chinese/p5-Lingua-ZH-WordSegmenter/pkg-descr
Cheng-Lung Sung 14e6325838 Add p5-Lingua-ZH-WordSegmenter 0.01, simplified Chinese Word
Segmentation.

PR:		ports/113476
Submitted by:	Gea-Suan Lin <gslin at gslin.org>
2007-07-02 02:08:49 +00:00

13 lines
591 B
Text

This is a perl version of simplified Chinese word segmentation.
The algorithm for this segmenter is to search the longest word at each
point from both left and right directions, and choose the one with
higher frequency product.
The original program is from the CPAN module Lingua::ZH::WordSegment
(http://search.cpan.org/~chenyr/) I did the follwing changes: 1) make
the interface object oriented; 2) make the internal string into utf8;
3) using sogou's dictionary (http://www.sogou.com/labs/dl/w.html) as
the default dictionary.
WWW: http://search.cpan.org/dist/Lingua-ZH-WordSegmenter/