freebsd-ports/textproc/rubygem-whatlanguage/pkg-descr
Joseph Mingrone 8a4e16c9ca textproc/rubygem-whatlanguage: Natural language detection for text samples
Adding textproc/rubygem-whatlanguage, because it is a dependency for the
upcoming port, net-im/mastodon.

Approved by:	swills (mentor, implicit)
2017-04-26 19:33:45 +00:00

10 lines
512 B
Text

WhatLanguage, written in pure-Ruby, detects the human language of supplied text.
It uses Bloom filters, so it is fast and memory efficient. It works well on
text of over 10 words in length (e.g. blog posts or comments) and very poorly on
short or Twitter-esque text.
It works with Arabic, Dutch, English, Farsi, Finnish, French, German, Greek,
Hebrew, Hungarian, Italian, Korean, Norwegian, Pinyin, Polish, Portuguese,
Russian, Spanish, and Swedish out of the box.
WWW: https://github.com/peterc/whatlanguage