part). These patches, released under a BSD license, seem to improve the
accuracy of language detection, especially those that don't have a
Latin script.
technique described in Cavnar & Trenkle, "N-Gram-Based Text Categorization".
It was primarily developed for language guessing, a task on which it is known to
perform with near-perfect accuracy.
WWW: http://software.wise-guys.nl/libtextcat/