Commit graph

2 commits

Author SHA1 Message Date
Sergey Svishchev
c68285ed14 Follow redirects (HOMEPAGE, MASTER_SITES) 2006-11-10 23:16:04 +00:00
Thomas Klausner
922e8744f0 Initial import of tesseract-1.02:
This code is a raw OCR engine. It has NO PAGE LAYOUT ANALYSIS, NO
OUTPUT FORMATTING, and NO UI. It can only process an image of a
single column and create text from it. It can detect fixed pitch
vs proportional text.  Having said that, in 1995, this engine was
in the top 3 in terms of character accuracy, and it compiles and
runs on both Linux and Windows. Another current limitation is that
it only recognizes English and its character set is only US-ASCII.
Training code IS included in the open source release however, and
will be included in a future release.

TODO:

Compiles fine, but dumps core on NetBSD-4.99.3/amd64. Backtrace:
Program terminated with signal 11, Segmentation fault.
#0  0x00000000004c1c70 in reverse32 ()
(gdb) bt
#0  0x00000000004c1c70 in reverse32 ()
#1  0x00000000004aed12 in read_squished_dawg ()
#2  0x00000000004aaded in init_permute ()
#3  0x0000000000485779 in program_editup ()
#4  0x0000000000485869 in start_recog ()
#5  0x0000000000403d04 in init_tesseract ()
#6  0x000000000040309b in main ()
(gdb)
2006-10-27 22:30:56 +00:00