Albert Cervera i Areny
2d2910d49a
Improvements in date and page number detection.
2009-03-25 02:02:17 +01:00
Albert Cervera i Areny
642cf462e7
Call callback function only once per Tag. Added several words for invoice number.
2009-03-25 02:01:29 +01:00
Albert Cervera i Areny
201109c913
Remove commented code in InvoiceRecognizer.
2009-03-23 23:17:06 +01:00
Albert Cervera i Areny
ccee97f718
Added an algorithm to detect fixed pitch fonts.
...
Now the algorithm to format text (adding spaces) takes into account if
font is fixedPitch. Detection seems to work reasonably well. Format text
algorithm needs to take fixedPitch on a per bloc basis instead of using
the same criteria for the same line.
2009-03-23 23:14:58 +01:00
Albert Cervera i Areny
9cb1e8870a
Remove a couple of print's.
2009-03-23 23:14:33 +01:00
Albert Cervera i Areny
41224310da
Several improvements in invoice recognition:
...
Added new types, improved date recognition, improved performance.
2009-03-23 16:09:11 +01:00
Albert Cervera i Areny
4119ec747c
Important performance improvement: store images as BMP instead of PNG
...
before processing them with external tools. A sample image took 13
seconds to be stored as PNG while BMP took less than a second.
2009-03-16 23:45:12 +01:00
Albert Cervera i Areny
02241981b6
Improved Block class implementation. Make Ocr use it internally.
...
First steps towards Block finding in documents.
2009-03-16 23:18:56 +01:00
Albert Cervera i Areny
4cafa912ce
Added Block, PdfReader, Range and TextPatterns.
2009-03-14 18:11:20 +01:00
Albert Cervera i Areny
0eb944c512
- Fixed doxygen file.
...
- Added invoice recognition module. Still missing block detection.
2009-03-14 18:10:23 +01:00
Albert Cervera i Areny
2cbba682f0
Make region optional in textInRegion functions.
2009-03-14 18:08:32 +01:00
Albert Cervera i Areny
fe3f8c1cb8
Use Levenshtein module when available for performance reasons
...
(it's 300 times faster).
2008-12-30 20:25:59 +01:00
Albert Cervera i Areny
7d8111e99c
Implemented and fully working Ocr Cuneiform backend.
2008-12-30 02:30:59 +01:00
Albert Cervera i Areny
021d5df12d
More uppercase renaming.
2008-12-30 00:44:53 +01:00
Albert Cervera i Areny
c2fb42ebbd
Renamed:
...
- Unwritable NaNScaN To NanScan.
- Capitalized file names.
2008-12-29 01:53:29 +01:00