Commit Graph

15 Commits

Author SHA1 Message Date
Albert Cervera i Areny 2d2910d49a Improvements in date and page number detection. 2009-03-25 02:02:17 +01:00
Albert Cervera i Areny 642cf462e7 Call callback function only once per Tag. Added several words for invoice number. 2009-03-25 02:01:29 +01:00
Albert Cervera i Areny 201109c913 Remove commented code in InvoiceRecognizer. 2009-03-23 23:17:06 +01:00
Albert Cervera i Areny ccee97f718 Added an algorithm to detect fixed pitch fonts.
Now the algorithm to format text (adding spaces) takes into account if
font is fixedPitch. Detection seems to work reasonably well. Format text
algorithm needs to take fixedPitch on a per bloc basis instead of using
the same criteria for the same line.
2009-03-23 23:14:58 +01:00
Albert Cervera i Areny 9cb1e8870a Remove a couple of print's. 2009-03-23 23:14:33 +01:00
Albert Cervera i Areny 41224310da Several improvements in invoice recognition:
Added new types, improved date recognition, improved performance.
2009-03-23 16:09:11 +01:00
Albert Cervera i Areny 4119ec747c Important performance improvement: store images as BMP instead of PNG
before processing them with external tools. A sample image took 13
seconds to be stored as PNG while BMP took less than a second.
2009-03-16 23:45:12 +01:00
Albert Cervera i Areny 02241981b6 Improved Block class implementation. Make Ocr use it internally.
First steps towards Block finding in documents.
2009-03-16 23:18:56 +01:00
Albert Cervera i Areny 4cafa912ce Added Block, PdfReader, Range and TextPatterns. 2009-03-14 18:11:20 +01:00
Albert Cervera i Areny 0eb944c512 - Fixed doxygen file.
- Added invoice recognition module. Still missing block detection.
2009-03-14 18:10:23 +01:00
Albert Cervera i Areny 2cbba682f0 Make region optional in textInRegion functions. 2009-03-14 18:08:32 +01:00
Albert Cervera i Areny fe3f8c1cb8 Use Levenshtein module when available for performance reasons
(it's 300 times faster).
2008-12-30 20:25:59 +01:00
Albert Cervera i Areny 7d8111e99c Implemented and fully working Ocr Cuneiform backend. 2008-12-30 02:30:59 +01:00
Albert Cervera i Areny 021d5df12d More uppercase renaming. 2008-12-30 00:44:53 +01:00
Albert Cervera i Areny c2fb42ebbd Renamed:
- Unwritable NaNScaN To NanScan.
- Capitalized file names.
2008-12-29 01:53:29 +01:00