mirror of
https://github.com/NaN-tic/nanscan.git
synced 2023-12-14 03:12:57 +01:00
37 lines
1.7 KiB
Plaintext
37 lines
1.7 KiB
Plaintext
|
Some ideas where nanscan could help/implement:
|
||
|
|
||
|
- The queue order of scanned documents should be kept somehow. For example,
|
||
|
in Open ERP even if one document has gone as an attachment to one customer
|
||
|
it could be interesting to know which document (page) was scanned just before
|
||
|
and which just after. This could be used by algorithms trying to identify
|
||
|
related pages. For multi page document handling we could use djvu as it seems
|
||
|
the right document format. It allows attaching meta information, ocr
|
||
|
information with characters positions is supported (perl script available to
|
||
|
feed from tesseract output), it can also handle annotations.
|
||
|
Image compression is supposed to be very good. A first test with c44 (the
|
||
|
compression application) shows results similar to JPEG.
|
||
|
|
||
|
It should be easy to attach concepts such as page, volume or collection to an image.
|
||
|
Templates should ease that too. So by simply setting an input as 'page' or 'volume'
|
||
|
there should be some logic to handle and understand them and put them in the same
|
||
|
djvu (or PDF, TIFF, whatever) document.
|
||
|
|
||
|
djvu has some interesting features regarding speed. You can even have each page (or I
|
||
|
suppose a set of pages) in different files which can be queried on demand. The index
|
||
|
file just links to the appropiate URL for each of the pages.
|
||
|
|
||
|
- Logo/image detection/recognition
|
||
|
|
||
|
- Document segmentation
|
||
|
|
||
|
- Multiline values
|
||
|
|
||
|
- API consistency: Should we use python properties (python standard) or setProperty()
|
||
|
property() functions (PyQt4 standard). The latter has a couple of problems: the need
|
||
|
to create both functions for each property and that if we want to embed a widget in
|
||
|
QtDesigner properties should be set in Python style.
|
||
|
|
||
|
- Add simple scanning dialog, with the possibility of storing the image in a file,
|
||
|
server, etc.
|
||
|
|