pkgsrc/graphics/tesseract/distinfo

13 lines
942 B
Text
Raw Normal View History

$NetBSD: distinfo,v 1.15 2017/06/14 14:41:26 fhajny Exp $
Update graphics/tesseract to 3.04.01. Move to new home at Github. Clean up. 2015-02-17 - V3.04.01 - Added OSD renderer for psm 0. Works for single page and multi-page images. - Improve tesstrain.sh script. - Simplify build and run of ScrollView. - Improved PDF output for OS X Preview utility. - INCOMPATIBLE fix to hOCR line height information - commit 134ebc3. - Added option to build Tesseract without Cube OCR engine (-DNO_CUBE_BUILD). - Enable OpenMP support. - Many bug fixes. 2015-07-11 - V3.04.00 - Tesseract development is now done with Git and hosted at github.com (Previously we used Subversion as a VCS and code.google.com for hosting). - Tesseract now requires leptonica 1.71 or a higher version. - Removed official support for VS 2008. - Added support for 39 additional scripts/languages, including: amh, asm, aze_cyrl, bod, bos, ceb, cym, dzo, fas, gle, guj, hat, iku, jav, kat, kat_old, kaz, khm, kir, kur, lao, lat, mar, mya, nep, ori, pan, pus, san, sin, srp_latn, syr, tgk, tir, uig, urd, uzb, uzb_cyrl, yid - Major updates to training system as a result of extensive testing on 100 languages. - New training data for over 100 languages - Improved performance with PIC compilation option. - Significant change to invisible font system in pdf output to improve correctness and compatibility with external programs, particularly ghostscript. - Improved font identification. - Major change to improve layout analysis for heavily diacritic languages: Thai, Vietnamese, Kannada, Telugu etc. - Fixed problems with shifted baselines so recognition can recover from layout analysis errors. - Major refactor to improve speed on difficult images, especially when running a heap checker. - Moved params from global in page layout to tesseractclass. - Improved single column layout analysis. - Allow ocr output to multiple formats using tesseract command line executable. - Fixed issues with mixed eng+ara scripts. - Improved script consistency in numbers. - Major refactor of control.cpp to enable line recognition. - Added tesstrain.sh - a master training script. - Added ability to text2image training tool to just list available fonts. - Added ability to text2image to underline words. - Improved efficiency of image processing for PDF output. - Added parameter description for each parameter listed with 'print-parameters' command line option. - Added font info to hOCR output. - Enabled streaming input and output of multi-page documents. - Many bug fixes. 2014-02-04 - V3.03(rc1) - Added new training tool text2image to generate box/tif file pairs from text and truetype fonts. - Added support for PDF output with searchable text. - Removed entire IMAGE class and all code in image directory. - Tesseract executable: support for output to stdout; limited support for one page images from stdin (especially on Windows) - Added Renderer to API to allow document-level processing and output of document formats, like hOCR, PDF. - Major refactor of word-level recognition, beam search, eliminating dead code. - Refactored classifier to make it easier to add new ones. - Generalized feature extractor to allow feature extraction from greyscale. - Improved sub/superscript treatment. - Improved baseline fit. - Added set_unicharset_properties to training tools. - Many bug fixes. - More training source data included.
2016-03-17 13:51:14 +01:00
SHA1 (tessdata-3.04.00.tar.gz) = 6ea24cccf0e823da98589ccc75d51f0950618236
RMD160 (tessdata-3.04.00.tar.gz) = 0a3c3b3c127b6031e2e037d78e3a6f159fb9e869
SHA512 (tessdata-3.04.00.tar.gz) = 4fbb66137c729e16c7a9e35b09916a45c1bb5ec5a7002a22647e0b10975362cb44c6d6c0c997baf25866f78749ec2d4a86317ec3fb664bd963243e230516d162
Size (tessdata-3.04.00.tar.gz) = 499088801 bytes
SHA1 (tesseract-3.05.01.tar.gz) = a9a70bf84a597cb3c228d73c70a590e7b032b6ce
RMD160 (tesseract-3.05.01.tar.gz) = 11fae540fdd0ec4f6f9388fae4bbde790b17ee4d
SHA512 (tesseract-3.05.01.tar.gz) = a49c20c98386684cd89582e57b772811204fad8e5ff18214fb0da109f73629c70845054985e31e8deeb49107fbcf56e546aff661f08eb5dd60fbf83dbe976e81
Size (tesseract-3.05.01.tar.gz) = 3574810 bytes
SHA1 (patch-tessdata_Makefile.am) = 013c9b4bbf64a0948a362d334e6b86a240aa944f
SHA1 (patch-viewer_scrollview.cpp) = 05a9ff5d2a9e302b3a682144db54c612fd4eccc2