freebsd-ports/textproc/py-ocrmypdf/pkg-descr
Kai Knoblich 16091b3e6b New port: textproc/py-ocrmypdf
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be
searched or copy-pasted.

Main features:

* Generates a searchable PDF/A file from a regular PDF
* Places OCR text accurately below the image to ease copy / paste
* Keeps the exact resolution of the original embedded images
* When possible, inserts OCR information as a "lossless" operation without
  disrupting any other content
* Optimizes PDF images, often producing files smaller than the input file
* If requested deskews and/or cleans the image before performing OCR
* Validates input and output files
* Distributes work across all available CPU cores
* Uses Tesseract OCR engine to recognize more than 100 languages
* Scales properly to handle files with thousands of pages
* Battle-tested on millions of PDFs

WWW: https://github.com/jbarlow83/OCRmyPDF

Reviewed by:	0mp, koobs
Differential Revision:	https://reviews.freebsd.org/D20927
2019-07-12 15:08:03 +00:00

19 lines
835 B
Text

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be
searched or copy-pasted.
Main features:
* Generates a searchable PDF/A file from a regular PDF
* Places OCR text accurately below the image to ease copy / paste
* Keeps the exact resolution of the original embedded images
* When possible, inserts OCR information as a "lossless" operation without
disrupting any other content
* Optimizes PDF images, often producing files smaller than the input file
* If requested deskews and/or cleans the image before performing OCR
* Validates input and output files
* Distributes work across all available CPU cores
* Uses Tesseract OCR engine to recognize more than 100 languages
* Scales properly to handle files with thousands of pages
* Battle-tested on millions of PDFs
WWW: https://github.com/jbarlow83/OCRmyPDF