Problems found locating distfiles:
Package acroread7: missing distfile AdobeReader_enu-7.0.9-1.i386.tar.gz
Package acroread8: missing distfile AdobeReader_enu-8.1.7-1.sparc.tar.gz
Package cups-filters: missing distfile cups-filters-1.1.0.tar.xz
Package dvidvi: missing distfile dvidvi-1.0.tar.gz
Package lgrind: missing distfile lgrind.tar.bz2
Otherwise, existing SHA1 digests verified and found to be the same on
the machine holding the existing distfiles (morden). All existing
SHA1 digests retained for now as an audit trail.
v1.3.0 (30th December 2012)
- Numerous performance optimisations (thanks Alex Dowad)
- Improved text extraction (thanks Nathaniel Madura)
- Load less of the hashery gem to reduce core monkey patches
- various bug fixes
v1.2.0 (28th AUgust 2012)
- Feature: correctly extract text using surrogate pairs and ligatures
(thanks Nathaniel Madura)
- Speed optimisation: cache tokenised Form XObjects to avoid re-parsing them
- Feature: support opening documents with some junk bytes prepended to file
(thanks Paul Gallagher)
- Acrobat does this, so it seemed reasonable to add support
v1.1.0 (25th March 2012)
- new PageState class for handling common state tracking in page receivers
- see PageTextReceiver for example usage
- various bugfixes to support reading more PDF dialects
v1.0.0 (16th January 2012)
- support a new encryption variation
- bugfix in PageTextRender (thanks Paul Gallagher)
v1.0.0.rc1 (19th December 2011)
- performance optimisations (all by Bernerd Schaefer)
- some improvements to text extraction from form xobjects
- assume invalid font encodings are StandardEncoding
- use binary mode when opening PDFs to stop ruby being helpful and transcoding
bytes for us
v1.0.0.beta1 (6th October 2011)
- ensure inline images that contain "EI" are correctly parsed
(thanks Bernard Schaefer)
- fix parsing of inline image data
v0.12.0.alpha (28th August 2011)
- small breaking changes to the page-based API - it's alpha for a reason
- resource related methods on Page object return raw PDF objects
- if the caller wants the resources wrapped in a more convenient
Ruby object (like PDF::Reader::Font or PDF::Reader::FormXObject) will
need to do so themselves
- add support for RunLengthDecode filters (thanks Bernerd Schaefer)
- add support for standard PDF encryption (thanks Evan Brunner)
- add support for decoding stream with TIFF prediction
- new PDF::Reader::FormXObject class to simplify working with form XObjects
v0.11.0.alpha (19th July 2011)
- introduce experimental new page-based API
- old API is deprecated but will continue to work with no warnings
- add transparent caching of common objects to ObjectHash
The PDF::Reader library implements a PDF parser conforming as much as
possible to the PDF specification from Adobe.
It provides programmatic access to the contents of a PDF file with
a high degree of flexibility.
The PDF 1.7 specification is a weighty document and not all aspects
are currently supported. I welcome submission of PDF files that
exhibit unsupported aspects of the spec to assist with improving out
support.