v1.3.0 (30th December 2012)
- Numerous performance optimisations (thanks Alex Dowad)
- Improved text extraction (thanks Nathaniel Madura)
- Load less of the hashery gem to reduce core monkey patches
- various bug fixes
v1.2.0 (28th AUgust 2012)
- Feature: correctly extract text using surrogate pairs and ligatures
(thanks Nathaniel Madura)
- Speed optimisation: cache tokenised Form XObjects to avoid re-parsing them
- Feature: support opening documents with some junk bytes prepended to file
(thanks Paul Gallagher)
- Acrobat does this, so it seemed reasonable to add support
v1.1.0 (25th March 2012)
- new PageState class for handling common state tracking in page receivers
- see PageTextReceiver for example usage
- various bugfixes to support reading more PDF dialects
v1.0.0 (16th January 2012)
- support a new encryption variation
- bugfix in PageTextRender (thanks Paul Gallagher)
v1.0.0.rc1 (19th December 2011)
- performance optimisations (all by Bernerd Schaefer)
- some improvements to text extraction from form xobjects
- assume invalid font encodings are StandardEncoding
- use binary mode when opening PDFs to stop ruby being helpful and transcoding
bytes for us
v1.0.0.beta1 (6th October 2011)
- ensure inline images that contain "EI" are correctly parsed
(thanks Bernard Schaefer)
- fix parsing of inline image data
v0.12.0.alpha (28th August 2011)
- small breaking changes to the page-based API - it's alpha for a reason
- resource related methods on Page object return raw PDF objects
- if the caller wants the resources wrapped in a more convenient
Ruby object (like PDF::Reader::Font or PDF::Reader::FormXObject) will
need to do so themselves
- add support for RunLengthDecode filters (thanks Bernerd Schaefer)
- add support for standard PDF encryption (thanks Evan Brunner)
- add support for decoding stream with TIFF prediction
- new PDF::Reader::FormXObject class to simplify working with form XObjects
v0.11.0.alpha (19th July 2011)
- introduce experimental new page-based API
- old API is deprecated but will continue to work with no warnings
- add transparent caching of common objects to ObjectHash
The PDF::Reader library implements a PDF parser conforming as much as
possible to the PDF specification from Adobe.
It provides programmatic access to the contents of a PDF file with
a high degree of flexibility.
The PDF 1.7 specification is a weighty document and not all aspects
are currently supported. I welcome submission of PDF files that
exhibit unsupported aspects of the spec to assist with improving out
support.