Commit graph

14 commits

Author SHA1 Message Date
taca
e5f6ee50db print/ruby-pdf-reader: update to 2.9.1
2.8.0 (2021-12-28)

* Add PDF::Reader::Page#runs for extracting text from a page with
  positioning metadata (http://github.com/yob/pdf-reader/pull/411)
* Add options to PDF::Reader::Page#text to make some behaviour configurable
  (http://github.com/yob/pdf-reader/pull/411)
	- including extracting the text for only part of the page
* Improve text positioning and extraction for Type3 fonts
  (http://github.com/yob/pdf-reader/pull/412)
* Skip extracting text that is positioned outside the page
  (http://github.com/yob/pdf-reader/pull/413)
* Fix occasional crash when reading some streams
  (http://github.com/yob/pdf-reader/pull/405)

2.9.0 (2022-01-24)

* Support additional encryption standards
  (http://github.com/yob/pdf-reader/pull/419)
* Return CropBox correctly from Page#rectangles
  (https://github.com/yob/pdf-reader/pull/420)
* For sorbet users, additional type annotations are included in the gem

2.9.1 (2022-02-04)

* Fix exception in Page#walk introduced in 2.9.0
  (http://github.com/yob/pdf-reader/pull/442)
* Other small bug fixes
2022-02-14 14:15:28 +00:00
taca
d5c66fe724 print/ruby-pdf-reader: update to 2.7.0
2.7.0 (2021-12-13)

* Include RBI type files in the gem

  - Downstream users of pdf-reader who also use sorbet *should* find many
    parts of the API will now be typed checked by sorbet

* Fix glyph positioning in some rotation scenarios
  (http://github.com/yob/pdf-reader/pull/403)

  - Improved text extraction on some rotated pages, and rotated text on
    normal pages

* Add PDF::Reader::Page#rectangles
  (http://github.com/yob/pdf-reader/pull/402)

  - Returns page boxes (MediaBox, etc) with rotation applied, and as PORO
    rather than arrays of numbers

* Add PDF::Reader::Page#origin (http://github.com/yob/pdf-reader/pull/400)

* Add PDF::Reader::Page#{height,width}
  (http://github.com/yob/pdf-reader/pull/399)

* Overlap filter should only drop characters that overlap *and* match
  (http://github.com/yob/pdf-reader/pull/401)
2021-12-13 14:58:57 +00:00
taca
c788b62755 print/ruby-pdf-reader: update to 2.6.0
2.6.0 (2021-11-12)

* Text extraction improvements

  - Improved text layout on pages with a variety of font sizes
    (http://github.com/yob/pdf-reader/pull/355)
  - Fixed text positioning for some rotated pages
    (http://github.com/yob/pdf-reader/pull/356)
  - Improved character width calculation for PDFs using built-in
    (non-embedded) ZapfDingbats (http://github.com/yob/pdf-reader/pull/373)
  - Skip zero-width characters (http://github.com/yob/pdf-reader/pull/372)

* Performance improvements

  - Reduced memory pressure when decoding TIFF images
    (http://github.com/yob/pdf-reader/pull/360)
  - Optional dependency on ascii81_native gem for faster processing of files
    using the ascii85 filter (http://github.com/yob/pdf-reader/pull/359)

* Successfully parse more files

  - Gracefully handle some non-spec compliant CR/LF issues
    (http://github.com/yob/pdf-reader/pull/364)
  - Fix parsing of some escape sequences in content streams
    (http://github.com/yob/pdf-reader/pull/368)
  - Increase the amount of junk bytes we detect and skip at the end of a
    file (382)
  - Ignore "/Prev 0" in trailers (http://github.com/yob/pdf-reader/pull/383)
  - Fix parsing of some inline images (BI ID EI tokens)
    (http://github.com/yob/pdf-reader/pull/389)
  - Gracefully handle some xref tables that incorrectly start with 1
    (http://github.com/yob/pdf-reader/pull/384)
2021-11-28 08:07:38 +00:00
taca
12eef851be print/ruby-pdf-reader: update to 2.4.1
v.2.4.1 (24th September 2020)
- Re-vendor font metrics from Adobe to clarify their license
2021-01-11 01:06:21 +00:00
taca
fd2276bc41 print/ruby-pdf-reader: update to 2.4.0
Update ruby-pdf-reader to 2.4.0.


2.4.0 (21st November 2019)

- Optimise overlapping characters code introduced in 2.3.0. Text extraction
  of pages with thousands of characters is still slower than it was in
  2.2.1, but it might tolerable for now.
  See https://github.com/yob/pdf-reader/pull/308 for details.

- Implement very basic font substitution for Type1 and TrueType fonts that
  aren't embedded

- Remove PDF::Hash class. It's been deprecated since 2010, and it's hard to
  believe anyone is still using it.

- Several small bug fixes

2.3.0 (7th November 2019)

- Text extraction now makes an effort to skip duplicate characters that
  overlap, a common approach used for a fake "bold" effect, This will make
  text extraction a bit slower - if that turns out to be an issue I'll look
  into further optimisations or provide a toggle to turn it off

- Several small bug fixes
2020-03-24 15:40:56 +00:00
taca
28448f9975 print/ruby-pdf-reader: update to 2.1.0
pkgsrc change:

* Add missing ALTERNATIVES forgot from 2015.

v2.1.0 (15th Februar 2018)
- Support extra encrypted PDF variants (thanks to Gyuchang Jun)
- various bug fixes
2018-03-14 15:31:31 +00:00
taca
32f8f50c43 Update ruby-pdf-reader to 2.0.0.
v2.0.0 (25th February 2017)
- various bug fixes

v2.0.0.beta1 (15th February 2017)
- BREAKING CHANGE: remove all methods that were deprecated in 1.0.0
- Bug: Support extra encrypted PDF variants (thanks to Gyuchang Jun)
- various bug fixes

v1.4.1 (2nd January 2017)
- improve compatability with ruby 2.4 (thanks Akira Matsuda)
- various bug fixes
2017-03-20 15:05:43 +00:00
taca
e1c57134e6 Upadte ruby-pdf-reader to 1.4.0.
v1.4.0 (22nd February 2016)
- raise minimum ruby version to 1.9.3
- print warnings to stderr when deprecated methods are used. These methods have been
  deprecated for 4 years, so hopefully few people are depending on them
- Fix exception when a npn-breakng space (character 160) is used with a
  built-in fint (helvetica, etc)
- various bug fixes
2016-03-15 15:01:16 +00:00
taca
a2e21f3209 * Allow this package build on Ruby 2.2.
* Add support for pkg_alternatives.

Bump PKGREVISION.
2015-06-08 16:02:54 +00:00
taca
de0fc45d26 Update ruby-pdf-reader to 1.3.0.
v1.3.0 (30th December 2012)
- Numerous performance optimisations (thanks Alex Dowad)
- Improved text extraction (thanks Nathaniel Madura)
- Load less of the hashery gem to reduce core monkey patches
- various bug fixes
2013-02-11 08:58:50 +00:00
taca
4362cc5187 Update ruby-pdf-reader to 1.2.0.
v1.2.0 (28th AUgust 2012)
- Feature: correctly extract text using surrogate pairs and ligatures
  (thanks Nathaniel Madura)
- Speed optimisation: cache tokenised Form XObjects to avoid re-parsing them
- Feature: support opening documents with some junk bytes prepended to file
  (thanks Paul Gallagher)
  - Acrobat does this, so it seemed reasonable to add support
2012-09-16 08:18:36 +00:00
taca
4ed129e6df Update ruby-pdf-reader to 1.1.0.
v1.1.0 (25th March 2012)
- new PageState class for handling common state tracking in page receivers
  - see PageTextReceiver for example usage
- various bugfixes to support reading more PDF dialects
2012-04-29 14:18:50 +00:00
taca
411af244a2 Update ruby-pdf-reader to 1.0.0.
v1.0.0 (16th January 2012)
- support a new encryption variation
- bugfix in PageTextRender (thanks Paul Gallagher)

v1.0.0.rc1 (19th December 2011)
- performance optimisations (all by Bernerd Schaefer)
- some improvements to text extraction from form xobjects
- assume invalid font encodings are StandardEncoding
- use binary mode when opening PDFs to stop ruby being helpful and transcoding
    bytes for us

v1.0.0.beta1 (6th October 2011)
- ensure inline images that contain "EI" are correctly parsed
  (thanks Bernard Schaefer)
- fix parsing of inline image data

v0.12.0.alpha (28th August 2011)
- small breaking changes to the page-based API - it's alpha for a reason
  - resource related methods on Page object return raw PDF objects
  - if the caller wants the resources wrapped in a more convenient
    Ruby object (like PDF::Reader::Font or PDF::Reader::FormXObject) will
    need to do so themselves
- add support for RunLengthDecode filters (thanks Bernerd Schaefer)
- add support for standard PDF encryption (thanks Evan Brunner)
- add support for decoding stream with TIFF prediction
- new PDF::Reader::FormXObject class to simplify working with form XObjects

v0.11.0.alpha (19th July 2011)
- introduce experimental new page-based API
  - old API is deprecated but will continue to work with no warnings
- add transparent caching of common objects to ObjectHash
2012-03-20 13:08:33 +00:00
taca
08aa15a881 Importing ruby-pdf-reader package version 0.9.2.
The PDF::Reader library implements a PDF parser conforming as much as
possible to the PDF specification from Adobe.

It provides programmatic access to the contents of a PDF file with
a high degree of flexibility.

The PDF 1.7 specification is a weighty document and not all aspects
are currently supported. I welcome submission of PDF files that
exhibit unsupported aspects of the spec to assist with improving out
support.
2011-06-19 14:18:58 +00:00