1.5.0
=====
### Significant changes relative to 1.5 beta1:
1. Fixed an issue whereby a malformed motion-JPEG frame could cause the "fast
path" of libjpeg-turbo's Huffman decoder to read from uninitialized memory.
2. Added libjpeg-turbo version and build information to the global string table
of the libjpeg and TurboJPEG API libraries. This is a common practice in other
infrastructure libraries, such as OpenSSL and libpng, because it makes it easy
to examine an application binary and determine which version of the library the
application was linked against.
3. Fixed a couple of issues in the PPM reader that would cause buffer overruns
in cjpeg if one of the values in a binary PPM/PGM input file exceeded the
maximum value defined in the file's header. libjpeg-turbo 1.4.2 already
included a similar fix for ASCII PPM/PGM files. Note that these issues were
not security bugs, since they were confined to the cjpeg program and did not
affect any of the libjpeg-turbo libraries.
4. Fixed an issue whereby attempting to decompress a JPEG file with a corrupt
header using the `tjDecompressToYUV2()` function would cause the function to
abort without returning an error and, under certain circumstances, corrupt the
stack. This only occurred if `tjDecompressToYUV2()` was called prior to
calling `tjDecompressHeader3()`, or if the return value from
`tjDecompressHeader3()` was ignored (both cases represent incorrect usage of
the TurboJPEG API.)
5. Fixed an issue in the ARM 32-bit SIMD-accelerated Huffman encoder that
prevented the code from assembling properly with clang.
6. The `jpeg_stdio_src()`, `jpeg_mem_src()`, `jpeg_stdio_dest()`, and
`jpeg_mem_dest()` functions in the libjpeg API will now throw an error if a
source/destination manager has already been assigned to the compress or
decompress object by a different function or by the calling program. This
prevents these functions from attempting to reuse a source/destination manager
structure that was allocated elsewhere, because there is no way to ensure that
it would be big enough to accommodate the new source/destination manager.
1.4.90 (1.5 beta1)
==================
### Significant changes relative to 1.4.2:
1. Added full SIMD acceleration for PowerPC platforms using AltiVec VMX
(128-bit SIMD) instructions. Although the performance of libjpeg-turbo on
PowerPC was already good, due to the increased number of registers available
to the compiler vs. x86, it was still possible to speed up compression by about
3-4x and decompression by about 2-2.5x (relative to libjpeg v6b) through the
use of AltiVec instructions.
2. Added two new libjpeg API functions (`jpeg_skip_scanlines()` and
`jpeg_crop_scanline()`) that can be used to partially decode a JPEG image. See
[libjpeg.txt](libjpeg.txt) for more details.
3. The TJCompressor and TJDecompressor classes in the TurboJPEG Java API now
implement the Closeable interface, so those classes can be used with a
try-with-resources statement.
4. The TurboJPEG Java classes now throw unchecked idiomatic exceptions
(IllegalArgumentException, IllegalStateException) for unrecoverable errors
caused by incorrect API usage, and those classes throw a new checked exception
type (TJException) for errors that are passed through from the C library.
5. Source buffers for the TurboJPEG C API functions, as well as the
`jpeg_mem_src()` function in the libjpeg API, are now declared as const
pointers. This facilitates passing read-only buffers to those functions and
ensures the caller that the source buffer will not be modified. This should
not create any backward API or ABI incompatibilities with prior libjpeg-turbo
releases.
6. The MIPS DSPr2 SIMD code can now be compiled to support either FR=0 or FR=1
FPUs.
7. Fixed additional negative left shifts and other issues reported by the GCC
and Clang undefined behavior sanitizers. Most of these issues affected only
32-bit code, and none of them was known to pose a security threat, but removing
the warnings makes it easier to detect actual security issues, should they
arise in the future.
8. Removed the unnecessary `.arch` directive from the ARM64 NEON SIMD code.
This directive was preventing the code from assembling using the clang
integrated assembler.
9. Fixed a regression caused by 1.4.1[6] that prevented 32-bit and 64-bit
libjpeg-turbo RPMs from being installed simultaneously on recent Red Hat/Fedora
distributions. This was due to the addition of a macro in jconfig.h that
allows the Huffman codec to determine the word size at compile time. Since
that macro differs between 32-bit and 64-bit builds, this caused a conflict
between the i386 and x86_64 RPMs (any differing files, other than executables,
are not allowed when 32-bit and 64-bit RPMs are installed simultaneously.)
Since the macro is used only internally, it has been moved into jconfigint.h.
10. The x86-64 SIMD code can now be disabled at run time by setting the
`JSIMD_FORCENONE` environment variable to `1` (the other SIMD implementations
already had this capability.)
11. Added a new command-line argument to TJBench (`-nowrite`) that prevents the
benchmark from outputting any images. This removes any potential operating
system overhead that might be caused by lazy writes to disk and thus improves
the consistency of the performance measurements.
12. Added SIMD acceleration for Huffman encoding on SSE2-capable x86 and x86-64
platforms. This speeds up the compression of full-color JPEGs by about 10-15%
on average (relative to libjpeg-turbo 1.4.x) when using modern Intel and AMD
CPUs. Additionally, this works around an issue in the clang optimizer that
prevents it (as of this writing) from achieving the same performance as GCC
when compiling the C version of the Huffman encoder
(<https://llvm.org/bugs/show_bug.cgi?id=16035>). For the purposes of
benchmarking or regression testing, SIMD-accelerated Huffman encoding can be
disabled by setting the `JSIMD_NOHUFFENC` environment variable to `1`.
13. Added ARM 64-bit (ARMv8) NEON SIMD implementations of the commonly-used
compression algorithms (including the slow integer forward DCT and h2v2 & h2v1
downsampling algorithms, which are not accelerated in the 32-bit NEON
implementation.) This speeds up the compression of full-color JPEGs by about
75% on average on a Cavium ThunderX processor and by about 2-2.5x on average on
Cortex-A53 and Cortex-A57 cores.
14. Added SIMD acceleration for Huffman encoding on NEON-capable ARM 32-bit
and 64-bit platforms.
For 32-bit code, this speeds up the compression of full-color JPEGs by
about 30% on average on a typical iOS device (iPhone 4S, Cortex-A9) and by
about 6-7% on average on a typical Android device (Nexus 5X, Cortex-A53 and
Cortex-A57), relative to libjpeg-turbo 1.4.x. Note that the larger speedup
under iOS is due to the fact that iOS builds use LLVM, which does not optimize
the C Huffman encoder as well as GCC does.
For 64-bit code, NEON-accelerated Huffman encoding speeds up the
compression of full-color JPEGs by about 40% on average on a typical iOS device
(iPhone 5S, Apple A7) and by about 7-8% on average on a typical Android device
(Nexus 5X, Cortex-A53 and Cortex-A57), in addition to the speedup described in
[13] above.
For the purposes of benchmarking or regression testing, SIMD-accelerated
Huffman encoding can be disabled by setting the `JSIMD_NOHUFFENC` environment
variable to `1`.
15. pkg-config (.pc) scripts are now included for both the libjpeg and
TurboJPEG API libraries on Un*x systems. Note that if a project's build system
relies on these scripts, then it will not be possible to build that project
with libjpeg or with a prior version of libjpeg-turbo.
16. Optimized the ARM 64-bit (ARMv8) NEON SIMD decompression routines to
improve performance on CPUs with in-order pipelines. This speeds up the
decompression of full-color JPEGs by nearly 2x on average on a Cavium ThunderX
processor and by about 15% on average on a Cortex-A53 core.
17. Fixed an issue in the accelerated Huffman decoder that could have caused
the decoder to read past the end of the input buffer when a malformed,
specially-crafted JPEG image was being decompressed. In prior versions of
libjpeg-turbo, the accelerated Huffman decoder was invoked (in most cases) only
if there were > 128 bytes of data in the input buffer. However, it is possible
to construct a JPEG image in which a single Huffman block is over 430 bytes
long, so this version of libjpeg-turbo activates the accelerated Huffman
decoder only if there are > 512 bytes of data in the input buffer.
18. Fixed a memory leak in tjunittest encountered when running the program
with the `-yuv` option.
1.4.2
=====
### Significant changes relative to 1.4.1:
1. Fixed an issue whereby cjpeg would segfault if a Windows bitmap with a
negative width or height was used as an input image (Windows bitmaps can have
a negative height if they are stored in top-down order, but such files are
rare and not supported by libjpeg-turbo.)
2. Fixed an issue whereby, under certain circumstances, libjpeg-turbo would
incorrectly encode certain JPEG images when quality=100 and the fast integer
forward DCT were used. This was known to cause `make test` to fail when the
library was built with `-march=haswell` on x86 systems.
3. Fixed an issue whereby libjpeg-turbo would crash when built with the latest
& greatest development version of the Clang/LLVM compiler. This was caused by
an x86-64 ABI conformance issue in some of libjpeg-turbo's 64-bit SSE2 SIMD
routines. Those routines were incorrectly using a 64-bit `mov` instruction to
transfer a 32-bit JDIMENSION argument, whereas the x86-64 ABI allows the upper
(unused) 32 bits of a 32-bit argument's register to be undefined. The new
Clang/LLVM optimizer uses load combining to transfer multiple adjacent 32-bit
structure members into a single 64-bit register, and this exposed the ABI
conformance issue.
4. Fixed a bug in the MIPS DSPr2 4:2:0 "plain" (non-fancy and non-merged)
upsampling routine that caused a buffer overflow (and subsequent segfault) when
decompressing a 4:2:0 JPEG image whose scaled output width was less than 16
pixels. The "plain" upsampling routines are normally only used when
decompressing a non-YCbCr JPEG image, but they are also used when decompressing
a JPEG image whose scaled output height is 1.
5. Fixed various negative left shifts and other issues reported by the GCC and
Clang undefined behavior sanitizers. None of these was known to pose a
security threat, but removing the warnings makes it easier to detect actual
security issues, should they arise in the future.
Problems found with existing digests:
Package fotoxx distfile fotoxx-14.03.1.tar.gz
ac2033f87de2c23941261f7c50160cddf872c110 [recorded]
118e98a8cc0414676b3c4d37b8df407c28a1407c [calculated]
Package ploticus-examples distfile ploticus-2.00/plnode200.tar.gz
34274a03d0c41fae5690633663e3d4114b9d7a6d [recorded]
da39a3ee5e6b4b0d3255bfef95601890afd80709 [calculated]
Problems found locating distfiles:
Package AfterShotPro: missing distfile AfterShotPro-1.1.0.30/AfterShotPro_i386.deb
Package pgraf: missing distfile pgraf-20010131.tar.gz
Package qvplay: missing distfile qvplay-0.95.tar.gz
Otherwise, existing SHA1 digests verified and found to be the same on
the machine holding the existing distfiles (morden). All existing
SHA1 digests retained for now as an audit trail.
Changes:
1.4.0
=====
[2] The non-SIMD RGB565 color conversion code did not work correctly on big
endian machines. This has been fixed.
[3] Fixed an issue in tjPlaneSizeYUV() whereby it would erroneously return 1
instead of -1 if componentID was > 0 and subsamp was TJSAMP_GRAY.
[3] Fixed an issue in tjBufSizeYUV2() wherby it would erroneously return 0
instead of -1 if width was < 1.
[5] The Huffman encoder now uses clz and bsr instructions for bit counting on
ARM64 platforms (see 1.4 beta1 [5].)
[6] The close() method in the TJCompressor and TJDecompressor Java classes is
now idempotent. Previously, that method would call the native tjDestroy()
function even if the TurboJPEG instance had already been destroyed. This
caused an exception to be thrown during finalization, if the close() method had
already been called. The exception was caught, but it was still an expensive
operation.
[7] The TurboJPEG API previously generated an error ("Could not determine
subsampling type for JPEG image") when attempting to decompress grayscale JPEG
images that were compressed with a sampling factor other than 1 (for instance,
with 'cjpeg -grayscale -sample 2x2'). Subsampling technically has no meaning
with grayscale JPEGs, and thus the horizontal and vertical sampling factors
for such images are ignored by the decompressor. However, the TurboJPEG API
was being too rigid and was expecting the sampling factors to be equal to 1
before it treated the image as a grayscale JPEG.
[8] cjpeg, djpeg, and jpegtran now accept an argument of -version, which will
print the library version and exit.
[9] Referring to 1.4 beta1 [15], another extremely rare circumstance was
discovered under which the Huffman encoder's local buffer can be overrun
when a buffered destination manager is being used and an
extremely-high-frequency block (basically junk image data) is being encoded.
Even though the Huffman local buffer was increased from 128 bytes to 136 bytes
to address the previous issue, the new issue caused even the larger buffer to
be overrun. Further analysis reveals that, in the absolute worst case (such as
setting alternating AC coefficients to 32767 and -32768 in the JPEG scanning
order), the Huffman encoder can produce encoded blocks that approach double the
size of the unencoded blocks. Thus, the Huffman local buffer was increased to
256 bytes, which should prevent any such issue from re-occurring in the future.
1.3.90 (1.4 beta1)
==================
[1] New features in the TurboJPEG API:
-- YUV planar images can now be generated with an arbitrary line padding
(previously only 4-byte padding, which was compatible with X Video, was
supported.)
-- The decompress-to-YUV function has been extended to support image scaling.
-- JPEG images can now be compressed from YUV planar source images.
-- YUV planar images can now be decoded into RGB or grayscale images.
-- 4:1:1 subsampling is now supported. This is mainly included for
compatibility, since 4:1:1 is not fully accelerated in libjpeg-turbo and has no
significant advantages relative to 4:2:0.
-- CMYK images are now supported. This feature allows CMYK source images to be
compressed to YCCK JPEGs and YCCK or CMYK JPEGs to be decompressed to CMYK
destination images. Conversion between CMYK/YCCK and RGB or YUV images is not
supported. Such conversion requires a color management system and is thus out
of scope for a codec library.
-- The handling of YUV images in the Java API has been significantly refactored
and should now be much more intuitive.
-- The Java API now supports encoding a YUV image from an arbitrary position in
a large image buffer.
-- All of the YUV functions now have a corresponding function that operates on
separate image planes instead of a unified image buffer. This allows for
compressing/decoding from or decompressing/encoding to a subregion of a larger
YUV image. It also allows for handling YUV formats that swap the order of the
U and V planes.
[2] Added SIMD acceleration for DSPr2-capable MIPS platforms. This speeds up
the compression of full-color JPEGs by 70-80% on such platforms and
decompression by 25-35%.
[3] If an application attempts to decompress a Huffman-coded JPEG image whose
header does not contain Huffman tables, libjpeg-turbo will now insert the
default Huffman tables. In order to save space, many motion JPEG video frames
are encoded without the default Huffman tables, so these frames can now be
successfully decompressed by libjpeg-turbo without additional work on the part
of the application. An application can still override the Huffman tables, for
instance to re-use tables from a previous frame of the same video.
[5] The Huffman encoder now uses clz and bsr instructions for bit counting on
ARM platforms rather than a lookup table. This reduces the memory footprint
by 64k, which may be important for some mobile applications. Out of four
Android devices that were tested, two demonstrated a small overall performance
loss (~3-4% on average) with ARMv6 code and a small gain (also ~3-4%) with
ARMv7 code when enabling this new feature, but the other two devices
demonstrated a significant overall performance gain with both ARMv6 and ARMv7
code (~10-20%) when enabling the feature. Actual mileage may vary.
[7] Improved the accuracy and performance of the non-SIMD implementation of the
floating point inverse DCT (using code borrowed from libjpeg v8a and later.)
The accuracy of this implementation now matches the accuracy of the SSE/SSE2
implementation. Note, however, that the floating point DCT/IDCT algorithms are
mainly a legacy feature. They generally do not produce significantly better
accuracy than the slow integer DCT/IDCT algorithms, and they are quite a bit
slower.
[8] Added a new output colorspace (JCS_RGB565) to the libjpeg API that allows
for decompressing JPEG images into RGB565 (16-bit) pixels. If dithering is not
used, then this code path is SIMD-accelerated on ARM platforms.
[13] Restored 12-bit-per-component JPEG support. A 12-bit version of
libjpeg-turbo can now be built by passing an argument of --with-12bit to
configure (Unix) or -DWITH_12BIT=1 to cmake (Windows.) 12-bit JPEG support is
included only for convenience. Enabling this feature disables all of the
performance features in libjpeg-turbo, as well as arithmetic coding and the
TurboJPEG API. The resulting library still contains the other libjpeg-turbo
features (such as the colorspace extensions), but in general, it performs no
faster than libjpeg v6b.
[14] Added ARM 64-bit SIMD acceleration for the YCC-to-RGB color conversion
and IDCT algorithms (both are used during JPEG decompression.) For unknown
reasons (probably related to clang), this code cannot currently be compiled for
iOS.
[15] Fixed an extremely rare bug that could cause the Huffman encoder's local
buffer to overrun when a very high-frequency MCU is compressed using quality
100 and no subsampling, and when the JPEG output buffer is being dynamically
resized by the destination manager. This issue was so rare that, even with a
test program specifically designed to make the bug occur (by injecting random
high-frequency YUV data into the compressor), it was reproducible only once in
about every 25 million iterations.
[16] Fixed an oversight in the TurboJPEG C wrapper: if any of the JPEG
compression functions was called repeatedly with the same
automatically-allocated destination buffer, then TurboJPEG would erroneously
assume that the jpegSize parameter was equal to the size of the buffer, when in
fact that parameter was probably equal to the size of the most recently
compressed JPEG image. If the size of the previous JPEG image was not as large
as the current JPEG image, then TurboJPEG would unnecessarily reallocate the
destination buffer.
1.3.1
=====
[3] Fixed a bug whereby attempting to encode a progressive JPEG with arithmetic
entropy coding (by passing arguments of -progressive -arithmetic to cjpeg or
jpegtran, for instance) would result in an error, "Requested feature was
omitted at compile time".
[4] Fixed a couple of issues whereby malformed JPEG images would cause
libjpeg-turbo to use uninitialized memory during decompression.
[5] Fixed an error ("Buffer passed to JPEG library is too small") that occurred
when calling the TurboJPEG YUV encoding function with a very small (< 5x5)
source image, and added a unit test to check for this error.
=====
[1] 'make test' now works properly on FreeBSD, and it no longer requires the
md5sum executable to be present on other Un*x platforms.
[2] Overhauled the packaging system:
-- To avoid conflict with vendor-supplied libjpeg-turbo packages, the
official RPMs and DEBs for libjpeg-turbo have been renamed to
"libjpeg-turbo-official".
-- The TurboJPEG libraries are now located under /opt/libjpeg-turbo in the
official Linux and Mac packages, to avoid conflict with vendor-supplied
packages and also to streamline the packaging system.
-- Release packages are now created with the directory structure defined
by the configure variables "prefix", "bindir", "libdir", etc. (Un*x) or by the
CMAKE_INSTALL_PREFIX variable (Windows.) The exception is that the docs are
always located under the system default documentation directory on Un*x and Mac
systems, and on Windows, the TurboJPEG DLL is always located in the Windows
system directory.
-- To avoid confusion, official libjpeg-turbo packages on Linux/Unix platforms
(except for Mac) will always install the 32-bit libraries in
/opt/libjpeg-turbo/lib32 and the 64-bit libraries in /opt/libjpeg-turbo/lib64.
-- Fixed an issue whereby, in some cases, the libjpeg-turbo executables on Un*x
systems were not properly linking with the shared libraries installed by the
same package.
-- Fixed an issue whereby building the "installer" target on Windows when
WITH_JAVA=1 would fail if the TurboJPEG JAR had not been previously built.
-- Building the "install" target on Windows now installs files into the same
places that the installer does.
[3] Fixed a Huffman encoder bug that prevented I/O suspension from working
properly.
COMMENT should not be longer than 70 characters.
COMMENT should not begin with 'A'.
COMMENT should not begin with 'An'.
COMMENT should not begin with 'a'.
COMMENT should not end with a period.
COMMENT should start with a capital letter.
pkglint warnings. Some files also got minor formatting, spelling, and style
corrections.
it should provide the jpeg-8 API/ABI rather than the -6b one.
So switch to 1.1.0 with the jpeg8 "configure" option.
Tested to be binary compatible on i386, at least for simple
image viewers.
being here, add "test" target
instructions to accelerate baseline JPEG compression/decompression by about
2-4x on x86 and x86-64 platforms.
XXX Conflicts with graphics/jpeg - which rather demands a solution.