Commit graph

1 commit

Author SHA1 Message Date
adam
3b1f9eeee4 libjpeg-turbo: updated to 2.1.0
2.1.0

Significant changes relative to 2.1 beta1

Fixed a regression introduced by 2.1 beta1[6(b)] whereby attempting to decompress certain progressive JPEG images with one or more component planes of width 8 or less caused a buffer overrun.

Fixed a regression introduced by 2.1 beta1[6(b)] whereby attempting to decompress a specially-crafted malformed progressive JPEG image caused the block smoothing algorithm to read from uninitialized memory.

Fixed an issue in the Arm Neon SIMD Huffman encoders that caused the encoders to generate incorrect results when using the Clang compiler with Visual Studio.

Fixed a floating point exception (CVE-2021-20205) that occurred when attempting to compress a specially-crafted malformed GIF image with a specified image width of 0 using cjpeg.

Fixed a regression introduced by 2.0 beta1[15] whereby attempting to generate a progressive JPEG image on an SSE2-capable CPU using a scan script containing one or more scans with lengths divisible by 32 and non-zero successive approximation low bit positions would, under certain circumstances, result in an error ("Missing Huffman code table entry") and an invalid JPEG image.

Introduced a new flag (TJFLAG_LIMITSCANS in the TurboJPEG C API and TJ.FLAG_LIMIT_SCANS in the TurboJPEG Java API) and a corresponding TJBench command-line argument (-limitscans) that causes the TurboJPEG decompression and transform functions/operations to return/throw an error if a progressive JPEG image contains an unreasonably large number of scans. This allows applications that use the TurboJPEG API to guard against an exploit of the progressive JPEG format described in the report "Two Issues with the JPEG Standard".

The PPM reader now throws an error, rather than segfaulting (due to a buffer overrun) or generating incorrect pixels, if an application attempts to use the tjLoadImage() function to load a 16-bit binary PPM file (a binary PPM file with a maximum value greater than 255) into a grayscale image buffer or to load a 16-bit binary PGM file into an RGB image buffer.

Fixed an issue in the PPM reader that caused incorrect pixels to be generated when using the tjLoadImage() function to load a 16-bit binary PPM file into an extended RGB image buffer.

Fixed an issue whereby, if a JPEG buffer was automatically re-allocated by one of the TurboJPEG compression or transform functions and an error subsequently occurred during compression or transformation, the JPEG buffer pointer passed by the application was not updated when the function returned.

2.0.90 (2.1 beta1)

Significant changes relative to 2.0.6:

The build system, x86-64 SIMD extensions, and accelerated Huffman codec now support the x32 ABI on Linux, which allows for using x86-64 instructions with 32-bit pointers. The x32 ABI is generally enabled by adding -mx32 to the compiler flags.

Caveats:

CMake 3.9.0 or later is required in order for the build system to automatically detect an x32 build.
Java does not support the x32 ABI, and thus the TurboJPEG Java API will automatically be disabled with x32 builds.
Added Loongson MMI SIMD implementations of the RGB-to-grayscale, 4:2:2 fancy chroma upsampling, 4:2:2 and 4:2:0 merged chroma upsampling/color conversion, and fast integer DCT/IDCT algorithms. Relative to libjpeg-turbo 2.0.x, this speeds up:

the compression of RGB source images into grayscale JPEG images by approximately 20%
the decompression of 4:2:2 JPEG images by approximately 40-60% when using fancy upsampling
the decompression of 4:2:2 and 4:2:0 JPEG images by approximately 15-20% when using merged upsampling
the compression of RGB source images by approximately 30-45% when using the fast integer DCT
the decompression of JPEG images into RGB destination images by approximately 2x when using the fast integer IDCT
The overall decompression speedup for RGB images is now approximately 2.3-3.7x (compared to 2-3.5x with libjpeg-turbo 2.0.x.)

32-bit (Armv7 or Armv7s) iOS builds of libjpeg-turbo are no longer supported, and the libjpeg-turbo build system can no longer be used to package such builds. 32-bit iOS apps cannot run in iOS 11 and later, and the App Store no longer allows them.

32-bit (i386) OS X/macOS builds of libjpeg-turbo are no longer supported, and the libjpeg-turbo build system can no longer be used to package such builds. 32-bit Mac applications cannot run in macOS 10.15 "Catalina" and later, and the App Store no longer allows them.

The SSE2 (x86 SIMD) and C Huffman encoding algorithms have been significantly optimized, resulting in a measured average overall compression speedup of 12-28% for 64-bit code and 22-52% for 32-bit code on various Intel and AMD CPUs, as well as a measured average overall compression speedup of 0-23% on platforms that do not have a SIMD-accelerated Huffman encoding implementation.

The block smoothing algorithm that is applied by default when decompressing progressive Huffman-encoded JPEG images has been improved in the following ways:

The algorithm is now more fault-tolerant. Previously, if a particular scan was incomplete, then the smoothing parameters for the incomplete scan would be applied to the entire output image, including the parts of the image that were generated by the prior (complete) scan. Visually, this had the effect of removing block smoothing from lower-frequency scans if they were followed by an incomplete higher-frequency scan. libjpeg-turbo now applies block smoothing parameters to each iMCU row based on which scan generated the pixels in that row, rather than always using the block smoothing parameters for the most recent scan.
When applying block smoothing to DC scans, a Gaussian-like kernel with a 5x5 window is used to reduce the "blocky" appearance.
Added SIMD acceleration for progressive Huffman encoding on Arm platforms. This speeds up the compression of full-color progressive JPEGs by about 30-40% on average (relative to libjpeg-turbo 2.0.x) when using modern Arm CPUs.

Added configure-time and run-time auto-detection of Loongson MMI SIMD instructions, so that the Loongson MMI SIMD extensions can be included in any MIPS64 libjpeg-turbo build.

Added fault tolerance features to djpeg and jpegtran, mainly to demonstrate methods by which applications can guard against the exploits of the JPEG format described in the report "Two Issues with the JPEG Standard".

Both programs now accept a -maxscans argument, which can be used to limit the number of allowable scans in the input file.
Both programs now accept a -strict argument, which can be used to treat all warnings as fatal.
CMake package config files are now included for both the libjpeg and TurboJPEG API libraries. This facilitates using libjpeg-turbo with CMake's find_package() function. For example:

find_package(libjpeg-turbo CONFIG REQUIRED)

add_executable(libjpeg_program libjpeg_program.c)
target_link_libraries(libjpeg_program PUBLIC libjpeg-turbo::jpeg)

add_executable(libjpeg_program_static libjpeg_program.c)
target_link_libraries(libjpeg_program_static PUBLIC
  libjpeg-turbo::jpeg-static)

add_executable(turbojpeg_program turbojpeg_program.c)
target_link_libraries(turbojpeg_program PUBLIC
  libjpeg-turbo::turbojpeg)

add_executable(turbojpeg_program_static turbojpeg_program.c)
target_link_libraries(turbojpeg_program_static PUBLIC
  libjpeg-turbo::turbojpeg-static)
Since the Unisys LZW patent has long expired, cjpeg and djpeg can now read/write both LZW-compressed and uncompressed GIF files (feature ported from jpeg-6a and jpeg-9d.)

jpegtran now includes the -wipe and -drop options from jpeg-9a and jpeg-9d, as well as the ability to expand the image size using the -crop option. Refer to jpegtran.1 or usage.txt for more details.

Added a complete intrinsics implementation of the Arm Neon SIMD extensions, thus providing SIMD acceleration on Arm platforms for all of the algorithms that are SIMD-accelerated on x86 platforms. This new implementation is significantly faster in some cases than the old GAS implementation-- depending on the algorithms used, the type of CPU core, and the compiler. GCC, as of this writing, does not provide a full or optimal set of Neon intrinsics, so for performance reasons, the default when building libjpeg-turbo with GCC is to continue using the GAS implementation of the following algorithms:

32-bit RGB-to-YCbCr color conversion
32-bit fast and accurate inverse DCT
64-bit RGB-to-YCbCr and YCbCr-to-RGB color conversion
64-bit accurate forward and inverse DCT
64-bit Huffman encoding
A new CMake variable (NEON_INTRINSICS) can be used to override this default.

Since the new intrinsics implementation includes SIMD acceleration for merged upsampling/color conversion, 1.5.1[5] is no longer necessary and has been reverted.

The Arm Neon SIMD extensions can now be built using Visual Studio.

The build system can now be used to generate a universal x86-64 + Armv8 libjpeg-turbo SDK package for both iOS and macOS.
2021-04-26 08:18:48 +00:00
Renamed from graphics/libjpeg-turbo/patches/patch-simd_arm_jsimd.c (Browse further)