pkgsrc/graphics/libjpeg-turbo/Makefile

29 lines
746 B
Makefile
Raw Normal View History

Updated libjpeg-turbo to 1.5.0. 1.5.0 ===== ### Significant changes relative to 1.5 beta1: 1. Fixed an issue whereby a malformed motion-JPEG frame could cause the "fast path" of libjpeg-turbo's Huffman decoder to read from uninitialized memory. 2. Added libjpeg-turbo version and build information to the global string table of the libjpeg and TurboJPEG API libraries. This is a common practice in other infrastructure libraries, such as OpenSSL and libpng, because it makes it easy to examine an application binary and determine which version of the library the application was linked against. 3. Fixed a couple of issues in the PPM reader that would cause buffer overruns in cjpeg if one of the values in a binary PPM/PGM input file exceeded the maximum value defined in the file's header. libjpeg-turbo 1.4.2 already included a similar fix for ASCII PPM/PGM files. Note that these issues were not security bugs, since they were confined to the cjpeg program and did not affect any of the libjpeg-turbo libraries. 4. Fixed an issue whereby attempting to decompress a JPEG file with a corrupt header using the `tjDecompressToYUV2()` function would cause the function to abort without returning an error and, under certain circumstances, corrupt the stack. This only occurred if `tjDecompressToYUV2()` was called prior to calling `tjDecompressHeader3()`, or if the return value from `tjDecompressHeader3()` was ignored (both cases represent incorrect usage of the TurboJPEG API.) 5. Fixed an issue in the ARM 32-bit SIMD-accelerated Huffman encoder that prevented the code from assembling properly with clang. 6. The `jpeg_stdio_src()`, `jpeg_mem_src()`, `jpeg_stdio_dest()`, and `jpeg_mem_dest()` functions in the libjpeg API will now throw an error if a source/destination manager has already been assigned to the compress or decompress object by a different function or by the calling program. This prevents these functions from attempting to reuse a source/destination manager structure that was allocated elsewhere, because there is no way to ensure that it would be big enough to accommodate the new source/destination manager. 1.4.90 (1.5 beta1) ================== ### Significant changes relative to 1.4.2: 1. Added full SIMD acceleration for PowerPC platforms using AltiVec VMX (128-bit SIMD) instructions. Although the performance of libjpeg-turbo on PowerPC was already good, due to the increased number of registers available to the compiler vs. x86, it was still possible to speed up compression by about 3-4x and decompression by about 2-2.5x (relative to libjpeg v6b) through the use of AltiVec instructions. 2. Added two new libjpeg API functions (`jpeg_skip_scanlines()` and `jpeg_crop_scanline()`) that can be used to partially decode a JPEG image. See [libjpeg.txt](libjpeg.txt) for more details. 3. The TJCompressor and TJDecompressor classes in the TurboJPEG Java API now implement the Closeable interface, so those classes can be used with a try-with-resources statement. 4. The TurboJPEG Java classes now throw unchecked idiomatic exceptions (IllegalArgumentException, IllegalStateException) for unrecoverable errors caused by incorrect API usage, and those classes throw a new checked exception type (TJException) for errors that are passed through from the C library. 5. Source buffers for the TurboJPEG C API functions, as well as the `jpeg_mem_src()` function in the libjpeg API, are now declared as const pointers. This facilitates passing read-only buffers to those functions and ensures the caller that the source buffer will not be modified. This should not create any backward API or ABI incompatibilities with prior libjpeg-turbo releases. 6. The MIPS DSPr2 SIMD code can now be compiled to support either FR=0 or FR=1 FPUs. 7. Fixed additional negative left shifts and other issues reported by the GCC and Clang undefined behavior sanitizers. Most of these issues affected only 32-bit code, and none of them was known to pose a security threat, but removing the warnings makes it easier to detect actual security issues, should they arise in the future. 8. Removed the unnecessary `.arch` directive from the ARM64 NEON SIMD code. This directive was preventing the code from assembling using the clang integrated assembler. 9. Fixed a regression caused by 1.4.1[6] that prevented 32-bit and 64-bit libjpeg-turbo RPMs from being installed simultaneously on recent Red Hat/Fedora distributions. This was due to the addition of a macro in jconfig.h that allows the Huffman codec to determine the word size at compile time. Since that macro differs between 32-bit and 64-bit builds, this caused a conflict between the i386 and x86_64 RPMs (any differing files, other than executables, are not allowed when 32-bit and 64-bit RPMs are installed simultaneously.) Since the macro is used only internally, it has been moved into jconfigint.h. 10. The x86-64 SIMD code can now be disabled at run time by setting the `JSIMD_FORCENONE` environment variable to `1` (the other SIMD implementations already had this capability.) 11. Added a new command-line argument to TJBench (`-nowrite`) that prevents the benchmark from outputting any images. This removes any potential operating system overhead that might be caused by lazy writes to disk and thus improves the consistency of the performance measurements. 12. Added SIMD acceleration for Huffman encoding on SSE2-capable x86 and x86-64 platforms. This speeds up the compression of full-color JPEGs by about 10-15% on average (relative to libjpeg-turbo 1.4.x) when using modern Intel and AMD CPUs. Additionally, this works around an issue in the clang optimizer that prevents it (as of this writing) from achieving the same performance as GCC when compiling the C version of the Huffman encoder (<https://llvm.org/bugs/show_bug.cgi?id=16035>). For the purposes of benchmarking or regression testing, SIMD-accelerated Huffman encoding can be disabled by setting the `JSIMD_NOHUFFENC` environment variable to `1`. 13. Added ARM 64-bit (ARMv8) NEON SIMD implementations of the commonly-used compression algorithms (including the slow integer forward DCT and h2v2 & h2v1 downsampling algorithms, which are not accelerated in the 32-bit NEON implementation.) This speeds up the compression of full-color JPEGs by about 75% on average on a Cavium ThunderX processor and by about 2-2.5x on average on Cortex-A53 and Cortex-A57 cores. 14. Added SIMD acceleration for Huffman encoding on NEON-capable ARM 32-bit and 64-bit platforms. For 32-bit code, this speeds up the compression of full-color JPEGs by about 30% on average on a typical iOS device (iPhone 4S, Cortex-A9) and by about 6-7% on average on a typical Android device (Nexus 5X, Cortex-A53 and Cortex-A57), relative to libjpeg-turbo 1.4.x. Note that the larger speedup under iOS is due to the fact that iOS builds use LLVM, which does not optimize the C Huffman encoder as well as GCC does. For 64-bit code, NEON-accelerated Huffman encoding speeds up the compression of full-color JPEGs by about 40% on average on a typical iOS device (iPhone 5S, Apple A7) and by about 7-8% on average on a typical Android device (Nexus 5X, Cortex-A53 and Cortex-A57), in addition to the speedup described in [13] above. For the purposes of benchmarking or regression testing, SIMD-accelerated Huffman encoding can be disabled by setting the `JSIMD_NOHUFFENC` environment variable to `1`. 15. pkg-config (.pc) scripts are now included for both the libjpeg and TurboJPEG API libraries on Un*x systems. Note that if a project's build system relies on these scripts, then it will not be possible to build that project with libjpeg or with a prior version of libjpeg-turbo. 16. Optimized the ARM 64-bit (ARMv8) NEON SIMD decompression routines to improve performance on CPUs with in-order pipelines. This speeds up the decompression of full-color JPEGs by nearly 2x on average on a Cavium ThunderX processor and by about 15% on average on a Cortex-A53 core. 17. Fixed an issue in the accelerated Huffman decoder that could have caused the decoder to read past the end of the input buffer when a malformed, specially-crafted JPEG image was being decompressed. In prior versions of libjpeg-turbo, the accelerated Huffman decoder was invoked (in most cases) only if there were > 128 bytes of data in the input buffer. However, it is possible to construct a JPEG image in which a single Huffman block is over 430 bytes long, so this version of libjpeg-turbo activates the accelerated Huffman decoder only if there are > 512 bytes of data in the input buffer. 18. Fixed a memory leak in tjunittest encountered when running the program with the `-yuv` option. 1.4.2 ===== ### Significant changes relative to 1.4.1: 1. Fixed an issue whereby cjpeg would segfault if a Windows bitmap with a negative width or height was used as an input image (Windows bitmaps can have a negative height if they are stored in top-down order, but such files are rare and not supported by libjpeg-turbo.) 2. Fixed an issue whereby, under certain circumstances, libjpeg-turbo would incorrectly encode certain JPEG images when quality=100 and the fast integer forward DCT were used. This was known to cause `make test` to fail when the library was built with `-march=haswell` on x86 systems. 3. Fixed an issue whereby libjpeg-turbo would crash when built with the latest & greatest development version of the Clang/LLVM compiler. This was caused by an x86-64 ABI conformance issue in some of libjpeg-turbo's 64-bit SSE2 SIMD routines. Those routines were incorrectly using a 64-bit `mov` instruction to transfer a 32-bit JDIMENSION argument, whereas the x86-64 ABI allows the upper (unused) 32 bits of a 32-bit argument's register to be undefined. The new Clang/LLVM optimizer uses load combining to transfer multiple adjacent 32-bit structure members into a single 64-bit register, and this exposed the ABI conformance issue. 4. Fixed a bug in the MIPS DSPr2 4:2:0 "plain" (non-fancy and non-merged) upsampling routine that caused a buffer overflow (and subsequent segfault) when decompressing a 4:2:0 JPEG image whose scaled output width was less than 16 pixels. The "plain" upsampling routines are normally only used when decompressing a non-YCbCr JPEG image, but they are also used when decompressing a JPEG image whose scaled output height is 1. 5. Fixed various negative left shifts and other issues reported by the GCC and Clang undefined behavior sanitizers. None of these was known to pose a security threat, but removing the warnings makes it easier to detect actual security issues, should they arise in the future.
2016-06-14 14:07:57 +02:00
# $NetBSD: Makefile,v 1.14 2016/06/14 12:07:57 wiz Exp $
Updated libjpeg-turbo to 1.5.0. 1.5.0 ===== ### Significant changes relative to 1.5 beta1: 1. Fixed an issue whereby a malformed motion-JPEG frame could cause the "fast path" of libjpeg-turbo's Huffman decoder to read from uninitialized memory. 2. Added libjpeg-turbo version and build information to the global string table of the libjpeg and TurboJPEG API libraries. This is a common practice in other infrastructure libraries, such as OpenSSL and libpng, because it makes it easy to examine an application binary and determine which version of the library the application was linked against. 3. Fixed a couple of issues in the PPM reader that would cause buffer overruns in cjpeg if one of the values in a binary PPM/PGM input file exceeded the maximum value defined in the file's header. libjpeg-turbo 1.4.2 already included a similar fix for ASCII PPM/PGM files. Note that these issues were not security bugs, since they were confined to the cjpeg program and did not affect any of the libjpeg-turbo libraries. 4. Fixed an issue whereby attempting to decompress a JPEG file with a corrupt header using the `tjDecompressToYUV2()` function would cause the function to abort without returning an error and, under certain circumstances, corrupt the stack. This only occurred if `tjDecompressToYUV2()` was called prior to calling `tjDecompressHeader3()`, or if the return value from `tjDecompressHeader3()` was ignored (both cases represent incorrect usage of the TurboJPEG API.) 5. Fixed an issue in the ARM 32-bit SIMD-accelerated Huffman encoder that prevented the code from assembling properly with clang. 6. The `jpeg_stdio_src()`, `jpeg_mem_src()`, `jpeg_stdio_dest()`, and `jpeg_mem_dest()` functions in the libjpeg API will now throw an error if a source/destination manager has already been assigned to the compress or decompress object by a different function or by the calling program. This prevents these functions from attempting to reuse a source/destination manager structure that was allocated elsewhere, because there is no way to ensure that it would be big enough to accommodate the new source/destination manager. 1.4.90 (1.5 beta1) ================== ### Significant changes relative to 1.4.2: 1. Added full SIMD acceleration for PowerPC platforms using AltiVec VMX (128-bit SIMD) instructions. Although the performance of libjpeg-turbo on PowerPC was already good, due to the increased number of registers available to the compiler vs. x86, it was still possible to speed up compression by about 3-4x and decompression by about 2-2.5x (relative to libjpeg v6b) through the use of AltiVec instructions. 2. Added two new libjpeg API functions (`jpeg_skip_scanlines()` and `jpeg_crop_scanline()`) that can be used to partially decode a JPEG image. See [libjpeg.txt](libjpeg.txt) for more details. 3. The TJCompressor and TJDecompressor classes in the TurboJPEG Java API now implement the Closeable interface, so those classes can be used with a try-with-resources statement. 4. The TurboJPEG Java classes now throw unchecked idiomatic exceptions (IllegalArgumentException, IllegalStateException) for unrecoverable errors caused by incorrect API usage, and those classes throw a new checked exception type (TJException) for errors that are passed through from the C library. 5. Source buffers for the TurboJPEG C API functions, as well as the `jpeg_mem_src()` function in the libjpeg API, are now declared as const pointers. This facilitates passing read-only buffers to those functions and ensures the caller that the source buffer will not be modified. This should not create any backward API or ABI incompatibilities with prior libjpeg-turbo releases. 6. The MIPS DSPr2 SIMD code can now be compiled to support either FR=0 or FR=1 FPUs. 7. Fixed additional negative left shifts and other issues reported by the GCC and Clang undefined behavior sanitizers. Most of these issues affected only 32-bit code, and none of them was known to pose a security threat, but removing the warnings makes it easier to detect actual security issues, should they arise in the future. 8. Removed the unnecessary `.arch` directive from the ARM64 NEON SIMD code. This directive was preventing the code from assembling using the clang integrated assembler. 9. Fixed a regression caused by 1.4.1[6] that prevented 32-bit and 64-bit libjpeg-turbo RPMs from being installed simultaneously on recent Red Hat/Fedora distributions. This was due to the addition of a macro in jconfig.h that allows the Huffman codec to determine the word size at compile time. Since that macro differs between 32-bit and 64-bit builds, this caused a conflict between the i386 and x86_64 RPMs (any differing files, other than executables, are not allowed when 32-bit and 64-bit RPMs are installed simultaneously.) Since the macro is used only internally, it has been moved into jconfigint.h. 10. The x86-64 SIMD code can now be disabled at run time by setting the `JSIMD_FORCENONE` environment variable to `1` (the other SIMD implementations already had this capability.) 11. Added a new command-line argument to TJBench (`-nowrite`) that prevents the benchmark from outputting any images. This removes any potential operating system overhead that might be caused by lazy writes to disk and thus improves the consistency of the performance measurements. 12. Added SIMD acceleration for Huffman encoding on SSE2-capable x86 and x86-64 platforms. This speeds up the compression of full-color JPEGs by about 10-15% on average (relative to libjpeg-turbo 1.4.x) when using modern Intel and AMD CPUs. Additionally, this works around an issue in the clang optimizer that prevents it (as of this writing) from achieving the same performance as GCC when compiling the C version of the Huffman encoder (<https://llvm.org/bugs/show_bug.cgi?id=16035>). For the purposes of benchmarking or regression testing, SIMD-accelerated Huffman encoding can be disabled by setting the `JSIMD_NOHUFFENC` environment variable to `1`. 13. Added ARM 64-bit (ARMv8) NEON SIMD implementations of the commonly-used compression algorithms (including the slow integer forward DCT and h2v2 & h2v1 downsampling algorithms, which are not accelerated in the 32-bit NEON implementation.) This speeds up the compression of full-color JPEGs by about 75% on average on a Cavium ThunderX processor and by about 2-2.5x on average on Cortex-A53 and Cortex-A57 cores. 14. Added SIMD acceleration for Huffman encoding on NEON-capable ARM 32-bit and 64-bit platforms. For 32-bit code, this speeds up the compression of full-color JPEGs by about 30% on average on a typical iOS device (iPhone 4S, Cortex-A9) and by about 6-7% on average on a typical Android device (Nexus 5X, Cortex-A53 and Cortex-A57), relative to libjpeg-turbo 1.4.x. Note that the larger speedup under iOS is due to the fact that iOS builds use LLVM, which does not optimize the C Huffman encoder as well as GCC does. For 64-bit code, NEON-accelerated Huffman encoding speeds up the compression of full-color JPEGs by about 40% on average on a typical iOS device (iPhone 5S, Apple A7) and by about 7-8% on average on a typical Android device (Nexus 5X, Cortex-A53 and Cortex-A57), in addition to the speedup described in [13] above. For the purposes of benchmarking or regression testing, SIMD-accelerated Huffman encoding can be disabled by setting the `JSIMD_NOHUFFENC` environment variable to `1`. 15. pkg-config (.pc) scripts are now included for both the libjpeg and TurboJPEG API libraries on Un*x systems. Note that if a project's build system relies on these scripts, then it will not be possible to build that project with libjpeg or with a prior version of libjpeg-turbo. 16. Optimized the ARM 64-bit (ARMv8) NEON SIMD decompression routines to improve performance on CPUs with in-order pipelines. This speeds up the decompression of full-color JPEGs by nearly 2x on average on a Cavium ThunderX processor and by about 15% on average on a Cortex-A53 core. 17. Fixed an issue in the accelerated Huffman decoder that could have caused the decoder to read past the end of the input buffer when a malformed, specially-crafted JPEG image was being decompressed. In prior versions of libjpeg-turbo, the accelerated Huffman decoder was invoked (in most cases) only if there were > 128 bytes of data in the input buffer. However, it is possible to construct a JPEG image in which a single Huffman block is over 430 bytes long, so this version of libjpeg-turbo activates the accelerated Huffman decoder only if there are > 512 bytes of data in the input buffer. 18. Fixed a memory leak in tjunittest encountered when running the program with the `-yuv` option. 1.4.2 ===== ### Significant changes relative to 1.4.1: 1. Fixed an issue whereby cjpeg would segfault if a Windows bitmap with a negative width or height was used as an input image (Windows bitmaps can have a negative height if they are stored in top-down order, but such files are rare and not supported by libjpeg-turbo.) 2. Fixed an issue whereby, under certain circumstances, libjpeg-turbo would incorrectly encode certain JPEG images when quality=100 and the fast integer forward DCT were used. This was known to cause `make test` to fail when the library was built with `-march=haswell` on x86 systems. 3. Fixed an issue whereby libjpeg-turbo would crash when built with the latest & greatest development version of the Clang/LLVM compiler. This was caused by an x86-64 ABI conformance issue in some of libjpeg-turbo's 64-bit SSE2 SIMD routines. Those routines were incorrectly using a 64-bit `mov` instruction to transfer a 32-bit JDIMENSION argument, whereas the x86-64 ABI allows the upper (unused) 32 bits of a 32-bit argument's register to be undefined. The new Clang/LLVM optimizer uses load combining to transfer multiple adjacent 32-bit structure members into a single 64-bit register, and this exposed the ABI conformance issue. 4. Fixed a bug in the MIPS DSPr2 4:2:0 "plain" (non-fancy and non-merged) upsampling routine that caused a buffer overflow (and subsequent segfault) when decompressing a 4:2:0 JPEG image whose scaled output width was less than 16 pixels. The "plain" upsampling routines are normally only used when decompressing a non-YCbCr JPEG image, but they are also used when decompressing a JPEG image whose scaled output height is 1. 5. Fixed various negative left shifts and other issues reported by the GCC and Clang undefined behavior sanitizers. None of these was known to pose a security threat, but removing the warnings makes it easier to detect actual security issues, should they arise in the future.
2016-06-14 14:07:57 +02:00
DISTNAME= libjpeg-turbo-1.5.0
CATEGORIES= graphics
MASTER_SITES= ${MASTER_SITE_SOURCEFORGE:=libjpeg-turbo/}
MAINTAINER= dsainty@NetBSD.org
HOMEPAGE= http://libjpeg-turbo.virtualgl.org/
COMMENT= Accelerated libjpeg with SIMD instructions
LICENSE= gnu-lgpl-v2 # OR wxWindows Library Licence, Version 3.1
CONFLICTS= jpeg-[0-9]*
.if ${MACHINE_ARCH} == x86_64 || ${MACHINE_ARCH} == i386
2010-12-12 13:36:39 +01:00
BUILD_DEPENDS+= nasm-[0-9]*:../../devel/nasm
.endif
2010-12-12 13:36:39 +01:00
GNU_CONFIGURE= yes
# compatibility with pkgsrc/graphics/jpeg
CONFIGURE_ARGS+= --with-jpeg8
USE_LIBTOOL= yes
USE_LANGUAGES= c c++
Updated libjpeg-turbo to 1.5.0. 1.5.0 ===== ### Significant changes relative to 1.5 beta1: 1. Fixed an issue whereby a malformed motion-JPEG frame could cause the "fast path" of libjpeg-turbo's Huffman decoder to read from uninitialized memory. 2. Added libjpeg-turbo version and build information to the global string table of the libjpeg and TurboJPEG API libraries. This is a common practice in other infrastructure libraries, such as OpenSSL and libpng, because it makes it easy to examine an application binary and determine which version of the library the application was linked against. 3. Fixed a couple of issues in the PPM reader that would cause buffer overruns in cjpeg if one of the values in a binary PPM/PGM input file exceeded the maximum value defined in the file's header. libjpeg-turbo 1.4.2 already included a similar fix for ASCII PPM/PGM files. Note that these issues were not security bugs, since they were confined to the cjpeg program and did not affect any of the libjpeg-turbo libraries. 4. Fixed an issue whereby attempting to decompress a JPEG file with a corrupt header using the `tjDecompressToYUV2()` function would cause the function to abort without returning an error and, under certain circumstances, corrupt the stack. This only occurred if `tjDecompressToYUV2()` was called prior to calling `tjDecompressHeader3()`, or if the return value from `tjDecompressHeader3()` was ignored (both cases represent incorrect usage of the TurboJPEG API.) 5. Fixed an issue in the ARM 32-bit SIMD-accelerated Huffman encoder that prevented the code from assembling properly with clang. 6. The `jpeg_stdio_src()`, `jpeg_mem_src()`, `jpeg_stdio_dest()`, and `jpeg_mem_dest()` functions in the libjpeg API will now throw an error if a source/destination manager has already been assigned to the compress or decompress object by a different function or by the calling program. This prevents these functions from attempting to reuse a source/destination manager structure that was allocated elsewhere, because there is no way to ensure that it would be big enough to accommodate the new source/destination manager. 1.4.90 (1.5 beta1) ================== ### Significant changes relative to 1.4.2: 1. Added full SIMD acceleration for PowerPC platforms using AltiVec VMX (128-bit SIMD) instructions. Although the performance of libjpeg-turbo on PowerPC was already good, due to the increased number of registers available to the compiler vs. x86, it was still possible to speed up compression by about 3-4x and decompression by about 2-2.5x (relative to libjpeg v6b) through the use of AltiVec instructions. 2. Added two new libjpeg API functions (`jpeg_skip_scanlines()` and `jpeg_crop_scanline()`) that can be used to partially decode a JPEG image. See [libjpeg.txt](libjpeg.txt) for more details. 3. The TJCompressor and TJDecompressor classes in the TurboJPEG Java API now implement the Closeable interface, so those classes can be used with a try-with-resources statement. 4. The TurboJPEG Java classes now throw unchecked idiomatic exceptions (IllegalArgumentException, IllegalStateException) for unrecoverable errors caused by incorrect API usage, and those classes throw a new checked exception type (TJException) for errors that are passed through from the C library. 5. Source buffers for the TurboJPEG C API functions, as well as the `jpeg_mem_src()` function in the libjpeg API, are now declared as const pointers. This facilitates passing read-only buffers to those functions and ensures the caller that the source buffer will not be modified. This should not create any backward API or ABI incompatibilities with prior libjpeg-turbo releases. 6. The MIPS DSPr2 SIMD code can now be compiled to support either FR=0 or FR=1 FPUs. 7. Fixed additional negative left shifts and other issues reported by the GCC and Clang undefined behavior sanitizers. Most of these issues affected only 32-bit code, and none of them was known to pose a security threat, but removing the warnings makes it easier to detect actual security issues, should they arise in the future. 8. Removed the unnecessary `.arch` directive from the ARM64 NEON SIMD code. This directive was preventing the code from assembling using the clang integrated assembler. 9. Fixed a regression caused by 1.4.1[6] that prevented 32-bit and 64-bit libjpeg-turbo RPMs from being installed simultaneously on recent Red Hat/Fedora distributions. This was due to the addition of a macro in jconfig.h that allows the Huffman codec to determine the word size at compile time. Since that macro differs between 32-bit and 64-bit builds, this caused a conflict between the i386 and x86_64 RPMs (any differing files, other than executables, are not allowed when 32-bit and 64-bit RPMs are installed simultaneously.) Since the macro is used only internally, it has been moved into jconfigint.h. 10. The x86-64 SIMD code can now be disabled at run time by setting the `JSIMD_FORCENONE` environment variable to `1` (the other SIMD implementations already had this capability.) 11. Added a new command-line argument to TJBench (`-nowrite`) that prevents the benchmark from outputting any images. This removes any potential operating system overhead that might be caused by lazy writes to disk and thus improves the consistency of the performance measurements. 12. Added SIMD acceleration for Huffman encoding on SSE2-capable x86 and x86-64 platforms. This speeds up the compression of full-color JPEGs by about 10-15% on average (relative to libjpeg-turbo 1.4.x) when using modern Intel and AMD CPUs. Additionally, this works around an issue in the clang optimizer that prevents it (as of this writing) from achieving the same performance as GCC when compiling the C version of the Huffman encoder (<https://llvm.org/bugs/show_bug.cgi?id=16035>). For the purposes of benchmarking or regression testing, SIMD-accelerated Huffman encoding can be disabled by setting the `JSIMD_NOHUFFENC` environment variable to `1`. 13. Added ARM 64-bit (ARMv8) NEON SIMD implementations of the commonly-used compression algorithms (including the slow integer forward DCT and h2v2 & h2v1 downsampling algorithms, which are not accelerated in the 32-bit NEON implementation.) This speeds up the compression of full-color JPEGs by about 75% on average on a Cavium ThunderX processor and by about 2-2.5x on average on Cortex-A53 and Cortex-A57 cores. 14. Added SIMD acceleration for Huffman encoding on NEON-capable ARM 32-bit and 64-bit platforms. For 32-bit code, this speeds up the compression of full-color JPEGs by about 30% on average on a typical iOS device (iPhone 4S, Cortex-A9) and by about 6-7% on average on a typical Android device (Nexus 5X, Cortex-A53 and Cortex-A57), relative to libjpeg-turbo 1.4.x. Note that the larger speedup under iOS is due to the fact that iOS builds use LLVM, which does not optimize the C Huffman encoder as well as GCC does. For 64-bit code, NEON-accelerated Huffman encoding speeds up the compression of full-color JPEGs by about 40% on average on a typical iOS device (iPhone 5S, Apple A7) and by about 7-8% on average on a typical Android device (Nexus 5X, Cortex-A53 and Cortex-A57), in addition to the speedup described in [13] above. For the purposes of benchmarking or regression testing, SIMD-accelerated Huffman encoding can be disabled by setting the `JSIMD_NOHUFFENC` environment variable to `1`. 15. pkg-config (.pc) scripts are now included for both the libjpeg and TurboJPEG API libraries on Un*x systems. Note that if a project's build system relies on these scripts, then it will not be possible to build that project with libjpeg or with a prior version of libjpeg-turbo. 16. Optimized the ARM 64-bit (ARMv8) NEON SIMD decompression routines to improve performance on CPUs with in-order pipelines. This speeds up the decompression of full-color JPEGs by nearly 2x on average on a Cavium ThunderX processor and by about 15% on average on a Cortex-A53 core. 17. Fixed an issue in the accelerated Huffman decoder that could have caused the decoder to read past the end of the input buffer when a malformed, specially-crafted JPEG image was being decompressed. In prior versions of libjpeg-turbo, the accelerated Huffman decoder was invoked (in most cases) only if there were > 128 bytes of data in the input buffer. However, it is possible to construct a JPEG image in which a single Huffman block is over 430 bytes long, so this version of libjpeg-turbo activates the accelerated Huffman decoder only if there are > 512 bytes of data in the input buffer. 18. Fixed a memory leak in tjunittest encountered when running the program with the `-yuv` option. 1.4.2 ===== ### Significant changes relative to 1.4.1: 1. Fixed an issue whereby cjpeg would segfault if a Windows bitmap with a negative width or height was used as an input image (Windows bitmaps can have a negative height if they are stored in top-down order, but such files are rare and not supported by libjpeg-turbo.) 2. Fixed an issue whereby, under certain circumstances, libjpeg-turbo would incorrectly encode certain JPEG images when quality=100 and the fast integer forward DCT were used. This was known to cause `make test` to fail when the library was built with `-march=haswell` on x86 systems. 3. Fixed an issue whereby libjpeg-turbo would crash when built with the latest & greatest development version of the Clang/LLVM compiler. This was caused by an x86-64 ABI conformance issue in some of libjpeg-turbo's 64-bit SSE2 SIMD routines. Those routines were incorrectly using a 64-bit `mov` instruction to transfer a 32-bit JDIMENSION argument, whereas the x86-64 ABI allows the upper (unused) 32 bits of a 32-bit argument's register to be undefined. The new Clang/LLVM optimizer uses load combining to transfer multiple adjacent 32-bit structure members into a single 64-bit register, and this exposed the ABI conformance issue. 4. Fixed a bug in the MIPS DSPr2 4:2:0 "plain" (non-fancy and non-merged) upsampling routine that caused a buffer overflow (and subsequent segfault) when decompressing a 4:2:0 JPEG image whose scaled output width was less than 16 pixels. The "plain" upsampling routines are normally only used when decompressing a non-YCbCr JPEG image, but they are also used when decompressing a JPEG image whose scaled output height is 1. 5. Fixed various negative left shifts and other issues reported by the GCC and Clang undefined behavior sanitizers. None of these was known to pose a security threat, but removing the warnings makes it easier to detect actual security issues, should they arise in the future.
2016-06-14 14:07:57 +02:00
PKGCONFIG_OVERRIDE= release/libturbojpeg.pc.in release/libjpeg.pc.in
TEST_TARGET= test
.include "../../mk/bsd.pkg.mk"