FFTW 3.3.7:
* Experimental support for CMake.
The primary build mechanism for FFTW remains GNU autoconf/automake.
CMake support is meant to offer an easy way to compile FFTW on
Windows, and as such it does not cover all the features of the
automake build system, such as exotic cycle counters,
cross-compiling, or build of binaries for a mixture of ISA's
(e.g., amd64 vs amd64+avx vs amd64+avx2). Patches are welcome.
* Fixes for armv7a cycle counter.
* Official support for aarch64, now that we have hardware to test it.
* Tweak usage of FMA instructions in a way that favors newer processors
(Skylake and Ryzen) over older processors (Haswell).
* tests/bench: use 64-bit precision to compute mflops.
FFTW 3.3.6-pl2:
* Bugfix: MPI Fortran-03 headers were missing in FFTW 3.3.6-pl1.
* Reduced planning time in estimate mode for sizes with large prime factors.
* Added AVX autodetection under Visual Studio.
* Modern Fortran interface now uses a separate fftw3l.f03 interface file for
the long double interface, which is not supported by some Fortran compilers.
Provided new fftw3q.f03 interface file to access the quadruple-precision FFTW
routines with recent versions of gcc/gfortran.
* Added support for the NEON extensions to the ARM ISA.
* MPI code now compiles even if mpicc is a C++ compiler.
* Compiling OpenMP support (--enable-openmp) now installs a fftw3_omp library,
instead of fftw3_threads, so that OpenMP and POSIX threads (--enable-threads)
libraries can be built and installed at the same time.
* Various minor compilation fixes, corrections of manual typos, and
improvements to the benchmark test program.
* Add support for the AVX extensions to x86 and x86-64. The AVX code works with
16-byte alignment (as opposed to 32-byte alignment), so there is no ABI
change compared to FFTW 3.2.2.
* Added Fortran 2003 interface, which should be usable on most modern Fortran
compilers (e.g. gfortran) and provides type-checked access to the the C FFTW
interface. (The legacy Fortran-77 interface is still included also.)
* Added MPI distributed-memory transforms. Compared to 3.3alpha, the major
changes in the MPI transforms are:
* Fixed some deadlock and crashing bugs.
* Added Fortran 2003 interface.
* Added new-array execute functions for MPI plans.
* Eliminated use of large MPI tags, since Cray MPI requires tags < 224.
* Expanded documentation.
* make check now runs MPI tests
* Some ABI changes — not binary-compatible with 3.3alpha MPI.
* Add support for quad-precision __float128 in gcc 4.6 or later (on x86.
x86-64, and Itanium). The new routines use the fftwq_ prefix.
* Temporarily removed MIPS paired-single support due to lack of available
hardware for testing. We hope to add it back before the final FFTW 3.3
release; meanwhile, users who want this functionality should continue using
FFTW 3.2.x.
* Removed support for the Cell Broadband Engine. Cell users should use FFTW
3.2.x.
* New convenience functions fftw_alloc_real and fftw_alloc_complex to use
fftw_malloc for real and complex arrays without typecasts or sizeof.
All library names listed by *.la files no longer need to be listed
in the PLIST, e.g., instead of:
lib/libfoo.a
lib/libfoo.la
lib/libfoo.so
lib/libfoo.so.0
lib/libfoo.so.0.1
one simply needs:
lib/libfoo.la
and bsd.pkg.mk will automatically ensure that the additional library
names are listed in the installed package +CONTENTS file.
Also make LIBTOOLIZE_PLIST default to "yes".
* Some speed improvements in SIMD code.
* --without-cycle-counter option is removed. If no cycle counter is found,
then the estimator is always used. A --with-slow-timer option is provided
to force the use of lower-resolution timers.
* Added missing static keyword that prevented simultaneous linkage
of different-precision versions; thanks to Rasmus Larson for the bug report.
* Corrected accidental omission of f77_wisdom.f file; thanks to Alan Watson.
* Removed non-portable use of 'tempfile' in fftw-wisdom-to-conf script;
thanks to Nicolas Decoster for the patch.
* Added 'make smallcheck' target in tests/ directory, at the request of
James Treacy.
Major goals of this release:
* Speed: often 20% or more faster than FFTW 2.x, even without SIMD (see below).
* Complete rewrite, to make it easier to add new algorithms and transforms.
* New API, to support more general semantics.
Other enhancements:
* SIMD acceleration on supporting CPUs (SSE, SSE2, 3DNow!, and AltiVec).
(With special thanks to Franz Franchetti for many experimental prototypes
and to Stefan Kral for the vectorizing generator from fftwgel.)
* True in-place 1d transforms of large sizes (as well as compressed
twiddle tables for additional memory/cache savings).
* More arbitrary placement of real & imaginary data, e.g. including
interleaved (as in FFTW 2.x) as well as separate real/imag arrays.
* Efficient prime-size transforms of real data.
* Multidimensional transforms can operate on a subset of a larger matrix,
and/or transform selected dimensions of a multidimensional array.
* By popular demand, simultaneous linking to double precision (fftw),
single precision (fftwf), and long-double precision (fftwl) versions
of FFTW is now supported.
* Cycle counters (on all modern CPUs) are exploited to speed planning.
* Efficient transforms of real even/odd arrays, a.k.a. discrete
cosine/sine transforms (types I-IV). (Currently work via pre/post
processing of real transforms, ala FFTPACK, so are not optimal.)
* DHTs (Discrete Hartley Transforms), again via post-processing
of real transforms (and thus suboptimal, for now).
* Support for linking to just those parts of FFTW that you need,
greatly reducing the size of statically linked programs when
only a limited set of transform sizes/types are required.
* Canonical global wisdom file (/etc/fftw/wisdom) on Unix, along
with a command-line tool (fftw-wisdom) to generate/update it.
* Fortran API can be used with both g77 and non-g77 compilers
simultaneously.
* Multi-threaded version has optional OpenMP support.
* Authors' good looks have greatly improved with age.
Summary of changes:
- removal of USE_GTEXINFO
- addition of mk/texinfo.mk
- inclusion of this file in package Makefiles requiring it
- `install-info' substituted by `${INSTALL_INFO}' in PLISTs
- tuning of mk/bsd.pkg.mk:
removal of USE_GTEXINFO
INSTALL_INFO added to PLIST_SUBST
`${INSTALL_INFO}' replace `install-info' in target rules
print-PLIST target now generate `${INSTALL_INFO}' instead of `install-info'
- a couple of new patch files added for a handful of packages
- setting of the TEXINFO_OVERRIDE "switch" in packages Makefiles requiring it
- devel/cssc marked requiring texinfo 4.0
- a couple of packages Makefiles were tuned with respect of INFO_FILES and
makeinfo command usage
See -newly added by this commit- section 10.24 of Packages.txt for
further information.