Change Log
2011-10-12: version 1.4.682 (Download)
- The previous release was wrongly marked as version 1.3.319 at some websites promoting open-source software. So this new version starts from 1.4.x to avoid any confusion.
- MAJOR NEW FEATURE - new experimental API to support overlap of communications and computations in applications. More details here.
- Much improved IO library with refactored code, new functions and bug fixes. Please note there are minor adjustments of the parameter lists of several I/O routines. Refer to the IO API page for more details.
- Much improved halo-cell communication code, now supporting arbitrary global data size, data structures defined using global coordinate, and periodic boundary conditions.
- More sample applications.
- Many bug fixes and minor improvements.
2011-07-08: version 1.1.319 (Download)
- Reintroduced the IBM ESSL implementation of the FFT library, to be used on IBM hardware such as Blue Genes and other PowerPC based machines where ESSL is available.
- A new FFTW implementation using the Fortran 2003 interface provided by the latest FFTW 3.3-beta1. The old implementation using the legacy Fortran interface remains. The main benefit of the new Fortran 2003 interface is the guaranteed memory alignment which may offers performance improvement on certain hardware/compiler combinations, although this is not seen on my test hardware.
- A new sample application allowing to crosscheck the parallel FFT result against P3DFFT.
2011-06-12: version 1.1.273 (Download)
- Better handling of the MPI buffers using global data strucutres in the 2D decomposition library, resulting in significant speed-up (more than 20% on a Cray XE6) for all communication code.
- Special branch of code to optimise the FFT performance when 1D decomposition is actually in use as a special case of 2D decomposition, again resulting in significant speed-up. Read this note if you want to parallelising applications using the 1D decomposition.
- Optimisation of several FFT engines by using in-place transforms when computing the underlying 1D FFTs.
- Introduced an option that allows overwriting the FFT input. This can reduce the memory footprint of the library and may improve performance.
2011-05-04: version 1.0.246 (Download)
- Initial public release.
- General-purpose 2D pencil decomposition module.
- Distributed 3D Fast Fourier Transform.
- Halo-cell support allowing explicit message passing between neighbouring blocks.
- Parallel I/O module using MPI-IO to handle the input/output of data sets.
- System V IPC shared-memory optimisation of the communication code.
- Interface with most popular external FFT libraries: FFTW, ACML, MKL, FFTE and FFTPACK.