What's new - IMSL® Fortran Numerical Library 7.1
Performance improvements from CUDA BLAS integration
As GPU hardware becomes more prevalent in both research and commercial institutions, software that takes advantage of this specialized hardware is growing in demand. In many cases, it is infeasible or impossible to rewrite an existing program to run entirely on the GPU, so the goal is often to offload as much work as possible. The IMSL Fortran Library offloads CPU work to NVIDIA GPU hardware where the CUDA BLAS library is utilized. Users with supported hardware will be able to link the IMSL Fortran Library with CUDA version 6.0 to gain significant performance improvements for many linear algebra functions. This is similar to Intel ® Math Kernel Library (Intel® MKL). The calling sequences for IMSL functions are untouched, so there is no learning curve and users can be productive immediately.
This graph shows the speedup of double precision matrix multiply (DGEMM) across several problem sizes (500 square to 8000 square) using the NVIDIA CUBLAS algorithm and Intel® MKL. Against pure Fortran code, code executes over 800 times faster moving to the GPU. Compared to hardware-optimized BLAS using 4 CPU threads, performance can improve up to 16 times faster.
The many updated algorithms in the IMSL FORTRAN Library version 7.1 provide unique numerical analysis techniques to customers in major corporations, academic institutions, and research laboratories worldwide. There are several upgrades including:
- CUDA Toolkit Libraries 6.0 is now supported
- The internally used ScaLAPACK mapping functions were improved
- The product is no longer license-managed for users who have purchased the product
- A number of code fixes and improvements are included in this release
- Supported platforms are updated to support the latest OSs, compilers and chip set
For more details, see the release notes.