08b3e08a91
LAPACK 3.6.1: What’s new [Mark Gates, UTK] blocked back-transformation for the non-symmetric eigenvalue problem It blocks NB gemv calls into one gemm call inside trevc. To do that, it needs a new routine, trevc3, because unfortunately the lwork was not passed into trevc. Attached is the performance speedup for dgeev. It gives a nice 1.5x speedup for N=20000, and that appears to still be increasing with N. This is not the improvements that Greg Henry recently provided for doing the triangular solves as BLAS-3 instead of BLAS-1. That will take a while to process, but we expect another, even larger increase in performance when those changes are applied. This also does not include doing multiple (BLAS-1) triangular solves in parallel, which is available in MAGMA, since that requires OpenMP or pthreads. |
||
---|---|---|
.. | ||
patch-aa | ||
patch-ac | ||
patch-ad | ||
patch-BLAS_SRC_Makefile |