* Linear Least Squares / Minimum Norm solution
* Symmetric-indefinite Factorization: Aasen’s tridiagonalization
* Symmetric-indefinite Factorization: New storage format for L factor in Rook Pivoting and Bunch Kaufman of LDLT
* Symmetric eigenvalue problem: Two-stage algorithm for reduction to tridiagonal form
* Improved Complex Jacobi SVD
* LAPACKE interfaces
LAPACK 3.6.1: What’s new
[Mark Gates, UTK] blocked back-transformation for the non-symmetric eigenvalue problem
It blocks NB gemv calls into one gemm call inside trevc. To do
that, it needs a new routine, trevc3, because unfortunately the
lwork was not passed into trevc. Attached is the performance speedup
for dgeev. It gives a nice 1.5x speedup for N=20000, and that
appears to still be increasing with N. This is not the improvements
that Greg Henry recently provided for doing the triangular solves
as BLAS-3 instead of BLAS-1. That will take a while to process,
but we expect another, even larger increase in performance when
those changes are applied. This also does not include doing multiple
(BLAS-1) triangular solves in parallel, which is available in MAGMA,
since that requires OpenMP or pthreads.
* added Symmetric/Hermitian LDLT factorization routines with rook pivoting algorithm
* 2-by-1 CSD to be used for tall and skinny matrix with orthonormal columns (in LAPCK 3.4.0, we already integrated CSD of a full square orthogonal matrix)
* New stopping criteria for balancing.