LAPACK Working Note 172: Benefits of IEEE-754 Features in Modern Symmetric Tridiagonal Eigensolvers
Marques, Osni A.
Riedy, E. Jason
Technical Report Identifier: CSD-05-1414
September 30, 2005
Abstract: Bisection is one of the most common methods used to compute the eigenvalues of symmetric tridiagonal matrices. Bisection relies on the Sturm count: for a given shift sigma, the number of negative pivots in the factorization T - sigma I = LDL^T equals the number of eigenvalues of T that are smaller than sigma. In IEEE-754 arithmetic, the value infinity permits the computation to continue past a zero pivot, producing a correct Sturm count when T is unreduced. Demmel and Li showed in the 90s that using infinity rather than testing for zero pivots within the loop could improve performance significantly on certain architectures.
When eigenvalues are to be computed to high relative accuracy, it is often preferable to work with LDL^T factorizations instead of the original tridiagonal T, see for example the MRRR algorithm. In these cases, the Sturm count has to be computed from LDL^T. The differential stationary and progressive qds algorithms are the methods of choice.
While it seems trivial to replace T by LDL^T, in reality these algorithms are more complicated: in IEEE-754 arithmetic, a zero pivot produces an overflow, followed by an invalid exception (NaN), that renders the Sturm count incorrect.
We present alternative, safe formulations that are guaranteed to produce the correct result.
Benchmarking these algorithms on a variety of platforms shows that the original formulation without tests is always faster provided no exception occurs. The transforms see speed-ups of up to 2.6 times over the careful formulations.
Tests on industrial matrices show that encountering exceptions in practice is rare. This leads to the following design: First, compute the Sturm count by the fast but unsafe algorithm. Then, if an exception occurred, recompute the count by a safe, slower alternative.
The improved Sturm count algorithms improve the speed of bisection by up to 2 times on our test matrices. Furthermore, unlike the traditional tiny-pivot substitution, proper use of IEEE-754 features provides a careful formulation that imposes no input range restrictions.