I have a matrix I know is triangular and I want to diagonalize it. Since the eigenvalues of a triangular matrix are easy to find (just the diagonal values), this problem is theoretically exactly solvable, so I figured that using numpy.linalg.eig
might be overkill and might be slower/not as accurate as it could be. That is, of course, unless eig
is smart enough to work efficiently with a triangular matrix, but I couldn't find any documentation that assures this.
Technically I COULD use numpy.linalg.solve
to solve (A-lambda I)x=0 for each eigenvalue lambda to find the eigenvectors, but I have a decent amount of matrices (~200) and a decent amount of eigenvalues (also ~200) so I feel like this might not be optimal either.
Since the eigenvalues of a triangular matrix are easy to find (just the diagonal values), this problem is theoretically exactly solvable
I don't know if that really helps you. All of the ways I can think of for efficiently calculating eigenvectors also give you the eigenvalues for free.
Honestly, I would suggest just using numpy.linalg.eig(A)
. My testing showed that it is about 10x-20x faster in the case where A is an upper triangular matrix.
Reading the source code, OpenBLAS appears to do a QR factorization to obtain an upper triangular matrix, then use the strevc3_
lapack function to find the eigenvalues and eigenvectors. A QR factorization is going to be very fast for an upper triangular matrix.
Benchmark:
import numpy as np
import scipy.linalg
np.random.seed(42)
for N in [100, 1000]:
print(f"trying matrices of size {N}")
A = np.random.normal(size=(N, N))
print("full matrix")
%timeit np.linalg.eig(A)
%timeit np.linalg.qr(A)
%timeit scipy.linalg.lu_factor(A)
%timeit
print("upper tri")
A = np.random.normal(size=(N, N))
A = np.triu(A)
%timeit np.linalg.eig(A)
%timeit np.linalg.qr(A)
%timeit scipy.linalg.lu_factor(A)
Results:
trying matrices of size 100
full matrix
11.7 ms ± 56.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
737 µs ± 1.55 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
91.7 µs ± 854 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
upper tri
438 µs ± 1.39 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
94.5 µs ± 1.78 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
81.6 µs ± 1.36 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
trying matrices of size 1000
full matrix
1.31 s ± 22.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
94 ms ± 393 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
18.4 ms ± 136 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
upper tri
157 ms ± 1.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
80.1 ms ± 946 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
16.7 ms ± 425 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
As an alternative, you might think about using strevc3_
directly. However, I couldn't find a simple way to do this. Although both SciPy and NumPy include a strevc3_
or dtrevc3
routine internally, neither SciPy nor Numpy have a Python accessible wrapper for it.