P r o j e c t s

          N o t e s

         I n f o r m a t i o n
         g i t h u b

Some Facts about Eigenvalues

1. Localizing Eigenvalues

  • It is useful when we want to know the rough location of eigenvalues of a matrix.
  • Gershgorin’s Theorem says that all the eigenvalues of a `ntimes n` matrix `A` is within `n` disks. The `k`-th disk of these `n` disks is centered at `a_{kk}` and its radius is `sum_{j ne k}|a_{kj}|` where `a_{ij}` is `(i, j)` element of `A`. 
  • Suppose that  `x` is the eigenvector of `A` and `lambda` is the eigenvalue of `x` such that `||x||_{infty}=1`. Then there exists `x_k=1` in `x`.
\begin{align} Ax=\lambda x &= 
\begin{pmatrix} \  & \  &  \vdots \\ a_{k1} & \cdots & a_{kk} & \cdots & a_{kn} \\ \  &\  & \vdots \end{pmatrix} 
\begin{pmatrix} x_1 \\ \vdots \\ x_k \\ \vdots \\ \ x_n \end{pmatrix} = \lambda 
\begin{pmatrix} x_1 \\ \vdots \\ x_k \\ \vdots \\ \ x_n \end{pmatrix} \\ \\ &\Rightarrow
a_{k1}x_1 + \cdots + a_{kk}x_k + \cdots + a_{kn}x_n = \lambda x_k \\ \\ &\Rightarrow
(\lambda - a_{kk})x_k = \lambda - a_{kk} = \sum_{j \ne k} a_{kj}x_j \\ \\ &\Rightarrow
| \lambda - a_{kk} | = \left| \sum_{j \ne k} a_{kj}x_j \right| \leq \sum_{j \ne k} |a_{kj}||x_j| \leq \sum_{j \ne k} |a_{kj}| = \text{radius}
  • For example, the eigenvalues of `A_1` and `A_2` are as follows:
\begin{align} A_1=
\begin{pmatrix} \color{blue}{4.0} & -0.5 & 0.0 \\ 0.6 & \color{blue}{5.0} & -0.6 \\ 0.0 & 0.5 & \color{blue}{3.0} \end{pmatrix}, \quad A_2=
\begin{pmatrix} \color{red}{4.0} & 0.5 & 0.0 \\ 0.6 & \color{red}{5.0} & 0.6 \\ 0.0 & 0.5 & \color{red}{3.0} \end{pmatrix}

   `A_1` and `A_2` have the same diagonal elements and sum of absolute values in each row except the diagonal element, so their disks are the same. As above, the blue circles are the eigenvalues of `A_1` and the red circles are those of `A_2`.

2. Sensitivity

  • It is about the sensitivity of the eigenvectors and eigenvalues to small changes in a `ntimes n` matrix `A`.
  • Suppose that a `ntimes n` matrix `X=(x_{(1)} cdots x_{(n)})` where `x_{(i)}` is the eigenvector of `A` and `ntimes n` diagonal matrix `D=diag(lambda_1 cdots lambda_n)` where `lambda_i` is the eigenvalue corresponding to `x_{(i)}`. Then `AX=XD`.
  • When all the eigenvectors of `A` are nondefective, which means they are linearly independent, `X^{-1}AX=D`. After `A` goes through some small changes, assume that `A` turns into `A+E` where `E` denotes errors. Then `X^{-1}(A+I)X=``X^{-1}AX+X^{-1}EX``=D+F`. Since `A+E` and `D+F` are similar, they must have the same eigenvalues. Let `mu` be a eigenvalue of `A+E`, then for the eigenvector `v` corresponding to `mu`,
\begin{align} (D+F)v=\mu v \Rightarrow v=(\mu I-D)^{-1}Fv.   
   For this to be true, `(mu I-D)` should be nonsingular which happens when all its  diagonal elements are not zero. It means that any elements of `D` should not have `mu`. Otherwise, `D` has the eigenvalue `mu`, so `F=0` and `E=0` which means there are no erros. However, we consider `E ne 0`, so `(mu I-D)` is nonsingular. Accordingly,
\begin{align} \left\| v \right\|_2 \leq \left\| (\mu I-D)^{-1} \right\|_2 \left\| F \right\|_2 \left\| v \right\|_2 \Rightarrow \left\| (\mu I-D)^{-1} \right\|_2^{-1} \leq \left\| F \right\|_2
   By definition, `|| (mu I-D)^{-1} ||_2` is the largest singluar value. Therefore `|| (mu I-D)^{-1} ||_2=``frac{1}{|mu - lambda_k |}` where `lambda_k` is the eigenvalue of `D` closest to `mu`.
\begin{align} \left\| (\mu I-D)^{-1} \right\|_2^{-1} &= \left|  \mu - \lambda_k \right| \leq \left\| F \right\|_2 = \left\| X^{-1}EX \right\|_2 \\ &\leq \left\| X^{-1} \right\|_2 \left\| E \right\|_2 \left\| X \right\|_2 = \text{condition number}(X) \left\| E \right\|_2
   So, the effect from a small change of `A` depends on the condition number of `X`, not the condition number of `A`.
  • Another way to check the sensitivity of eigenvalues and eigenvectors including even when `A` is defective is using right and left eigenvectors together. Let `x` and `y` be right and left eigenvectors, then there exist `lambda` and `mu` such that `Ax=lambda x` and `y^tA=mu y^t`. It yields that `y^tAx=lambda y^tx=mu y^tx`, so `lambda=mu` or `y^tx=0`. Assume that `y^tx ne 0`. If `A` has been changed by an error `E` and `x` and `lambda` are also changed, then
\begin{align} (A+E)(x+\Delta x) &= (\lambda+\Delta \lambda)(x+\Delta x) \\ \\
Ax+Ex+A\Delta x+E\Delta x &= \lambda x+\Delta \lambda x+\lambda \Delta x + \Delta \lambda \Delta x \\ \\
Ex+A\Delta x+E\Delta x &= \Delta \lambda x+\lambda \Delta x + \Delta \lambda \Delta x \\ \\
Ex+A\Delta x &\approx \Delta \lambda x+\lambda \Delta x
   because `E Delta x` and `Delta lambda Delta x` are small enough and negligible. Since `lambda=mu`, we can add `y` as follows:
y^tEx+ y^t A\Delta x &\approx \Delta \lambda y^t x+ \lambda y^t\Delta x \\ \\
y^tEx &\approx \Delta \lambda y^t x \\ \\
\Delta \lambda &\approx \frac{y^tEx}{y^tx} \\ \\
\left| \Delta \lambda \right|&\lessapprox \frac{\left\| y \right\|_2 \left\| x \right\|_2}{\left\| y^tx \right\|_2} \left\| E \right\|_2 = \frac{1}{\cos \theta} \left\| E \right\|_2
   where `theta` is the angle between `x` and `y`. So, it is sensitive as `theta` increases.
  • For example, 
\begin{align} A &=
\begin{pmatrix} -149 & -50 & -154 \\ 537 & 180 & 546 \\ -27 & -9 & -25 \end{pmatrix} \\ \\
\text{eigenvalues: } &\lambda_1=1, \lambda_2=2, \lambda_3=3 \\ \\
\text{normalized right eigenvectors: } &X=
\begin{pmatrix} x_{(1)} & x_{(2)} & x_{(3)} \end{pmatrix}=
\begin{pmatrix} 0.316 & 0.404 & 0.139 \\ -0.949 & -0.909 & -0.974 \\ 0.000 & -0.101 & 0.179 \end{pmatrix} \\ \\
\text{normalized left eigenvectors: } &Y=
\begin{pmatrix} y_{(1)} & y_{(2)} & y_{(3)} \end{pmatrix}=
\begin{pmatrix} 0.681 & -0.676 & -0.688 \\ 0.225 & -0.225 & -0.229 \\ 0.697 & -0.701 & -0.688 \end{pmatrix}
   First, the condition number of `X` is `1289`, so the eigenvalues of `A` are sensitive. Second, `y_{(1)}^tx_ {(1)} =0.0017`, `y_ {(2)} ^tx_ {(2)} =0.0025`, and `y_ {(3)} ^tx_ {(3)} =0.0046`, so the angles between `x_ {(i)} ` and `y_ {(i)} ` are large. Therefore, it is expected that `A` is sensitive to small changes. The following are eigenvalues changed from tiny changes of `A`.
\begin{align} A+E &=
\begin{pmatrix} -149 & -50 & -154 \\ 537 & \color{red}{180.01} & 546 \\ -27 & -9 & -25 \end{pmatrix} \Rightarrow
\lambda_1=0.207, \\
\lambda_2=2.301, \\
\end{cases} \\ \\ A+E &=
\begin{pmatrix} -149 & -50 & -154 \\ 537 & \color{red}{179.99} & 546 \\ -27 & -9 & -25 \end{pmatrix} \Rightarrow
\lambda_1=1.664+1.054i, \\
\lambda_2=1.664-1.054i, \\
   It shows the large changes of eigenvalues about the small changes of `A`.

3. Properties

  • Suppose that `x` is the eigenvector of a `ntimes n` matrix `A` and `lambda` is the eigenvalue corresponding to `x`.
  • `A` has the eigenvalue which is zero.
    `Leftrightarrow` There exists `x ne 0` such that `Ax=0`
    `Leftrightarrow` `dim Ker A > 0`  `Leftrightarrow` `A` is singular
  • For `alpha ne 0 in mathbb{R}`, `alpha x` is also the eigenvector of `A`.
  • `x` is the eigenvector of `alpha A` for `alpha in mathbb{R}`, and its eigenvalue is `alpha lambda`.
  • `x` is the eigenvector of `A+alpha I` for `alpha in mathbb{R}`, and its eigenvalue is ` lambda +alpha`.
  • For `k in mathbb{N}`, `x` is the eigenvector of `A^k`, and its eigenvalue is `lambda^k`.
  • If `A` is invertible, `x` is the eigenvector of `A^{-1}`, and its eigenvalue is `frac{1}{lambda}`.
  • For a diagonal matrix `D=diag(a_1 cdots a_n)`, its eigenvalues are `a_1, cdots, a_n` and its eigenvectors are `e_1, cdots, e_n` where all components of a standard basis vector `e_i in mathbb{R^n}` are `0` except the `i`-th element which is `1`.
  • For upper or lower triangular matrices, the eigenvalues are their diagonal elements.
  • For a `ntimes n` nonsingular matrix `S`, `S^{-1}x` is the eigenvector of `S^{-1}AS` and the eigenvalue corresponding to `S^{-1}x` is `lambda`. Eigenvalues are not changed by similar transformations.
  • `det A=lambda_1 cdots lambda_n`.
  • For `k leq n`, if eigenvalues `lambda_1, cdots, lambda_k` are distint, the eigenvectors `x_{(1)}, cdots, x_{(n)}` corresponding to them are linearly independent.
  • If all the eigenvalues are distint, all the eigenvectors are linearly independent, so `X=(x_{(1)}, cdots, x_{(n)})` is nonsingular and diagonalizable as `X^{-1}AX=``diag(lambda_1 cdots lambda_n)`. However, although `X` is diagonalizable, its all eigenvalues may not be distint such `A=I`. Moreover, although the eigenvalues of `A` are not unique, `A` may be diagonalizable. For example, 
\begin{align} A &=
\begin{pmatrix} 3 & -1 & 1 \\ 0 & 2 & 1 \\ 0 & 0 & 3 \end{pmatrix} ,
\lambda_1=2, \\
\lambda_2=3, \\
\end{cases}\\ \\
&\Rightarrow X=
\begin{pmatrix} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 0 & 1 \end{pmatrix} \text{: nonsingular}
  • Singular values of `A` are the nonnegative square roots of eigenvalues of `A^tA`.


[1] Michael T. Heath, Scientific Computing: An Introductory Survey. 2nd Edition, McGraw-Hill Higher Education.
[2] Hiraoka Kazuyuki, Hori Gen, Programming No Tame No Senkei Daisu, Ohmsha.