A b o u t   M e      |       P r o j e c t s     |       N o t e s       |       T h e   D a y    ︎ ︎

# 1. Localizing Eigenvalues

• It is useful when we want to know the rough location of eigenvalues of a matrix.
• Gershgorin’s Theorem says that all the eigenvalues of a ntimes n matrix A is within n disks. The k-th disk of these n disks is centered at a_{kk} and its radius is sum_{j ne k}|a_{kj}| where a_{ij} is (i, j) element of A.
• Suppose that  x is the eigenvector of A and lambda is the eigenvalue of x such that ||x||_{infty}=1. Then there exists x_k=1 in x.
\begin{align} Ax=\lambda x &=  \begin{pmatrix} \  & \  &  \vdots \\ a_{k1} & \cdots & a_{kk} & \cdots & a_{kn} \\ \  &\  & \vdots \end{pmatrix}  \begin{pmatrix} x_1 \\ \vdots \\ x_k \\ \vdots \\ \ x_n \end{pmatrix} = \lambda  \begin{pmatrix} x_1 \\ \vdots \\ x_k \\ \vdots \\ \ x_n \end{pmatrix} \\ \\ &\Rightarrow a_{k1}x_1 + \cdots + a_{kk}x_k + \cdots + a_{kn}x_n = \lambda x_k \\ \\ &\Rightarrow (\lambda - a_{kk})x_k = \lambda - a_{kk} = \sum_{j \ne k} a_{kj}x_j \\ \\ &\Rightarrow | \lambda - a_{kk} | = \left| \sum_{j \ne k} a_{kj}x_j \right| \leq \sum_{j \ne k} |a_{kj}||x_j| \leq \sum_{j \ne k} |a_{kj}| = \text{radius} \end{align}
• For example, the eigenvalues of A_1 and A_2 are as follows:
\begin{align} A_1= \begin{pmatrix} \color{blue}{4.0} & -0.5 & 0.0 \\ 0.6 & \color{blue}{5.0} & -0.6 \\ 0.0 & 0.5 & \color{blue}{3.0} \end{pmatrix}, \quad A_2= \begin{pmatrix} \color{red}{4.0} & 0.5 & 0.0 \\ 0.6 & \color{red}{5.0} & 0.6 \\ 0.0 & 0.5 & \color{red}{3.0} \end{pmatrix} \end{align}

A_1 and A_2 have the same diagonal elements and sum of absolute values in each row except the diagonal element, so their disks are the same. As above, the blue circles are the eigenvalues of A_1 and the red circles are those of A_2.

# 2. Sensitivity

• It is about the sensitivity of the eigenvectors and eigenvalues to small changes in a ntimes n matrix A.
• Suppose that a ntimes n matrix X=(x_{(1)} cdots x_{(n)}) where x_{(i)} is the eigenvector of A and ntimes n diagonal matrix D=diag(lambda_1 cdots lambda_n) where lambda_i is the eigenvalue corresponding to x_{(i)}. Then AX=XD.
• When all the eigenvectors of A are nondefective, which means they are linearly independent, X^{-1}AX=D. After A goes through some small changes, assume that A turns into A+E where E denotes errors. Then X^{-1}(A+I)X=X^{-1}AX+X^{-1}EX=D+F. Since A+E and D+F are similar, they must have the same eigenvalues. Let mu be a eigenvalue of A+E, then for the eigenvector v corresponding to mu,
\begin{align} (D+F)v=\mu v \Rightarrow v=(\mu I-D)^{-1}Fv.    \end{align}
For this to be true, (mu I-D) should be nonsingular which happens when all its  diagonal elements are not zero. It means that any elements of D should not have mu. Otherwise, D has the eigenvalue mu, so F=0 and E=0 which means there are no erros. However, we consider E ne 0, so (mu I-D) is nonsingular. Accordingly,
\begin{align} \left\| v \right\|_2 \leq \left\| (\mu I-D)^{-1} \right\|_2 \left\| F \right\|_2 \left\| v \right\|_2 \Rightarrow \left\| (\mu I-D)^{-1} \right\|_2^{-1} \leq \left\| F \right\|_2 \end{align}
By definition, || (mu I-D)^{-1} ||_2 is the largest singluar value. Therefore || (mu I-D)^{-1} ||_2=frac{1}{|mu - lambda_k |} where lambda_k is the eigenvalue of D closest to mu.
\begin{align} \left\| (\mu I-D)^{-1} \right\|_2^{-1} &= \left|  \mu - \lambda_k \right| \leq \left\| F \right\|_2 = \left\| X^{-1}EX \right\|_2 \\ &\leq \left\| X^{-1} \right\|_2 \left\| E \right\|_2 \left\| X \right\|_2 = \text{condition number}(X) \left\| E \right\|_2 \end{align}
So, the effect from a small change of A depends on the condition number of X, not the condition number of A.
• Another way to check the sensitivity of eigenvalues and eigenvectors including even when A is defective is using right and left eigenvectors together. Let x and y be right and left eigenvectors, then there exist lambda and mu such that Ax=lambda x and y^tA=mu y^t. It yields that y^tAx=lambda y^tx=mu y^tx, so lambda=mu or y^tx=0. Assume that y^tx ne 0. If A has been changed by an error E and x and lambda are also changed, then
\begin{align} (A+E)(x+\Delta x) &= (\lambda+\Delta \lambda)(x+\Delta x) \\ \\ Ax+Ex+A\Delta x+E\Delta x &= \lambda x+\Delta \lambda x+\lambda \Delta x + \Delta \lambda \Delta x \\ \\ Ex+A\Delta x+E\Delta x &= \Delta \lambda x+\lambda \Delta x + \Delta \lambda \Delta x \\ \\ Ex+A\Delta x &\approx \Delta \lambda x+\lambda \Delta x \end{align}
because E Delta x and Delta lambda Delta x are small enough and negligible. Since lambda=mu, we can add y as follows:
\begin{align}  y^tEx+ y^t A\Delta x &\approx \Delta \lambda y^t x+ \lambda y^t\Delta x \\ \\ y^tEx &\approx \Delta \lambda y^t x \\ \\\Delta \lambda &\approx \frac{y^tEx}{y^tx} \\ \\\left| \Delta \lambda \right|&\lessapprox \frac{\left\| y \right\|_2 \left\| x \right\|_2}{\left\| y^tx \right\|_2} \left\| E \right\|_2 = \frac{1}{\cos \theta} \left\| E \right\|_2 \end{align}
where theta is the angle between x and y. So, it is sensitive as theta increases.
• For example,
\begin{align} A &= \begin{pmatrix} -149 & -50 & -154 \\ 537 & 180 & 546 \\ -27 & -9 & -25 \end{pmatrix} \\ \\ \text{eigenvalues: } &\lambda_1=1, \lambda_2=2, \lambda_3=3 \\ \\ \text{normalized right eigenvectors: } &X=\begin{pmatrix} x_{(1)} & x_{(2)} & x_{(3)} \end{pmatrix}= \begin{pmatrix} 0.316 & 0.404 & 0.139 \\ -0.949 & -0.909 & -0.974 \\ 0.000 & -0.101 & 0.179 \end{pmatrix} \\ \\ \text{normalized left eigenvectors: } &Y=\begin{pmatrix} y_{(1)} & y_{(2)} & y_{(3)} \end{pmatrix}=\begin{pmatrix} 0.681 & -0.676 & -0.688 \\ 0.225 & -0.225 & -0.229 \\ 0.697 & -0.701 & -0.688 \end{pmatrix} \end{align}
First, the condition number of X is 1289, so the eigenvalues of A are sensitive. Second, y_{(1)}^tx_ {(1)} =0.0017, y_ {(2)} ^tx_ {(2)} =0.0025, and y_ {(3)} ^tx_ {(3)} =0.0046, so the angles between x_ {(i)}  and y_ {(i)}  are large. Therefore, it is expected that A is sensitive to small changes. The following are eigenvalues changed from tiny changes of A.
\begin{align} A+E &=\begin{pmatrix} -149 & -50 & -154 \\ 537 & \color{red}{180.01} & 546 \\ -27 & -9 & -25 \end{pmatrix} \Rightarrow \begin{cases} \lambda_1=0.207, \\ \lambda_2=2.301, \\\lambda_3=3.502 \end{cases} \\ \\ A+E &= \begin{pmatrix} -149 & -50 & -154 \\ 537 & \color{red}{179.99} & 546 \\ -27 & -9 & -25 \end{pmatrix} \Rightarrow \begin{cases} \lambda_1=1.664+1.054i, \\ \lambda_2=1.664-1.054i, \\ \lambda_3=2.662 \end{cases} \end{align}
It shows the large changes of eigenvalues about the small changes of A.

# 3. Properties

• Suppose that x is the eigenvector of a ntimes n matrix A and lambda is the eigenvalue corresponding to x.
• A has the eigenvalue which is zero.
Leftrightarrow There exists x ne 0 such that Ax=0
Leftrightarrow dim Ker A > 0  Leftrightarrow A is singular
• For alpha ne 0 in mathbb{R}, alpha x is also the eigenvector of A.
• x is the eigenvector of alpha A for alpha in mathbb{R}, and its eigenvalue is alpha lambda.
• x is the eigenvector of A+alpha I for alpha in mathbb{R}, and its eigenvalue is  lambda +alpha.
• For k in mathbb{N}, x is the eigenvector of A^k, and its eigenvalue is lambda^k.
• If A is invertible, x is the eigenvector of A^{-1}, and its eigenvalue is frac{1}{lambda}.
• For a diagonal matrix D=diag(a_1 cdots a_n), its eigenvalues are a_1, cdots, a_n and its eigenvectors are e_1, cdots, e_n where all components of a standard basis vector e_i in mathbb{R^n} are 0 except the i-th element which is 1.
• For upper or lower triangular matrices, the eigenvalues are their diagonal elements.
• For a ntimes n nonsingular matrix S, S^{-1}x is the eigenvector of S^{-1}AS and the eigenvalue corresponding to S^{-1}x is lambda. Eigenvalues are not changed by similar transformations.
• det A=lambda_1 cdots lambda_n.
• For k leq n, if eigenvalues lambda_1, cdots, lambda_k are distint, the eigenvectors x_{(1)}, cdots, x_{(n)} corresponding to them are linearly independent.
• If all the eigenvalues are distint, all the eigenvectors are linearly independent, so X=(x_{(1)}, cdots, x_{(n)}) is nonsingular and diagonalizable as X^{-1}AX=diag(lambda_1 cdots lambda_n). However, although X is diagonalizable, its all eigenvalues may not be distint such A=I. Moreover, although the eigenvalues of A are not unique, A may be diagonalizable. For example,
\begin{align} A &=\begin{pmatrix} 3 & -1 & 1 \\ 0 & 2 & 1 \\ 0 & 0 & 3 \end{pmatrix} ,\begin{cases} \lambda_1=2, \\\lambda_2=3, \\\lambda_3=3 \end{cases}\\ \\ &\Rightarrow X= \begin{pmatrix} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 0 & 1 \end{pmatrix} \text{: nonsingular}\end{align}
• Singular values of A are the nonnegative square roots of eigenvalues of A^tA.

︎Reference

 Michael T. Heath, Scientific Computing: An Introductory Survey. 2nd Edition, McGraw-Hill Higher Education.
 Hiraoka Kazuyuki, Hori Gen, Programming No Tame No Senkei Daisu, Ohmsha.

emoy.net