Small Residuals Do Not Imply Small Errors
Consider the following linear system:
\begin{align} Ax= \begin{pmatrix} 0.913 & 0.659 \\ 0.457 & 0.330 \end{pmatrix} x= \begin{pmatrix} 0.254 \\ 0.127 \end{pmatrix}= b. \end{align} Let the estimated solutions be `hat{x}_1` and `hat{x}_2`, \begin{align} \hat{x}_1 = \begin{pmatrix} -0.0827 \\ 0.5 \end{pmatrix}, \quad \hat{x}_2 = \begin{pmatrix} 0.999 \\ -1.001 \end{pmatrix}, \end{align} and its residuals are \begin{align} \left\| r_1 \right\|_1 &= \left\| b-A\hat{x}_1 \right\|_1 =2.1\times 10^{-4} \\ \left\| r_2 \right\|_1 &= \left\| b-A\hat{x}_2 \right\|_1 =2.4\times 10^{-2}. \end{align}
Since `||r_1||<||r_2||`, it seems that `hat{x}_1` is the optimal solution. Considering the real solution, however, is `x=(1-, 1)^t`, it makes more sense that the optimal solution would be `hat{x}_2`.
This situation happens because `A` is close to singular. Therefore, when `A` is ill-conditioned, which means the condition number of `A` is large`(>10^4)`, this can happen.
![]()
When `A` is close to singular, a line in the original space is almost suppresed to a point in the objective space. So, `hat{x}_2` which is close to the optimal solution `x` in the original space may be mapped further than `x`.
\begin{align} Ax= \begin{pmatrix} 0.913 & 0.659 \\ 0.457 & 0.330 \end{pmatrix} x= \begin{pmatrix} 0.254 \\ 0.127 \end{pmatrix}= b. \end{align} Let the estimated solutions be `hat{x}_1` and `hat{x}_2`, \begin{align} \hat{x}_1 = \begin{pmatrix} -0.0827 \\ 0.5 \end{pmatrix}, \quad \hat{x}_2 = \begin{pmatrix} 0.999 \\ -1.001 \end{pmatrix}, \end{align} and its residuals are \begin{align} \left\| r_1 \right\|_1 &= \left\| b-A\hat{x}_1 \right\|_1 =2.1\times 10^{-4} \\ \left\| r_2 \right\|_1 &= \left\| b-A\hat{x}_2 \right\|_1 =2.4\times 10^{-2}. \end{align}
Since `||r_1||<||r_2||`, it seems that `hat{x}_1` is the optimal solution. Considering the real solution, however, is `x=(1-, 1)^t`, it makes more sense that the optimal solution would be `hat{x}_2`.
This situation happens because `A` is close to singular. Therefore, when `A` is ill-conditioned, which means the condition number of `A` is large`(>10^4)`, this can happen.

When `A` is close to singular, a line in the original space is almost suppresed to a point in the objective space. So, `hat{x}_2` which is close to the optimal solution `x` in the original space may be mapped further than `x`.
︎Reference
[1] Michael T. Heath, Scientific Computing: An Introductory Survey. 2nd Edition, McGraw-Hill Higher Education.
[1] Michael T. Heath, Scientific Computing: An Introductory Survey. 2nd Edition, McGraw-Hill Higher Education.
emoy.net