Skip to main content

Section 38.4 Theory

Subsection 38.4.1 Properties of orthogonal projection

First, we formally state the existence of projection vectors. As discussed in Subsection 38.1.1, this fact follows immediately from combining Proposition 28.6.8 with Corollary 37.5.19.

Next, we'll formally record our formulas for projection onto \(U\) and onto \(\orthogcmp{U}\) from Subsection 38.1.1, again without proof.

The following properties of orthogonal projection echo those from Proposition 14.5.2.

Subsection 38.4.2 Best approximation is best

Because of Pythagoras, the orthogonal projection \(\proj_U \uvec{v}\) is the vector in \(U\) that is at the smallest distance from \(\uvec{v}\text{.}\)

First, if \(\uvec{v}\) is in \(U\text{,}\) then using Rule 4 of Proposition 38.4.3 it is obvious that \(\proj_U \uvec{v} = \uvec{v}\) is the unique vector in \(U\) that is closest to \(\uvec{v}\text{.}\)

So assume \(\uvec{v}\) is not in \(U\text{,}\) and suppose \(\uvec{u} \neq \proj_U \uvec{v}\) in \(U\text{.}\) Then the three vectors \(\uvec{v}, \uvec{u}, \proj_U \uvec{v}\) form a “right triangle” to which we can apply the Pythagorean formula (Theorem 37.5.4).

Diagram illustrating an orthogonal projection in \(\R^3\) as a best approximation.

In this “right triangle”, the vector \(\uvec{v} - \uvec{u}\) is the “hypotenuse”, \(\uvec{v} - \proj_U \uvec{v}\) is the component of \(\uvec{v}\) orthogonal to \(U\text{,}\) \(\proj_U \uvec{v} - \uvec{u}\) is parallel to \(U\text{,}\) and we have

\begin{equation*} \uvec{v} - \uvec{u} = (\uvec{v} - \proj_U \uvec{v}) + (\proj_U \uvec{v} - \uvec{u}) \text{.} \end{equation*}

The two vectors in the sum on the right, being orthogonal and parallel to \(U\text{,}\) respectively, must be orthogonal to one another. So we may apply Theorem 37.5.4 to get

\begin{align*} \bigl[\dist (\uvec{v}, \uvec{u})\bigr]^2 \amp = \norm{\uvec{v} - \uvec{u}}^2\\ \amp = \norm{\uvec{v} - \proj_U \uvec{v}}^2 + \norm{\proj_U \uvec{v} - \uvec{u}}^2\\ \amp = \bigl[\dist(\uvec{v},\proj_U \uvec{v})\bigr]^2 + \bigl[\dist(\proj_U \uvec{v},\uvec{u})\bigr]^2\text{.} \end{align*}

Since we have assumed \(\uvec{u} \neq \proj_U \uvec{v}\text{,}\) we have

\begin{equation*} \bigl[\dist(\proj_U \uvec{v},\uvec{u})\bigr]^2 \gt 0 \text{,} \end{equation*}

so

\begin{equation*} \bigl[\dist (\uvec{v}, \uvec{u})\bigr]^2 \gt \bigl[\dist(\uvec{v},\proj_U \uvec{v})\bigr]^2 \text{.} \end{equation*}

But nonnegative numbers \(x_1,x_2\) satisfying \(x_1^2 \gt x_2^2\) must also satisfy \(x_1 \gt x_2\text{,}\) hence

\begin{equation*} \dist (\uvec{v}, \uvec{u}) \gt \dist(\uvec{v},\proj_U \uvec{v}) \text{,} \end{equation*}

as desired.

Looking back at our definition of distance between a vector \(\uvec{v}\) and a subspace \(U\), the preceding theorem tells us the value of such a distance.

Subsection 38.4.3 Normal system is consistent

Finally, we record the fact that there are always solutions to the normal system associated to a linear system, even if the original system is inconsistent.

Let \(U\) represent the column space of \(A\text{.}\) As discussed in Subsection 38.1.4, every vector in \(U\) is of the form \(A \uvec{x}\) for some column vector \(\uvec{x}\text{.}\) So the system

\begin{equation*} A \uvec{x} = \proj_U \uvec{b} \end{equation*}

is consistent, since \(\proj_U \uvec{b}\) lies in \(U\text{.}\) But then any solution to this system will also solve the normal system

\begin{equation*} \utrans{A} A \uvec{x} = \utrans{A} \uvec{b} \text{,} \end{equation*}

as discussed in Subsection 38.1.4.