Section 38.4 Theory
Subsection 38.4.1 Properties of orthogonal projection
First, we formally state the existence of projection vectors. As discussed in Subsection 38.1.1, this fact follows immediately from combining Proposition 28.6.8 with Corollary 37.5.19.
Theorem 38.4.1.
Suppose \(U\) is a subspace in an inner product space \(V\text{.}\) Then every vector in \(V\) can be decomposed uniquely into a sum
\begin{equation*}
\uvec{v} = \uvec{u} + \uvec{u}' \text{,}
\end{equation*}
where \(\uvec{u}\) is in \(U\) and \(\uvec{u}'\) is in \(\orthogcmp{U}\text{.}\)
Next, we’ll formally record our formulas for projection onto \(U\) and onto \(\orthogcmp{U}\) from Subsection 38.1.1, again without proof.
Theorem 38.4.2.
Suppose \(U\) is a subspace of a finite-dimensional inner product space \(V\text{,}\)
\begin{equation*}
\basisfont{B}_U = \{ \uvec{e}_1, \dotsc, \uvec{e}_\ell \}
\end{equation*}
is an orthogonal basis for \(U\text{,}\) and
\begin{equation*}
\basisfont{B}_V = \{ \uvec{e}_1, \dotsc, \uvec{e}_\ell, \uvec{e}_{\ell + 1}, \dotsc, \uvec{e}_n \}
\end{equation*}
is an enlargement to an orthogonal basis for \(V\text{.}\) Then we have the following formulas.
-
For \(\uvec{v}\) in \(V\text{,}\)\begin{align*} \proj_U \uvec{v} \amp = \frac{\inprod{\uvec{v}}{\uvec{e}_1}}{\norm{\uvec{e}_1}^2} \, \uvec{e}_1 + \dotsb + \frac{\inprod{\uvec{v}}{\uvec{e}_\ell}}{\norm{\uvec{e}_\ell}^2} \, \uvec{e}_\ell \text{,}\\ \\ \proj_{\orthogcmp{U}} \uvec{v} \amp = \frac{\inprod{\uvec{v}}{\uvec{e}_{\ell + 1}}}{\norm{\uvec{e}_{\ell + 1}}^2} \, \uvec{e}_{\ell + 1} + \dotsb + \frac{\inprod{\uvec{v}}{\uvec{e}_n}}{\norm{\uvec{e}_n}^2} \, \uvec{e}_n\text{.} \end{align*}
-
If \(\basisfont{B}_V\) is orthonormal, then\begin{align*} \proj_U \uvec{v} \amp = \inprod{\uvec{v}}{\uvec{e}_1} \, \uvec{e}_1 + \dotsb + \inprod{\uvec{v}}{\uvec{e}_\ell} \, \uvec{e}_\ell \text{,}\\ \\ \proj_{\orthogcmp{U}} \uvec{v} \amp = \inprod{\uvec{v}}{\uvec{e}_{\ell + 1}} \, \uvec{e}_{\ell + 1} + \dotsb + \inprod{\uvec{v}}{\uvec{e}_n} \, \uvec{e}_n\text{.} \end{align*}
The following properties of orthogonal projection echo those from Proposition 14.5.2.
Proposition 38.4.3. Properties of orthogonal projection.
Suppose \(U\) is a subspace of an inner product space \(V\text{,}\) \(\uvec{v}\) and \(\uvec{w}\) are vectors in \(V\text{,}\) and \(k\) is a scalar. Then the following hold.
-
\(\proj_U \zerovec = \zerovec\text{.}\)
-
\(\proj_U (k\uvec{v}) = k (\proj_U \uvec{v})\text{.}\)
-
\(\proj_U (\uvec{v} + \uvec{w}) = \proj_U \uvec{v} + \proj_U \uvec{w}\text{.}\)
-
If \(\uvec{v}\) is in \(U\text{,}\) then \(\proj_U \uvec{v} = \uvec{v}\text{.}\)
-
\(\proj_U (\proj_U \uvec{v}) = \proj_U \uvec{v} \text{.}\)
-
If \(\uvec{v}\) is in \(\orthogcmp{U}\text{,}\) then \(\proj_U \uvec{v} = \zerovec\text{.}\)
-
\(\proj_{\orthogcmp{U}} (\proj_U \uvec{v}) = \zerovec \text{.}\)
-
\(\proj_U (\proj_{\orthogcmp{U}} \uvec{v}) = \zerovec \text{.}\)
-
If \(\basisfont{B}_U = \{ \uvec{e}_1, \dotsc, \uvec{e}_\ell \} \) is an orthogonal basis for \(U\text{,}\) then\begin{equation*} \norm{\proj_U \uvec{v}} = \sqrt{ \frac{\abs{\inprod{\uvec{v}}{\uvec{e}_1}}^2}{\norm{\uvec{e}_1}^2} + \dotsb + \frac{\abs{\inprod{\uvec{v}}{\uvec{e}_\ell}}^2}{\norm{\uvec{e}_\ell}^2} }\text{,} \end{equation*}where the absolute values are redundant in the real context, but indicate modulus in the complex context.
Proof idea.
Most of these statements follow immediately from Theorem 38.4.1 or from previous statements in the present proposition. We leave the details to you, the reader, but here are two hints.
Statement 1.
Statement 9.
Similarly to the proof of Statement 2 of Proposition 37.5.6, apply Theorem 37.5.4 to an expansion for \(\uvec{v}\) relative to \(\basisfont{B}\) (Theorem 38.4.2), first noting Proposition 37.5.3.
Subsection 38.4.2 Best approximation is best
Because of Pythagoras, the orthogonal projection \(\proj_U \uvec{v}\) is the vector in \(U\) that is at the smallest distance from \(\uvec{v}\text{.}\)
Theorem 38.4.4.
Suppose \(U\) is a subspace of a finite-dimensional inner product space \(V\text{,}\) and \(\uvec{v}\) is a vector in \(V\text{.}\) Then \(\proj_U \uvec{v}\) is the unique vector in \(U\) at minimum distance from \(\uvec{v}\text{.}\) That is,
\begin{equation*}
\dist(\uvec{v}, \proj_U \uvec{v}) \lt \dist(\uvec{v}, \uvec{u})
\end{equation*}
for every \(\uvec{u} \neq \proj_U \uvec{v}\) in \(U\text{.}\)
Proof.
First, if \(\uvec{v}\) is in \(U\text{,}\) then using Rule 4 of Proposition 38.4.3 it is obvious that \(\proj_U \uvec{v} = \uvec{v}\) is the unique vector in \(U\) that is closest to \(\uvec{v}\text{.}\)
So assume \(\uvec{v}\) is not in \(U\text{,}\) and suppose \(\uvec{u} \neq \proj_U \uvec{v}\) in \(U\text{.}\) Then the three vectors \(\uvec{v}, \uvec{u}, \proj_U \uvec{v}\) form a “right triangle” to which we can apply the Pythagorean formula (Theorem 37.5.4).
A diagram in three-dimensions illustrating how an orthogonal projection onto a subspace relates to the minimum distance between the vector being projected and the subspace. A parallelogram with a shaded-in interior is drawn. The interior of this parallelogram should be imagined as if it is a two-dimensional, solid, rectangular surface suspended within a three-dimensional space (similar to a tabletop “suspended” above the floor in a room), but viewed at an angle from above. Embedded within this two-dimensional surface is a point representing the zero vector, and the shaded surface is labelled as representing a portion of a plane subspace \(U\) of \(\R^3\text{.}\)
A directed line segment representing a vector \(\uvec{v}\) is drawn with initial point at the point zero vector, but rising up out of the shaded surface at an acute angle. Another directed line segment runs along (and parallel to) the shaded surface from the point zero vector to the point directly below the terminal point of \(uvec{v}\text{,}\) and is labelled as representing \(\proj_U \uvec{v}\text{.}\) A dashed line is drawn between the terminal points of these two vectors, so that it meets the shaded surface at the terminal point of the projection vector at a right angle. This dashed line is labelled as representing the minimum distance from the terminal point of \(\uvec{v}\) to the plane.
An arbitrary third point is plotted in the shaded surface, and a dashed line is drawn from it to each of the terminal points of the two vectors, creating a right triangle where the hypotenuse is the dashed line from this arbitrary point to the terminal point of \(\uvec{v}\text{.}\) Finally, a directed line segment labelled as representing a vector \(\uvec{u}\) is drawn from the point zero vector to this new point.
In this “right triangle”, the vector \(\uvec{v} - \uvec{u}\) is the “hypotenuse”, \(\uvec{v} - \proj_U \uvec{v}\) is the component of \(\uvec{v}\) orthogonal to \(U\text{,}\) \(\proj_U \uvec{v} - \uvec{u}\) is parallel to \(U\text{,}\) and we have
\begin{equation*}
\uvec{v} - \uvec{u} = (\uvec{v} - \proj_U \uvec{v}) + (\proj_U \uvec{v} - \uvec{u}) \text{.}
\end{equation*}
The two vectors in the sum on the right, being orthogonal and parallel to \(U\text{,}\) respectively, must be orthogonal to one another. So we may apply Theorem 37.5.4 to get
\begin{align*}
\bigl[\dist (\uvec{v}, \uvec{u})\bigr]^2
\amp = \norm{\uvec{v} - \uvec{u}}^2\\
\amp = \norm{\uvec{v} - \proj_U \uvec{v}}^2 + \norm{\proj_U \uvec{v} - \uvec{u}}^2\\
\amp = \bigl[\dist(\uvec{v},\proj_U \uvec{v})\bigr]^2 + \bigl[\dist(\proj_U \uvec{v},\uvec{u})\bigr]^2\text{.}
\end{align*}
Since we have assumed \(\uvec{u} \neq \proj_U \uvec{v}\text{,}\) we have
\begin{equation*}
\bigl[\dist(\proj_U \uvec{v},\uvec{u})\bigr]^2 \gt 0 \text{,}
\end{equation*}
so
\begin{equation*}
\bigl[\dist (\uvec{v}, \uvec{u})\bigr]^2 \gt \bigl[\dist(\uvec{v},\proj_U \uvec{v})\bigr]^2 \text{.}
\end{equation*}
But nonnegative numbers \(x_1,x_2\) satisfying \(x_1^2 \gt x_2^2\) must also satisfy \(x_1 \gt x_2\text{,}\) hence
\begin{equation*}
\dist (\uvec{v}, \uvec{u}) \gt \dist(\uvec{v},\proj_U \uvec{v}) \text{,}
\end{equation*}
as desired.
Looking back at our definition of distance between a vector \(\uvec{v}\) and a subspace \(U\), the preceding theorem tells us the value of such a distance.
Corollary 38.4.5. Distance between a vector and a subspace.
For vector \(\uvec{v}\) and subspace \(U\) in an inner product space,
\begin{equation*}
\dist (\uvec{v}, U) = \dist(\uvec{v},\proj_U \uvec{v}) = \norm{\uvec{v} - \proj_U \uvec{v}} = \norm{\proj_{\orthogcmp{U}} \uvec{v}} \text{.}
\end{equation*}
Subsection 38.4.3 Normal system is consistent
Finally, we record the fact that there are always solutions to the normal system associated to a linear system, even if the original system is inconsistent.
Theorem 38.4.6.
For system \(A \uvec{x} = \uvec{b}\text{,}\) the associated normal system
\begin{equation*}
\utrans{A} A \uvec{x} = \utrans{A} \uvec{b}
\end{equation*}
is always consistent.
Proof outline.
Let \(U\) represent the column space of \(A\text{.}\) As discussed in Subsection 38.1.4, every vector in \(U\) is of the form \(A \uvec{x}\) for some column vector \(\uvec{x}\text{.}\) So the system
\begin{equation*}
A \uvec{x} = \proj_U \uvec{b}
\end{equation*}
is consistent, since \(\proj_U \uvec{b}\) lies in \(U\text{.}\) But then any solution to this system will also solve the normal system
\begin{equation*}
\utrans{A} A \uvec{x} = \utrans{A} \uvec{b} \text{,}
\end{equation*}
as discussed in Subsection 38.1.4.

