Section 38.1 Concepts
Subsection 38.1.1 Orthogonal projection
In a finite-dimensional inner product space \(V\text{,}\) a subspace \(U\) and its orthogonal complement \(\orthogcmp{U}\) form a complete set of independent subspaces (Corollary 37.5.19). So every vector \(\uvec{v}\) in \(V\) can be decomposed uniquely as into a sum of two vectors, one in \(U\) and one in \(\orthogcmp{U}\) (Proposition 28.6.8):
\begin{equation*}
\uvec{v} = \uvec{u} + \uvec{u}'\text{,}
\end{equation*}
with \(\uvec{u}\) in \(U\) and \(\uvec{u}'\) in \(\orthogcmp{U}\text{.}\)
A diagram in three-dimensions of a vector-addition triangle sitting atop a plane to illustrate the geometric decomposition a vector relative to a plane subspace and the line that forms its orthogonal complement. A parallelogram with a shaded-in interior is drawn. The interior of this parallelogram should be imagined as if it is a two-dimensional, solid, rectangular surface suspended within a three-dimensional space (similar to a tabletop “suspended” above the floor in a room), but viewed at an angle from above. Embedded within this two-dimensional surface is a point representing the zero vector, and this surface is labelled as representing the plane subspace \(U\text{.}\) A line is drawn through the point zero vector at a right angle, and is labelled as representing the orthogonal complement subspace \(\orthogcmp{U}\text{.}\)
A directed line segment representing a vector \(\uvec{v}\) is drawn with initial point at the point zero vector, but rising up out of the shaded surface at an acute angle. Two more directed line segments are drawn with their initial points at the point zero vector: one along (and parallel to) the shaded surface, labelled as representing the vector \(\uvec{u}\text{,}\) and one along (and parallel to) the line perpendicular to the surface, labelled as representing the vector \(\uvec{u}'\text{.}\) Another directed line segment, labelled as also representing vector \(\uvec{u}'\text{,}\) runs from the terminal point of \(\uvec{u}\) to the terminal point of \(\uvec{v}\) to complete a vector-addition triangle with those two vectors where \(\uvec{v}\) represents the sum vector. Finally, a dashed line between the terminal points of the two copies of \(\uvec{u}'\) completes a rectangle where the other three sides are \(\uvec{u}\) and the two copies of \(\uvec{u}'\) and \(\uvec{v}\) is a diagonal.
The vector \(\uvec{u}\) in the decomposition is called the orthogonal projection of \(\uvec{v}\) onto \(U\), and we’ll write
\begin{equation*}
\proj_U \uvec{v}
\end{equation*}
to mean this vector. Note that by the symmetry of orthogonal complements (Corollary 37.5.20), we could also say that the complementary vector \(\uvec{u}'\) in \(\orthogcmp{U}\) is
\begin{equation*}
\proj_{\orthogcmp{U}} \uvec{v} \text{,}
\end{equation*}
the orthogonal projection of \(\uvec{v}\) onto \(\orthogcmp{U}\text{.}\)
As a pair, the vectors \(\uvec{u}\) and \(\uvec{u}'\) are sometimes called the component of \(\uvec{v}\) parallel to \(U\) and the component of \(\uvec{v}\) orthogonal to \(U\), respectively.
The Expansion theorem (Theorem 37.5.5), combined with Proposition 37.5.18, tells us how to compute orthogonal projections. An orthogonal basis
\begin{equation*}
\basisfont{B}_U = \{ \uvec{e}_1, \dotsc, \uvec{e}_\ell \}
\end{equation*}
for subspace \(U\) can be enlarged to an orthogonal basis
\begin{equation*}
\basisfont{B}_V = \{ \uvec{e}_1, \dotsc, \uvec{e}_\ell, \uvec{e}_{\ell + 1}, \dotsc, \uvec{e}_n \}
\end{equation*}
for the whole inner product space \(V\) (Corollary 37.5.10). But the extra vectors in the enlarged basis then form a basis for \(\orthogcmp{U}\text{:}\)
\begin{equation*}
\basisfont{B}_{\orthogcmp{U}} = \{ \uvec{e}_{\ell + 1}, \dotsc, \uvec{e}_n \} \text{.}
\end{equation*}
So if we expand \(\uvec{v}\) relative to \(\basisfont{B}_V\text{,}\) we will simultaneously obtain expansions for the components parallel and orthogonal to \(U\) relative to \(\basisfont{B}_U\) and \(\basisfont{B}_{\orthogcmp{U}}\text{,}\) respectively:
\begin{equation*}
\uvec{v} =
\underbrace{
\frac{\inprod{\uvec{v}}{\uvec{e}_1}}{\norm{\uvec{e}_1}^2} \, \uvec{e}_1
+ \dotsb
+ \frac{\inprod{\uvec{v}}{\uvec{e}_\ell}}{\norm{\uvec{e}_\ell}^2} \, \uvec{e}_\ell
}_{\proj_U \uvec{v}}
+
\underbrace{
\frac{\inprod{\uvec{v}}{\uvec{e}_{\ell + 1}}}{\norm{\uvec{e}_{\ell + 1}}^2} \, \uvec{e}_{\ell + 1}
+ \dotsb
+ \frac{\inprod{\uvec{v}}{\uvec{e}_n}}{\norm{\uvec{e}_n}^2} \, \uvec{e}_n
}_{\proj_{\orthogcmp{U}} \uvec{v}}\text{.}
\end{equation*}
Note that we don’t actually need to know the extra vectors in \(\basisfont{B}_V\) that form \(\basisfont{B}_{\orthogcmp{U}}\) in order to compute either \(\proj_U \uvec{v}\) or \(\proj_{\orthogcmp{U}} \uvec{v}\text{.}\) All we need is the orthogonal basis \(\basisfont{B}_U\text{,}\) computed using the Gram-Schmidt orthogonalization process (Procedure 37.3.2), if necessary, and then
\begin{gather}
\proj_U \uvec{v} =
\frac{\inprod{\uvec{v}}{\uvec{e}_1}}{\norm{\uvec{e}_1}^2} \, \uvec{e}_1
+ \dotsb
+ \frac{\inprod{\uvec{v}}{\uvec{e}_\ell}}{\norm{\uvec{e}_\ell}^2} \, \uvec{e}_\ell \text{,}\tag{✶}
\end{gather}
\begin{gather}
\proj_{\orthogcmp{U}} \uvec{v} = \uvec{v} - \proj_U \uvec{v}\text{.}\tag{✶✶}
\end{gather}
See Example 38.3.1 in Subsection 38.3.1 for an example.
Orthogonal projection onto a vector in \(\R^n\).
In Chapter 14, we defined the orthogonal projection of one vector \(\uvec{v}\) onto another \(\uvec{a}\) in \(\R^n\) as
\begin{equation*}
\proj_{\uvec{a}} \uvec{v} = \frac{\udotprod{u}{a}}{\unorm{a}^2}\, \uvec{a} \text{.}
\end{equation*}
This is consistent with our current definition of orthogonal projection, in the case of a one-dimensional subspace \(U\) of \(\R^n\text{,}\) and the standard inner product on \(\R^n\text{.}\) In that case, every basis
\begin{equation*}
\basisfont{B}_U = \{ \uvec{a} \}
\end{equation*}
for \(U\) can be considered an orthogonal basis, and the formula
\begin{equation*}
\proj_U \uvec{v} = \frac{\uvecinprod{v}{a}}{\unorm{a}^2} \, \uvec{a}
\end{equation*}
agrees with our previous formula for \(\proj_{\uvec{a}} \uvec{v}\text{.}\)
Subsection 38.1.2 Gram-Schmidt process versus orthogonal projection
The steps in the Gram-Schmidt orthogonalization process (Procedure 37.3.2) create a sequence of orthogonal bases
\begin{equation*}
\basisfont{B}_{U_1}, \basisfont{B}_{U_2}, \dotsc, \basisfont{B}_{U_n}
\end{equation*}
for a nested sequence of subspaces
\begin{equation*}
U_1 \subseteq U_2 \subseteq \dotsb \subseteq U_n = V \text{,}
\end{equation*}
where \(\dim U_j = j\) for each index \(j\text{.}\) The process begins with \(\uvec{e}_1 = \uvec{v}_1\text{,}\) where we set the first result vector to be simply the first vector in the starting (possibly non-orthogonal) basis for the space. So, set \(\basisfont{B}_{U_1} = \{\uvec{e}_1\}\text{,}\) so that \(U_1 = \Span \{\uvec{e}_1\}\text{.}\) Continuing inductively, to enlarge basis \(\basisfont{B}_{U_j}\) to basis \(\basisfont{B}_{U_{j+1}}\text{,}\) the process appends the vector
\begin{equation*}
\uvec{e}_{j+1} = \uvec{v}_{j+1} - \left[
\frac{\inprod{\uvec{v}_{j+1}}{\uvec{e}_1}}{\norm{\uvec{e}_1}^2} \, \uvec{e}_1
+ \dotsb
+ \frac{\inprod{\uvec{v}_{j+1}}{\uvec{e}_j}}{\norm{\uvec{e}_j}^2} \, \uvec{e}_j
\right]\text{,}
\end{equation*}
where
\begin{equation*}
\basisfont{B}_{U_j} = \{ \uvec{e}_1,\dotsc,\uvec{e}_j \}
\end{equation*}
is the collection of orthogonal vectors created to that point in the process, and \(\uvec{v}_{j+1}\) is the \(\nth[(j+1)]\) vector in the initial basis used as the starting point in the process.
Comparing the expression in the brackets being subtracted from \(\uvec{v}_{j+1}\) with formulas (✶) and (✶✶), we see that
\begin{equation*}
\uvec{e}_{j+1} = \uvec{v}_{j+1} - \proj_{U_j} \uvec{v}_{j+1} = \proj_{\orthogcmp{U}_j} \uvec{v}_{j+1} \text{.}
\end{equation*}
A diagram in three-dimensions illustrating how the Gram-Schmidt process is ultimately a process of repeated orthogonal projections. A parallelogram with a shaded-in interior is drawn. The interior of this parallelogram should be imagined as if it is a two-dimensional, solid, rectangular surface suspended within a three-dimensional space (similar to a tabletop “suspended” above the floor in a room), but viewed at an angle from above. Embedded within this two-dimensional surface is a point representing the zero vector, and two directed line segments parallel to the shaded surface, representing the vectors \(\uvec{e}_1\) and \(\uvec{e}_2\text{,}\) emanate from this point, so that they appear to lie along the shaded surface. Within the shaded surface, these two vectors are at a right angle, and the shaded surface represents a portion of the plane spanned by them, which is labelled as \(U_2\text{.}\) A line is drawn through the point zero vector at a right angle, and is labelled as representing the orthogonal complement subspace \(\orthogcmp{U}_2\text{.}\)
A directed line segment representing the third vector \(\uvec{v}_3\) of the original input basis for the space is drawn with initial point at the point zero vector, but rising up out of the shaded surface at an acute angle. Another directed line segment runs along (and parallel to) the shaded surface from the point zero vector to the point directly below the terminal point of \(uvec{v}_3\text{,}\) and is labelled as representing \(\proj_{U_2} \uvec{v}_3\text{.}\) A final directed line segment representing vector \(\uvec{e}_3\) is drawn along (and parallel to) the line for \(\orthogcmp{U}_2\text{,}\) with initial point at the point zero vector and terminal point at the same height above the shaded surface as the terminal point of \(\uvec{v}_3\text{.}\) Dashed lines are drawn from the terminal point of \(\uvec{v}_3\) to the terminal points of \(\uvec{e}_3\) and \(\proj_{U_2} \uvec{v}_3\text{,}\) completing a rectangle where those two vectors are the other two sides, and \(\uvec{v}_3\) is a diagonal.
Subsection 38.1.3 Best approximation and distance between vector and subspace
As in \(\R^n\text{,}\) we can measure distance between vectors by measuring the norm of a difference vector:
\begin{equation*}
\dist(\uvec{u},\uvec{v}) = \norm{\uvec{u} - \uvec{v}} \text{.}
\end{equation*}
Three non-collinear points are plotted in the plane: one in the lower-left corner of the diagram, one at the top edge of the diagram but left of center, and one at the right edge of the diagram but below center. Two directed line segments with initial point at the point in the lower left corner are drawn, representing vectors labelled \(\uvec{u}\) and \(\uvec{v}\text{.}\) Vector \(\uvec{u}\) has its terminal point at the plotted point at the top edge of the diagram, and vector \(\uvec{v}\) has its terminal point at the plotted point at the right edge of the diagram. A third vector is drawn from the terminal point of \(\uvec{v}\) to the terminal point of \(\uvec{u}\) to complete the vector-subtraction triangle, and so is labelled as representing the difference vector \(\uvec{u} - \uvec{v}\text{.}\) A dashed line is drawn in parallel with this difference vector to represent the distance between those points, and is labelled \(\dist(\uvec{u},\uvec{v})\text{.}\)
We originally developed the concept of orthogonal projection in Chapter 14 to help us answer an approximation question (Question 14.3.4), and it will do the same in an abstract inner product space. If \(U\) is a subspace of a finite-dimensional inner product space, and \(\uvec{v}\) is a vector not in \(U\text{,}\) then amongst all vectors \(\uvec{u}\) in \(U\) the distance
\begin{equation*}
\dist (\uvec{v},\uvec{u}) = \norm{\uvec{v} - \uvec{u}}
\end{equation*}
will be minimized at
\begin{equation*}
\uvec{u} = \proj_U \uvec{v} \text{.}
\end{equation*}
(See Theorem 38.4.4.)
For this reason, the vector \(\proj_U \uvec{v}\) is called the best approximation to \(\uvec{v}\) in \(U\).
A diagram in three-dimensions illustrating how an orthogonal projection onto a subspace relates to the minimum distance between the vector being projected and the subspace. A parallelogram with a shaded-in interior is drawn. The interior of this parallelogram should be imagined as if it is a two-dimensional, solid, rectangular surface suspended within a three-dimensional space (similar to a tabletop “suspended” above the floor in a room), but viewed at an angle from above. Embedded within this two-dimensional surface is a point representing the zero vector, and the shaded surface is labelled as representing a portion of a plane subspace \(U\) of \(\R^3\text{.}\)
A directed line segment representing a vector \(\uvec{v}\) is drawn with initial point at the point zero vector, but rising up out of the shaded surface at an acute angle. Another directed line segment runs along (and parallel to) the shaded surface from the point zero vector to the point directly below the terminal point of \(uvec{v}\text{,}\) and is labelled as representing \(\proj_U \uvec{v}\text{.}\) A dashed line is drawn between the terminal points of these two vectors, so that it meets the shaded surface at the terminal point of the projection vector at a right angle. This dashed line is labelled as representing the minimum distance from the terminal point of \(\uvec{v}\) to the plane.
An arbitrary third point is plotted in the shaded surface, and a dashed line is drawn from it to each of the terminal points of the two vectors, creating a right triangle where the hypotenuse is the dashed line from this arbitrary point to the terminal point of \(\uvec{v}\text{.}\)
Because of this, orthogonal projection can be used to solve approximation and optimization problems. See Example 38.3.2 in Subsection 38.3.2 for an example.
And since there is one unique smallest distance \(\dist (\uvec{v},\uvec{u}) \) amongst all \(\uvec{u}\) in \(U\text{,}\) achieved when \(\uvec{u} = \proj_U \uvec{v}\text{,}\) we can define this as the distance between \(\uvec{v}\) and \(U\):
\begin{equation*}
\dist (\uvec{v},U) = \dist (\uvec{v},\proj_U \uvec{v}) = \norm{\uvec{v} - \proj_U \uvec{v}} = \norm{\proj_{\orthogcmp{U}} \uvec{v}} \text{.}
\end{equation*}
Suppose \(\basisfont{B} = \{\uvec{e}_1, \uvec{e}_2, \dotsc, \uvec{e}_n\}\) is an orthonormal basis for our inner product space. (However, we do not assume here that the first so many vectors in this basis forms a basis for the subspace \(U\text{.}\)) We can then express the vector \(\uvec{v}\) and any vector \(\uvec{u}\) in \(U\) as a coordinate vector relative to this basis:
\begin{align*}
\rmatrixOf{\uvec{v}}{B} \amp = (v_1,v_2,\dotsc,v_n) \text{,} \amp
\rmatrixOf{\uvec{u}}{B} \amp = (u_1,u_2,\dotsc,u_n) \text{.}
\end{align*}
Applying Statement 2 of Proposition 37.5.6, we have
\begin{equation*}
\bbrac{\dist (\uvec{v},\uvec{u})}^2 = {\norm{\uvec{v} - \uvec{u}}}^2 = (v_1 - u_1)^2 + (v_2 - u_2)^2 + \dotsb + (v_n - u_n)^2 \text{.}
\end{equation*}
Since we are talking about nonnegative quantities, minimizing \(\dist (\uvec{v},\uvec{u})\) is the same as minimizing the square of that distance, and so we can say that setting \(\uvec{u} = \proj_U \uvec{v}\) minimizes the sum-of-squares expression on the right above. We could interpret this sum-of-squares formula as a measure of the “error” when we “approximate” \(\uvec{v}\) by a vector \(\uvec{u}\) in \(U\text{,}\) and so \(\uvec{u} = \proj_U \uvec{v}\) could be called the least squares approximation to \(\uvec{v}\) within \(U\text{.}\)
Subsection 38.1.4 Least squares solutions to a linear system
Here is a particular kind of linear approximation problem that can be solved with the help of orthogonal projection.
Question 38.1.5.
Suppose \(A\) is an \(m \times n\) coefficient matrix and \(\uvec{b}_0\) is a column vector in \(\R^m\) so that the linear system \(A \uvec{x} = \uvec{b}_0\) is inconsistent.
In other words, for what vector \(\uvec{x}_0\) in \(\R^n\) will \(A \uvec{x}_0\) be as close as possible to \(\uvec{b}_0\text{?}\)
We can pursue this question by “pushing it forward” through multiplication by \(A\text{,}\) so that it is a question about vectors in \(\R^m\text{,}\) where \(\uvec{b}_0\) and each \(A \uvec{x}\) live, instead of a question about vectors in \(\R^n\text{,}\) where each \(\uvec{x}\) lives (and where the answer \(\uvec{x}_0\) lives, if it exists).
We have seen before that a matrix-times-column product can be expanded as a linear combination of the columns of the matrix:
\begin{equation*}
A \uvec{x} = x_1 \uvec{a}_1 + x_2 \uvec{a}_2 + \dotsb + x_n \uvec{a}_n \text{,}
\end{equation*}
where the \(\uvec{a}_j\) are the columns of \(A\text{.}\) (For example, see Subsection 22.3.2.) It is this pattern that led us to the definition of column space of a matrix as the span of the columns of the matrix. Furthermore, we have seen that system \(A \uvec{x} = \uvec{b}\) is consistent precisely when the vector of constants \(\uvec{b}\) lies in the column space of \(A\text{.}\) (See Subsection 21.3.1.) Turning the above matrix-times-column pattern around, we also see that the column space of \(A\) is made up of all possible products \(A \uvec{x}\) for \(\uvec{x}\) in \(\R^n\text{.}\) (But note that it is possible for different vectors \(\uvec{x}\) in \(\R^n\) to produce the same vector \(\uvec{b}\) in the column space of \(A\text{,}\) since we know that a consistent system \(A \uvec{x} = \uvec{b}\) could have an infinite number of solutions.)
We have assumed that our system is inconsistent, so that \(\uvec{b}_0\) must not lie in the column space of \(A\text{,}\) whereas every result \(A \uvec{x}\) lies in the column space of \(A\text{.}\) We can now re-frame Question 38.1.5 in a way that will allow us to apply orthogonal projection: what vector(s) \(\uvec{x}\) will produce an “output” vector \(A \uvec{x}\) in the column space of \(A\) that is closest to \(\uvec{b}_0\text{?}\) The answer is: any vector \(\uvec{x}_0\) so that
\begin{gather}
A \uvec{x}_0 = \proj_U \uvec{b}_0 \text{,}\tag{†}
\end{gather}
where \(U\) is the column space of \(A\) in \(\R^m\text{.}\)
A diagram in three-dimensions illustrating a geometrically approximate solution to an inconsistent system. A parallelogram with a shaded-in interior is drawn. The interior of this parallelogram should be imagined as if it is a two-dimensional, solid, rectangular surface suspended within a three-dimensional space (similar to a tabletop “suspended” above the floor in a room), but viewed at an angle from above. Embedded within this two-dimensional surface is a point representing the zero vector, and the shaded surface is labelled as representing the column space of a \(3 \times n\) matrix \(A\) in \(\R^3\text{.}\)
A directed line segment representing a vector \(\uvec{b}_0\) is drawn with initial point at the point zero vector, but rising up out of the shaded surface at an acute angle. Another directed line segment runs along (and parallel to) the shaded surface from the point zero vector to the point directly below the terminal point of \(uvec{b}_0\text{,}\) and is labelled as representing a column space vector \(A \uvec{x}_0\text{.}\) A dashed line is drawn between the terminal points of these two vectors, so that it meets the shaded surface at the terminal point of the column space vector at a right angle. Because of this right angle, it is appropriate to write \(A \uvec{x}_0 \approx \uvec{b}_0\text{.}\)
One way to solve this problem is to use Procedure 21.3.2 to determine a basis for the column space of matrix A, apply the Gram-Schmidt orthogonalization process (Procedure 37.3.2) to produce an orthogonal basis for the column space, and then use that basis to compute \(\proj_U \uvec{b}_0\) (where \(U\) represents the column space of \(A\)). Finally, we could solve the system \(A \uvec{x} = \proj_U \uvec{b}_0\text{,}\) which must be consistent because \(\proj_U \uvec{b}_0\) will lie in \(U\text{,}\) the column space of \(A\text{.}\)
But by applying some theory, we can develop a more direct procedure. Take pairing \(\inprod{\blank}{\blank}\) to mean the standard inner product on \(\R^m\) (or, later, on \(\R^n\)). Now, the vector
\begin{equation*}
\uvec{b}_0' = \proj_{\orthogcmp{U}} \uvec{b}_0 = \uvec{b}_0 - \proj_{U} \uvec{b}_0
\end{equation*}
lies in \(\orthogcmp{U}\text{,}\) and so is orthogonal to every vector in \(U\text{,}\) the column space of \(A\text{.}\) So for every \(\uvec{x}\) in \(\R^n\text{,}\) we have
\begin{align*}
0 \amp = \inprod{\uvec{b}_0'}{A \uvec{x}}\\
\amp = \utrans{(A \uvec{x})} (\uvec{b}_0') \\
\amp = \utrans{\uvec{x}} (\utrans{A} \uvec{b}_0') \\
\amp = \inprod{\utrans{A} \uvec{b}_0'}{\uvec{x}} \text{.}
\end{align*}
Aside: A look ahead.
As the above calculation holds for every \(\uvec{x}\) in \(\R^n\text{,}\) the result says that \(\utrans{A} \uvec{b}_0'\) lies in \(\orthogcmp{(\R^n)}\text{.}\) However, in an inner product space, only the zero vector is orthogonal to every vector (Statement 2 of Proposition 37.5.15). Hence
\begin{equation*}
\zerovec = \utrans{A} \uvec{b}_0'
= \utrans{A} (\uvec{b}_0 - \proj_{U} \uvec{b}_0)
= \utrans{A} (\uvec{b}_0 - A \uvec{x}_0)
\end{equation*}
(using (†)). Re-arranging, we have
\begin{equation*}
\utrans{A} A \uvec{x}_0 = \utrans{A} \uvec{b}_0 \text{.}
\end{equation*}
In other words, the special \(\uvec{x}_0\) that we are looking for is a solution to
\begin{equation*}
\utrans{A} A \uvec{x} = \utrans{A} \uvec{b}_0 \text{,}
\end{equation*}
called the normal system associated to \(A \uvec{x} = \uvec{b}_0\). Solutions to this system are referred to as least-squares solutions for the original inconsistent system \(A \uvec{x} = \uvec{b}_0\text{.}\)
Because \(\proj_{U} \uvec{b}_0\) lies in the column space of \(A\text{,}\) there is always some \(\uvec{x}_0\) that satisfies (†), hence there is always a solution to the normal system. If \(A\) is square and invertible, then so is \(\utrans{A}A\text{,}\) and in this case there is one unique solution
\begin{gather}
\uvec{x}_0 = \inv{(\utrans{A} A)} \utrans{A} \uvec{b}_0\tag{††}
\end{gather}
But even if \(A\) is not square, the coefficient matrix of the normal system, \(\utrans{A} A\text{,}\) is always square: we have assumed \(A\) is \(m \times n\text{,}\) so \(\utrans{A} A\) is \(n \times n\text{.}\) If this matrix is invertible, then again there is one unique solution as above. In this case,
\begin{equation*}
\inv{(\utrans{A} A)} \utrans{A}
\end{equation*}
is called the pseudo-inverse of \(A\text{,}\) as an analogy between (††) with the fact that a square, invertible coefficient matrix affords solution
\begin{equation*}
A \uvec{x} = \uvec{b} \quad\implies\quad \uvec{x} = \inv{A} \uvec{b} \text{.}
\end{equation*}
But recall that in this analysis, we have not assumed that \(A\) is square, hence the pseudo in pseudo-inverse.

