Section 20.5 Theory
Subsection 20.5.1 Column space
First we’ll record two facts concerning how multiplying each in a set of column vectors in \(\R^n\) by a common matrix affects linear dependence and independence, leading to our conclusion about how to determine a basis for the column space of a matrix from examining its RREF.
Proposition 20.5.1. Dependence/independence versus matrix transformation.
Suppose \(\uvec{v}_1, \uvec{v}_2, \dotsc, \uvec{v}_\ell\) are column vectors in \(\R^n\) and \(E\) is an \(m \times n\) matrix.
- If \(\{ E \uvec{v}_1, E \uvec{v}_2, \dotsc, E \uvec{v}_\ell \} \) is a linearly independent set of vectors then so too is \(\{ \uvec{v}_1, \uvec{v}_2, \dotsc, \uvec{v}_\ell \} \text{.}\)
- If \(E\) is square and invertible and \(\{ \uvec{v}_1, \uvec{v}_2, \dotsc, \uvec{v}_\ell \} \) is a linearly independent set, then so too is \(\{ E \uvec{v}_1, E \uvec{v}_2, \dotsc, E \uvec{v}_\ell \} \text{.}\)
- If \(E\) is square and invertible and \(\uvec{w}\) is another column vector in \(\R^n\) so that vector \(E \uvec{w}\) is linearly dependent with the vectors \(E \uvec{v}_1, E \uvec{v}_2, \dotsc, E \uvec{v}_\ell\) by the dependence relation\begin{equation*} E \uvec{w} = k_1 E \uvec{v}_1 + k_2 E \uvec{v}_2 + \dotsb + k_\ell E \uvec{v}_\ell \text{,} \end{equation*}then \(\uvec{w}\) is linearly dependent with \(\uvec{v}_1, \uvec{v}_2, \dotsc, \uvec{v}_\ell\) by a dependence relation involving the same scalars,\begin{equation*} \uvec{w} = k_1 \uvec{v}_1 + k_2 \uvec{v}_2 + \dotsb + k_\ell \uvec{v}_\ell \text{.} \end{equation*}
Proof of Statement 1.
Let’s apply the Test for Linear Dependence/Independence to the vectors \(\uvec{v}_1,\uvec{v}_2,\dotsc,\uvec{v}_\ell\text{:}\) suppose that \(k_1,k_2,\dotsc,k_\ell\) are scalars so that
\begin{gather}
k_1\uvec{v}_1 + k_2\uvec{v}_2 + \dotsc + k_\ell\uvec{v}_\ell = \zerovec.\tag{✶}
\end{gather}
Multiplying both sides of this equation by the matrix \(E\text{,}\) and using some matrix algebra, we get
\begin{equation*}
k_1E\uvec{v}_1 + k_2E\uvec{v}_2 + \dotsc + k_\ell E\uvec{v}_\ell = E\zerovec = \zerovec.
\end{equation*}
But we have assumed that the vectors \(E\uvec{v}_1,E\uvec{v}_2,\dotsc,E\uvec{v}_\ell\) are linearly independent, so the only way this linear combination could equal the zero vector is if all the scalars \(k_1,k_2,\dotsc,k_\ell\) are zero. Thus, the linear combination in (✶) must be the trivial one, and so the vectors \(\uvec{v}_1,\uvec{v}_2,\dotsc,\uvec{v}_\ell\) are linearly independent.
Proof of Statement 2.
We assume \(\uvec{v}_1,\uvec{v}_2,\dotsc,\uvec{v}_\ell\) are linearly independent. Since we also assume \(E\) to be invertible, we can restate this as saying that vectors
\begin{equation*}
\inv{E} (E \uvec{v}_1), \inv{E} (E \uvec{v}_2), \dotsc, \inv{E} (E \uvec{v}_\ell)
\end{equation*}
are linearly independent. Now we can apply Statement 1 with \(E\) replaced by \(\inv{E}\) and \(\uvec{v}_1,\uvec{v}_2,\dotsc,\uvec{v}_\ell\) replaced by \(E \uvec{v}_1, E \uvec{v}_2, \dotsc, E \uvec{v}_\ell\text{.}\)
Proof of Statement 3.
Simply apply the inverse \(\inv{E}\) to both sides of
\begin{equation*}
E\uvec{w} = k_1 E\uvec{v}_1 + k_2 E\uvec{v}_2 + \dotsb + k_\ell E\uvec{v}_\ell
\end{equation*}
to obtain
\begin{equation*}
\uvec{w} = k_1\uvec{v}_1 + k_2\uvec{v}_2 + \dotsb + k_\ell\uvec{v}_\ell.
\end{equation*}
Corollary 20.5.2. Column space basis and dimension.
- A basis for the column space of an \(m\times n\) matrix \(A\) can be formed from those columns of \(A\) (as column vectors in \(\R^m\)) in positions corresponding to the locations of the leading ones in the RREF of \(A\text{.}\)
- The dimension of the column space of a matrix is equal to the rank of the matrix.
Proof of Statement 1.
By definition, the columns of \(A\) are a spanning set for the column space of \(A\text{.}\) By Proposition 18.5.1, this spanning set can be reduced to a basis; it’s a matter of determining the largest possible linearly independent set of these spanning column vectors.
Let \(E = E_t E_{t-1} \dotsm E_1\) be the product of elementary matrices corresponding to some sequence of row operations that reduces \(A\) to its RREF. Because of the nature of RREF, each column of \(\RREF(A)\) that contains a leading one will be a standard basis vector in \(\R^m\text{,}\) no two such leading-one columns will be the same standard basis vector, and each column that does not contain a leading one will be a linear combination of those leading-one columns that appear to its left. Therefore, the leading-one columns represent the largest set of linearly independent vectors that can be formed from the columns of \(\RREF(A)\text{.}\) Since \(E\) is invertible, the two statements of Proposition 20.5.1 tell us that the columns of \(A\) will have the same relationships: those columns in \(A\) that are in positions where the leading ones occur in \(\RREF(A)\) will be linearly independent, and that will be the largest possible collection of linearly independent columns of \(A\text{,}\) because each of the other columns will be linearly dependent with the leading-one-position columns of \(A\) to its left.
Thus, we can reduce the spanning set made up of all columns of \(A\) to a basis for its column space by discarding the linearly dependent columns and keeping only those columns in positions corresponding to the locations of the leading ones occur in \(\RREF(A)\text{.}\)
Proof of Statement 2.
Since we obtain a basis for column space by taking those columns in the matrix in positions corresponding to the leading ones in a reduced form for the matrix, the number of basis vectors is equal to the number of leading ones.
Subsection 20.5.2 Row space
Next we’ll record our observations concerning how the row operations affect the row space of a matrix, leading to our conclusion about how to obtain a basis for the row space of a matrix from its RREF.
Proposition 20.5.3. Row space versus row operations.
If an elementary row operation is applied to a matrix, then the row space of the new matrix is the same as the row space of the old matrix.
Proof.
Consider an \(m \times n\) matrix \(A\) as a collection of row vectors in \(\R^n\text{:}\)
\begin{equation*}
A = \begin{bmatrix}
\leftrightlinesubstitute \amp \uvec{a}_1 \amp \leftrightlinesubstitute\\
\leftrightlinesubstitute \amp \uvec{a}_2 \amp \leftrightlinesubstitute\\
\amp\vdots\\
\leftrightlinesubstitute \amp \uvec{a}_m \amp \leftrightlinesubstitute\\
\end{bmatrix}.
\end{equation*}
Then the row space of \(A\) is, by definition, the subspace \(\Span\{\uvec{a}_1,\uvec{a}_2,\dotsc,\uvec{a}_m\}\) of \(\R^m\text{.}\)
As we did in Discovery 20.5, we will make repeated use of Statement 2 of Proposition 16.5.6, which tells us how to determine when two spanning sets generate the same subspace.
Let’s consider each type of elementary row operation in turn.
- Suppose we swap two rows in \(A\text{:}\)\begin{equation*} A = \begin{bmatrix} \leftrightlinesubstitute \amp \uvec{a}_1 \amp \leftrightlinesubstitute\\ \amp\vdots\\ \leftrightlinesubstitute \amp \uvec{a}_i \amp \leftrightlinesubstitute\\ \amp\vdots\\ \leftrightlinesubstitute \amp \uvec{a}_j \amp \leftrightlinesubstitute\\ \amp\vdots\\ \leftrightlinesubstitute \amp \uvec{a}_m \amp \leftrightlinesubstitute\\ \end{bmatrix} \to A' = \begin{bmatrix} \leftrightlinesubstitute \amp \uvec{a}_1 \amp \leftrightlinesubstitute\\ \amp\vdots\\ \leftrightlinesubstitute \amp \uvec{a}_j \amp \leftrightlinesubstitute\\ \amp\vdots\\ \leftrightlinesubstitute \amp \uvec{a}_i \amp \leftrightlinesubstitute\\ \amp\vdots\\ \leftrightlinesubstitute \amp \uvec{a}_m \amp \leftrightlinesubstitute\\ \end{bmatrix}. \end{equation*}The row space of the new matrix, \(A'\text{,}\) is the span of its row vectors. But every row vector in \(A'\) is equal to one of the row vectors in \(A\text{,}\) and vice versa. So clearly the conditions of the above-referenced Statement 2 are satisfied, and the rowspaces of the two matrices are the same space.
-
Suppose we multiply one of the rows in \(A\) by a nonzero constant \(k\text{:}\)\begin{equation*} A = \begin{bmatrix} \leftrightlinesubstitute \amp \uvec{a}_1 \amp \leftrightlinesubstitute\\ \amp\vdots\\ \leftrightlinesubstitute \amp \uvec{a}_i \amp \leftrightlinesubstitute\\ \amp\vdots\\ \leftrightlinesubstitute \amp \uvec{a}_m \amp \leftrightlinesubstitute\\ \end{bmatrix} \to A'' = \begin{bmatrix} \leftrightlinesubstitute \amp \uvec{a}_1 \amp \leftrightlinesubstitute\\ \amp\vdots\\ \leftrightlinesubstitute \amp k\uvec{a}_i \amp \leftrightlinesubstitute\\ \amp\vdots\\ \leftrightlinesubstitute \amp \uvec{a}_m \amp \leftrightlinesubstitute\\ \end{bmatrix}. \end{equation*}Again, most of the row vectors in the new matrix \(A''\) are equal to one of the row vectors in \(A\text{,}\) and vice versa. So to fully satisfy the conditions of the above-referenced Statement 2, we need to verify that \(k\uvec{a}_i\) is somehow a linear combination of row vectors from \(A\) and that \(\uvec{a}_i\) is somehow a linear combination of row vectors from \(A''\text{.}\) But \(k\uvec{a}_i\) is already expressed as a scalar multiple of a row vector from \(A\text{,}\) and since \(k\) is nonzero we can also write\begin{equation*} \uvec{a}_i = \frac{1}{k} \cdot (k\uvec{a}_i), \end{equation*}so that \(\uvec{a}_i\) is also a scalar multiple of a row vector from \(A''\text{.}\)With the conditions of the above-referenced Statement 2 now fully satisfied, we can conclude that the rowspaces of the two matrices are the same space.
-
Suppose we replace one row vector in \(A\) by the sum of that row and a scalar multiple of another:\begin{equation*} A = \begin{bmatrix} \leftrightlinesubstitute \amp \uvec{a}_1 \amp \leftrightlinesubstitute\\ \amp\vdots\\ \leftrightlinesubstitute \amp \uvec{a}_i \amp \leftrightlinesubstitute\\ \amp\vdots\\ \leftrightlinesubstitute \amp \uvec{a}_m \amp \leftrightlinesubstitute\\ \end{bmatrix} \to A''' = \begin{bmatrix} \leftrightlinesubstitute \amp \uvec{a}_1 \amp \leftrightlinesubstitute\\ \amp\vdots\\ \leftrightlinesubstitute \amp \uvec{a}_i + k\uvec{a}_j \amp \leftrightlinesubstitute\\ \amp\vdots\\ \leftrightlinesubstitute \amp \uvec{a}_m \amp \leftrightlinesubstitute\\ \end{bmatrix}. \end{equation*}Once again, most of the row vectors in the new matrix \(A'''\) are equal to one of the row vectors in \(A\text{,}\) and vice versa. So to fully satisfy the conditions of the above-referenced Statement 2, we need to verify that \(\uvec{a}_i + k\uvec{a}_j\) is somehow a linear combination of row vectors from \(A\) and that \(\uvec{a}_i\) is somehow a linear combination of row vectors from \(A'''\text{.}\) But \(\uvec{a}_i + k\uvec{a}_j\) is already expressed as a linear combination of row vectors from \(A'''\text{,}\) and for \(\uvec{a}_i\) we can write\begin{equation*} \uvec{a}_i = 1(\uvec{a}_i + k\uvec{a}_j) + (-k)\uvec{a}_j, \end{equation*}a linear combination of row vectors from \(A'''\text{.}\)With the conditions of the above-referenced Statement 2 now fully satisfied, we can conclude that the rowspaces of the two matrices are the same space.
Corollary 20.5.4. Row space basis and dimension.
Let \(A\) represent a matrix.
- If \(E\) is an invertible square matrix of compatible size, then \(A\) and \(EA\) have the same row space.
- The row space of each REF for \(A\) (including the RREF of \(A\)) is always the same as that of \(A\text{.}\)
- The nonzero rows of each REF for \(A\) form a basis for the row space of \(A\text{.}\)
- The dimension of the row space of \(A\) is equal to the rank of \(A\text{.}\)
Proof of Statement 1.
Since \(E\) is invertible, it can be expressed as a product of elementary matrices (Theorem 6.5.2), and the product \(EA\) has the same result as applying to \(A\) the sequence of row operations represented by those elementary matrices. But Proposition 20.5.3 tells us that applying those operations does not change the row space.
Proof of Statement 2.
Let \(F\) be an REF for \(A\text{,}\) and let \(E_1,E_2,\dotsc,E_\ell\) be elementary matrices corresponding to some sequence of row operations that reduces \(A\) to \(F\text{.}\) Set \(E = E_\ell \dotsm E_2 E_1\text{.}\) Then \(E\) is an invertible matrix and \(F = EA\text{.}\) Therefore, \(F\) has the same row space as \(A\) by Statement 1 of this corollary.
Proof of Statement 3.
Let \(F\) be an REF for \(A\text{.}\) By Statement 2 of this corollary, the rows of \(F\) are a spanning set for the row space of \(A\text{.}\) Clearly we can discard any zero rows from this spanning set, so it just remains to verify that the nonzero rows of \(F\) are linearly independent. For this, we will use Proposition 17.5.6, building up our linearly independent spanning set one vector at a time. Let \(\uvec{v}_1,\uvec{v}_2,\dotsc,\uvec{v}_\ell\) represent the nonzero rows of \(F\text{,}\) from top to bottom. Start with \(\uvec{v}_\ell\text{;}\) all by itself, this one nonzero vector is linearly independent. Now, \(\uvec{v}_{\ell-1}\) cannot be in \(\Span\{\uvec{v}_\ell\}\text{,}\) because the leading one in \(\uvec{v}_{\ell-1}\) appears to the left of the leading one in \(\uvec{v}_\ell\text{,}\) and so no scalar multiple of \(\uvec{v}_\ell\) will have a nonzero entry in the component where \(\uvec{v}_{\ell-1}\) has its leading one. From this, Proposition 17.5.6 tells us that \(\{\uvec{v}_{\ell-1},\uvec{v}_\ell\}\) is linearly independent. Moving on, \(\uvec{v}_{\ell-2}\) cannot be in \(\Span\{\uvec{v}_{\ell-1},\uvec{v}_\ell\}\text{,}\) because the leading one in \(\uvec{v}_{\ell-2}\) appears to the left of both the leading one in \(\uvec{v}_{\ell-1}\) and in \(\uvec{v}_{\ell}\text{,}\) and so no linear combination of those two vectors will have a nonzero entry in the component where \(\uvec{v}_{\ell-2}\) has its leading one. From this, Proposition 17.5.6 tells us that \(\{\uvec{v}_{\ell-2},\uvec{v}_{\ell-1},\uvec{v}_\ell\}\) is linearly independent. Repeating this argument as we move up the rows of \(F\text{,}\) we see that the nonzero rows of \(F\) are linearly independent when taken altogether.
Proof of Statement 4.
Applying Statement 3 of this corollary to the RREF for \(A\text{,}\) the nonzero rows of \(\RREF(A)\) form a basis for the row space of \(A\text{.}\) But the nonzero rows of \(\RREF(A)\) must all contain leading ones, so the number of vectors in a basis for the row space of \(A\) is equal to the number of leading ones in \(\RREF(A)\text{,}\) as desired.
Subsection 20.5.3 Column and row spaces versus rank and invertibility
As discovered in Discovery 20.4, we can use our observations recorded in Proposition 20.5.1 to connect column space to invertibility. We can similarly use Corollary 20.5.4 to also connect row space to invertibility.
First, we will extend the list of properties that are equivalent to invertibility of a square matrix, first started in Theorem 6.5.2, and then continued in Theorem 10.5.3.
Theorem 20.5.5. Characterizations of invertibility.
For a square matrix \(A\text{,}\) the following are equivalent.
- Matrix \(A\) is invertible.
- Every linear system that has \(A\) as a coefficient matrix has one unique solution.
- The homogeneous system \(A\uvec{x} = \zerovec\) has only the trivial solution.
- There is some linear system that has \(A\) as a coefficient matrix and has one unique solution.
- The rank of \(A\) is equal to the size of \(A\text{.}\)
- The RREF of \(A\) is the identity.
- Matrix \(A\) can be expressed as a product of some number of elementary matrices.
- The determinant of \(A\) is nonzero.
- The null space of \(A\) consists of only the zero vector.
- The columns of \(A\) are linearly independent.
- The columns of \(A\) form a basis for \(\R^n\text{,}\) where \(n\) is the size of \(A\text{.}\)
- The rows of \(A\) are linearly independent.
- The rows of \(A\) form a basis for \(\R^n\text{,}\) where \(n\) is the size of \(A\text{.}\)
In particular, an \(n\times n\) matrix is invertible if and only if its columns form a basis for \(\R^n\text{.}\)
Proof.
We have previously encountered the equivalence of many of these statements, most recently in Theorem 10.5.3. So currently we only need to concern ourselves with the new statements. For each of these, if we can establish equivalence of the new statement to one of the old, then the new statement must be equivalent to all of the old, by the transitivity of logical equivalence.
Statement 9.
This is just restatement of Statement 3 using the concept of null space.
Statement 10.
From our reinterpretation of Proposition 17.5.1, stated in Procedure 20.3.4, we know that all of the columns of \(A\) will be linearly independent if and only if every column of \(\RREF(A)\) has a leading one. Therefore, this statement is equivalent to Statement 5.
Statement 11.
This statement is equivalent to Statement 10, since Proposition 19.5.5 tells us that we need exactly \(n\) linearly independent vectors to form a basis for \(\R^n\text{.}\)
Statement 12.
From the row space version of the Test for Linear Dependence/Independence stated in Procedure 20.3.7, we know that all of the rows of \(A\) will be linearly independent if and only if every row of \(\RREF(A)\) is nonzero. Therefore, this statement is also equivalent to Statement 5.
Statement 13.
This statement is equivalent to Statement 12, again since Proposition 19.5.5 tells us that we need exactly \(n\) linearly independent vectors to form a basis for \(\R^n\text{.}\)
Finally, we’ll record an observation from Discovery 20.9, which is just a reframing of Proposition 2.5.8.
Theorem 20.5.6. Rank-Nullity Theorem.
If \(A\) is an \(m \times n\) matrix, then
\begin{equation*}
n = \rank(A) + \nullity(A).
\end{equation*}
That is,
\begin{equation*}
\dim\R^n = \dim(\text{column space of } A) + \dim(\text{null space of } A).
\end{equation*}
Proof.
The dimension of the column space of \(A\) is equal to the number of leading ones in its RREF, while the dimension of the null space of \(A\) is equal to the number of free variables, which is equal to the number of columns in the RREF that do not have a leading one. These two numbers must add up to the total number of columns in \(A\text{.}\)