Section 33.6 Theory
Here we'll formally state some of the patterns we've encountered in this chapter.
First, let's note that a nilpotent matrix can always be put into triangular-block nilpotent form.
Theorem 33.6.1.
If \(A\) is a nilpotent matrix, then there exists an invertible matrix \(P\) such that \(\inv{P} A P\) is in triangular-block nilpotent form.
Proof idea.
Essentially, the existence of \(P\) is implied by the fact that we can write down procedures for computing such a \(P\text{.}\) (See Subsection 33.4.3.) However, it might take some convincing that the procedures we have written down do what they claim to do.
For a different perspective, see the topic of cyclic decomposition in [3, 6].
Once we know the form \(N = \inv{P} A P\) exists, it is possible to use the properties of \(N\) to identify properties of \(A\) that will help us find \(P\text{.}\)
Theorem 33.6.2.
Suppose that \(A\) is an \(n \times n\) nilpotent matrix that is similar to the triangular-block nilpotent form \(N\text{.}\) Then the following hold.
- The number of blocks in \(N\) is equal to \(\nullity A\text{,}\) the dimension of the null space of \(A\text{.}\)
- The largest (i.e. first) block in \(N\) has size \(k \times k\text{,}\) where \(k\) is the degree of nilpotency of \(A\text{,}\) and \(N\) has \(\rank (A^{k-1})\) blocks of this size.
Proof.
- For this statement, consider that \(N\) is almost in RREF; we would only need to swap some rows to put it in RREF. Therefore, the rank of \(N\) is equal to the number of ones that appear in it. Each elementary block of \(N\) is just a single one away from having full rank. That is, if \(N\) has a block of size \(m \times m\text{,}\) that block contributes exactly \((m-1)\) to the rank of \(N\text{.}\) Thus, the number of blocks in \(N\) is equal to how far \(N\) is from having full rank \(n\text{:}\)\begin{equation*} \rank N = n - (\text{ # of blocks in } N) \text{.} \end{equation*}We now have that the number of blocks in \(N\) is equal to \(n - \rank N\text{,}\) which is equal to \(\nullity N\) by the Rank-Nullity Theorem. Finally, since \(A\) and \(N\) are similar, they have the same nullity (Corollary 26.5.6), and so we could say that the number of blocks in \(N\) is equal to \(\nullity A\text{.}\)
-
For this statement, first recall that similar nilpotent matrices have the same degree of nilpotency (Proposition 31.5.2), so \(N\) first becomes zero when \(A\) does. Powers of \(N\) can be computed by taking powers of its blocks (Statement 1 of Proposition 28.6.1), and we know that powers of each elementary block will become zero precisely at the block's size (see Discovery 32.3). Since a power of \(N\) will be zero precisely when each of its blocks is zero at that power, we can conclude that degree of nilpotency of \(N\) (and hence of \(A\)) is equal to the size of the largest block in \(N\text{.}\)
It still remains to determine the number of blocks of largest size \(k \times k\text{,}\) where \(k\) is the common degree of nilpotency of \(N\) and \(A\text{.}\) For this, consider the matrix \(N^{k-1}\text{.}\) At this power, all blocks of smaller size will still become zero. But each block of size \(k \times k\text{,}\) raised to the power \(k-1\text{,}\) will consist of all zeros except for a single one in the lower left entry (again see Discovery 32.3). Therefore, each block of size \(k \times k\) contributes a single one to the rank of \(N^{k-1}\text{,}\) and each block of smaller size contributes nothing. Thus, \(\rank(N^{k-1})\) counts the number of blocks of \(N\) of size \(k \times k\text{.}\) Finally, since \(A\) is similar to \(N\text{,}\) so also \(A^{k-1}\) is similar to \(N^{k-1}\) (Proposition 26.5.4). Therefore, \(\rank(A^{k-1}) = \rank(N^{k-1})\text{,}\) and so we could also say that \(\rank(A^{k-1})\) counts the number of blocks of \(N\) of size \(k \times k\text{.}\)
Remark 33.6.3.
As we have seen in Subsection 33.4.2 and Subsection 33.5.1, it is possible to expand the above list of properties: the ranks of the powers of \(A\) tell us exactly the form of \(N\text{.}\)
The following fact makes it easier to check whether our \(A\)-cyclic spaces are independent. Note that the statement is true even in the case that \(A\) is not nilpotent.
Proposition 33.6.4.
Suppose that \(A\) is an \(n \times n\) matrix and \(W_1, W_2, \dotsc, W_\ell\) is a collection of \(A\)-cyclic subspaces of \(\R^n\) (or \(\C^n\text{,}\) as appropriate) each of which has a cyclic basis whose last element lies in the null space of \(A\text{.}\) Then the collection \(W_1,W_2,\dotsc,W_\ell\) is independent if and only if the collection of final vectors in each of these cyclic bases forms a linearly independent set in the null space \(A\text{.}\)
Proof outline.
If the spaces \(W_1,W_2,\dotsc,W_\ell\) are independent, then by definition their bases taken all together remain linearly independent. Since a subcollection of an independent set is still independent (Statement 2 of Proposition 18.5.3), it follows that the collection of final null space vectors taken from each cyclic basis will remain independent.
Now consider the reverse implication. We will prove this statement in the case of two cyclic spaces, \(W_1\) and \(W_2\text{.}\) (The proof in the case of more than two spaces is similar, but much more tedious.) The proof we give is similar to the independence argument in the proof of Theorem 32.6.3.
Suppose
is a cylic basis for \(W_1\) and
is a cylic basis for \(W_2\text{.}\) Further assume that
is a linearly independent set contained in the null space of \(A\text{.}\) Note that these vectors being in the null space means that
Case \(m=k\).
We wish to show that the union of the two cyclic bases forms a linearly independent set. Using the Test for Linear Dependence/Independence, assume that
We wish to show that all of the scalars in the above linear combination are zero. First, multiply each side of the equation by \(A^{k-1}\text{.}\) Since
this eliminates all of the terms except the \(a_0\) and \(b_0\) terms, leaving
However, we have assumed that \(A^{k-1} \uvec{u}\) and \(A^{k-1} \uvec{v}\) are linearly independent. Therefore, we must have \(a_0 = b_0 = 0\text{,}\) leaving
in the original homogeneous vector equation. Multiplying both sides of this new equation by \(A^{k-2}\text{,}\) we can use the same argument to conclude that \(a_1 = b_1 = 0\text{.}\) Continuing in this fashion, we can argue that all the coefficients are zero.
Case \(m \gt k\).
Again, assume that
Multiply both sides of this equation by \(A^{m-1}\text{.}\) Since \(m \gt k\text{,}\) this will eliminate all of the \(\uvec{u}\) terms, and just leave \(b_0 A^{m-1} \uvec{v} = \zerovec\text{.}\) Since \(A^{m-1} \uvec{v}\) is part of a basis, it cannot be zero, hence \(b_0 = 0\text{.}\) We are then left with
in the original homogeneous vector equation. Continuing in this fashion, we can argue that \(b_0 = b_1 = \dotsb = b_{m-k-1} = 0\text{,}\) leaving
If we set \(\uvec{v}' = A^{m-k}\uvec{v}\text{,}\) we then have
and from here we can repeat the argument of the \(m=k\) case.
Case \(m \lt k\).
Argue exactly as in the \(m \gt k\) case with the roles of \(m\) and \(k\) reversed.
As a corollary, we can justify our method for choosing new generating vectors at each step of Procedure 33.4.2.
Corollary 33.6.5.
Suppose \(\uvec{w}_1, \uvec{w}_2, \dotsc, \uvec{w}_\ell\) are vectors from the null space of \(A^m\) such that
and the null space of \(A^{m-1}\) form a pair of independent subspaces. Then the \(A\)-cyclic spaces generated by these vectors also form a collection of independent subspaces.
Proof.
Write \(W_j\) to represent the \(A\)-cyclic space generated by \(\uvec{w}_j\text{.}\) By our assumption, no individual vector \(\uvec{w}_j\) from our collection can lie in the null space of \(A^{m-1}\) (Theorem 28.6.4). In that case, Theorem 32.6.3 tells us that
is a (cyclic) basis for \(W_j\text{,}\) and all of these spaces have the same dimension.
Since each \(\uvec{w}_j\) is in the null space of \(A^m\text{,}\) each final cyclic basis vector \(A^{m-1} \uvec{w}_j\) is in the null space of \(A\text{.}\) Therefore, Proposition 33.6.4 applies, and to demonstrate that the \(W_j\) form an independent collection of subspaces we just need to demonstrate that the collection
of final vectors is linearly independent.
So set up the Test for Linear Dependence/Independence:
Factoring out the common \(A^{m-1}\) gives us
which gives us a linear combination of the \(\uvec{w}_j\) in the null space of \(A^{m-1}\text{.}\) By assumption, this linear combination must be the trivial one, so that each coefficient \(c_j\) is equal to \(0\text{.}\)
The assumption of the corollary is precisely the situation at each step of Procedure 33.4.2. If \(\{\uvec{u}_1, \uvec{u}_2, \dotsc, \uvec{u}_s \}\) is a basis for the null space of \(A^{m-1}\) and we extend that to a basis
for the null space of \(A^m\text{,}\) then no non-trivial linear combination of the new \(\uvec{w}_j\) vectors can be in the null space of \(A^{m-1}\text{,}\) as such a combination would have to also be equal to a linear combination of the \(\uvec{u}_j\) vectors, violating the linear independence of the full collection.
However, notice that it was not necessary to explicitly assume that the \(\uvec{w}_j\) in the corollary are linearly independent. Since the zero vector is in the null space of \(A^{m-1}\text{,}\) then the stated assumption in the corollary implies that no non-trivial linear combination of the \(\uvec{w}_j\) can equal to \(\zerovec\text{.}\)