Skip to main content

Section 33.6 Theory

Here we'll formally state some of the patterns we've encountered in this chapter.

First, let's note that a nilpotent matrix can always be put into triangular-block nilpotent form.

Essentially, the existence of \(P\) is implied by the fact that we can write down procedures for computing such a \(P\text{.}\) (See Subsection 33.4.3.) However, it might take some convincing that the procedures we have written down do what they claim to do.

For a different perspective, see the topic of cyclic decomposition in [3, 6].

Once we know the form \(N = \inv{P} A P\) exists, it is possible to use the properties of \(N\) to identify properties of \(A\) that will help us find \(P\text{.}\)

  1. For this statement, consider that \(N\) is almost in RREF; we would only need to swap some rows to put it in RREF. Therefore, the rank of \(N\) is equal to the number of ones that appear in it. Each elementary block of \(N\) is just a single one away from having full rank. That is, if \(N\) has a block of size \(m \times m\text{,}\) that block contributes exactly \((m-1)\) to the rank of \(N\text{.}\) Thus, the number of blocks in \(N\) is equal to how far \(N\) is from having full rank \(n\text{:}\)
    \begin{equation*} \rank N = n - (\text{ # of blocks in } N) \text{.} \end{equation*}
    We now have that the number of blocks in \(N\) is equal to \(n - \rank N\text{,}\) which is equal to \(\nullity N\) by the Rank-Nullity Theorem. Finally, since \(A\) and \(N\) are similar, they have the same nullity (Corollary 26.5.6), and so we could say that the number of blocks in \(N\) is equal to \(\nullity A\text{.}\)
  2. For this statement, first recall that similar nilpotent matrices have the same degree of nilpotency (Proposition 31.5.2), so \(N\) first becomes zero when \(A\) does. Powers of \(N\) can be computed by taking powers of its blocks (Statement 1 of Proposition 28.6.1), and we know that powers of each elementary block will become zero precisely at the block's size (see Discovery 32.3). Since a power of \(N\) will be zero precisely when each of its blocks is zero at that power, we can conclude that degree of nilpotency of \(N\) (and hence of \(A\)) is equal to the size of the largest block in \(N\text{.}\)

    It still remains to determine the number of blocks of largest size \(k \times k\text{,}\) where \(k\) is the common degree of nilpotency of \(N\) and \(A\text{.}\) For this, consider the matrix \(N^{k-1}\text{.}\) At this power, all blocks of smaller size will still become zero. But each block of size \(k \times k\text{,}\) raised to the power \(k-1\text{,}\) will consist of all zeros except for a single one in the lower left entry (again see Discovery 32.3). Therefore, each block of size \(k \times k\) contributes a single one to the rank of \(N^{k-1}\text{,}\) and each block of smaller size contributes nothing. Thus, \(\rank(N^{k-1})\) counts the number of blocks of \(N\) of size \(k \times k\text{.}\) Finally, since \(A\) is similar to \(N\text{,}\) so also \(A^{k-1}\) is similar to \(N^{k-1}\) (Proposition 26.5.4). Therefore, \(\rank(A^{k-1}) = \rank(N^{k-1})\text{,}\) and so we could also say that \(\rank(A^{k-1})\) counts the number of blocks of \(N\) of size \(k \times k\text{.}\)

Remark 33.6.3.

As we have seen in Subsection 33.4.2 and Subsection 33.5.1, it is possible to expand the above list of properties: the ranks of the powers of \(A\) tell us exactly the form of \(N\text{.}\)

The following fact makes it easier to check whether our \(A\)-cyclic spaces are independent. Note that the statement is true even in the case that \(A\) is not nilpotent.

If the spaces \(W_1,W_2,\dotsc,W_\ell\) are independent, then by definition their bases taken all together remain linearly independent. Since a subcollection of an independent set is still independent (Statement 2 of Proposition 18.5.3), it follows that the collection of final null space vectors taken from each cyclic basis will remain independent.

Now consider the reverse implication. We will prove this statement in the case of two cyclic spaces, \(W_1\) and \(W_2\text{.}\) (The proof in the case of more than two spaces is similar, but much more tedious.) The proof we give is similar to the independence argument in the proof of Theorem 32.6.3.

Suppose

\begin{equation*} \{ \uvec{u}, A\uvec{u}, \dotsc, A^{k-1} \uvec{u} \} \end{equation*}

is a cylic basis for \(W_1\) and

\begin{equation*} \{ \uvec{v}, A\uvec{v}, \dotsc, A^{m-1} \uvec{v} \} \end{equation*}

is a cylic basis for \(W_2\text{.}\) Further assume that

\begin{equation*} \{ A^{k-1} \uvec{u}, A^{m-1} \uvec{v} \} \end{equation*}

is a linearly independent set contained in the null space of \(A\text{.}\) Note that these vectors being in the null space means that

\begin{equation*} A^k \uvec{u} = \zerovec, \qquad A^m \uvec{v} = \zerovec. \end{equation*}
Case \(m=k\).

We wish to show that the union of the two cyclic bases forms a linearly independent set. Using the Test for Linear Dependence/Independence, assume that

\begin{equation*} a_0 \uvec{u} + a_1 A \uvec{u} + \dotsb + a_{k-1} A^{k-1} \uvec{u} + b_0 \uvec{v} + b_1 A \uvec{v} + \dotsb + b_{k-1} A^{k-1} \uvec{v} = \zerovec\text{.} \end{equation*}

We wish to show that all of the scalars in the above linear combination are zero. First, multiply each side of the equation by \(A^{k-1}\text{.}\) Since

\begin{equation*} A^k \uvec{u} = A^k \uvec{v} = \zerovec\text{,} \end{equation*}

this eliminates all of the terms except the \(a_0\) and \(b_0\) terms, leaving

\begin{equation*} a_0 A^{k-1} \uvec{u} + b_0 A^{k-1} \uvec{v} = \zerovec \text{.} \end{equation*}

However, we have assumed that \(A^{k-1} \uvec{u}\) and \(A^{k-1} \uvec{v}\) are linearly independent. Therefore, we must have \(a_0 = b_0 = 0\text{,}\) leaving

\begin{equation*} a_1 A \uvec{u} + \dotsb + a_{k-1} A^{k-1} \uvec{u} + b_1 A \uvec{v} + \dotsb + b_{k-1} A^{k-1} \uvec{v} = \zerovec \end{equation*}

in the original homogeneous vector equation. Multiplying both sides of this new equation by \(A^{k-2}\text{,}\) we can use the same argument to conclude that \(a_1 = b_1 = 0\text{.}\) Continuing in this fashion, we can argue that all the coefficients are zero.

Case \(m \gt k\).

Again, assume that

\begin{equation*} a_0 \uvec{u} + a_1 A \uvec{u} + \dotsb + a_{k-1} A^{k-1} \uvec{u} + b_0 \uvec{v} + b_1 A \uvec{v} + \dotsb + b_{m-1} A^{m-1} \uvec{v} = \zerovec\text{.} \end{equation*}

Multiply both sides of this equation by \(A^{m-1}\text{.}\) Since \(m \gt k\text{,}\) this will eliminate all of the \(\uvec{u}\) terms, and just leave \(b_0 A^{m-1} \uvec{v} = \zerovec\text{.}\) Since \(A^{m-1} \uvec{v}\) is part of a basis, it cannot be zero, hence \(b_0 = 0\text{.}\) We are then left with

\begin{equation*} a_0 \uvec{u} + a_1 A \uvec{u} + \dotsb + a_{k-1} A^{k-1} \uvec{u} + b_1 A \uvec{v} + \dotsb + b_{m-1} A^{m-1} \uvec{v} = \zerovec \end{equation*}

in the original homogeneous vector equation. Continuing in this fashion, we can argue that \(b_0 = b_1 = \dotsb = b_{m-k-1} = 0\text{,}\) leaving

\begin{align*} a_0 \uvec{u} \amp + a_1 A \uvec{u} + \dotsb + a_{k-1} A^{k-1} \uvec{u}\\ \amp + b_{m-k} A^{m-k} \uvec{v} + b_{m-k+1} A^{m-k+1} \uvec{v} + \dotsb + b_{m-1} A^{m-1} \uvec{v} = \zerovec\text{.} \end{align*}

If we set \(\uvec{v}' = A^{m-k}\uvec{v}\text{,}\) we then have

\begin{align*} a_0 \uvec{u} \amp + a_1 A \uvec{u} + \dotsb + a_{k-1} A^{k-1} \uvec{u}\\ \amp + b_{m-k} \uvec{v}' + b_{m-k+1} A \uvec{v}' + \dotsb + b_{m-1} A^{k-1} \uvec{v} = \zerovec\text{,} \end{align*}

and from here we can repeat the argument of the \(m=k\) case.

Case \(m \lt k\).

Argue exactly as in the \(m \gt k\) case with the roles of \(m\) and \(k\) reversed.

As a corollary, we can justify our method for choosing new generating vectors at each step of Procedure 33.4.2.

Write \(W_j\) to represent the \(A\)-cyclic space generated by \(\uvec{w}_j\text{.}\) By our assumption, no individual vector \(\uvec{w}_j\) from our collection can lie in the null space of \(A^{m-1}\) (Theorem 28.6.4). In that case, Theorem 32.6.3 tells us that

\begin{equation*} \{ \uvec{w}_j, A \uvec{w}_j, A^2 \uvec{w}_j, \dotsc, A^{m-1} \uvec{w}_j \} \end{equation*}

is a (cyclic) basis for \(W_j\text{,}\) and all of these spaces have the same dimension.

Since each \(\uvec{w}_j\) is in the null space of \(A^m\text{,}\) each final cyclic basis vector \(A^{m-1} \uvec{w}_j\) is in the null space of \(A\text{.}\) Therefore, Proposition 33.6.4 applies, and to demonstrate that the \(W_j\) form an independent collection of subspaces we just need to demonstrate that the collection

\begin{equation*} \{ A^{m-1} \uvec{w}_1, A^{m-1} \uvec{w}_2, \dotsc, A^{m-1} \uvec{w}_\ell \} \end{equation*}

of final vectors is linearly independent.

So set up the Test for Linear Dependence/Independence:

\begin{equation*} c_1 A^{m-1} \uvec{w}_1 + c_2 A^{m-1} \uvec{w}_2 + \dotsb c_\ell A^{m-1} \uvec{w}_\ell = \zerovec \text{.} \end{equation*}

Factoring out the common \(A^{m-1}\) gives us

\begin{equation*} A^{m-1} (c_1 \uvec{w}_1 + c_2 \uvec{w}_2 + \dotsb c_\ell \uvec{w}_\ell) = \zerovec \text{,} \end{equation*}

which gives us a linear combination of the \(\uvec{w}_j\) in the null space of \(A^{m-1}\text{.}\) By assumption, this linear combination must be the trivial one, so that each coefficient \(c_j\) is equal to \(0\text{.}\)

The assumption of the corollary is precisely the situation at each step of Procedure 33.4.2. If \(\{\uvec{u}_1, \uvec{u}_2, \dotsc, \uvec{u}_s \}\) is a basis for the null space of \(A^{m-1}\) and we extend that to a basis

\begin{equation*} \{\uvec{u}_1, \uvec{u}_2, \dotsc, \uvec{u}_s, \uvec{w}_1, \uvec{w}_2, \dotsc, \uvec{w}_\ell \} \end{equation*}

for the null space of \(A^m\text{,}\) then no non-trivial linear combination of the new \(\uvec{w}_j\) vectors can be in the null space of \(A^{m-1}\text{,}\) as such a combination would have to also be equal to a linear combination of the \(\uvec{u}_j\) vectors, violating the linear independence of the full collection.

However, notice that it was not necessary to explicitly assume that the \(\uvec{w}_j\) in the corollary are linearly independent. Since the zero vector is in the null space of \(A^{m-1}\text{,}\) then the stated assumption in the corollary implies that no non-trivial linear combination of the \(\uvec{w}_j\) can equal to \(\zerovec\text{.}\)