Skip to main content

Section 32.4 Concepts

Subsection 32.4.1 Moving past triangular block form

In Discovery 32.1, we first reminded ourselves that similarity is transitive, and applied this fact with triangular block form as an intermediate form: if a matrix is similar to a triangular block matrix, and that triangular block matrix is similar to another (hopefully simpler) matrix, then the first matrix is also similar to the third matrix. So in our quest to determine a uniform answer to Question 28.1.1, we can attempt to move past triangular block form by concentrating on that form itself.

As in Discovery 32.1, consider triangular block form matrix

\begin{equation*} T = \begin{bmatrix} T_1 \\ \amp T_2 \\ \amp \amp \ddots \\ \amp \amp \amp T_\ell \end{bmatrix} \text{,} \end{equation*}

where each block \(T_j\) is a scalar-triangular form. We would like to construct a transition matrix \(Q\) so that \(\inv{Q} T Q\) is somehow even simpler in form than \(T\text{.}\) There are aspects of triangular block form that make it particularly simple, and we don't want to mess those up. But if we restrict ourselves to working with transition matrices

\begin{equation*} Q = \begin{bmatrix} Q_1 \\ \amp Q_2 \\ \amp \amp \ddots \\ \amp \amp \amp Q_\ell \end{bmatrix} \end{equation*}

that are also in block-diagonal form, with the same number and sizes of blocks as \(T\text{,}\) then the algebra of block-diagonal matrices (Statement 1 of Proposition 28.6.1) tells us that

\begin{equation*} \inv{Q} T Q = \begin{bmatrix} \inv{Q}_1 T_1 Q_1 \\ \amp \inv{Q}_2 T_2 Q_2 \\ \amp \amp \ddots \\ \amp \amp \amp \inv{Q}_\ell T_\ell Q_\ell \end{bmatrix} \text{.} \end{equation*}

So instead of focusing on how to make triangular block form simpler, we can focus on how to make each scalar-triangular block simpler.

A scalar-triangular matrix \(S\) can be decomposed as a sum of a scalar matrix and an upper triangular matrix,

\begin{equation*} S = \lambda I + N \text{,} \end{equation*}

where the upper triangular matrix has zeros down the diagonal. (See (\(\star\)) in Subsection 31.3.2.) If \(Q\) is any transition matrix, then

\begin{equation*} \inv{Q} S Q = \inv{Q} (\lambda I + N) Q = \lambda \inv{Q} I Q + \inv{Q} N Q = \lambda I + \inv{Q} N Q \text{,} \end{equation*}

so again we can narrow our focus onto how to make matrices of the form of \(N\) simpler through similarity. And since an upper triangular matrix of this form is always nilpotent (Lemma 31.5.1), we should turn our attention to the similarity patterns of nilpotent matrices.

Subsection 32.4.2 The similarity pattern of elementary nilpotent form

As uncovered in Discovery 32.2, nilpotent matrices can only be similar to other nilpotent matrices. And as discussed in Section 32.2, it might be fruitful to first examine the case of nilpotent matrices that are similar to elementary nilpotent form.

We'll follow the example of Discovery 32.4 and consider the \(4 \times 4\) case. Suppose that \(A\) is similar to

\begin{equation*} N = \begin{bmatrix} 0 \amp 0 \amp 0 \amp 0 \\ 1 \amp 0 \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \amp 0 \\ 0 \amp 0 \amp 1 \amp 0 \end{bmatrix} \text{.} \end{equation*}

Using Pattern 26.3.1, we know that \(\inv{P} A P = N\) can only be true if

\begin{align*} A\uvec{p}_1 \amp = 0\uvec{p}_1 + 1 \uvec{p}_2 + 0\uvec{p}_3 + 0\uvec{p}_4 \text{,}\\ A\uvec{p}_2 \amp = 0\uvec{p}_1 + 0 \uvec{p}_2 + 1\uvec{p}_3 + 0\uvec{p}_4 \text{,}\\ A\uvec{p}_3 \amp = 0\uvec{p}_1 + 0\uvec{p}_2 + 0\uvec{p}_3 + 1\uvec{p}_4 \text{,}\\ A\uvec{p}_4 \amp = 0\uvec{p}_1 + 0\uvec{p}_2 + 0\uvec{p}_3 + 0\uvec{p}_4 \text{,} \end{align*}

where the \(\uvec{p}_j\) are the columns of \(P\text{,}\) as usual. Rewriting this to make it a little more clear, we have

\begin{align*} A\uvec{p}_1 \amp = \uvec{p}_2 \text{,}\\ A\uvec{p}_2 \amp = \uvec{p}_3 \text{,}\\ A\uvec{p}_3 \amp = \uvec{p}_4 \text{,}\\ A\uvec{p}_4 \amp = \zerovec \text{.} \end{align*}

The effect of multiplication by \(A\) on the columns of \(P\) seems to follow a cyclic pattern, though the pattern ends in the zero vector instead of wrapping back around to \(\uvec{p}_1\text{.}\) This pattern becomes more evident if we repeatedly apply \(A\) to \(\uvec{p}_1\text{.}\)

\begin{align*} A \uvec{p}_1 \amp = \uvec{p}_2 \text{,}\\ A^2 \uvec{p}_1 \amp = A(A\uvec{p}_1) = A\uvec{p}_2 = \uvec{p}_3 \text{,}\\ A^3 \uvec{p}_1 \amp = A(A^2\uvec{p}_1) = A\uvec{p}_3 = \uvec{p}_4 \text{,}\\ A^4 \uvec{p}_1 \amp = A(A^3\uvec{p}_1) = A\uvec{p}_4 = \zerovec \text{.} \end{align*}

Of course, the pattern ended at \(A^4 \uvec{p}_1 = \zerovec\) because \(A\) is \(4 \times 4\) nilpotent (Statement 4 of Theorem 31.5.3). In particular, the fact that it ended in the zero vector tells us that the vector before that in the cycle must be in the null space of \(A\text{.}\)

It should be clear that the pattern will unfold in a similar manner for an \(n \times n\) matrix \(A\) that is similar to an elementary nilpotent form matrix.

Pattern 32.4.1. Similarity with an elementary nilpotent matrix.

For \(n \times n\) matrix \(A\) to be similar to an elementary nilpotent matrix \(N\) via similarity relation \(\inv{P} A P\text{,}\) the columns \(\uvec{p}_1,\uvec{p}_2,\dotsc,\uvec{p}_n\) of the transition matrix \(P\) must satisfy the pattern

\begin{equation*} \uvec{p}_{j+1} = A \uvec{p}_j = A^j \uvec{p}_1 \text{,} \end{equation*}

where the last vector \(\uvec{p}_n = A^{n-1} \uvec{p}_1\) must be in the null space of \(A\text{.}\)

Subsection 32.4.3 Elementary nilpotent procedure

According to Pattern 32.4.1, choosing the first column \(\uvec{p}_1\) of a transition matrix \(P\) automatically chooses all the other columns by

\begin{equation*} \uvec{p}_{j+1} = A^j \uvec{p}_1 \text{.} \end{equation*}

However, none of the columns of \(P\) may be zero, since we need \(P\) to be invertible. In particular, we need

\begin{gather} \uvec{p}_n = A^{n-1} \uvec{p}_1 \ne \zerovec\text{.}\label{equation-elem-nilpotent-concepts-transition-pattern}\tag{\(\star\)} \end{gather}

So while \(A\) must be nilpotent in order to be similar to a nilpotent matrix, and so will satisfy \(A^n = \zerovec\) (Statement 4 of Theorem 31.5.3), we must have \(A^{n-1} \ne \zerovec\). This echoes Task 32.3.b, where we recognized that an \(n \times n\) elementary nilpotent matrix has maximum degree of nilpotency \(n\text{.}\) It follows that for \(A\) to be similar to an elementary nilpotent matrix, it must also have maximum degree of nilpotency (Proposition 31.5.2).

Note that condition (\(\star\)) is enough to guarantee that all the other columns of \(P\) are nonzero, since if

\begin{equation*} \uvec{p}_j = A^{j-1} \uvec{p}_1 = \zerovec \end{equation*}

were true for some \(j \lt n\text{,}\) then

\begin{equation*} A^{n-1} \uvec{p}_1 = A^{n-j} (A^{j-1} \uvec{p}_1) = \zerovec \end{equation*}

would be true as well.

Remark 32.4.3.
  1. Such a vector \(\uvec{v}\) as in the procedure must exist because the only \(n \times n\) matrix whose null space is all of \(\R^n\) (or \(\C^n\text{,}\) as appropriate) is the zero matrix. Since we have assumed \(A^{n-1} \ne \zerovec\text{,}\) there must exist vectors that are not in the null space of \(A^{n-1}\text{.}\) In particular, if the \(\nth[j]\) column of \(A^{n-1}\) is nonzero, then we could take \(\uvec{v}\) to be \(\uvec{e}_j\text{,}\) the \(\nth[j]\) standard basis vector.
  2. While we have justified above that choosing \(\uvec{v}\) so that \(A^{n-1} \uvec{v} \ne \zerovec\) will guarantee that none of the columns of \(P\) with be zero, this is not enough to guarantee that those columns will be linearly independent, which is required for \(P\) to be invertible. We leave this to be verified in Section 32.6.

Subsection 32.4.4 Cyclic subspaces

In Procedure 32.4.2, for the transition matrix \(P\) to be invertible it must be true that the vectors

\begin{equation*} \uvec{v}, A \uvec{v}, A^2 \uvec{v}, \dotsc, A^{n-1} \uvec{v} \end{equation*}

are linearly independent and so form a basis of \(\R^n\) (or \(\C^n\text{,}\) as appropriate). It is reasonable to turn the “cyclic” pattern exhibited by these basis vectors into a general concept, for all matrices instead of just for nilpotent ones.

Given an \(n \times n\) matrix \(A\text{,}\) a subspace \(W\) of \(\R^n\) (or \(\C^n\text{,}\) as appropriate) that can be described by a “cyclic” spanning set as

\begin{equation*} W = \Span \{ \uvec{w}, A \uvec{w}, A^2 \uvec{w}, \dotsc \} \end{equation*}

is called an \(A\)-cyclic subspace, and the generating vector \(\uvec{w}\) is called an \(A\)-cyclic vector for \(W\text{.}\) We will explore the theory of \(A\)-cyclic subspaces in Section 32.6, but we will say a couple of things here.

First, being contained in a finite-dimensional space, the infinite spanning set for \(W\) cannot be linarly independent. So there will be a point in the sequence

\begin{equation*} \uvec{w}, A \uvec{w}, A^2 \uvec{w}, \dotsc \end{equation*}

where \(A^k \uvec{w}\) is dependent with all the previous vectors, and then we will have

\begin{equation*} W = \Span \{ \uvec{w}, A \uvec{w}, A^2 \uvec{w}, \dotsc, A^{k-1} \uvec{w} \} \text{.} \end{equation*}

(See Lemma 32.6.2.)

Second, an \(A\)-cyclic subspace is always \(A\)-invariant. Every vector in such a subspace \(W\) is a linear combination of the cyclic spanning vectors \(A^j \uvec{w}\text{,}\) and multiplying such a linear combination by \(A\) will obviously still be a linear combination of those cyclic spanning vectors, just with each power of \(A\) in the linear combination increased by one.

Warning 32.4.4.

A subspace of \(\R^n\) (or \(\C^n\text{,}\) as appropriate) does not necessarily contain any \(A\)-cyclic vectors, even if it is \(A\)-invariant.

Subsection 32.4.5 Properties of elementary nilpotent matrices

In Discovery 32.3, we explored properties of elementary nilpotent matrices and their powers. Here we will simply summarize our two main discoveries, which are both a consequence of a simple observation: in powers of an elementary nilpotent matrix, the diagonal of ones marches down the sub-diagonals of the matrix, taking one step down per increase in exponent.

Suppose \(N\) is an \(n \times n\) matrix in elementary nilpotent form.

  • Matrix \(N\) has maximum degree of nilpotency \(n\text{.}\) That is, the first power of \(N\) that becomes the zero matrix is the \(\nth\) power.
  • The rank of \(N\) is one less than full, \(n - 1\text{,}\) and ranks of powers of \(N\) decrease from there:
    \begin{equation*} \rank N = n - 1, \quad \rank N^2 = n - 2, \quad \rank N^3 = n - 3, \quad \dotsc \text{,} \end{equation*}
    until we reach \(\rank N^n = 0\text{.}\)

Since rank and degree of nilpotency are both preserved by similarity, any matrix that is similar to a matrix in elementary nilpotent form will share the same properties.

A look ahead.

We will not need these properties to further explore similarity to elementary nilpotent form, but they will become important in the next chapter for being able to determine similarity to a matrix built out of elementary nilpotent blocks, without actually having to determine the transition matrix to realize such similarity.