Skip to main content

Section 31.3 Concepts

We have already used triangular block form to establish some important properties of generalized eigenspaces (Statement 2 of Theorem 30.5.5). In Discovery guide 31.1, we further explored what triangular block tells about the connections between a matrix and its characteristic polynomial.

Subsection 31.3.1 Matrix polynomials

Just as we can evaluate polynomials at specific numeric values of the indeterminate \(x\text{,}\) matrix algebra allows us to evaluate polynomials at “values” of \(x\) that are square matrices, if we regard the constant term of the polynomial as a scalar multiple of the identity matrix.

For example, the polynomial

\begin{equation*} p(x) = x^2 - 3x + 6 \end{equation*}

from Discovery 31.1 can be rewritten as

\begin{equation*} p(X) = X^2 - 3X + 6I \text{,} \end{equation*}

where now we interpret the indeterminate \(X\) to represent a square matrix (of whatever size is appropriate to the context).

Similarity and matrix polynomials.

In Task 31.1.b, we conjectured that

\begin{equation*} p(\inv{P} A P) = \inv{P} \bigl(p(A)\bigr) P \end{equation*}

is always true, so that similar matrices evaluate to similar results in a matrix polynomial. Even more, the same transition matrix that realizes the similarity of the original matrices will realize the similarity of the evaluation results. We will state and provide the outline of a proof for this fact as Lemma 31.5.6 in Subsection 31.5.2.

Diagonal and block-diagonal matrices in matrix polynomials.

Powers, scalar multiples, and sums of diagonal matrices all result in diagonal matrices, and can all be calculated by performing those operations on the diagonal entries. So, as we discovered in Task 31.1.c, for a diagonal matrix \(D\) matrix and polynomial \(p(x)\text{,}\) we can calculate

\begin{equation*} D = \begin{bmatrix} d_1 \\ \amp d_2\\ \amp \amp \ddots \\ \amp \amp \amp d_n \end{bmatrix} \quad\implies\quad p(D) = \begin{bmatrix} p(d_1) \\ \amp p(d_2)\\ \amp \amp \ddots \\ \amp \amp \amp p(d_n) \end{bmatrix}\text{.} \end{equation*}

In other words, a matrix polynomial of a diagonal matrix can be evaluated by evaluating the polynomial at each diagonal entry.

Powers, scalar multiples, and sums of block-diagonal matrices can be calculated in a similar fashion by working with the blocks on the diagonal, so that if \(D\) is in block-diagonal form, then

\begin{equation*} D = \begin{bmatrix} D_1 \\ \amp D_2\\ \amp \amp \ddots \\ \amp \amp \amp D_m \end{bmatrix} \quad\implies\quad p(D) = \begin{bmatrix} p(D_1) \\ \amp p(D_2)\\ \amp \amp \ddots \\ \amp \amp \amp p(D_m) \end{bmatrix}\text{.} \end{equation*}

Subsection 31.3.2 Nilpotent matrices

In Discovery 31.2 and Discovery 31.4 we noticed a common pattern for a matrix \(T\) in triangular block form: if \(\lambda\) is an eigenvalue of \(T\text{,}\) then \(\lambda I - T\) will have zeros down the diagonal of one of its upper triangular blocks. A scalar-triangular matrix is just a triangular block matrix with a single block, and in this case \(\lambda I - T\) has zeros down the entire main diagonal. In Subsection 29.4.1 we observed an alternate form of this pattern: a scalar-triangular matrix can always be decomposed into a sum of a scalar matrix and a scalar-triangular matrix with zeros down the diagonal. In general, this can be visualized as

\begin{align} \begin{bmatrix} \lambda \amp \ast \amp \cdots \amp \ast \\ \phantom{\ddots} \amp \lambda \amp \ddots \amp \vdots \\ \amp \phantom{\ddots} \amp \ddots \amp \ast \\ \amp \amp \amp \lambda \end{bmatrix} = \lambda I + \begin{bmatrix} 0 \amp \ast \amp \cdots \amp \ast \\ \phantom{\ddots} \amp 0 \amp \ddots \amp \vdots \\ \amp \phantom{\ddots} \amp \ddots \amp \ast \\ \amp \amp \amp 0 \end{bmatrix}\text{.}\label{equation-cayley-hamilton-scalar-triang-decomp}\tag{\(\star\)} \end{align}

In Discovery 31.3 we focused specifically on matrices of form like the last matrix on the right above. A matrix of this form is called nilpotent, and we soon discovered why: in subsequent powers of such a matrix, the zeros marches up the “super” diagonals above the main diagonal, until the whole matrix becomes zero when the exponent reaches the size of the matrix.

This form of matrix is just a special case of scalar-triangular form, with single eigenvalue \(0\) of multiplicity equal to the size of the matrix, and characteristic polynomial \(c(\lambda) = \lambda^n\text{.}\)

Generalizing from this nilpotent form of matrix, every matrix with the property that powers eventually become zero is also called nilpotent, and we will study the similarity patterns of such matrices in the next two chapters. But we will have a few things to say about the eigenvalues and characteristic polynomial of nilpotent matrices in Section 31.5, similar to our above analysis of the nilpotent scalar-triangular form.

Subsection 31.3.3 Matrices and their characteristic polynomials

In Discovery 31.4, the pattern of powers of nilpotent matrices led us to the conclusion that a triangular block matrix evaluates to zero in its own characteristic polynomial. Combining this with the facts that

  • every matrix is similar (over \(\C\)) to a triangular block matrix,
  • similar matrices evaluate to similar results in a polynomial, and
  • the only matrix that is similar to the zero matrix is the zero matrix itself,

we are led to the conclusion that every matrix evaluates to zero in its own characteristic polynomial. This fact is usually referred to as the Cayley-Hamilton Theorem.

Roots and field extensions.

The Cayley-Hamilton Theorem is an extraordinary result! It says that every matrix is a root of its own characteristic polynomial, which provides an algebraic way to deal with roots rather than just symbolically writing a \(\sqrt{}\) symbol. It is beyond the scope of this book to go into exactly what this means, but we can give a flavour here with an example.

In Subsection 7.3.1, we saw how scalar matrices allow us to embed the algebra of numbers into the algebra of matrices — that is, a scalar matrix is essentially the same as a number.

What if we further expand the concept of number to include the \(2 \times 2\) matrix

\begin{equation*} C = \left[\begin{array}{rr} 0 \amp -1 \\ 1 \amp 0 \end{array}\right] \text{?} \end{equation*}

It is straightforward to compute the characteristic polynomial

\begin{equation*} c_C(\lambda) = \lambda^2 + 1 \text{,} \end{equation*}

so the Cayley-Hamilton Theorem tells us that

\begin{equation*} C^2 + 1 = \zerovec \text{.} \end{equation*}

This is precisely the defining condition of the “imaginary number” \(\ci\text{.}\) It turns out that \(\ci\) is not so imaginary after all, as here we have a matrix embodiment of it that only requires the numbers \(0,1,-1\) to construct!

Combining this special matrix with \(2 \times 2\) scalar matrices, we can even construct a realization of the field of complex numbers inside \(\matrixring_2(\R)\text{.}\) Every complex number \(z\) can be expressed as

\begin{equation*} z = a + b \ci \end{equation*}

for some real numbers \(a,b\text{.}\) If we replace \(a\) by the scalar matrix \(a I\) and \(\ci\) by the special matrix \(C\text{,}\) we can realize the complex number \(z\) as the \(2 \times 2\) matrix

\begin{equation*} Z = \left[\begin{array}{rr} a \amp -b \\ b \amp a \end{array}\right] \text{.} \end{equation*}

We leave it as an exercise for you, the reader, to verify that adding, multiplying, and inverting matrices of the form of \(Z\) have the same results as adding, multiplying, and inverting the corresponding complex numbers. You might also get a surprise if you compute \(\det Z\text{!}\)

Companion matrices.

A slightly less precise way to phrase the Cayley-Hamilton Theorem would be to say that, given any square matrix, we can construct a polynomial that has that matrix as a root. Turning this statement around, we could also ask: given any polynomial, can we construct a matrix root of that polynomial? Getting more precise again, we ask the following question.

Question 31.3.1.

Given a monic, degree \(n\) polynomial, can we construct an \(n \times n\) matrix that has that polynomial as its characteristic polynomial?

We leave it as an exercise for you, the reader, to verify that the matrix

\begin{equation*} \begin{bmatrix} 0 \amp 0 \amp 0 \amp \cdots \amp 0 \amp -a_0 \\ 1 \amp 0 \amp 0 \amp \cdots \amp 0 \amp -a_1 \\ 0 \amp 1 \amp 0 \amp \cdots \amp 0 \amp -a_2 \\ \vdots \amp \vdots \amp \vdots \amp \amp \vdots \amp \vdots \\ 0 \amp 0 \amp 0 \amp \cdots \amp 1 \amp -a_{n-1} \end{bmatrix} \end{equation*}

answers Question 31.3.1 in the affirmative. The matrix above is called the companion matrix of the monic polynomial

\begin{equation*} p(x) = x^n + a_{n-1} x^{n-1} + a_{n-2} x^{n-2} + \dotsb + a_1 x + a_0 \text{.} \end{equation*}

Subsection 31.3.4 Connection to determinants and traces of matrices

Finally, in Discovery 31.6 we encountered some surprising connections between properties of a matrix, its eigenvalues, and its characteristic polynomial by looking more closely at triangular block form. We will leave it to Subsection 31.5.3 to restate and prove these properties.