Section 30.5 Theory
In this section.
Subsection 30.5.1 Similarity to triangular block form
Every similarity class contains a triangular block matrix, at least if we are will to use complex scalars, vectors, and matrices. It is possible to prove this without the aid of generalized eigenvectors first, and then from that fact deduce the required properties of generalized eigenspaces to make Procedure 30.4.2 work.
Theorem 30.5.1. Triangular block form of a matrix.
If the characteristic polynomial of square matrix \(A\) factors completely as
then there exists an invertible matrix \(P\) such that \(\inv{P} A P\) has the block-diagonal form
where each \(U_j\) is an \(m_j \times m_j\) matrix in scalar-triangular form that has scalar \(\lambda_j\) down the diagonal.
Proof idea.
The proof is by induction on \(n\text{,}\) the size of \(A\text{.}\) We will assume we are working with a real matrix and real scalars. But there is no detail of the proof that depends on this assumption, and so the proof is valid for complex matrices and scalars as well.
Base case.
Every \(1 \times 1\) matrix is trivially in triangular-block form.
Induction step.
First, we make the induction hypothesis: let \(k\) represent a fixed positive integer, and assume that every \(k \times k\) matrix whose characteristic polynomial factors completely as in the statement of the theorem is similar to a matrix in triangular-block form.
We must now use the induction hypothesis to prove that every matrix \(A\) of size \((k+1) \times (k+1)\) whose characteristic polynomial factors completely as in the statement of the theorem is similar to a matrix in triangular-block form.
Suppose \(\uvec{x}\) is an eigenvector of \(A\) corresponding to \(\lambda_1\text{.}\) Choose a basis of \(\R^{k+1}\) that contains \(\uvec{x}\) as its first element, and let \(Q\) be the invertible \((k+1) \times (k+1)\) matrix made up of these basis vectors as columns. Then \(\inv{Q} A Q\) has block form
where \(A'\) is \(k \times k\text{,}\) and \(B\) is \(1 \times k\text{.}\)
Similar to Statement 6 of Proposition 28.6.1, the characteristic polynomial of \(\inv{Q} A Q\) must be
But similar matrices have the same characteristic polynomial (Theorem 26.5.8), and so from
we may conclude that
As the characteristic polynomial of \(c_{A'}\) also factors completely, we may apply the induction hypothesis to obtain an invertible, \(k \times k\) matrix \(Q'\) such that \(\inv{(Q')} A' Q'\) is in triangular-block form
where each block \(U'_j\) has size \(m_j \times m_j\text{,}\) except block \(U'_1\text{,}\) which has size \((m_1-1)\times(m_1-1)\text{.}\) Thus, if we set \(P\) to be the invertible \((k+1) \times (k+1)\) matrix
we get
In the last matrix above, we have broken the row matrix \(B Q'\) up into row matrices \(C_1\) and \(C_2\text{,}\) where \(C_1\) has size \(1 \times (m_1-1)\text{.}\) This is almost in triangular-block form — the remaining difficulty is that \(C_2\) might have nonzero entries. The remainder of the proof revolves around showing that one can choose the matrix \(Q'\) so that \(C_2 = \zerovec\text{;}\) see [5] for the details.
Finally, note that the blocks of \(\inv{P} A P\) have size \(m_1,m_2,\dotsc,m_\ell\text{,}\) respectively.
Remark 30.5.2.
- If our field of scalars is \(\C\text{,}\) then The Fundamental Theorem of Algebra (Complex Version) tells us that the characteristic polynomial of every matrix factors completely. However, if our field of scalars is \(\R\text{,}\) then the characteristic polynomials of some matrices will contain irreducible quadratic factors.
- Notice that in triangular-block form, to each distinct eigenvalue of \(A\) there corresponds precisely one block in the scalar-triangular form, with the eigenvalue down the diagonal, and of size equal to the algebraic multiplicity of the eigenvalue.
In the case of a matrix with a single repeated eigenvalue, as in Chapter 29, we can be more specific.
Corollary 30.5.3. Triangular block form in case of a single eigenvalue.
Every matrix with a single repeated eigenvalue is similar (over \(\C\)) to a scalar-triangular matrix with the eigenvalue repeated down the main diagonal.
Proof.
In this case, the characteristic polynomial must factor as
where \(\lambda_0\) is the single repeated eigenvalue, and so Theorem 30.5.1 tells us that the matrix is similar to a triangular block matrix with a single scalar-triangular block.
Subsection 30.5.2 Properties of generalized eigenvectors
We previously stated some properties of generalized eigenspaces as Theorem 29.6.1, and we are now in a position to prove Statement 4 of that theorem. As well, we know from Chapter 29 that it is generalized eigenspaces that produce matrices in scalar-triangular form. All we need further is to establish that the generalized eigenspaces of a matrix satisfy the requirements of the block-diagonalization procedure: invariance and independence.
Lemma 30.5.4. Invariance of generalized eigensubspaces.
Each generalized eigensubspace of a square matrix is invariant under that matrix.
Proof.
Suppose \(A\) is a square matrix, \(\lambda\) is an eigenvalue of \(A\text{,}\) and \(k\) is a positive integer. We would like to verify that the generalized eigensubspace \(E_\lambda^k(A)\) is \(A\)-invariant.
First, note that the matrices \((\lambda I - A)\) and \(A\) commute, since
From this, we can conclude that \(A\) commutes with \((\lambda I - A)^k\text{,}\) since
Now, to show that \(E_\lambda^k(A)\) is \(A\)-invariant, we must show that, given vector \(\uvec{u}\) in \(E_\lambda^k(A)\text{,}\) the vector \(A\uvec{u}\) is again in \(E_\lambda^k(A)\text{.}\) So assume that \(\uvec{u} \in E_\lambda^k(A)\text{.}\) By definition, this means that \((\lambda I - A)^k \uvec{u} = \zerovec\text{.}\) But then, using the fact that \(A\) commutes with \((\lambda I - A)^k\) proved above, we have
Since \((\lambda I - A)^k (A \uvec{u}) = \zerovec\text{,}\) we have that \(A\uvec{u}\) in \(E_\lambda^k(A)\) as well, as desired.
Theorem 30.5.5. More properties of generalized eigenspaces.
Suppose \(A\) is an \(n \times n\) matrix.
- If \(\lambda\) is an eigenvalue of \(A\text{,}\) then the generalized eigenspace \(G_\lambda(A)\) is an \(A\)-invariant subspace of \(\R^n\) (or \(\C^n\text{,}\) as appropriate).
- Suppose the characteristic polynomial of \(A\) factors completely as\begin{equation*} c_A(\lambda) = (\lambda-\lambda_1)^{m_1} (\lambda-\lambda_2)^{m_2} \dotsm (\lambda-\lambda_\ell)^{m_\ell}. \end{equation*}Then each generalized eigenspace \(G_{\lambda_j}(A)\) has dimension \(m_j\text{.}\) Furthermore, if we have a basis for each generalized eigenspace of \(A\text{,}\) taking these bases all together gives a basis for \(\R^n\) (or \(\C^n\text{,}\) as appropriate). That is, the generalized eigenspaces of \(A\) are a complete set of independent subspaces.
Proof of Statement 1.
We already know that \(G_\lambda(A)\) is a subspace of \(\R^n\) or \(\C^n\text{,}\) as appropriate (Statement 1 of Theorem 29.6.1). So we only need to verify that it is \(A\)-invariant.
If \(\uvec{x}\) is a vector in the generalized eigenspace \(G_\lambda(A)\text{,}\) then by definition it is in a generalized eigensubspace \(E_\lambda^k(A)\) for some \(k\text{.}\) But we have already shown that each \(E_\lambda^k(A)\) is \(A\)-invariant (Lemma 30.5.4), so \(A\uvec{x}\) must again be back in \(E_\lambda^k(A)\text{,}\) which puts it back in \(G_\lambda(A)\text{.}\)
Outline of proof of Statement 2.
We give a proof for \(A\) a real matrix, and assuming real scalars. The proof for the complex context is identical.
By Theorem 30.5.1, there exists an invertible matrix \(P\) such that \(\inv{P} A P\) is in triangular-block form. From the block form of \(\inv{P} A P\text{,}\) confirm that
where each spanning set above is obviously linearly independent.
As usual, the transition matrix \(P\) transfers properties of \(\inv{P} A P\) to the equivalent properties of \(A\text{.}\) In particular, it can be shown that the above spanning sets for the generalized eigenspaces of \(\inv{P} A P\) can be transferred to spanning sets for the generalized eigenspaces of \(A\text{:}\)
Since \(P\) is invertible, each of these transformed spanning sets is also linearly independent (Statement 2 of Proposition 21.5.1). Thus, \(\dim G_{\lambda_j}(A) = m_j\text{.}\) Similarly, since the standard basis of \(\R^n\) is linearly independent, and \(P\) is invertible, taking all these spanning sets together forms a basis of \(\R^n\text{.}\)