Skip to main content
Logo image

Section 22.4 Concepts

Subsection 22.4.1 The transition matrix and the diagonal form

The columns of the transition matrix.

In Discovery 22.1, we transformed the equation \(\inv{P}AP = D\) into the equivalent equation \(AP = PD\text{.}\) Thinking of \(P\) as being made up of column vectors, multiplying \(P\) on the left by \(A\) multiplies each column of \(P\) by \(A\text{,}\) and multiplying \(P\) on the right by \(D\) multiplies each column of \(P\) by the corresponding diagonal entry. So if we view \(P\) and \(D\) as having forms
\begin{align*} P \amp= \begin{bmatrix} | \amp | \amp \amp | \\ \uvec{p}_1 \amp \uvec{p}_2 \amp \cdots \amp \uvec{p}_n \\ | \amp | \amp \amp | \end{bmatrix}, \amp D \amp= \begin{bmatrix} \lambda_1 \\ \amp \lambda_2 \\ \amp\amp \ddots \\ \amp\amp\amp \lambda_n \end{bmatrix}, \end{align*}
then we can view \(AP\) and \(PD\) as having forms
\begin{align*} AP \amp= \begin{bmatrix} | \amp | \amp \amp | \\ A\uvec{p}_1 \amp A\uvec{p}_2 \amp \cdots \amp A\uvec{p}_n \\ | \amp | \amp \amp | \end{bmatrix}, \amp PD \amp= \begin{bmatrix} | \amp | \amp \amp | \\ \lambda_1\uvec{p}_1 \amp \lambda_2\uvec{p}_2 \amp \cdots \amp \lambda_n\uvec{p}_n \\ | \amp | \amp \amp | \end{bmatrix}. \end{align*}
The only way these two matrices can be equal is if they have equal columns, so that
\begin{align*} A\uvec{p}_1 \amp= \lambda\uvec{p}_1, \amp A\uvec{p}_2 \amp= \lambda\uvec{p}_2, \amp \amp\dotsc, \amp A\uvec{p}_n \amp= \lambda\uvec{p}_n. \end{align*}
These column vector equalities exhibit the eigenvector-eigenvalue pattern. That is, the only way to make \(\inv{P}AP\) diagonal is to use eigenvectors of \(A\) as the columns of the transition matrix \(P\).
Moreover, \(P\) needs to be invertible, so the columns of \(P\) need to be linearly independent (Theorem 20.5.5).

The diagonal form matrix \(\inv{P}AP\).

In Discovery 22.3, we analyzed the pattern of the diagonal matrix \(D = \inv{P}AP\text{.}\) If \(\lambda_j\) is its \(\nth[j]\) diagonal entry, the condition \(A\uvec{p}_j = \lambda_j\uvec{p}_j\) from our analysis above says that \(\lambda_j\) is an eigenvalue for \(A\text{,}\) and the \(\nth[j]\) column of \(P\) is a corresponding eigenvector. So
  • \(D\) will have the eigenvalues of \(A\) for its diagonal entries,
  • the number of times an eigenvalue of \(A\) is repeated as a diagonal entry in \(D\) will correspond to the number of linearly independent eigenvectors for that eigenvalue that were used in the columns of \(P\text{,}\) and
  • the order of the entries down the diagonal of \(D\) corresponds to the order of eigenvectors in the columns of \(P\text{.}\)

Subsection 22.4.2 Diagonalizable matrices

Is every \(n\times n\) matrix similar to a diagonal one? In Discovery 22.4, we discovered that the answer is no. For some matrices, it will not be possible to collect together enough linearly independent eigenvectors to fill all \(n\) columns of the transition matrix \(P\text{.}\) The largest number of linearly independent eigenvectors we can obtain for a particular eigenvalue is the dimension of the corresponding eigenspace. In Discovery 22.6, we discovered that eigenvectors from different eigenspaces of the same matrix are automatically linearly independent. So the limiting factor is the dimension of each eigenspace, and whether these dimensions add up to \(n\text{,}\) the required number of linearly independent columns in \(P\text{.}\)
An eigenvalue of an \(n \times n\) matrix \(A\) has two important numbers attached to it — its algebraic multiplicity and its geometric multiplicity.
If the roots of the characteristic polynomial are all real numbers, then the characteristic polynomial will factor completely as
\begin{equation*} c_A(\lambda) = (\lambda-\lambda_1)^{m_1}(\lambda-\lambda_2)^{m_2}\dotsm(\lambda-\lambda_\ell)^{m_\ell}, \end{equation*}
where the \(\lambda_j\) are the distinct eigenvalues of \(A\) and the \(m_j\) are the corresponding algebraic multiplicities. Since \(c_A(\lambda)\) is always a degree \(n\) polynomial, the algebraic multiplicities will add up to \(n\text{.}\) To obtain enough linearly independent eigenvectors for \(A\) to fill the columns of \(P\text{,}\) we also need the geometric multiplicities to add up to \(n\text{.}\) We will learn in Subsection 22.6.3 that somehow, the algebraic multiplicity of each eigenvalue is the “best-case scenario” — the geometric multiplicity for an eigenvalue can be no greater than its algebraic multiplicity. Thus, if any eigenvalue for \(A\) is “defective” in the sense that its geometric multiplicity is strictly less than its algebraic multiplicity, we will not obtain enough linearly independent eigenvectors for that eigenvalue to fill up its “portion” of the required eigenvectors. To summarize, a square matrix is diagonalizable precisely when each of its eigenvalues has geometric multiplicity equal to its algebraic multiplicity.

Subsection 22.4.3 Diagonalization procedure