Section 22.6 Theory
Subsection 22.6.1 Similar matrices
First, we’ll record just a few of the facts about general similar matrices from Section 22.3.
Proposition 22.6.1. Properties of similar matrices.
- Similar matrices have the same determinant.
- Similar matrices have the same characteristic polynomial.
- Similar matrices have the same eigenvalues, with the same algebraic multiplicities.
Proof of Statement 1.
Suppose square matrices and are similar, and is a transition matrix that realizes the similarity, so that
We know from Proposition 10.5.6 that the determinant of a product is the product of the determinants. And we also know from Proposition 10.5.8 that the determinant of an inverse is the inverse of the determinant. So we can compute as
Thus, the similar matrices and have the same determinant.
Careful: In this proof, it would have been incorrect to cancel the with the immediately, because order of matrix multiplication matters! It was only after we split the determinant into a product of determinants that we could cancel with because all three of the determinants are numbers, and order of number multiplication does not matter.
Proof of Statement 2.
Suppose square matrices and are similar, and is a transition matrix that realizes the similarity, so that
The characteristic polynomials of these two matrices are computed as
Using our assumption along with we can express the matrix involved in the characteristic polynomial for as
where in the last step we have factored the common and factors out of the difference (making sure to factor each to the correct side, because order of matrix multiplication matters). We have now shown that matrices and are also similar, via the same transition matrix and so by Statement 1 they have the same determinant. That is,
and thus the similar matrices and have the same characteristic polynomial.
Proof of Statement 3.
Statement 3 follows immediately from Statement 2, as the eigenvalues of a matrix are precisely the roots of the characteristic polynomial of the matrix, and the algebraic multiplicity of an eigenvalue is the number of times that value is repeated as a root of the characteristic polynomial.
Subsection 22.6.2 Diagonalizable matrices
We start with the justification that a transition matrix made up of linearly independent eigenvectors will diagonalize a matrix.
Theorem 22.6.2. Characterization of diagonalizability.
An matrix is diagonalizable if and only if there exists a set of linearly independent vectors in each of which is an eigenvector of If is an matrix whose columns are linearly independent eigenvectors of then diagonalizes
Proof.
This fact follows from our analysis of the transition matrix and the diagonal form matrix in Subsection 22.4.1.
We will refine this theorem using our more sophisticated notions of algebraic and geometric multiplicity in the next subsection. But first, here is a surprising result that demonstrates how central eigenvalues are in matrix theory.
Proposition 22.6.3. Determinant versus eigenvalues.
If a square matrix is diagonlizable, then its determinant is equal to the product of its eigenvalues (including multiplicities).
Proof.
Suppose is a diagonalizable matrix. Then it is similar to some diagonal matrix The eigenvalues of a diagonal matrix are precisely the diagonal entries, and the algebraic multiplicity of each of these eigenvalues is the number of times that eigenvalue is repeated down the diagonal. So if are all of the distinct eigenvalues of (i.e. there are no repeats in this list of eigenvalues), and are the corresponding algebraic multiplicities of these eigenvalues (i.e. each is equal to the number of times appears on the main diagonal of ), then
because the determinant of a diagonal matrix is just the product of its diagonal entries (Statement 1 of Proposition 8.5.2). But from Statement 1 of Proposition 22.6.1 we know that the similar matrices and have the same determinant, and have all the same eigenvalues with the same corresponding algebraic multiplicities. Thus, the expression
can be viewed as an expression for as a product of the eigenvalues of including multiplicities.
Remark 22.6.4.
The above fact is actually true about all square matrices, if you allow complex eigenvalues. In a second linear algebra course, you may learn that diagonalizable matrices are a special case of a more general theory, in which every matrix can be triangularized. That is, every square matrix is similar to a special form of triangular matrix (either upper or lower), though for many matrices both the transition matrix and the triangular form matrix might need to contain complex numbers in its entries. In this more general theory, it is again the case that the diagonal entries of the triangular form matrix will be precisely the eigenvalues of the original matrix, with each eigenvalue repeated down the diagonal according to its algebraic multiplicity, so the proof provided for the fact above can be adapted to work in this slightly more general setting.
Subsection 22.6.3 The geometry of eigenvectors
We require that the columns of a transition matrix be linearly independent, so that is invertible. Basis vectors for a particular eigenspace are linearly independent by definition of basis. But when we lump basis vectors from different eigenspaces together, will they all remain linearly independent together? The next fact answers this question with a more general version of what we explored in Discovery 22.6.
Proposition 22.6.5. Eigenvectors from different eigenvalues are independent.
Suppose is an matrix, and is a linearly independent set of vectors in each of which is an eigenvector for Further suppose that is another eigenvector for that is linearly independent from those vectors in that are from the same eigenspace as Then the enlarged collection of eigenvectors consisting of all vectors in along with is also linearly independent.
Proof.
Let’s write where the are those eigenvectors in that are in the same eigenspace as and the are those that are not. Write for the eigenvalue of corresponding to (hence also to each ), and write for the eigenvalue corresponding to We have assumed that the full set is linearly independent, and therefore so are the subsets and (Statement 2 of Statement 17.5.3). In addition, we have assumed that the set remains linearly independent.
The strategy in this proof is essentially the same as explored in Discovery 22.6. To prove independence, we must prove that the assumption
leads to the conclusion that each of the scalars is
Since each of the vectors in the combination above is an eigenvector for if we multiply both sides of equation (✶) by the matrix we may substitute and Making these substitutions, we obtain
Notice that the and terms of both the “ times (✶)” equation and the “ times (✶)” equation are identical, so if we subtract these equations and collect like -terms, we obtain
Since the collection of vectors are linearly independent, the scalar coefficient expressions in this new linear combination must all be zero. That is, each scalar expression
must be zero. However, none of the is from the same eigenspace as so each is nonzero, which forces each of the to be zero.
Substituting this new information into equation (✶), we have
But the collection is assumed independent, so each of the scalars in the remaining combination on the left above is also zero.
We have now successfully shown that the only way equation (✶) can be true is if each of the scalars involved is as required.
The proposition above is somewhat similar in effect to Proposition 17.5.6, in that it lets us build up a linearly independent set of eigenvectors one-by-one. But the above fact is a little stronger, in that when we look to add a new eigenvector to our collection, we only need to worry about it being linearly independent from the eigenvectors we already have from that eigenspace. This leads to the following corollary.
Corollary 22.6.6. Eigenspaces are independent.
Given a collection of bases for the different eigenspaces of a matrix, the collection of all these eigenspace basis vectors together will still be linearly independent.
Proof.
Let be a square matrix, and write for its eigenvalues. Suppose we have a basis for eigenspace and a basis for eigenspace and so on. Begin with which is linearly independent because it is a basis for a subspace. Enlarge with vectors from one at a time. At each step we may apply Proposition 22.6.5, because each new vector from is both
- from a different eigenspace than the vectors in
and - linearly independent from the previous vectors from
already included in the new enlarged collection.
Proposition 22.6.5 tells us that at each step of enlarging our collection by one, the new, larger collection will remain linearly independent. Once we run out of vectors in we begin enlarging our collection with vectors from one at a time. Again, Proposition 22.6.5 applies at each enlargement step, so that each collection of eigenvectors along the way remains linearly independent. Carry this process through to the end, until finally all vectors from are also included, and Proposition 22.6.5 will still apply at the last step to tell us that the complete set of basis eigenvectors is linearly independent.
In the next subsection, we will use this corollary to refine our initial characterization of diagonalizability stated in Theorem 22.6.2. In the meantime, we will formally state the relationship between geometric and algebraic multiplicities that we discussed in Subsection 22.4.2.
Theorem 22.6.7. Geometric versus algebraic multiplicity.
The geometric multiplicity of an eigenvalue is always less than or equal to its algebraic multiplicity.
Proof.
We will not include the proof of this statement here — you may encounter it in further study of matrix forms, perhaps in a second course in linear algebra.
Remark 22.6.8.
As we’ve noted already, the geometric multiplicity of an eigenvalue is always at least one, since otherwise it wouldn’t have any corresponding nonzero eigenvectors!
Subsection 22.6.4 More about diagonalizable matrices
Corollary 22.6.6 tells us that when collecting eigenvectors to make up the transition matrix we only have to worry about linear independence inside eigenspaces; linear independence between eigenspaces is automatic. But linear independence inside an eigenspace is taken care of for us when we row reduce So our initial characterization of diagonalization in Theorem 22.6.2 can be refined so that we don’t actually have to worry about linear independence of eigenvectors at all — we just have to worry about having enough eigenspace basis vectors. It turns out that the algebraic multiplicity of each eigenvector is exactly the necessary number of basis vectors for the corresponding eigenspace, and the next statements record this thinking.
Corollary 22.6.9. More characterizations of diagonalizability.
- A matrix with real eigenvalues is diagonalizable if and only if each eigenvalue has geometric multiplicity equal to its algebraic multiplicity.
- An
matrix that has different real eigenvalues must be diagonalizable.
Proof of Statement 1.
We need linearly independent eigenvectors to make up the columns of the transition matrix The maximum number of linearly independent eigenvectors we can get from a single eigenspace is the geometric multiplicity of the eigenvalue So the maximum number of linearly independent eigenvectors we can get in total is the sum of the geometric multiplicities of the eigenvalues. But the characteristic polynomial has degree and is the sum of the algebraic multiplicities of the eigenvalues, because if has all real eigenvalues, then factors as
So if even one eigenvalue is deficient in the sense that its geometric multiplicity is strictly less than its algebraic multiplicity, we won’t obtain enough linearly independent eigenvectors from that eigenspace to contribute to the linearly eigenvectors we need in total.
On the other hand, if each eigenvalue has geometric multiplicity equal to its algebraic multiplicity, then forming eigenspace bases and collecting them all together will provide us with exactly eigenvectors, and Proposition 22.6.5 tells us that these eigenvectors will be linearly independent.
Proof of Statement 2.
In the case that a square matrix has different real eigenvalues, then each of these eigenvalues must have algebraic multiplicity since otherwise these algebraic multiplicities would add up to more than the degree of the characteristic polynomial. So each geometric multiplicity is no greater than But also, as in noted in Remark 22.6.8, each geometric multiplicity must be at least Thus, each geometric multiplicity for this matrix is exactly and so is equal to the corresponding algebraic multiplicity.
The result now follows from the first statement of this corollary.