Section 26.5 Theory
In this section.
Subsection 26.5.1 Similarity is an equivalence relation
We will record here the result of Discovery 26.1 without proof, as we have already justified it in that activity.
Theorem 26.5.1.
Similarity is an equivalence relation on square matrices of a particular size. That is, similarity is reflexive, symmetric, and transitive.
As discussed in Subsection 26.3.3, Theorem 26.5.1 implies that similarity partitions \(\matrixring_n(\C)\) into similarity classes. There are some special similarity classes, which explains our initial focus on diagonal form (Chapter 25) as the first “simple” form of interest.
Proposition 26.5.2. Scalar similarity classes.
The similarity class of a scalar matrix contains only that matrix.
Proof idea.
All that is required is to prove that a scalar matrix \(k I\) can only be similar to itself. That is, if \(B \similar k I\) is true, then it must in fact be that \(B = kI\text{.}\) We leave the details to you, the reader.
Remark 26.5.3.
Note that this proposition applies to both the zero matrix \(\zerovec = 0 I\) and the identity matrix \(I = 1 I\text{.}\)
Subsection 26.5.2 Properties of similar matrices
Now we'll state and prove the properties of similar matrices that can be transferred from one to the other via a transition matrix that realizes the similarity.
Proposition 26.5.4. Powers of similar are similar.
If \(A\) and \(B\) are similar, then so are \(A^k\) and \(B^k\) for every positive integer \(k\text{.}\) If, in addition, we assume that one of \(A,B\) is invertible, then the other must be as well, and we may expand our first claim to include every integer \(k\text{.}\)
Proof.
Assume \(B = \inv{P} A P\text{.}\)
Case of positive \(k\).
We have
where the expression on the right has \(k\) factors of \((\inv{P} A P)\text{.}\) By associativity, rearrange the brackets:
Each factor of \(P \inv{P}\) is equal to the identity matrix, and the identity matrix has no effect in matrix multiplication. Thus,
where there are \(k\) factors of \(A\text{,}\) since we started with \(k\) factors of \((\inv{P} A P)\text{.}\) We now have \(B^k = \inv{P} A^k P\text{,}\) showing that \(A^k\) and \(B^k\) are similar, as desired.
If we assume that \(A\) is invertible, then so is \(B\) since it is equal to the product of invertible matrices \(\inv{P}\text{,}\) \(A\text{,}\) and \(P\) (Statement 4 of Proposition 5.5.5). If we instead assume that \(B\) is invertible, then we can make the same argument using expression \(A = P B \inv{P}\) to conclude that \(A\) is invertible as well.
For the remainder of the cases of showing similarity of powers of \(A\) and \(B\text{,}\) assume that \(A\) and \(B\) are both invertible.
Case of \(k = 0\).
By convention, both \(A^0 = I\) and \(B^0 = I\text{,}\) and the identity matrix is similar to itself.
Case of \(k = -1\).
Demonstrating similarity of \(\inv{A}\) and \(\inv{B}\) is straightforward, and we leave it to you, the reader.
Case of \(k -1\).
Write \(k = - m\) for \(m\) a positive integer. If \(A\) and \(B\) are similar, then so are \(\inv{A}\) and \(\inv{B}\) by a previous case. But then so are positive powers of \(\inv{A}\) and \(\inv{B}\text{,}\) also by a previous case. In particular, the matrices
are similar.
Proposition 26.5.5.
Assume \(B = \inv{P} A P\text{.}\) Then
- The transition matrix \(P\) transforms null space vectors of \(B\) into null space vectors of \(A\text{.}\)
- The transition matrix \(P\) transforms column space vectors of \(B\) into column space vectors of \(A\text{.}\)
Proof idea for Statement 1.
Assume \(\uvec{x} = \uvec{x}_0\) is a solution to \(B \uvec{x} = \zerovec\text{,}\) so that \(B \uvec{x}_0 = \zerovec\) is true. Substitute the similarity relation \(B = \inv{P} A P\) into this vector equality and rearrange to demonstrate that \(\uvec{x} = P \uvec{x}_0\) is a solution to \(A \uvec{x} = \zerovec\text{.}\)
Proof idea for Statement 2.
If vector \(\uvec{b}\) is in the column space of \(B\text{,}\) then the system \(B \uvec{x} = \uvec{b}\) is consistent. (See the discussion in Subsection 21.3.1.) So suppose \(\uvec{x} = \uvec{x}_0\) is a solution to this system, so that \(B \uvec{x}_0 = \uvec{b}\) is true. Substitute the similarity relation \(B = \inv{P} A P\) into this vector equality and rearrange to demonstrate that \(\uvec{x} = P \uvec{x}_0\) is a solution to the system \(A \uvec{x} = P \uvec{b}\text{.}\) Hence this system is consistent, which means that \(P \uvec{b}\) is in the column space of \(A\text{.}\)
The following corollary follows from the proposition due to the fact that the transition matrix \(P\) is invertible.
Corollary 26.5.6. Rank and nullity of similar matrices.
If \(A\) and \(B\) are similar matrices then the dimensions of the null spaces of the two matrices are equal, and also the dimensions of the column spaces of the two matrices are equal. In other words, similar matrices have the same nullity and rank.
Proof idea.
We will outline the proof of equal nullities by considering maximally linearly independent sets of null space vectors; the proof of equal ranks can be similarly obtained by considering maximally linearly sets of column space vectors.
We have already seen in Proposition 26.5.5 that if \(B = \inv{P} A P\text{,}\) then the transition matrix \(P\) transforms null space vectors of \(B\) into null space vectors of \(A\text{.}\) The symmetry of the similarity relation tells that \(\inv{P}\) will do the reverse: transform null space vectors of \(A\) into null space vectors of \(B\text{.}\) Applying Statement 2 of Proposition 21.5.1 with \(E = P\) to a maximal linearly independent set of null space vectors of \(B\text{,}\) or with \(E = \inv{P}\) to a maximal linearly independent set of null space vectors of \(A\text{,}\) we see that such maximal linearly independent sets must be the same size for \(A\) as for \(B\text{.}\) As dimension is the size of a basis, and a basis is a maximally linearly independent set (see Statement 2.a of Theorem 19.5.2) the dimensions of the two null spaces must be the same.
Now we record some facts relating eigenvalues and eigenvectors between similar matrices. Note that we have already recorded (and proved) some of these facts in Subsection 25.6.1.
Proposition 26.5.7.
Similar matrices have the same determinant.
Proof.
We have already stated and proved this fact as Statement 1 of Proposition 25.6.1.
Theorem 26.5.8.
Similar matrices have the same characteristic polynomial, hence also the same eigenvalues with the same algebraic multiplicities. Moreover, if \(B = \inv{P} A P\text{,}\) then for each shared eigenvalue the transition matrix \(P\) transforms eigenvectors of \(B\) into eigenvectors of \(A\text{.}\)
Proof idea.
Note that we have already stated and proved the fact about eigenvalues and their multiplicities as Statement 3 of Proposition 25.6.1. So all that remains is to prove the proposed fact about eigenvectors.
So assume \(B = \inv{P} A P\text{,}\) and suppose \(\uvec{x}_0\) is an eigenvector of \(B\) corresponding to eigenvalue \(\lambda\text{.}\) By definition, this means
Substitute the assumed similarity relation for \(A\) and \(B\) and rearrange to
which says that \(P \uvec{x}_0\) is an eigenvector of \(A\) corresponding to the shared eigenvalue \(\lambda\text{.}\)
Similarly to null spaces and column spaces of similar matrices above, the following corollary follows from the theorem due to the fact that the transition matrix \(P\) is invertible.
Corollary 26.5.9.
If \(A\) and \(B\) are similar matrices and \(\lambda\) is a shared eigenvalue of these two matrices, then the eigenspace of \(A\) corresponding to \(\lambda\) has the same dimension as the eigenspace of \(B\) corresponding to \(\lambda\text{.}\) In other words, the geometric multiplicity of \(\lambda\) as an eigenvalue of \(A\) is the same as the geometric multiplicity of \(\lambda\) as an eigenvalue of \(B\text{.}\)
Proof idea.
This is essentially the same as the proof of Corollary 26.5.6: if \(B = \inv{P} A P\text{,}\) then from Theorem 26.5.8 we know that the transition matrix \(P\) will transform vectors in the eigenspace \(E_\lambda(B)\) into vectors in the eigenspace \(E_\lambda(A)\) for any shared eigenvalue of \(A\) and \(B\text{,}\) and \(\inv{P}\) will do the reverse. Applying Statement 2 of Proposition 21.5.1 with \(E = P\) to a maximal linearly independent set of vectors in \(E_\lambda(B)\text{,}\) or with \(E = \inv{P}\) to a maximal linearly independent set of vectors in \(E_\lambda(A)\text{,}\) we see that such maximal linearly independent sets must be the same size for \(A\) as for \(B\text{.}\)