Section 26.3 Concepts
In this section.
Subsection 26.3.1 The geometry of similarity
In Discovery 26.2, we explored the geometric nature of similarity between matrices by examining how each matrix in a similar pair “transformed” vectors in \(\R^n\text{.}\) In effect, the columns of the transition matrix \(P\) form a new basis \(\basisfont{B}\) of \(\R^n\text{,}\) making \(P\) and \(\inv{P}\) transition matrices from \(\basisfont{B}\) to the standard basis \(\basisfont{S}\) and vice versa (justifying our double defintion of the term transition matrix in Section 22.2 and Section 25.2). The new basis \(\basisfont{B}\) forms a new set of axes in \(\R^n\text{,}\) and we then find that the geometric nature of how a standard-coordinate vector is transformed relative to the new axes by one matrix agrees with how the same vector is transformed by the other matrix after the vector is first converted to the new coordinates.
The goal is to understand the first matrix and how it transforms vectors in \(\R^n\) by finding a second similar matrix that is significantly simpler to understand, and then to use the transition matrices \(P\) and \(\inv{P}\) to convert information about the simpler matrix into information about the original matrix.
Relating this back to Discovery 26.2, when we plotted the vectors
on their own set of \(wz\)-axes, we observed that the simple diagonal form of \(B\) made it easy to determine how \(\matrixOf{\uvec{v}}{B}\) is transformed by \(B\text{:}\) a stretch by a factor of \(4\) in the \(w\)-direction and both a reflection and a stretch by a factor of \(2\) in the \(z\)-direction. After superimposing a set of \(wz\)-axes onto the \(xy\)-axes using the \(\basisfont{B}\) vectors, we observed that \(\uvec{v}\) was transformed by \(A\) in exactly the same way relative to the new superimposed axes. This should not have been a surprise, since the formula
(where \(A = P B \inv{P}\) by the reversal of the similarity relation \(\inv{P} A P = B \)) can be translated as:
to transform \(\uvec{v}\) by \(A\text{,}\) one can instead convert \(\uvec{v}\) to \(\basisfont{B}\)-coordinates, transform the result by \(B\text{,}\) and then convert that transformed vector back to standard coordinates.
Subsection 26.3.2 The algebra of similarity
In Discovery 26.2, we explored the algebraic connections between similar matrices \(A,B\) and the transition matrix \(P\) that realizes the similarity by \(\inv{P} A P = B\text{.}\)
Since inverses are complicated to work with, we first converted the similarity relation into \(A P = P B\text{,}\) and then we analyzed this equality of matrices by comparing corresponding columns of each side. Using the pattern of (\(\star\star\star\)) in Subsection 4.3.7, the columns of \(AP\) are
where the \(\uvec{p}_j\) are the columns of \(P\text{.}\) On the other side, the same matrix multiplication pattern gives the columns of \(PB\) as
where the \(\uvec{b}_j\) are the columns of \(B\text{.}\) To have both sides expressed in terms of the columns of \(P\text{,}\) we used the matrix-times-vector pattern of (\(\star\star\star\)) in Subsection 4.3.7 to write each \(P \uvec{b}_j\) as a linear combination of the columns of \(P\text{,}\) with coefficients from the column \(\uvec{b}_j\).
Putting both sides together, we must have
Since \(P\) is assumed invertible, its columns form a basis of \(\R^n\text{.}\) And linear combinations of basis vectors are precisely how we determine coordinates relative to a basis, which leads from the above equality of columns to the following pattern.
Pattern 26.3.1. Similarity by columns.
Square matrices \(A\) and \(B\) are similar via \(\inv{P} A P = B\) if and only if each column of \(B\) is the coordinate vector relative to \(\basisfont{B}_P\) of the result of transforming the corresponding column of \(P\) by \(A\text{,}\) where \(\basisfont{B}_P\) is the basis of \(\R^n\) formed by the columns of \(P\text{.}\) In symbols, we can write this condition as
for each column \(\uvec{b}_j\) of \(B\) and each corresponding column \(\uvec{p}_j\) of \(P\text{.}\)
This pattern will guide our exploration of matrix forms in most of the remainder of Part III of this book.
Subsection 26.3.3 Similarity classes
In Discovery 26.1, we demonstrated that the similarity relation is
- symmetric
that is, a matrix is not just similar to another matrix, a pair of matrices must be similar to each other;
- reflexive
that is, every matrix is similar to itself; and
- transitive
that is, similarity can be transferred from first to third in a chain of similarities \(A \similar B \similar C\text{.}\)
These three properties together demonstrate that similarity is an equivalence relation.
A consequence of this fact is that similarity partitions \(\matrixring_n(\C)\) into similarity classes — subsets of matrices that are all similar to each other but not to other matrices. To see why this is so, consider a particular square matrix \(A\text{.}\) The similarity class of \(A\) is defined to be the collection of all matrices that are similar to \(A\text{.}\) By the reflexive property, this class contains \(A\) itself, so is always nonempty. Suppose the class also contains matrices \(B\) and \(C\text{,}\) so that both \(B \similar A\) and \(C \similar A\) are true. Then the symmetric property lets us turn this into a chain \(B \similar A \similar C\text{,}\) which in turn lets us conclude \(B \similar C\) by the transitive property. So matrices in a particular similarity class are all equivalent to each other. Furthermore, a matrix in the similarity class of \(A\) cannot be similar to a matrix outside of that class, for if \(B \similar D\) is true then again we can form a chain \(A \similar B \similar D\text{,}\) so that \(D\) must be in the class.
Subsection 26.3.4 The similarities of similar matrices
The remaining activities in Discovery guide 26.1 provided justification for using the adjective similar in the terminology similar matrices. Just as we were able to use the transition matrices \(P\) and \(\inv{P}\) to relate the geometric action of \(A\) on \(\R^n\) relative to one set of coordinates to the geometric action of \(B\) on \(\R^n\) relative to another set of coordinates in Subsection 26.3.1, those same transition matrices allow us to relate algebraic properties of the similar matrices.
We will formally state and prove many of the statements from those remaining discovery activities in Subsection 26.5.2.