Skip to main content
Logo image

Discover Linear Algebra

Section 26.5 Theory

Subsection 26.5.1 Properties of trace

We have introduced a new matrix operation in this chapter, so we will record some of its algebraic properties. Most of these are straightforward to verify from the definition of trace (of a square matrix) (see the Terminology and notation section for this chapter), and we leave that task to you, the reader.

Proof of Statement 6.

Suppose that \(A,B\) are \(n \times n\) matrices. Set \(C = A B\) and \(D = B A\text{.}\) Since \(C\) is a product matrix, each of its entries is obtained from the product of a row of \(A\) with a column of \(B\text{.}\) In particular, the diagonal entries of \(C\) are:
\begin{equation*} \begin{array}{rcccccccc} c_{11} \amp = \amp a_{11} b_{11} \amp + \amp a_{12} b_{21} \amp + \amp \dotsb \amp + \amp a_{1n} b_{n1} \text{,} \\ c_{22} \amp = \amp a_{21} b_{12} \amp + \amp a_{22} b_{22} \amp + \amp \dotsb \amp + \amp a_{2n} b_{n2} \text{,} \\ \amp \vdots \\ c_{nn} \amp = \amp a_{n1} b_{1n} \amp + \amp a_{n2} b_{2n} \amp + \amp \dotsb \amp + \amp a_{nn} b_{nn} \text{,} \end{array} \end{equation*}
and \(\trace C\) is the sum of these sums. If you form a similar expression for each diagonal entry of \(D\) as a row of \(B\) times the corresponding column of \(A\text{,}\) you will find that, as a grid of products of matrix entries, it is simply the transpose of the grid above. Conclude that the total sum of each grid is the same.

Subsection 26.5.2 Similarity is an equivalence relation

We will record here the result of Discovery 26.1 without proof, as we have already justified it in that activity.
As discussed in Subsection 26.3.3, Theorem 26.5.2 implies that similarity partitions \(\matrixring_n(\C)\) into similarity classes. There are some special similarity classes, which explains our initial focus on diagonal form (Chapter 25) as the first “simple” form of interest.

Proof idea.

All that is required is to prove that a scalar matrix \(k I\) can only be similar to itself. That is, if \(B \similar k I\) is true, then it must in fact be that \(B = kI\text{.}\) We leave the details to you, the reader.

Remark 26.5.4.

Note that this proposition applies to both the zero matrix \(\zerovec = 0 I\) and the identity matrix \(I = 1 I\text{.}\)

Subsection 26.5.3 Properties of similar matrices

Now we’ll state and prove the properties of similar matrices that can be transferred from one to the other via a transition matrix that realizes the similarity.

Proof.

Assume \(B = \inv{P} A P\text{.}\)
Case of positive \(k\).
We have
\begin{equation*} B^k = (\inv{P} A P)^k = (\inv{P} A P) (\inv{P} A P) (\inv{P} A P) \dotsm (\inv{P} A P), \end{equation*}
where the expression on the right has \(k\) factors of \((\inv{P} A P)\text{.}\) By associativity, rearrange the brackets:
\begin{equation*} B^k = \inv{P} A (P \inv{P}) A (P \inv{P}) A (P \dotsm \inv{P}) A P \text{.} \end{equation*}
Each factor of \(P \inv{P}\) is equal to the identity matrix, and the identity matrix has no effect in matrix multiplication. Thus,
\begin{equation*} B^k = \inv{P} A A A \dotsm A P, \end{equation*}
where there are \(k\) factors of \(A\text{,}\) since we started with \(k\) factors of \((\inv{P} A P)\text{.}\) We now have \(B^k = \inv{P} A^k P\text{,}\) showing that \(A^k\) and \(B^k\) are similar, as desired.
If we assume that \(A\) is invertible, then so is \(B\) since it is equal to the product of invertible matrices \(\inv{P}\text{,}\) \(A\text{,}\) and \(P\) (Statement 4 of Proposition 5.5.5). If we instead assume that \(B\) is invertible, then we can make the same argument using expression \(A = P B \inv{P}\) to conclude that \(A\) is invertible as well.
For the remainder of the cases of showing similarity of powers of \(A\) and \(B\text{,}\) assume that \(A\) and \(B\) are both invertible.
Case of \(k = 0\).
By convention, both \(A^0 = I\) and \(B^0 = I\text{,}\) and the identity matrix is similar to itself.
Case of \(k = -1\).
Demonstrating similarity of \(\inv{A}\) and \(\inv{B}\) is straightforward, and we leave it to you, the reader.
Case of \(k \lt -1\).
Write \(k = - m\) for \(m\) a positive integer. If \(A\) and \(B\) are similar, then so are \(\inv{A}\) and \(\inv{B}\) by a previous case. But then so are positive powers of \(\inv{A}\) and \(\inv{B}\text{,}\) also by a previous case. In particular, the matrices
\begin{align*} \bigl(\inv{A}\bigr)^m \amp = A^{-m} = A^k \text{,} \amp \bigl(\inv{B}\bigr)^m \amp = B^{-m} = B^k \end{align*}
are similar.

Proof idea for Statement 1.

Assume \(\uvec{x} = \uvec{x}_0\) is a solution to \(B \uvec{x} = \zerovec\text{,}\) so that \(B \uvec{x}_0 = \zerovec\) is true. Substitute the similarity relation \(B = \inv{P} A P\) into this vector equality and rearrange to demonstrate that \(\uvec{x} = P \uvec{x}_0\) is a solution to \(A \uvec{x} = \zerovec\text{.}\)

Proof idea for Statement 2.

If vector \(\uvec{b}\) is in the column space of \(B\text{,}\) then the system \(B \uvec{x} = \uvec{b}\) is consistent. (See the discussion in Subsection 21.3.1.) So suppose \(\uvec{x} = \uvec{x}_0\) is a solution to this system, so that \(B \uvec{x}_0 = \uvec{b}\) is true. Substitute the similarity relation \(B = \inv{P} A P\) into this vector equality and rearrange to demonstrate that \(\uvec{x} = P \uvec{x}_0\) is a solution to the system \(A \uvec{x} = P \uvec{b}\text{.}\) Hence this system is consistent, which means that \(P \uvec{b}\) is in the column space of \(A\text{.}\)
The following corollary follows from the proposition due to the fact that the transition matrix \(P\) is invertible.

Proof idea.

We will outline the proof of equal nullities by considering maximally linearly independent sets of null space vectors; the proof of equal ranks can be similarly obtained by considering maximally linearly sets of column space vectors.
We have already seen in Proposition 26.5.6 that if \(B = \inv{P} A P\text{,}\) then the transition matrix \(P\) transforms null space vectors of \(B\) into null space vectors of \(A\text{.}\) The symmetry of the similarity relation tells that \(\inv{P}\) will do the reverse: transform null space vectors of \(A\) into null space vectors of \(B\text{.}\) Applying Statement 2 of Proposition 21.5.1 with \(E = P\) to a maximal linearly independent set of null space vectors of \(B\text{,}\) or with \(E = \inv{P}\) to a maximal linearly independent set of null space vectors of \(A\text{,}\) we see that such maximal linearly independent sets must be the same size for \(A\) as for \(B\text{.}\) As dimension is the size of a basis, and a basis is a maximally linearly independent set (see Statement 2.a of Theorem 19.5.2) the dimensions of the two null spaces must be the same.
Now we record some facts relating eigenvalues and eigenvectors between similar matrices. Note that we have already recorded (and proved) some of these facts in Subsection 25.6.1.

Proof.

We have already stated and proved the statement about determinant as Statement 1 of Proposition 25.6.1.
To prove that similar matrices have the same trace, we will use Statement 6 of Proposition 26.5.1. If \(B = \inv{P} A P\) for some invertible matrix \(P\text{,}\) then
\begin{align*} \trace B \amp = \trace (\inv{P} A P) \\ \amp = \trace \bbrac{\inv{P} (A P)} \\ \amp = \trace \bbrac{(A P) \inv{P}} \\ \amp = \trace \bbrac{A (P \inv{P})} \\ \amp = \trace (A I) \\ \amp = \trace A \text{.} \end{align*}
Note that in the above manipulations, we are not asserting that \(\inv{P} (A P)\) is equal to \((A P) \inv{P}\text{.}\) Instead, we are using the \(\trace (M N) = \trace (N M)\) pattern from Proposition 26.5.1, applied to \(M = \inv{P}\) and \(N = A P\text{.}\)

Proof idea.

Note that we have already stated and proved the fact about eigenvalues and their multiplicities as Statement 3 of Proposition 25.6.1. So all that remains is to prove the proposed fact about eigenvectors.
So assume \(B = \inv{P} A P\text{,}\) and suppose \(\uvec{x}_0\) is an eigenvector of \(B\) corresponding to eigenvalue \(\lambda\text{.}\) By definition, this means
\begin{equation*} B \uvec{x}_0 = \lambda \uvec{x}_0\text{.} \end{equation*}
Substitute the assumed similarity relation for \(A\) and \(B\) and rearrange to
\begin{equation*} A (P \uvec{x}_0) = \lambda (P \uvec{x}_0)\text{,} \end{equation*}
which says that \(P \uvec{x}_0\) is an eigenvector of \(A\) corresponding to the shared eigenvalue \(\lambda\text{.}\)
Similarly to null spaces and column spaces of similar matrices above, the following corollary follows from the theorem due to the fact that the transition matrix \(P\) is invertible.

Proof idea.

This is essentially the same as the proof of Corollary 26.5.7: if \(B = \inv{P} A P\text{,}\) then from Theorem 26.5.9 we know that the transition matrix \(P\) will transform vectors in the eigenspace \(E_\lambda(B)\) into vectors in the eigenspace \(E_\lambda(A)\) for any shared eigenvalue of \(A\) and \(B\text{,}\) and \(\inv{P}\) will do the reverse. Applying Statement 2 of Proposition 21.5.1 with \(E = P\) to a maximal linearly independent set of vectors in \(E_\lambda(B)\text{,}\) or with \(E = \inv{P}\) to a maximal linearly independent set of vectors in \(E_\lambda(A)\text{,}\) we see that such maximal linearly independent sets must be the same size for \(A\) as for \(B\text{.}\)