Skip to main content
Logo image

Discover Linear Algebra

Section 40.1 Motivation

Recall that we can use a square matrix \(A\) to transform column vectors by multiplication. But if we have a transition matrix \(P\) that puts \(A\) in diagonal form, we can consider
\begin{equation*} P = \ucobmtrx{B}{S}\text{,} \end{equation*}
where \(\basisfont{S}\) is the standard basis (of either \(\R^n\) or \(\C^n\text{,}\) as appropriate) and \(\basisfont{B}\) is the basis formed by the columns of \(P\text{.}\) And then also
\begin{equation*} \inv{P} = \ucobmtrx{S}{B}\text{.} \end{equation*}
So if we have
\begin{equation*} A = P D \inv{P} \end{equation*}
for some diagonal matrix \(D\text{,}\) and we consider how \(A\) transforms column vectors by multiplication, we can instead think of \(\inv{P}\) as first converting standard coordinates to \(\basisfont{B}\)-coordinates, the diagonal matrix \(D\) transforming those converted \(\basisfont{B}\)-column vectors by simply scaling each coordinate by the corresponding diagonal entry, and then \(P\) converting the result back to standard coordinates.
We explored this in detail for real \(2 \times 2\) matrix in Discovery 26.2, where we used the columns of \(P\) to represent a new set of \(wz\)-axes (just as standard vectors in \(\R^2\) represent the \(xy\)-axes).
Diagram of a change of variables via transition from one set of two-dimensional axes to another set.
A diagram illustrating how a \(2 \times 2\) transition matrix \(P\) can be used to transform geometry from one set two-dimensional axes to another.
On the left appears a set of \(xy\)-axes along with their representative standard basis vectors \(\uvec{e}_1, \uvec{e}_2\text{.}\) Another set of orthogonal vectors \(\uvec{p}_1, \uvec{p}_2\) is drawn with their initial points at the origin, representing the columns of \(P\text{.}\) Underneath and in parallel with \(\uvec{p}_1\) is drawn a slant axis using a dashed line, labelled as the \(w\)-axis, and similarly, Underneath and in parallel with \(\uvec{p}_2\) is drawn another slant axis using a dashed line, labelled as the \(z\)-axis. In this example, these slant axes are perpendicular to each other. An arbitrary vector \(\uvec{v}\) is also pictured, with its initial point at the origin, and a rectangle is drawn with two sides along the \(wz\)-axes so that \(\uvec{v}\) is a diagonal within the rectangle, indicating the representation of \(\uvec{v}\) in \(wz\)-coordinates.
On the right appears another set of perpendicular axes, but now the horizontal axis is labelled as the \(w\)-axis and the vertical axis is labelled as the \(z\)-axis. A unit vector with initial point at the origin points along each of these axes, labelled as representing vectors \({\widetilde{\uvec{p}}}_1 = \inv{P} \uvec{p}_1\) and \({\widetilde{\uvec{p}}}_2 = \inv{P} \uvec{p}_2\text{,}\) respectively. The transformed vectors \({\widetilde{\uvec{e}}}_1 = \inv{P} \uvec{e}_1\) and \({\widetilde{\uvec{e}}}_2 = \inv{P} \uvec{e}_2\) are also drawn with their initial points at the origin. Slant axes using dashed lines are drawn “through” and in parallel with these vectors, with the axis parallel to \({\widetilde{\uvec{e}}}_1\) labelled as the \(x\)-axis and the axis parallel to \({\widetilde{\uvec{e}}}_2\) labelled as the \(y\)-axis. Again, in this example, these slant axes are perpendicular to each other. The transformed vector \(\widetilde{\uvec{v}} = \inv{P} \uvec{v}\) is also drawn with its initial point at the origin, and again a rectangle is drawn with two sides along the \(wz\)-axes so that \(\uvec{v}\) is a diagonal within the rectangle. This rectangle represents the transformed result of the parallelogram from the set of axes on the left of the diagram.
Finally, curved arrows appear between the two sets of axes on each side of the diagram, one pointing left-to-right and labelled \(\inv{P}\text{,}\) and on point right-to-left and labelled \(P\text{,}\) representing how the transition matrix and its inverse transform from one set of axes to the other.
Figure 40.1.1. A transition matrix as change-of-variables. For each vector \(\uvec{x}\) on the \(xy\)-axes, \(\widetilde{\uvec{x}}\) represents the corresponding transformed vector \(\inv{P} \uvec{x}\) on the \(wz\)-axes.
One problem with this geometric analysis is that, while the geometric effect of multiplication by the diagonal matrix \(D\) is much simpler to analyze, the transition matrix \(P\) and its inverse might have complicated geometric effects in the conversion between standard and \(\basisfont{B}\)-coordinates. The axes that the columns of \(P\) represent might not even be perpendicular!
Diagram of a change of variables via transition matrix that skews geometry from one set of two-dimensional axes to another set.
A diagram illustrating how a \(2 \times 2\) transition matrix \(P\) can be used to transform yet skew geometry from one set two-dimensional axes to another.
On the left appears a set of \(xy\)-axes along with their representative standard basis vectors \(\uvec{e}_1, \uvec{e}_2\text{.}\) Another set of linearly independent vectors \(\uvec{p}_1, \uvec{p}_2\) is drawn with their initial points at the origin, representing the columns of \(P\text{.}\) Underneath and in parallel with \(\uvec{p}_1\) is drawn a slant axis using a dashed line, labelled as the \(w\)-axis, and similarly, Underneath and in parallel with \(\uvec{p}_2\) is drawn another slant axis using a dashed line, labelled as the \(z\)-axis. In this example, these slant axes are not perpendicular to each other. An arbitrary vector \(\uvec{v}\) is also pictured, with its initial point at the origin, and a parallelogram is drawn with two sides along the \(wz\)-axes so that \(\uvec{v}\) is a diagonal within the parallelogram, indicating the representation of \(\uvec{v}\) in \(wz\)-coordinates.
On the right appears another set of perpendicular axes, but now the horizontal axis is labelled as the \(w\)-axis and the vertical axis is labelled as the \(z\)-axis. A unit vector with initial point at the origin points along each of these axes, labelled as representing vectors \({\widetilde{\uvec{p}}}_1 = \inv{P} \uvec{p}_1\) and \({\widetilde{\uvec{p}}}_2 = \inv{P} \uvec{p}_2\text{,}\) respectively. The transformed vectors \({\widetilde{\uvec{e}}}_1 = \inv{P} \uvec{e}_1\) and \({\widetilde{\uvec{e}}}_2 = \inv{P} \uvec{e}_2\) are also drawn with their initial points at the origin. Slant axes using dashed lines are drawn “through” and in parallel with these vectors, with the axis parallel to \({\widetilde{\uvec{e}}}_1\) labelled as the \(x\)-axis and the axis parallel to \({\widetilde{\uvec{e}}}_2\) labelled as the \(y\)-axis. Again, in this example, these slant axes are not perpendicular to each other. The transformed vector \(\widetilde{\uvec{v}} = \inv{P} \uvec{v}\) is also drawn with its initial point at the origin, and a rectangle is drawn with two sides along the \(wz\)-axes so that \(\uvec{v}\) is a diagonal within the rectangle. This rectangle represents the transformed result of the parallelogram from the set of axes on the left of the diagram.
Finally, curved arrows appear between the two sets of axes on each side of the diagram, one pointing left-to-right and labelled \(\inv{P}\text{,}\) and on point right-to-left and labelled \(P\text{,}\) representing how the transition matrix and its inverse transform from one set of axes to the other.
Figure 40.1.2. A transition matrix as geometry-skewing change-of-variables. For each vector \(\uvec{x}\) on the \(xy\)-axes, \(\widetilde{\uvec{x}}\) represents the corresponding transformed vector \(\inv{P} \uvec{x}\) on the \(wz\)-axes.
But if \(P\) is a product-preserving matrix (i.e. orthogonal in the real case or unitary in the complex case), then \(P\) does not skew geometry in moving from standard coordinates to \(\basisfont{B}\)-coordinates (Proposition 39.5.4). And since the columns of \(P\) will form an orthonormal basis (Theorem 39.5.6), the system of axes these columns represent will be an orthogonal system, with each basis vector representing displacement by exactly one unit along its corresponding axis, just as the standard basis vectors do.
With the advantages of a orthogonal/unitary transition matrix in mind, in the remainder of this chapter we attempt to answer the following.

Question 40.1.3.

When is a real square matrix orthogonally diagonalizable? When is a complex square matrix unitarily diagonalizable?