Skip to main content

Section 40.1 Motivation

Recall that we can use a square matrix \(A\) to transform column vectors by multiplication. But if we have a transition matrix \(P\) that puts \(A\) in diagonal form, we can consider

\begin{equation*} P = \ucobmtrx{B}{S}\text{,} \end{equation*}

where \(\basisfont{S}\) is the standard basis (of either \(\R^n\) or \(\C^n\text{,}\) as appropriate) and \(\basisfont{B}\) is the basis formed by the columns of \(P\text{.}\) And then also

\begin{equation*} \inv{P} = \ucobmtrx{S}{B}\text{.} \end{equation*}

So if we have

\begin{equation*} A = P D \inv{P} \end{equation*}

for some diagonal matrix \(D\text{,}\) and we consider how \(A\) transforms column vectors by multiplication, we can instead think of \(\inv{P}\) as first converting standard coordinates to \(\basisfont{B}\)-coordinates, the diagonal matrix \(D\) transforming those converted \(\basisfont{B}\)-column vectors by simply scaling each coordinate by the corresponding diagonal entry, and then \(P\) converting the result back to standard coordinates.

We explored this in detail for real \(2 \times 2\) matrix in Discovery 26.2, where we used the columns of \(P\) to represent a new set of \(wz\)-axes (just as standard vectors in \(\R^2\) represent the \(xy\)-axes).

Diagram of a change of variables via transition matrix.
Figure 40.1.1. A transition matrix as change-of-variables. For each vector \(\uvec{x}\) on the \(xy\)-axes, the corresponding vector \(\widetilde{\uvec{x}}\) on the \(wz\)-axes indicates the converted vector \(\inv{P} \uvec{x}\text{.}\)

One problem with this geometric analysis is that, while the geometric effect of multiplication by the diagonal matrix \(D\) is much simpler to analyze, the transition matrix \(P\) and its inverse might have complicated geometric effects in the conversion between standard and \(\basisfont{B}\)-coordinates. The axes that the columns of \(P\) represent might not even be perpendicular!

Diagram of a change of variables via transition matrix that skews geometry.
Figure 40.1.2. A transition matrix as geometry-skewing change-of-variables. For each vector \(\uvec{x}\) on the \(xy\)-axes, the corresponding vector \(\widetilde{\uvec{x}}\) on the \(wz\)-axes indicates the converted vector \(\inv{P} \uvec{x}\text{.}\)

But if \(P\) is a product-preserving matrix (i.e. orthogonal in the real case or unitary in the complex case), then \(P\) does not skew geometry in moving from standard coordinates to \(\basisfont{B}\)-coordinates (Proposition 39.5.4). And since the columns of \(P\) will form an orthonormal basis (Theorem 39.5.6), the system of axes these columns represent will be an orthogonal system, with each basis vector representing displacement by exactly one unit along its corresponding axis, just as the standard basis vectors do.

With the advantages of a orthogonal/unitary transition matrix in mind, in the remainder of this chapter we attempt to answer the following.

Question 40.1.3.

When is a real square matrix orthogonally diagonalizable? When is a complex square matrix unitarily diagonalizable?