Skip to main content

Section 40.6 Theory

Subsection 40.6.1 Eigenvalues and eigenvectors

Since diagonalizability is dependent on eigenvalues and eigenvectors, we begin by relating those concepts to the inner product. Recall that for orthogonal/unitary diagonalization, we are most concerned with having eigenvectors that are orthogonal.

The proofs of these statements in the Hermitian complex case are outlined in Discovery 40.1 and Discovery 40.5. For the symmetric real case, the results follow from the fact that every such matrix can also be considered as a complex Hermitian matrix (as in Discovery 40.2), and the fact that the complex dot product applied to real vectors gives the same result as the real dot product.

Remark 40.6.2.

Another way to read Statement 2 of the theorem is to say that if \(\lambda_1 \neq \lambda_2\) are distinct eigenvalues of a self-adjoint matrix \(H\text{,}\) then \(E_{\lambda_1}(H)\) is a subspace of \(\orthogcmp{E_{\lambda_2}(H)}\) and \(E_{\lambda_2}(H)\) is a subspace of \(\orthogcmp{E_{\lambda_1}(H)}\text{.}\)

The proof of the statement for unitary complex matrices is outlined in Discovery 40.8. For the orthogonal real case, the results follow from the fact that every such matrix can also be considered as a complex unitary matrix (as in Discovery 40.9), and the fact that the complex modulus applied to a real number gives the same result as the real absolute value.

Remark 40.6.5.

Note that the statement for orthogonal matrices does not say that both \(\pm 1\) must be eigenvalues For example, the identity matrix is orthogonal with only \(\lambda = 1\) as an eigenvalue, and the negative of the identity matrix is also orthogonal with only \(\lambda = -1\) as an eigenvalue.

And an orthogonal real matrix may have no real eigenvalues at all. For example, the rotation matrix

\begin{equation*} \left[\begin{array}{cr} \cos \theta \amp -\sin \theta \\ \sin \theta \amp \cos \theta \end{array}\right] \end{equation*}

has no real eigenvalues. But the complex eigenvalues of an orthogonal real matrix must have complex modulus \(1\text{,}\) just as for the eigenvalues of a unitary complex matrix.

To investigate eigenvalues and eigenvectors of normal matrices, we'll first relate the property of normality back to the inner product.

Starting on the right in the proposed equality, we have

\begin{align*} {\inprod{N \uvec{u}}{N \uvec{v}}}_{\C} \amp = {\inprod{\adjoint{N} N \uvec{u}}{\uvec{v}}}_{\C} \amp \amp\text{(i)}\\ \amp = {\inprod{N \adjoint{N} \uvec{u}}{\uvec{v}}}_{\C} \amp \amp\text{(ii)}\\ \amp = {\inprod{\adjoint{N} \uvec{u}}{\adjoint{N} \uvec{v}}}_{\C} \amp \amp\text{(iii)}\text{,} \end{align*}

with justifications

  1. definition of adjoint relative to the inner product;
  2. definition of normal matrix; and
  3. definition of adjoint relative to the inner product.

This follows immediately from using \(\uvec{u} = \uvec{v}\) in Lemma 40.6.6, and we leave verification of the details to you, the reader.

Next, we'll relate normality to the eigenvalue coefficient matrix pattern \(\lambda I - N\text{.}\) Note that the following lemma is true for all scalars \(\lambda\text{,}\) not just eigenvalues of the normal matrix \(N\text{.}\)

Assuming \(\adjoint{N} N = N \adjoint{N}\) is true, use the algebra rules of the adjoint to verify that

\begin{equation*} \adjoint{(\lambda I - N)} (\lambda I - N) = (\lambda I - N) \adjoint{(\lambda I - N)} \end{equation*}

is also true.

Finally, we can use what we've established about normal matrices so far to investigate properties of their eigenvalues and eigenvectors.

  1. Suppose \(\lambda\) is an eigenvalue of normal matrix \(N\) and \(\uvec{x}\) is a corresponding eigenvector. Then,

    \begin{equation*} (\lambda I - N) \uvec{x} = \zerovec \text{,} \end{equation*}

    so that

    \begin{equation*} \norm{(\lambda I - N) \uvec{x}} = 0 \text{.} \end{equation*}

    But since \(N\) is normal, so is \(\lambda I - N\) (Lemma 40.6.8), and so

    \begin{equation*} \norm{\adjoint{(\lambda I - N)} \uvec{x}} = \norm{(\lambda I - N) \uvec{x}} = 0 \end{equation*}

    (Corollary 40.6.7). But the only vector with a norm of zero is the zero vector, so we have

    \begin{equation*} \adjoint{(\lambda I - N)} \uvec{x} = \zerovec \text{.} \end{equation*}

    Using the algebra rules of the adjoint, we can instead write

    \begin{equation*} (\cconj{\lambda} I - \adjoint{N}) \uvec{x} = \zerovec \text{,} \end{equation*}

    so that \(\uvec{x}\) is also an eigenvector of \(\adjoint{N}\text{,}\) but relative to the eigenvalue \(\cconj{\lambda}\text{.}\)

    So far we have established that every vector in \(E_\lambda(N)\) is also in \(E_{\cconj{\lambda}}(\adjoint{N})\text{.}\) But since \(\adjoint{(\adjoint{N})} = N\) and \(\lcconj{\cconj{\lambda}} = \lambda\text{,}\) we can symmetrically say that every vector in \(E_{\cconj{\lambda}}(\adjoint{N})\) is also in \(E_\lambda(N)\text{.}\) Putting both together establishes that

    \begin{equation*} E_{\cconj{\lambda}}(\adjoint{N}) = E_\lambda(N) \text{.} \end{equation*}
  2. Suppose \(\uvec{x}_1,\uvec{x}_2\) are eigenvectors of \(N\) from different eigenspaces \(E_{\lambda_1}(N),E_{\lambda_2}(N)\text{,}\) \(\lambda_1 \neq \lambda_2\text{.}\) Similar to the proof of Statement 2 of Theorem 40.6.1, compare the two sides of the equality
    \begin{equation*} \inprod{N \uvec{x}_1}{\uvec{x}_2} = \inprod{\uvec{x}_1}{\adjoint{N} \uvec{x}_2} \end{equation*}
    (applying Statement 1 on the right) to eventually conclude that \(\inprod{\uvec{x}_1}{\uvec{x}_2} = 0\text{.}\)
Remark 40.6.10.

As remarked for self-adjoint matrices, Statement 2 of the theorem says that if \(\lambda_1 \neq \lambda_2\) are distinct eigenvalues of a normal matrix \(N\text{,}\) then \(E_{\lambda_1}(N)\) is a subspace of \(\orthogcmp{E_{\lambda_2}(N)}\) and \(E_{\lambda_2}(N)\) is a subspace of \(\orthogcmp{E_{\lambda_1}(N)}\text{.}\)

Subsection 40.6.2 Characterizations of orthogonal/unitary diagonalization

We start with the basic requirement for diagonalizing a matrix with an orthogonal or unitary transition matrix.

We know from Theorem 25.6.3 (and its complex version) that to diagonalize a matrix \(A\text{,}\) the columns of the transition matrix must be a basis of \(\R^n\) (in the real case) or \(\C^n\) (in the complex case) consisting of eigenvectors of \(A\text{.}\) If we also want the transition matrix to be product-preserving, those eigenvector columns must be an orthonormal set (Statement 4 of Theorem 39.5.6).

And now we identify specific classes of orthogonally/unitarily diagonalizable matrices.

Suppose \(H\) is a Hermitian matrix. By The Fundamental Theorem of Algebra (Complex Version), the matrix \(H\) has at least one eigenvalue, and so has a nonzero eigenvector, say \(\uvec{x}\text{.}\) By normalizing, we may assume that \(\uvec{x}\) is a unit vector.

Let \(W = \Span \{ \uvec{x} \}\text{.}\) Since \(\uvec{x}\) is an eigenvector, \(W\) is an \(H\)-invariant subspace of \(\C^n\text{.}\) Let \(W^\perp\) be the orthogonal complement of \(W\) inside \(\C^n\text{.}\) Then the pair of subspaces \(W, W^\perp\) satisfy the properties required for the block-diagonalization procedure (Corollary 37.5.19). Since \(H\) is self-adjoint, Proposition 39.5.3 states that \(W^\perp\) is \(H\)-invariant.

If

\begin{equation*} \{ \uvec{y}_1, \dotsc, \uvec{y}_{n-1} \} \end{equation*}

is an orthonormal basis for \(W^\perp\text{,}\) then

\begin{equation*} \basisfont{B} = \{ \uvec{x}, \uvec{y}_1, \dotsc, \uvec{y}_{n-1} \} \end{equation*}

is an orthonormal basis for \(\C^n\text{,}\) and so the matrix \(U_1\) formed by taking the vectors of \(\basisfont{B}\) as the columns in \(U_1\) is unitary (Statement 4 of Theorem 39.5.6). The matrix \(U_1\) will put \(H\) into block-diagonal form, and because the basis of \(W\) used in the block-diagonalization procedure consists of a single eigenvector of \(H\text{,}\) the first block of \(\adjoint{U}_1 H U_1\) (corresponding to \(W\)) is a \(1 \times 1\) block with the corresponding eigenvalue of \(H\) as its entry. Write

\begin{equation*} \inv{U}_1 H U_1 = \begin{bmatrix} \lambda \amp \\ \amp H' \end{bmatrix} \text{,} \end{equation*}

where \(\lambda\) is the eigenvalue of \(H\) corresponding to \(\uvec{x}\text{,}\) and \(H'\) is the \((n-1) \times (n-1)\) block corresponding to \(W^\perp\text{.}\) Since \(H\) is Hermitian, so is \(\adjoint{U}_1 H U_1\text{,}\) and therefore so is \(H'\text{.}\)

By mathematical induction, we may assume that the smaller Hermitian matrix \(H'\) is unitarily diagonalizable by some \((n-1) \times (n-1)\) unitary matrix \(U'\text{.}\) Therefore, if we take

\begin{align*} U_2 \amp = \begin{bmatrix} 1 \\ \amp U' \end{bmatrix} \text{,} \amp U \amp = U_1 U_2 \text{,} \end{align*}

then both \(U_2\) and \(U\) are also unitary, and \(\inv{U} H U = \adjoint{U} H U\) will be diagonal.

As noted in Subsection 40.4.1, this is the end of the story in the real case.

Orthogonally diagonalizable \(\implies\) symmetric.

If real matrix \(A\) is orthogonally diagonalizable, then there exists orthogonal \(P\) so that

\begin{equation*} \utrans{P} A P = D \text{,} \end{equation*}

a diagonal matrix. Every diagonal matrix is symmetric, so we have

\begin{equation*} \utrans{A} = \utrans{(P D \utrans{P})} = P \utrans{D} \utrans{P} = P D \utrans{P} = A\text{,} \end{equation*}

as desired.

Symmetric \(\implies\) orthogonally diagonalizable.

The proof is essentially identical to the proof of Proposition 40.6.12.

But in the complex case, the class of unitarily diagonalizable matrices is larger than just the class of Hermitian matrices.

Unitarily diagonalizable \(\implies\) normal.

We considered this statement in Discovery 40.7, but we will provide the full proof here. So suppose \(N\) is unitarily diagonalizable, and \(U\) is a unitary matrix so that \(\adjoint{U} N U = D\text{,}\) a diagonal matrix. Now, \(\adjoint{D}\) is also diagonal, and diagonal matrices commute. So we have

\begin{equation*} \adjoint{D} D = D \adjoint{D} \text{,} \end{equation*}

to begin.

Now check \(N\) against its adjoint:

\begin{align*} \adjoint{N} N \amp = \adjoint{(U D \adjoint{U})} (U D \adjoint{U})\\ \amp = U \adjoint{D} \cancelto{I}{\adjoint{U} U} D \adjoint{U}\\ \amp = U \adjoint{D} D \adjoint{U}\\ \amp = U D \adjoint{D} \adjoint{U}\\ \amp = U D \adjoint{U} U \adjoint{D} \adjoint{U}\\ \amp = N \adjoint{N}\text{,} \end{align*}

as desired.

Normal \(\implies\) unitarily diagonalizable.

Assume that \(N\) is a normal matrix. We wish to prove that \(N\) is then unitarily diagonalizable. The argument is essentially the same as in the proof of Proposition 40.6.12.

By The Fundamental Theorem of Algebra (Complex Version), the matrix \(N\) has at least one eigenvalue, and so has a nonzero eigenvector, say \(\uvec{x}\text{.}\) By normalizing, we may assume that \(\uvec{x}\) is a unit vector.

By Statement 1 of Theorem 40.6.9, \(\uvec{x}\) is also an eigenvector of \(\adjoint{N}\text{.}\) Therefore, the space \(W = \Span \{ \uvec{x} \}\) is both \(N\)-invariant and \(\adjoint{N}\)-invariant. The same can then also be said about the space \(W^\perp\) (Proposition 39.5.3).

From this point, the remainder of the proof is identical to the proof of Proposition 40.6.12, replacing all occurrences of the word “Hermitian” by the word “normal”, all occurrences of the matrix \(H\) by the matrix \(N\text{,}\) and, if you like, using the notation \(N'\) in place of \(H'\text{.}\)

Subsection 40.6.3 Special instances of normal matrices

Finally, two instances of normal matrices.

If \(H\) is Hermitian, then both of \(\adjoint{H} H\) and \(H \adjoint{H}\) are equal to \(H^2\text{,}\) hence are equal to each other.

If \(U\) is Hermitian, then both of \(\adjoint{U} U\) and \(U \adjoint{U}\) are equal to \(I\text{,}\) hence are equal to each other.