DLA Theory

Section 40.6 Theory

In this section.

Subsection 40.6.1 Eigenvalues and eigenvectors

Since diagonalizability is dependent on eigenvalues and eigenvectors, we begin by relating those concepts to the inner product. Recall that for orthogonal/unitary diagonalization, we are most concerned with having eigenvectors that are orthogonal.

Theorem 40.6.1. Eigenvalues and eigenvectors of self-adjoint matrices.

Suppose \(H\) is a self-adjoint matrix.

Each eigenvalue of \(H\) is a real number.
Eigenvectors of \(H\) corresponding to different eigenvalues are orthogonal.

Proof outline.

The proofs of these statements in the Hermitian complex case are outlined in Discovery 40.1 and Discovery 40.5. For the symmetric real case, the results follow from the fact that every such matrix can also be considered as a complex Hermitian matrix (as in Discovery 40.2), and the fact that the complex dot product applied to real vectors gives the same result as the real dot product.

Remark 40.6.2.

Another way to read Statement 2 of the theorem is to say that if \(\lambda_1 \neq \lambda_2\) are distinct eigenvalues of a self-adjoint matrix \(H\text{,}\) then \(E_{\lambda_1}(H)\) is a subspace of \(\orthogcmp{E_{\lambda_2}(H)}\) and \(E_{\lambda_2}(H)\) is a subspace of \(\orthogcmp{E_{\lambda_1}(H)}\text{.}\)

Corollary 40.6.3.

A symmetric real matrix must have \(n\) real eigenvalues (including multiplicities).

Proof.

Using The Fundamental Theorem of Algebra (Complex Version), as a complex matrix every real matrix has \(n\) eigenvalues (including multiplicities), but some of those eigenvalues may be complex. But since a symmetric real matrix can be considered as a Hermitian complex matrix, using Statement 1 of Theorem 40.6.1 we can say that in that case each of the \(n\) eigenvalues must be real.

Proposition 40.6.4. Eigenvalues of product-preserving matrices.

If \(U\) is a unitary complex matrix, then every eigenvalue of \(U\) has complex modulus \(1\text{.}\)
If \(P\) is an orthogonal real matrix, then the only possible real eigenvalues of \(P\) are \(\lambda = \pm 1\text{.}\)

Proof outline.

The proof of the statement for unitary complex matrices is outlined in Discovery 40.8. For the orthogonal real case, the results follow from the fact that every such matrix can also be considered as a complex unitary matrix (as in Discovery 40.9), and the fact that the complex modulus applied to a real number gives the same result as the real absolute value.

Remark 40.6.5.

Note that the statement for orthogonal matrices does not say that both \(\pm 1\) must be eigenvalues For example, the identity matrix is orthogonal with only \(\lambda = 1\) as an eigenvalue, and the negative of the identity matrix is also orthogonal with only \(\lambda = -1\) as an eigenvalue.

And an orthogonal real matrix may have no real eigenvalues at all. For example, the rotation matrix

\begin{equation*} \left[\begin{array}{cr} \cos \theta \amp -\sin \theta \\ \sin \theta \amp \cos \theta \end{array}\right] \end{equation*}

has no real eigenvalues. But the complex eigenvalues of an orthogonal real matrix must have complex modulus \(1\text{,}\) just as for the eigenvalues of a unitary complex matrix.

To investigate eigenvalues and eigenvectors of normal matrices, we'll first relate the property of normality back to the inner product.

Lemma 40.6.6. Adjoint of normal has same effect on inner product.

Suppose \(N\) is a normal complex matrix. For every pair \(\uvec{u},\uvec{v}\) of column vectors in \(\C^n\text{,}\) we have

\begin{equation*} {\inprod{\adjoint{N} \uvec{u}}{\adjoint{N} \uvec{v}}}_{\C} = {\inprod{N \uvec{u}}{N \uvec{v}}}_{\C} \text{.} \end{equation*}

Proof.

Starting on the right in the proposed equality, we have

\begin{align*} {\inprod{N \uvec{u}}{N \uvec{v}}}_{\C} \amp = {\inprod{\adjoint{N} N \uvec{u}}{\uvec{v}}}_{\C} \amp \amp\text{(i)}\\ \amp = {\inprod{N \adjoint{N} \uvec{u}}{\uvec{v}}}_{\C} \amp \amp\text{(ii)}\\ \amp = {\inprod{\adjoint{N} \uvec{u}}{\adjoint{N} \uvec{v}}}_{\C} \amp \amp\text{(iii)}\text{,} \end{align*}

with justifications

definition of adjoint relative to the inner product;
definition of normal matrix; and
definition of adjoint relative to the inner product.

Corollary 40.6.7. Adjoint of normal has same effect on norm.

If \(N\) is a normal complex matrix, then

\begin{equation*} \norm{\adjoint{N} \uvec{v}} = \norm{N \uvec{v}} \end{equation*}

for every column vector \(\uvec{v}\) in \(\C^n\text{.}\)

Proof.

This follows immediately from using \(\uvec{u} = \uvec{v}\) in Lemma 40.6.6, and we leave verification of the details to you, the reader.

Next, we'll relate normality to the eigenvalue coefficient matrix pattern \(\lambda I - N\text{.}\) Note that the following lemma is true for all scalars \(\lambda\text{,}\) not just eigenvalues of the normal matrix \(N\text{.}\)

Lemma 40.6.8.

If \(N\) is a normal complex matrix, then so is \(\lambda I - N\text{.}\)

Proof idea.

Assuming \(\adjoint{N} N = N \adjoint{N}\) is true, use the algebra rules of the adjoint to verify that

\begin{equation*} \adjoint{(\lambda I - N)} (\lambda I - N) = (\lambda I - N) \adjoint{(\lambda I - N)} \end{equation*}

is also true.

Finally, we can use what we've established about normal matrices so far to investigate properties of their eigenvalues and eigenvectors.

Theorem 40.6.9.

Suppose \(N\) is a normal complex matrix.

If \(\lambda\) is an eigenvalue of \(N\text{,}\) then \(\cconj{\lambda}\) is an eigenvalue of \(\adjoint{N}\text{.}\) Furthermore, every eigenvector of \(N\) corresponding to \(\lambda\) is also an eigenvector of \(\adjoint{N}\) corresponding to \(\cconj{\lambda}\text{,}\) and vice versa. That is, \(E_{\cconj{\lambda}}(\adjoint{N}) = E_\lambda(N)\text{.}\)
Eigenvectors from different eigenvalues of \(N\) are orthogonal.

Proof outline.

Suppose \(\lambda\) is an eigenvalue of normal matrix \(N\) and \(\uvec{x}\) is a corresponding eigenvector. Then,

\begin{equation*} (\lambda I - N) \uvec{x} = \zerovec \text{,} \end{equation*}

so that

\begin{equation*} \norm{(\lambda I - N) \uvec{x}} = 0 \text{.} \end{equation*}

But since \(N\) is normal, so is \(\lambda I - N\) (Lemma 40.6.8), and so

\begin{equation*} \norm{\adjoint{(\lambda I - N)} \uvec{x}} = \norm{(\lambda I - N) \uvec{x}} = 0 \end{equation*}

(Corollary 40.6.7). But the only vector with a norm of zero is the zero vector, so we have

\begin{equation*} \adjoint{(\lambda I - N)} \uvec{x} = \zerovec \text{.} \end{equation*}

Using the algebra rules of the adjoint, we can instead write

\begin{equation*} (\cconj{\lambda} I - \adjoint{N}) \uvec{x} = \zerovec \text{,} \end{equation*}

so that \(\uvec{x}\) is also an eigenvector of \(\adjoint{N}\text{,}\) but relative to the eigenvalue \(\cconj{\lambda}\text{.}\)

So far we have established that every vector in \(E_\lambda(N)\) is also in \(E_{\cconj{\lambda}}(\adjoint{N})\text{.}\) But since \(\adjoint{(\adjoint{N})} = N\) and \(\lcconj{\cconj{\lambda}} = \lambda\text{,}\) we can symmetrically say that every vector in \(E_{\cconj{\lambda}}(\adjoint{N})\) is also in \(E_\lambda(N)\text{.}\) Putting both together establishes that

\begin{equation*} E_{\cconj{\lambda}}(\adjoint{N}) = E_\lambda(N) \text{.} \end{equation*}
Suppose \(\uvec{x}_1,\uvec{x}_2\) are eigenvectors of \(N\) from different eigenspaces \(E_{\lambda_1}(N),E_{\lambda_2}(N)\text{,}\) \(\lambda_1 \neq \lambda_2\text{.}\) Similar to the proof of Statement 2 of Theorem 40.6.1, compare the two sides of the equality
\begin{equation*} \inprod{N \uvec{x}_1}{\uvec{x}_2} = \inprod{\uvec{x}_1}{\adjoint{N} \uvec{x}_2} \end{equation*}
(applying Statement 1 on the right) to eventually conclude that \(\inprod{\uvec{x}_1}{\uvec{x}_2} = 0\text{.}\)

Remark 40.6.10.

As remarked for self-adjoint matrices, Statement 2 of the theorem says that if \(\lambda_1 \neq \lambda_2\) are distinct eigenvalues of a normal matrix \(N\text{,}\) then \(E_{\lambda_1}(N)\) is a subspace of \(\orthogcmp{E_{\lambda_2}(N)}\) and \(E_{\lambda_2}(N)\) is a subspace of \(\orthogcmp{E_{\lambda_1}(N)}\text{.}\)

Subsection 40.6.2 Characterizations of orthogonal/unitary diagonalization

We start with the basic requirement for diagonalizing a matrix with an orthogonal or unitary transition matrix.

Theorem 40.6.11. Basic characterization of orthogonal/unitary diagonalizability.

An \(n \times n\) real matrix \(A\) is orthogonally diagonalizable if and only if there exists an orthonormal basis of \(\R^n\) consisting of eigenvectors of \(A\text{.}\)
An \(n \times n\) complex matrix \(A\) is unitarily diagonalizable if and only if there exists an orthonormal basis of \(\C^n\) consisting of eigenvectors of \(A\text{.}\)

Proof.

We know from Theorem 25.6.3 (and its complex version) that to diagonalize a matrix \(A\text{,}\) the columns of the transition matrix must be a basis of \(\R^n\) (in the real case) or \(\C^n\) (in the complex case) consisting of eigenvectors of \(A\text{.}\) If we also want the transition matrix to be product-preserving, those eigenvector columns must be an orthonormal set (Statement 4 of Theorem 39.5.6).

And now we identify specific classes of orthogonally/unitarily diagonalizable matrices.

Proposition 40.6.12. Hermitian is unitarily diagonlizable.

Every Hermitian matrix is unitarily diagonalizable.

Proof outline.

Suppose \(H\) is a Hermitian matrix. By The Fundamental Theorem of Algebra (Complex Version), the matrix \(H\) has at least one eigenvalue, and so has a nonzero eigenvector, say \(\uvec{x}\text{.}\) By normalizing, we may assume that \(\uvec{x}\) is a unit vector.

Let \(W = \Span \{ \uvec{x} \}\text{.}\) Since \(\uvec{x}\) is an eigenvector, \(W\) is an \(H\)-invariant subspace of \(\C^n\text{.}\) Let \(W^\perp\) be the orthogonal complement of \(W\) inside \(\C^n\text{.}\) Then the pair of subspaces \(W, W^\perp\) satisfy the properties required for the block-diagonalization procedure (Corollary 37.5.19). Since \(H\) is self-adjoint, Proposition 39.5.3 states that \(W^\perp\) is \(H\)-invariant.

\begin{equation*} \{ \uvec{y}_1, \dotsc, \uvec{y}_{n-1} \} \end{equation*}

is an orthonormal basis for \(W^\perp\text{,}\) then

\begin{equation*} \basisfont{B} = \{ \uvec{x}, \uvec{y}_1, \dotsc, \uvec{y}_{n-1} \} \end{equation*}

is an orthonormal basis for \(\C^n\text{,}\) and so the matrix \(U_1\) formed by taking the vectors of \(\basisfont{B}\) as the columns in \(U_1\) is unitary (Statement 4 of Theorem 39.5.6). The matrix \(U_1\) will put \(H\) into block-diagonal form, and because the basis of \(W\) used in the block-diagonalization procedure consists of a single eigenvector of \(H\text{,}\) the first block of \(\adjoint{U}_1 H U_1\) (corresponding to \(W\)) is a \(1 \times 1\) block with the corresponding eigenvalue of \(H\) as its entry. Write

\begin{equation*} \inv{U}_1 H U_1 = \begin{bmatrix} \lambda \amp \\ \amp H' \end{bmatrix} \text{,} \end{equation*}

where \(\lambda\) is the eigenvalue of \(H\) corresponding to \(\uvec{x}\text{,}\) and \(H'\) is the \((n-1) \times (n-1)\) block corresponding to \(W^\perp\text{.}\) Since \(H\) is Hermitian, so is \(\adjoint{U}_1 H U_1\text{,}\) and therefore so is \(H'\text{.}\)

By mathematical induction, we may assume that the smaller Hermitian matrix \(H'\) is unitarily diagonalizable by some \((n-1) \times (n-1)\) unitary matrix \(U'\text{.}\) Therefore, if we take

\begin{align*} U_2 \amp = \begin{bmatrix} 1 \\ \amp U' \end{bmatrix} \text{,} \amp U \amp = U_1 U_2 \text{,} \end{align*}

then both \(U_2\) and \(U\) are also unitary, and \(\inv{U} H U = \adjoint{U} H U\) will be diagonal.

As noted in Subsection 40.4.1, this is the end of the story in the real case.

Theorem 40.6.13. Orthogonally diagonlizable is symmetric.

A real square matrix is orthogonally diagonalizable if and only if it is symmetric.

Proof idea.

Orthogonally diagonalizable \(\implies\) symmetric.

If real matrix \(A\) is orthogonally diagonalizable, then there exists orthogonal \(P\) so that

\begin{equation*} \utrans{P} A P = D \text{,} \end{equation*}

a diagonal matrix. Every diagonal matrix is symmetric, so we have

\begin{equation*} \utrans{A} = \utrans{(P D \utrans{P})} = P \utrans{D} \utrans{P} = P D \utrans{P} = A\text{,} \end{equation*}

as desired.

Symmetric \(\implies\) orthogonally diagonalizable.

The proof is essentially identical to the proof of Proposition 40.6.12.

But in the complex case, the class of unitarily diagonalizable matrices is larger than just the class of Hermitian matrices.

Theorem 40.6.14. Unitarily diagonlizable is normal.

A complex square matrix is unitarily diagonalizable if and only if it is normal.

Proof outline.

Unitarily diagonalizable \(\implies\) normal.

We considered this statement in Discovery 40.7, but we will provide the full proof here. So suppose \(N\) is unitarily diagonalizable, and \(U\) is a unitary matrix so that \(\adjoint{U} N U = D\text{,}\) a diagonal matrix. Now, \(\adjoint{D}\) is also diagonal, and diagonal matrices commute. So we have

\begin{equation*} \adjoint{D} D = D \adjoint{D} \text{,} \end{equation*}

to begin.

Now check \(N\) against its adjoint:

\begin{align*} \adjoint{N} N \amp = \adjoint{(U D \adjoint{U})} (U D \adjoint{U})\\ \amp = U \adjoint{D} \cancelto{I}{\adjoint{U} U} D \adjoint{U}\\ \amp = U \adjoint{D} D \adjoint{U}\\ \amp = U D \adjoint{D} \adjoint{U}\\ \amp = U D \adjoint{U} U \adjoint{D} \adjoint{U}\\ \amp = N \adjoint{N}\text{,} \end{align*}

as desired.

Normal \(\implies\) unitarily diagonalizable.

Assume that \(N\) is a normal matrix. We wish to prove that \(N\) is then unitarily diagonalizable. The argument is essentially the same as in the proof of Proposition 40.6.12.

By The Fundamental Theorem of Algebra (Complex Version), the matrix \(N\) has at least one eigenvalue, and so has a nonzero eigenvector, say \(\uvec{x}\text{.}\) By normalizing, we may assume that \(\uvec{x}\) is a unit vector.

By Statement 1 of Theorem 40.6.9, \(\uvec{x}\) is also an eigenvector of \(\adjoint{N}\text{.}\) Therefore, the space \(W = \Span \{ \uvec{x} \}\) is both \(N\)-invariant and \(\adjoint{N}\)-invariant. The same can then also be said about the space \(W^\perp\) (Proposition 39.5.3).

From this point, the remainder of the proof is identical to the proof of Proposition 40.6.12, replacing all occurrences of the word “Hermitian” by the word “normal”, all occurrences of the matrix \(H\) by the matrix \(N\text{,}\) and, if you like, using the notation \(N'\) in place of \(H'\text{.}\)

Subsection 40.6.3 Special instances of normal matrices

Finally, two instances of normal matrices.

Proposition 40.6.15. Hermitian and unitary are normal.

Every Heritian matrix is normal.
Every unitary matrix is normal.

Proof.

If \(H\) is Hermitian, then both of \(\adjoint{H} H\) and \(H \adjoint{H}\) are equal to \(H^2\text{,}\) hence are equal to each other.

If \(U\) is Hermitian, then both of \(\adjoint{U} U\) and \(U \adjoint{U}\) are equal to \(I\text{,}\) hence are equal to each other.