Section 39.3 Concepts
In this section.
Subsection 39.3.1 Adjoint matrices
Let \(V\) represent either real or complex \(n\)-dimensional space, and let \(\inprod{\blank}{\blank}\) represent either the real or complex dot product accordingly.
Thinking geometrically, for an \(n \times n\) matrix \(A\) and column vectors \(\uvec{u},\uvec{v}\) in \(V\text{,}\) the expression
could be considered to be a pairing not of two vectors in \(V\text{,}\) but rather a pairing of a vector in \(V\) with a vector in a “transformed” version of \(V\text{,}\) with the transformation achieved through multiplication by \(A\text{.}\)
Question 39.3.1.
How would we need to transform \(V\) in order to flip this around, so that it is the first vector in the pairing that is from a “transformed” version of \(V\) and the second vector is left un-transformed, but so that the result is always the same as when it was the second vector that was transformed?
That is, for what \(n \times n\) matrix \(B\) would we have
for every pair of column vectors \(\uvec{u},\uvec{v}\) in \(V\text{?}\)
In Task a and Task d of Discovery 39.1, we found that the answer to this question is that \(B\) should be equal to the transpose \(\utrans{A}\) in the real context, and should be equal to the conjugate-transpose \(\adjoint{A}\) in the complex context. Since the “conjugate” part of “conjugate-transpose” has no effect in the case that \(A\) is a real matrix, we will write \(B = \adjoint{A}\) and refer to \(\adjoint{A}\) as the adjoint of \(A\) in both contexts.
This means that for every pair of column vectors \(\uvec{u},\uvec{v}\text{,}\) it is always true that
And since both the conjugate and transpose operations are self-inverse, we have \(\adjoint{(\adjoint{A})} = A\) in both the real and complex contexts. To state this property relative to the inner product, for every pair of column vectors \(\uvec{u},\uvec{v}\) it is also always true that
In Discovery 39.6, we found another geometric interpretation of the adjoint: if \(A\) is a complex \(n \times n\) matrix and \(W\) is an \(A\)-invariant subspace of \(\C^n\text{,}\) the orthogonal complement \(\orthogcmp{W}\) will be \(\adjoint{A}\)-invariant. Since the process of taking an orthogonal complement is also self-inverse (i.e. \(\orthogcmp{(\orthogcmp{W})} = W\)), we can mix and match matrices and subspaces: if subspace \(U\) is \(\adjoint{A}\)-invariant, then \(\orthogcmp{U}\) will be \(A\)-invariant.
Subsection 39.3.2 Self-adjoint matrices
A matrix that is equal to its own adjoint is, fittingly, called self-adjoint. In the real context, we have seen that this occurs precisely when the matrix is symmetric, and in the complex context it occurs when the matrix is Hermitian.
We will have occasion to study the properties of self-adjoint matrices further in Chapter 40, but for now we will note how it relates to invariance of subspaces. We noted above that an \(A\)-invariant subspace \(W\) corresponds to an \(\adjoint{A}\)-invariant orthogonal complement \(\orthogcmp{W}\text{.}\) But when \(A\) is self-adjoint, both of these subspaces are \(A\)-invariant. And since a subspace and its orthogonal complement in a finite-dimensional inner product space always form a complete set of independent subspaces (Corollary 37.5.19), having a single subspace that is invariant under a self-adjoint matrix automatically sets up the ingredients for the block-diagonalization procedure, as the orthogonal complement of the subspace will always be invariant as well.
Subsection 39.3.3 Orthogonal and unitary matrices
Product-preserving matrices.
Here we have two different words, orthogonal in the real context and unitary in the complex context, for the same concept: an \(n \times n\) matrix \(A\) so that
is true for every pair of column vectors \(\uvec{u},\uvec{v}\text{.}\) In other words, an orthogonal matrix preserves the standard inner product on \(\R^n\), and a unitary matrix preserves the standard inner product on \(\C^n\). For the purposes of unifying the real and complex contexts into one discussion, for the remainder of this subsection, we will say product-preserving to mean either a real matrix that is orthogonal or a complex matrix that is unitary.
Matrix properties.
We can manipulate (\(\star\star\)) using the adjoint \(\adjoint{A}\text{:}\)
But now this exhibits an adjoint pattern again, not for \(A\) but for \(\adjoint{A} A\text{:}\)
Since this is true for all pairs \(\uvec{u},\uvec{v}\text{,}\) this says that
But since the identity matrix is clearly self-adjoint, we can take the adjoint of both sides of this equality to obtain a characterization of the concept of product-preserving in matrix algebra terms:
Further note that (\(\dagger\)) implies that a product-preserving matrix \(A\) must be invertible, with
Related to invertibility, we also considered the determinant of a product-preserving matrix in Discovery 39.3 and Discovery 39.5. Applying the determinant to both sides of equality (\(\dagger\)), we can then use Lemma 9.4.3 to conclude that
in the real context, and can use Proposition 11.4.3 to conclude that
in the complex context. We can unify both contexts by writing
meaning absolute value on the left in the real context and meaning complex modulus on the left in the complex context.
Finally, consider the matrix multiplication pattern on the left-hand side of equality (\(\dagger\)). If we write \(\uvec{a}_1,\uvec{a}_2,\dotsc,\uvec{a}_n\) for the columns of \(A\text{,}\) then the \((i,j)\) entry in the product \(\adjoint{A} A\) is
Recall that \(\adjoint{\uvec{a}}_i\) will just be \(\utrans{\uvec{a}}_i\) in the real context. So in both real and complex contexts, we can say that the \((i,j)\) entry in the product \(\adjoint{A} A\) is
But if (\(\dagger\)) holds, then we must have
In other words, a matrix is product-preserving precisely when its columns are an orthonormal set. And since orthogonality implies independence (Proposition 37.5.1), we can in fact say that a product-preserving matrix must have columns that form an orthonormal basis for \(\R^n\) or \(\C^n\text{,}\) depending on the context.
Geometric properties.
As discussed above, a product-preserving matrix does just that — it preserves inner product values, as in (\(\star\star\)). But every geometric concept we have – norm, angle, orthogonality – can be defined in terms of the inner product. So, as we found for orthogonal matrices in Discovery 39.4, a product-preserving matrix also preserves norm, distance, angle, and orthogonality.
In particular, a product-preserving matrix with preserve both orthogonal and orthonormal sets. That is, if
is an orthogonal set of column vectors, then so is
And if \(S\) is in fact orthonormal, then so is \(S'\text{.}\) We can go even further and say that transforming an orthogonal basis through multiplication by a product-preserving matrix will result in another orthogonal basis, and the same can be said with “orthogonal basis” replaced by “orthonormal basis”.
As transition matrices.
Consider again Discovery 39.7. The columns of a transition matrix are coordinate vectors for expanding each “old” basis vector relative to the “new” basis. If we are working in a finite-dimensional inner product space \(V\text{,}\) and the “new” basis
for \(V\) is orthonormal, then we can use the Expansion theorem to express each vector in the “old” basis
as
The \(\nth[j]\) column of \(\ucobmtrx{B'}{B}\) contains the coefficients in the expression above, and so we have \((i,j)\) entry
By the same reasoning, if \(\basisfont{B}'\) is also orthonormal, then the \((i,j)\) entry of \(\ucobmtrx{B}{B'}\) is
If we instead say that the \((j,i)\) entry of \(\ucobmtrx{B}{B'}\) is
we immediately see that
Since we also know
(Rule 3 of Proposition 22.5.4), we can use the characterization (\(\dagger\dagger\)) to conclude that a transition matrix between orthonormal bases of an inner product space must be a product-preserving matrix. That is, such a matrix must be orthogonal if \(V\) is a real inner product space and unitary if \(V\) is a complex inner product space.
Every invertible matrix can be considered as a transition matrix \(\ucobmtrx{B}{S}\text{,}\) where \(\basisfont{S}\) is the standard basis and \(\basisfont{B}\) is the basis formed by the columns of the invertible matrix. (See Proposition 22.5.6, which remains true about invertible complex matrices and bases of \(\C^n\text{.}\)) But the standard basis is clearly orthonormal, and a product-preserving matrix has orthonormal columns. So a product-preserving matrix is always somehow a transition matrix between orthonormal bases for \(\R^n\) or \(\C^n\text{,}\) depending on the context.