Section 10.3 Concepts
Recall that in Section 8.3 we set a goal for ourselves: given a square matrix \(A\text{,}\) determine a matrix \(A'\) so that \(A A'\) is a scalar multiple of the identity (Goal 8.3.1). The adjoint finally fulfills this goal.
Subsection 10.3.1 The classical adjoint
Before we dive in, a note about the adjective “classical.” In a second course in linear algebra, you will probably learn that square matrices have a different kind of “adjoint” matrix that is completely unrelated to the adjoint we will discuss here. (The word “adjoint” gets used a lot in mathematics for many different concepts.) So we are attaching the adjective “classical” to the adjoint matrix we define here to distinguish it from that other one.
Let’s remind ourselves how determinants are defined, by cofactor expansions. For matrix \(A = \begin{bmatrix} a_{ij} \end{bmatrix}\text{,}\) the cofactor expansion of \(\det A\) along row \(i\) is
\begin{equation*}
\det A = a_{i1} C_{i1} + a_{i2} C_{i2} \dotsb + a_{in} C_{in} \text{,}
\end{equation*}
where the \(C_{ij}\) are the associated cofactors. This pattern of a sum of products sure looks like matrix multiplication, where we are multiplying the \(\nth[i]\) row of \(A\) against a column of some matrix. Since each position in \(A\) has a corresponding cofactor, we can create a matrix of cofactors \(C_A = \begin{bmatrix}C_{ij}\end{bmatrix}\text{.}\) Except the pattern of indices for the \(C_{ij}\) in the cofactor expansion above progresses along a row of this cofactor matrix, whereas when we multiply matrices we multiply rows against columns. However, we know a way to turn rows into columns — the transpose. We call the transpose of the matrix of cofactors the (classical) adjoint of \(A\), and write \(\adj A\) to mean \(\utrans{C}_A\text{.}\)
In Discovery 10.1, we explored what happens when we multiply out \(A\) times \(\adj A\text{.}\) We only worked with the \(3\times 3\) case, but the same patterns would emerge for any size matrix. Remember that in a product like \(A (\adj A)\) we get the \(\nth[(i,j)]\) entry by multiplying the \(\nth[i]\) row of the first matrix against the \(\nth[j]\) column of the second matrix. Since the second matrix is a transpose, its \(\nth[j]\) column will be the \(\nth[j]\) row of the matrix of cofactors \(C_A\text{.}\) Thus, for each diagonal entry (that is, where \(j=i\)), we will be multiplying a row of \(A\) against the corresponding row of cofactors, and we’ll get the value of \(\det A\) repeated down the diagonal of \(A (\adj A)\text{.}\) On the other hand, for an off-diagonal entry (that is, where \(j\ne i\)), we’ll get a row of \(A\) multiplied against the cofactors associated to a different row. In our analysis of the operation of combining rows in Subsection 9.2.3, we determined that a “mixed” cofactor expansion always evaluates to \(0\). So all off-diagonal entries of \(A (\adj A)\) are \(0\text{,}\) and this product matrix is diagonal. Moreover, since the same value \(\det A\) is repeated down the diagonal, this product matrix is in fact scalar:
\begin{equation*}
A (\adj A) = (\det A) I \text{.}
\end{equation*}
As mentioned at the start of this section, this fulfills Goal 8.3.1, with \(\delta = \det A\) and \(A' = \adj A\text{.}\) In particular, this gives us a formula for the inverse of any matrix that has nonzero determinant:
\begin{equation*}
\inv{A} = \frac{1}{\det A} \, \adj A \text{.}
\end{equation*}
Remark 10.3.1.
Just as cofactor expansions are an inefficient means to compute determinants, calculating an inverse using the adjoint formula above is very inefficient, since computing an adjoint for an \(n\times n\) matrix involves computing \(n^2\) determinants of \((n-1)\times(n-1)\) matrices. You are much better off computing an inverse by row reducing, as in Subsection 6.3.5 and Subsection 6.4.3. However, the above formula is useful for further developing the theory of solving systems by inverses, as we will soon see.
Subsection 10.3.2 Determinants determine invertibility
Part of our motivation for developing determinants was to make sense of the \(a d - b c\) formula that determines whether a \(2 \times 2\) matrix is invertible, and obtain a similar formula for larger square matrices. In completing Goal 8.3.1 by obtaining the formula \(A (\adj A) = (\det A) I\text{,}\) we learn that whenever \(\det A \neq 0\) then \(A \bigl[ \inv{(\det A)} (\adj A) \bigr] = I\text{,}\) and so \(A\) is invertible (Proposition 6.5.6).
To repeat, we now know that if \(\det A\neq 0\text{,}\) then \(A\) must be invertible. Logically, that raises three related questions.
Question 10.3.2.
- If \(A\) is invertible, must \(\det A\) be nonzero?
- If \(\det A = 0\text{,}\) must \(A\) be singular?
- If \(A\) is singular, must \(\det A\) be zero?
In the study of logic, the statement version of these three questions are called the converse, inverse, and contrapositive, respectively, of the original conditional statement that states:
If \(\det A \neq 0\text{,}\) then \(A\) is invertible.
And the study of logic tells us that the answers to these three questions are not necessarily all affirmative just because the original statement is true. So in Discovery 10.2 and Discovery 10.3 we considered these questions, as well as the original statement, by considering the effects of row reducing on the determinant. Here is what we discovered, in the order we considered them in those two discovery activities, relying on our knowledge that a square matrix is invertible if and only if its RREF is the identity matrix (Theorem 6.5.2)
If \(\det A = 0\).
Since no elementary row operation can change a zero determinant to a nonzero one, the RREF of \(A\) must also have determinant \(0\text{.}\) But then the RREF of \(A\) cannot be \(I\text{,}\) since \(\det I = 1\text{.}\) So \(A\) is not invertible.
If \(A\) is invertible.
Then \(\det A\) cannot be zero, since then \(A\) wouldn’t be invertible, as we just argued in the previous point.
If \(\det A\) is nonzero.
Since no elementary row operation can change a nonzero determinant to a zero determinant (multiplying a row by \(0\) is not an elementary operation), the RREF for \(A\) must also have nonzero determinant. But then that RREF cannot have a row of zeros, because then its determinant would be \(0\text{.}\) Since it is square, that RREF matrix must have all of its leading ones, making it the identity matrix, and so \(A\) is invertible.
If \(A\) is singular.
Then \(\det A\) must be zero, since if it were nonzero then \(A\) would be invertible, as we just argued in the previous point.
Subsection 10.3.3 Determinants versus matrix multiplication: case of elementary matrices
In Discovery 10.4, we considered \(\det (EA)\) for \(E\) an elementary matrix and \(A\) a square matrix. Since there are three different kinds of elementary matrices, we had three different cases to consider. In each case, we were able to combine the appropriate part of Proposition 9.4.2 on the one hand with the appropriate part of Proposition 9.4.5 on the other, in order to verify
\begin{gather}
\det (EA) = (\det E) (\det A)\tag{✶}
\end{gather}
is true in all cases of the type of elementary matrix \(E\text{.}\) (For the details of these three cases, see the proof for Lemma 10.5.5, which appears in Subsection 10.5.3.)
Expressed in words, the equality above represents the pattern that a determinant of a product is the product of the determinants, at least in the case where the first matrix in the product is elementary (for now).
Subsection 10.3.4 Determinants versus matrix multiplication: invertible case
In Discovery 10.5, we progressed to considering determinants of a product of matrices where the first matrix in the product is invertible. In particular, this means that the first matrix can be expressed somehow as a product of elementary matrices (Theorem 6.5.2), and so we can unravel the determinant of this product one elementary matrix at a time, using the result of the previous subsection at each step.
As in Discovery 10.5, consider matrix \(N\) and invertible matrix \(M\text{,}\) where \(M\) can be expressed as a product of three elementary matrices, \(M = E_1 E_2 E_3\text{.}\) The we can repeatedly use our rule (✶) from the elementary matrix case in Subsection 10.3.3 above to obtain
\begin{align*}
\det (M N)
\amp= \det (E_1 E_2 E_3 N)\\
\amp= (\det E_1) \bbrac{\det (E_2 E_3 N)} \amp \amp\text{(i)}\\
\amp= (\det E_1) (\det E_2) \bbrac{\det (E_3 N)} \amp \amp\text{(ii)}\\
\amp= (\det E_1) (\det E_2) (\det E_3) (\det N) \amp \amp\text{(iii)}\\
\amp= (\det E_1) \bbrac{\det (E_2 E_3)} (\det N) \amp \amp\text{(iv)}\\
\amp= \bbrac{\det (E_1 E_2 E_3)} (\det N) \amp \amp\text{(v)}\\
\amp= (\det M)(\det N),
\end{align*}
with justifications
Of course, this sort of calculation could be repeated no matter how many elementary matrices went into a product expression for \(M\text{.}\) So we can make our final statement of the last subsection a little stronger: a determinant of a product is the product of the determinants, at least in the case where the first matrix in the product is invertible (for now).
Subsection 10.3.5 Determinants versus matrix multiplication: singular case
Finally, in Discovery 10.7 we considered the determinant of a product of matrices where the first matrix in the product is singular. It is fairly straightforward to verify that again, in this case, a determinant of a product is the product of the determinants whenever the first matrix in the product is singular. (See the proof of the singular case for Statement 1 of Proposition 10.5.6, which will appear in Subsection 10.5.3.)
Subsection 10.3.6 Determinants versus matrix multiplication: all cases
The considerations in Subsection 10.3.4 and Subsection 10.3.5 together verify the universal pattern
\begin{equation*}
\det (M N) = (\det M) (\det N)
\end{equation*}
for square matrices \(M\) and \(N\) of the same size, no matter whether \(M\) is invertible or singular, and so the pattern that a determinant of a product is the product of the determinants is true in all cases. In more sophisticated mathematical language, we say that the determinant function is multiplicative.
Subsection 10.3.7 Determinant of an inverse
In Discovery 10.6, we used the fact that the determinant is multiplicative to investigate the relationship between the determinants of an invertible matrix and its inverse. By definition of inverse, we have \(A \inv{A} = I\text{.}\) Since the product \(A \inv{A}\) is the same matrix as the identity, it must have the same determinant, so
\begin{equation*}
\det (A \inv{A}) = 1
\end{equation*}
(Statement 4 of Proposition 8.5.2).
As well, we know that \(\det A \neq 0\text{,}\) since \(A\) is invertible. So,
\begin{align*}
\det (A\inv{A}) \amp= 1\\
(\det A)\bbrac{\det (\inv{A})} \amp= 1\\
\det (\inv{A}) \amp= \frac{1}{\det A}\text{.}
\end{align*}
Thus, the determinant of an inverse is the inverse of the determinant.
Subsection 10.3.8 Cramer’s rule
While the adjoint inversion formula is not a good choice for computing inverses, it does have applications. Here is one application to solving systems. Remember that if \(A \uvec{x} = \uvec{b}\) is a linear system with a square, invertible coefficient matrix \(A\text{,}\) then there is one unique solution \(\uvec{x} = \inv{A} \uvec{b}\text{.}\) Using the adjoint inversion formula, we get
\begin{gather}
\uvec{x} = \inv{A}\uvec{b} = \frac{1}{\det A}(\adj A)\uvec{b}\text{.}\tag{✶✶}
\end{gather}
As usual, let’s consider this solution formula in the case that \(A\) is \(3\times 3\text{,}\) in which case both \(\uvec{x}\) and \(\uvec{b}\) are \(3\times 1\text{:}\)
\begin{align*}
\uvec{x} \amp= \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} \text{,} \amp
\uvec{b} \amp= \begin{bmatrix} b_1 \\ b_2 \\ b_3 \end{bmatrix} \text{.}
\end{align*}
The product \((\adj A) \uvec{b}\) will be a column matrix, whose top entry is obtained by multiplying the first row of \(\adj A\) against the column \(\uvec{b}\text{.}\) But \(\adj A\) is the transpose of the matrix of cofactors for \(A\text{,}\) so the first row of \(\adj A\) contains the cofactors from the first column of \(A\text{,}\) and we have
\begin{gather}
\bigl[(\adj A)\uvec{b}\bigr]_{11} = C_{11}b_1 + C_{21}b_2 + C_{31}b_3\text{.}\tag{✶✶✶}
\end{gather}
This looks like a cofactor expansion of some determinant! The cofactors are from the first column of \(A\text{,}\) so their values only depend on the second and third columns of \(A\text{.}\) But the entries are from \(\uvec{b}\text{,}\) so if we replace the first column in \(A\) with \(\uvec{b}\) to get a new matrix
\begin{equation*}
A_1 = \begin{bmatrix}
| \amp | \amp | \\
\uvec{b} \amp \uvec{a}_2 \amp \uvec{a}_3 \\
| \amp | \amp |
\end{bmatrix}\text{,}
\end{equation*}
then a cofactor expansion of \(\det A_1\) along the first column gives us exactly the expression in (✶✶✶) for the first entry in the product \((\adj A) \uvec{b}\text{.}\) Using this in (✶✶), and considering only the top entry on both sides, we get
\begin{equation*}
x_1 = \frac{1}{\det A} \bigl[ (\adj A) \uvec{b} \bigr]_{11} = \frac{\det A_1}{\det A} \text{.}
\end{equation*}
Similar calculations would tell us that \(x_2 = (\det A_2) / (\det A)\text{,}\) where \(A_2\) is the matrix obtained by replacing the second column of \(A\) by \(\uvec{b}\text{,}\) and similarly for \(x_3\text{.}\) And the same pattern emerges for larger systems.
We will work out an example of using Cramer’s rule in Subsection 10.4.3.