Skip to main content

Section 6.3 Concepts

Even though the title of this chapter is Elementary matrices, it is really another about matrix inverses.

Suppose \(A\) and \(B\) are square matrices of the same size, and \(A\) is invertible. Start with \(B\text{,}\) multiply on the left by \(A\) to get \(AB\text{,}\) and then multiply that result on the left by \(\inv{A}\) to get \(\inv{A}AB = IB = B\text{,}\) which is right back where we started. The point being that an inverse matrix \(\inv{A}\) undoes or reverses multiplication by \(A\text{.}\) So if we want to understand inverses, we need to understand how to reverse matrix multiplication.

Now, our motivation for defining matrix multiplication in the way that we did (i.e. rows times columns) was so that we could use matrix multiplication to represent a system of equations by a single matrix equation \(A\uvec{x}=\uvec{b}\text{,}\) with both the vector of unknowns \(\uvec{x}\) and the vector of constants \(\uvec{b}\) as column vectors. (See Discovery 4.5, and more generally Chapter 4.) Furthermore, for a system \(A\uvec{x} = \uvec{b}\) with a square invertible coefficient matrix \(A\text{,}\) we can solve the system either by row reducing or by reversing the multiplication by \(A\) and algebraically isolating \(\uvec{x}=\inv{A}\uvec{b}\text{.}\) So there must be a connection between row operations, matrix multiplication, and matrix inverses. And elementary matrices are precisely that connection.

Subsection 6.3.1 Elementary matrices

In Discovery guide 6.1–6.3, we discovered that we can create special square matrices so that multiplying another matrix by that special matrix (on the left) has the same effect as performing an elementary row operation, and we called these special matrices elementary matrices. So if \(E\) is an elementary matrix and \(A\) is another matrix of a compatible size (but not necessarily square), then the result of computing the matrix product \(EA\) is the same as performing some elementary row operation on \(A\text{.}\)

Applying this same reasoning with \(A\) replaced by \(I\text{,}\) we see that \(EI = E\) must be the same result as applying that elementary row operation on the identity. This gives us an easy way to produce an elementary matrix for a particular elementary row operation.

See Subsection 6.4.1 for some examples.

If each elementary row operation can be achieved by multiplication by an elementary matrix, then a sequence of row operations can be achieved can be achieved by iterated multiplication by elementary matrices, as in Discovery 6.4. For example, suppose we were to perform the following sequence of operations on some \(3 \times \ell\) matrix \(A\text{:}\)

\begin{equation*} A \xrightarrow{R_2 + 2 R_1 \quad} A' \xrightarrow{-4 R_3 \quad} A'' \xrightarrow{R_1 \leftrightarrow R_2 \quad} A'''\text{.} \end{equation*}

The first operation is the same as that corresponding to the elementary matrix \(E\) from Discovery 6.1, so the first result \(A'\) is equal to \(E A\text{.}\) Similarly, the second operation is the same as that corresponding to the elementary matrix \(E'\) from Discovery 6.2, but this second operation is being applied to the first result \(A'\text{.}\) So the second result \(A''\) is equal to

\begin{equation*} E' A' = E' (E A) \text{.} \end{equation*}

Finally, the third operation is the same as that corresponding to the elementary matrix \(E''\) from Discovery 6.3, and this third operation is being applied to the second result \(A''\text{.}\) So the third result \(A'''\) is equal to

\begin{equation*} E'' A'' = E'' \bbrac{E' (E A)}\text{.} \end{equation*}

So our sequence of row operations is

\begin{equation*} A \xrightarrow{R_2 + 2 R_1 \quad} E A \xrightarrow{-4 R_3 \quad} E' E A \xrightarrow{R_1 \leftrightarrow R_2 \quad} E'' E' E A\text{,} \end{equation*}

where each new elementary matrix corresponds to the operation of the preceding arrow. Notice the order of the elementary matrices in the final product — the elementary matrices appear in right-to-left order compared to the order that the operations have been performed. Make sure you understand why this is so.

In Discovery 6.5, we examined this kind of correspondence between row operations and elementary matrices in a row reduction process. It is possible to row reduce the matrix \(B\) in that activity to the identity matrix in three operations, represented by elementary matrices \(E_1,E_2,E_3\text{:}\)

\begin{equation*} B \xrightarrow[\text{operation }\phantom{x}]{\text{first}} E_1 B \xrightarrow[\text{operation }\phantom{x}]{\text{second}} E_2 E_1 B \xrightarrow[\text{operation }\phantom{x}]{\text{third}} E_3 E_2 E_1 B\text{.} \end{equation*}

See Subsection 6.4.2 for another example of determining elementary matrices corresponding to the steps in a row reduction process.

Subsection 6.3.2 Inverses by elementary matrices

As discussed above, in Discovery 6.5 we reduced a matrix \(B\) to the identity matrix in three operations. In terms of elementary matrices, this means that \(E_3 E_2 E_1 B = I\text{,}\) where \(E_1,E_2,E_3\) are the elementary matrices corresponding to the three operations in the reduction sequence.

Assuming that matrix \(B\) is invertible, we could use \(\inv{B}\) to manipulate this equality:

\begin{align*} \amp \amp I \amp= E_3 E_2 E_1 B\\ \amp \implies \amp I \inv{B} \amp= (E_3 E_2 E_1 B) \inv{B}\\ \amp \implies \amp \phantom{I}\inv{B} \amp= E_3 E_2 E_1 (B \inv{B})\\ \amp \amp \amp = E_3 E_2 E_1 I\\ \amp \amp \amp= E_3 E_2 E_1. \end{align*}

So if a matrix is invertible, we can compute its inverse by row reducing it to the identity matrix and then multiplying together the elementary matrices that correspond to the steps in that row reduction, in the proper order. But there is a more direct way, as we will see in Subsection 6.3.5 below.

Remark 6.3.3.

There are many different sequences of row operations that could reduce a matrix to its RREF, and so when a matrix is invertible there are many different ways we could compute its inverse via a product of elementary matrices. These different ways can even involve different numbers of elementary matrices.

Subsection 6.3.3 Inverses of elementary matrices

As we explored in Discovery 6.7, every elementary row operation has a reverse operation.

Table 6.3.4. Reversing row swaps.
Operation swap two rows
\(R_i \leftrightarrow R_j\)
Reverse operation swap the rows again
\(R_i \leftrightarrow R_j\)
Reverse of the reverse swap the rows again
\(R_i \leftrightarrow R_j\)
Table 6.3.5. Reversing row scales.
Operation multiply a row by a nonzero constant
\(R_i \rightarrow kR_i\)
Reverse operation divide that row by the constant
\(R_i \rightarrow \frac{1}{k}R_i\)
Reverse of the reverse divide that row by the reciprocated constant
\(R_i \rightarrow \frac{1}{1/k}R_i = kR_i\)
Table 6.3.6. Reversing row combinations.
Operation add a multiple of one row to another
\(R_i \rightarrow R_i+kR_j\)
Reverse operation subtract that multiple of the one row from the other
\(R_i \rightarrow R_i+(-k)R_j\)
Reverse of the reverse subtract that negative multiple of the one row from the other
\(R_i \rightarrow R_i+\bbrac{-(-k)}R_j = R_i+kR_j\)

In each case, performing an operation on a matrix and then performing the reverse operation on that result will return you to the original matrix. Also notice that in each case the reverse operation of a reverse operation is the original operation. So, if \(E\) is the elementary matrix corresponding to some operation, and \(E'\) is the elementary matrix corresponding to the reverse operation, then also \(E\) corresponds to the reverse of the operation of \(E'\text{.}\)

If we perform these operations on the identity matrix, we get

\begin{equation*} I \quad\xrightarrow{\text{operation}}\quad EI \quad\xrightarrow[\text{operation}]{\text{reverse}}\quad E'EI, \end{equation*}
\begin{equation*} I \quad\xrightarrow[\text{operation}]{\text{reverse}}\quad E'I \quad\xrightarrow{\text{operation}}\quad EE'I. \end{equation*}

But in both situations we should be back at the identity matrix, because the second operation reverses the first. Thus, \(E'E = I\) and \(EE' = I\text{,}\) which by definition says that \(E'\) is the inverse of \(E\text{.}\) Hence, every elementary matrix is invertible, and the inverse of an elementary matrix is the elementary matrix corresponding to the reverse operation.

Subsection 6.3.4 Decomposition of invertible matrices

Let's go back to the matrix \(B\) from Discovery 6.5, for which we obtained matrix equality \(E_3E_2E_1B=I\) for some particular elementary matrices \(E_1,E_2,E_3\text{.}\) We have just learned in the preceding subsection (Subsection 6.3.3) that elementary matrices are invertible, so we can use the algebra of matrix inverses to isolate \(B\) as

\begin{equation*} B = \inv{E}_1\inv{E}_2\inv{E}_3. \end{equation*}

Now, from the preceding subsection we know that each of \(\inv{E}_1,\inv{E}_2,\inv{E}_3\) is also an elementary matrix. So if we describe the pattern of the formula \(B = \inv{E}_1\inv{E}_2\inv{E}_3\) in words, we might choose to ignore the inverses and say that \(B\) can be expressed as a product of elementary matrices. Since a product of elementary matrices represents performing the corresponding elementary row operations in sequence (on the identity matrix, if you like), we might say that a square matrix is invertible precisely when it represents some sequence of elementary row operations, and so inverting it is the same as trying to reverse that sequence of operations.

Subsection 6.3.5 Inverses by row reduction

Still working with the matrix \(B\) from Discovery 6.5, consider the formula

\begin{equation*} \inv{B} = E_3 E_2 E_1 I, \end{equation*}

from the computation in Subsection 6.3.2. We could simplify away the identity matrix (as we did above), but as is often the case in mathematics, simplifying hides patterns. Remember where the elementary matrices \(E_1,E_2,E_3\) came from — our starting point in the computation above was the formula \(E_3 E_2 E_1 B = I\text{,}\) which we obtained from the fact that these elementary matrices represented the steps taken to reduce \(B\) to the identity. So when we compare the two formulas

\begin{align*} E_3 E_2 E_1 B \amp = I, \amp E_3 E_2 E_1 I \amp = \inv{B}, \end{align*}

we realize that the same sequence of operations that reduces \(B\) to \(I\) can be used to “unreduce” \(I\) to \(\inv{B}\).

Now, it is inefficient to first row reduce a matrix to \(I\text{,}\) and then unreduce \(I\) to \(\inv{B}\) afterward, because we will be doing the same operations, in the same order, in both parts of the process. It would be faster to do both at once, one operation at a time.

The last statement of the procedure will be justified by Theorem 6.5.2 in Subsection 6.5.2. See Subsection 6.4.3 for an example of carrying out this procedure.

Pattern.

Note that Procedure 6.3.7 keeps track of the elementary matrices involved in row reducing a matrix \(A\) for us, and automatically applies them to the identity (effectively multiplying them together) to produce the inverse on the right:

\begin{align} \left[\begin{array}{c|c} A \amp I \end{array}\right] \amp \xrightarrow[\text{operation}]{\text{first}} \left[\begin{array}{c|c} E_1 A \amp E_1 I \end{array}\right]\notag\\ \notag\\ \amp \xrightarrow[\text{operation}]{\text{second}} \left[\begin{array}{c|c} E_2 E_1 A \amp E_2 E_1 I \end{array}\right]\notag\\ \notag\\ \amp \qquad \vdots\notag\\ \notag\\ \amp \xrightarrow[\text{operation}]{\text{last}} \left[\begin{array}{c|c} E_{\ell} \dotsm E_2 E_1 A \amp E_{\ell} \dotsm E_2 E_1 I \end{array}\right] = \left[\begin{array}{c|c} I \amp \inv{A} \end{array}\right]\text{.}\label{equation-elem-matrices-concepts-inversion-by-row-red-pattern}\tag{\(\star\)} \end{align}