Skip to main content
Logo image

Section 6.5 Theory

As mentioned, elementary matrices are precisely the connection we need between systems of equations and row operations on one hand and matrix multiplication and inverses on the other.

Subsection 6.5.1 Inverses of elementary matrices

Let’s first record an important property of elementary matrices we encountered in Section 6.3.

Proof.

Subsection 6.5.2 Inverses versus row operations

Now let’s connect inverses to row reduction via elementary matrices.

Proof.

We will show that each statement of the theorem implies the next.
Whenever Statement 1 is true, then so is Statement 2.
We have already seen in Proposition 5.5.10 that when A is invertible, then also every system with A as coefficient matrix will have one unique solution.
Whenever Statement 2 is true, then so is Statement 3.
If Statement 2 is true about A, then every system Ax=b with A as coefficient matrix has one unique solution. In particular, the homogeneous system Ax=0 (i.e. in the case that b=0) has one unique solution. But we know that a homogeneous system always has the trivial solution x=0, so that must be the one unique solution.
Whenever Statement 3 is true, then so is Statement 4.
We need to verify that there is at least one example of a system Ax=b that has one unique solution. But we are already assuming that the homogeneous system Ax=0 has one unique solution, so the required example is provided by taking b=0.
Whenever Statement 4 is true, then so is Statement 5.
Suppose that Statement 4 is true, so that there is at least one example of a system Ax=b that has one unique solution. Imagine trying to solve this system by row reducing the associated augmented matrix:
[Ab]reducerow[RREF(A)b],
where b is whatever appears in the “equals” column after all of our row operations. When we have arrived at the RREF of A in the coefficient matrix part on the left, and are ready to solve the simplified solution, there should not be be any free variables. Because free variables would lead to parameters, and hence infinite solutions, whereas we are assuming that this particular system has only one unique solution. So every column in the RREF of A must have a leading one. By definition, the rank of A is equal to the number of leading ones in its RREF, and so for this A the rank is equal to the number of columns. But A is square, so the number of columns is the same as the size of A.
Whenever Statement 5 is true, then so is Statement 6.
If A is square, so is its RREF, and both matrices have the same size. And if the rank of A is equal to its size, then every column in the RREF of A must have a leading one, and these leading ones must march down the diagonal of A. In a RREF matrix, a column that contains a leading one must have every other entry equal to zero. Thus, the RREF of A must be the identity matrix.
Whenever Statement 6 is true, then so is Statement 7.
Suppose that Statement 6 is true, so that A can be reduced to the identity. That is, A can be reduced to I by some sequence of elementary row operations. Each of these operations has a corresponding elementary matrix, so there is some collection of elementary matrices E1,E2,,E1,E so that
(✶)EE1E2E1A=I.
Now, by Lemma 6.5.1, each of E1,E2,,E1,E is invertible. Therefore, so is the product
EE1E2E1,
with inverse
E11E21E11E1
Using this inverse, we can isolate A in (✶) above:
EE2E1A=I(EE2E1)1(EE2E1)A=(EE2E1)1I(EE2E1)1IA=(EE2E1)1A=E11E21E1.
So, we have A expressed as a product of the inverses of a collection of elementary matrices. But by Lemma 6.5.1, each of these inverses is actually an elementary matrix as well, and so we really have A expressed as a product of a collection of elementary matrices, as desired.
Whenever Statement 7 is true, then so is Statement 1.
If A is equal to a product of elementary matrices, then since each of those elementary matrices is invertible (Lemma 6.5.1), their product (and hence A) is also invertible (Rule 4 of Proposition 5.5.5).
Conclusion.
We now have a circle of logical deductions. Starting with the knowledge that any one of the seven statements is true for a particular matrix A, we can deduce from the logic above that the next statement is true for A, and then from that, that the next statement is true for A, and so on. When we get to the last statement, the logic above then requires that the first statement will also be true for A, and we can continue from there on to the second statement, and so on, until we are sure that all statements are true for A. Therefore, the seven statements are equivalent.

Remark 6.5.3.

  • In this theorem, the claim that these seven statements are equivalent for a particular matrix A means that if we know that any one of the statements is true for A, then it must be that all seven statements are true for A. For example, if we had a square matrix that we were able to row reduce to the identity, then the theorem tells us that that matrix must be invertible, that every linear system with that matrix as coefficient matrix has one unique solution, and so on. On the other hand, if we know that any one of the statements is false for a particular matrix A, then it must be that all seven statements are false for A. As soon as one statement is known to be false for a particular square matrix, it becomes impossible for any of the other statements to be true for that matrix, since knowing that this other statement is true implies that all seven statements are true for it, including the original statement that we already knew was false. And a statement cannot be both true and false for a particular matrix A.
  • It may seem unneccessary or even redundant to have all three of Statements 2–4 included in the list in Theorem 6.5.2, but these statements are definitely not the same. The equivalence of Statement 1 and Statement 2 tells us that when a matrix is invertible, then every system corresponding to that coefficient matrix has one unique solution, and vice versa. But the reverse connection would be difficult to use in practice: would you want to check that every system with a particular square coefficient matrix has one unique solution in order to conclude that the matrix is invertible? There are infinity of possible systems to check! The equivalence of Statement 4 and Statement 1 makes the reverse logic easier in practice: if you have just one example of a linear system with a square coefficient matrix that has one unique solution, then you can conclude that the matrix is invertible. Even better, the equivalence of Statement 3 and Statement 1 tells you that you can just check the corresponding homogeneous system as your one example of a system with that particular coefficient matrix that has only one unique solution. Furthermore, the equivalence of Statement 2 and Statement 4 tells you that once you know one example of a system with that particular coefficient matrix that has only one unique solution, then you can conclude without checking that every system with that coefficient matrix has only one unique solution.
  • In the proof of Theorem 6.5.2, the most important link is the one between Statement 6 and Statement 7, as this equivalence provides the link between row reducing and elementary matrices. In practice, the link between Statement 7 and Statement 1 is also important, as it helps us to compute the inverse of a matrix. But in further developing matrix theory, the most important link is the one between Statement 1 and Statement 3, as it will allow us to obtain further general properties of inverses. In particular, these statements will figure into the proofs of the propositions in the next subsection.

Subsection 6.5.3 More properties of inverses

Using our new connections between inverses and row operations, we can expand our knowledge about inverses in general.

Proof.

We are assuming that we have square matrices A and B so that BA=I. We would first like to check that A is invertible. By Theorem 6.5.2, we can instead check that the homogeneous system Ax=0 has only the trivial solution. So suppose that x0 is a solution to this system, so that Ax0=0. But then we can carry out two different simplifications of BAx0, one using the assumption BA=I and one using the assumption Ax0=0:
BAx0=(BA)x0BAx0=B(Ax0)=Ix0=B0=x0,=0.
Since both simplifications are correct, we have x0=0. So what we have discovered is that because there exists a matrix B so that BA=I, then whenever we think we have a solution x0 to the system Ax=0, that solution turns out to be the trivial solution. Thus, Ax=0 must have only the trivial solution, and hence A is invertible (Theorem 6.5.2).
Now that we know that A is invertible, we can use its inverse to manipulate the equality BA=I:
BA=I(BA)A1=IA1B(AA1)=A1BI=A1B=A1.
So, we have that A is invertible and A1=B, as desired.

Remark 6.5.5.

Recall that by definition, to verify that a matrix B is the inverse of a matrix A, we would need to check that both BA=I and AB=I are true. We needed both orders of multiplication in the definition of inverse matrix because order of matrix multiplication matters, and we couldn’t be sure that both BA and AB would produce the same result. Via the theory of elementary matrices, we now have the above proposition that allows us to check an inverse by only checking one order of multiplication: BA=I.
There is nothing special about BA=I versus AB=I. The previous and following propositions combine to tell us we only need to verify only one of BA=I or AB=I to check that B is the inverse of A.

Proof.

Here, we are assuming that we have square matrices A and B so that AB=I, and we again would like to know that A is invertible and that B=A1. However, instead of appealing back to Theorem 6.5.2, we can use Proposition 6.5.4 with the roles of A and B reversed: since AB=I, Proposition 6.5.4 says that B must be invertible and that A=B1. But inverses are themselves invertible (Rule 1 of Proposition 5.5.5), so A is invertible with
A1=(B1)1=B,
as desired.
In Proposition 5.5.5, we learned that products and powers of invertible matrices are always invertible. It turns out that a product of matrices can only be invertible if the matrices making up the product are all invertible, and a power of a matrix can only be invertible if the base matrix is invertible.

Proof of Statement 1.

Suppose that MN is invertible. Then it has an inverse; let’s call it X instead of (MN)1. By definition, this means that
X(MN)=I.
Using Rule 1.e of Item 1, we may rewrite
(XM)N=I.
Applying Proposition 6.5.4 with B=XM and A=N, we may conclude that N is invertible with inverse
N1=XM.
Similarly, since X is the inverse of MN, we may write
(MN)X=I
and rewrite
M(NX)=I.
Applying Proposition 6.5.6 this time, we may conclude that M is invertible with inverse
M1=NX.

Proof of Statement 2.

We leave the proof of this statement to you, the reader.

Proof of Statement 3.

This is the special case of Statement 2 where each of M1,M2,,M1,M is equal to M.
As in Proposition 5.5.7, we can turn the statements of Proposition 6.5.7 around to create new facts about singular matrices. Note that the statements below are new statements about singular matrices, related but not equivalent to the statements in Proposition 5.5.7.

Proof of Statement 1.

If the product MN were invertible, then Statement 1 of Proposition 6.5.7 says that each of M and N would have to be invertible. But we are assuming that at least one of them is not, so it is not possible for the product MN to be invertible.

Proof of Statement 2.

Proof of Statement 3.

Finally, we can use the link between Statement 1 and Statement 6 of Theorem 6.5.2 to make Proposition 5.5.4 more precise.

Proof outline.

We explored this in Discovery 6.6.
Start with the matrix A=[abcd] and row reduce to see whether it is possible to get to the identity. But in the operations we choose, we need to be careful not to divide by zero, because the variable entries could be any values, including some zero. So it will be necessary to break into cases, such as a=0 versus a0, and the row reduction steps chosen will differ in the different cases. Ultimately, it will be possible to get the identity as the RREF of A precisely when adbc0, and it will be impossible when adbc=0. From here, we may appeal to the equivalence of Statement 1 and Statement 6 of Theorem 6.5.2.

Subsection 6.5.4 Solution sets of row equivalent matrices

Elementary matrices also give us the tool we need to prove that row equivalent matrices represent systems with the same solution set. We first recorded the following as Theorem 2.5.5 in Subsection 2.5.2, but did not prove it there. We repeat the theorem here, and include a proof.

Proof.

Consider systems A1x=b1 and A2x=b2, where augmented matrices
A1=[A1b1],A2=[A2b2]
are row equivalent. Then there exists a sequence of elementary row operations that can be applied to A1 to produce A2. If we set E to be the product of all the elementary matrices corresponding to the operations in this sequence, then we have A2=EA1. Because of the way matrix multiplication acts on columns, we then have
[A2b2]=E[A1b1]=[EA1Eb1],
and so we also have
A2=EA1,b2=Eb1.
Furthermore, we know that every elementary matrix is invertible (Lemma 6.5.1), and that products of invertible matrices are invertible (Statement 4 of Proposition 5.5.5), so we conclude that E is invertible. Therefore, we also have
A1=E1A2,b1=E1b2.
We are now in a position to verify that a solution to one system is also a solution to the other system.
A solution to system A1x=b1 is also a solution to A2x=b2.
Suppose x=x1 solves system A1x=b1, so that A1x1=b1 is true. Then,
A2x1=(EA1)x1=E(A1x1)=Eb1=b2.
Thus, x=x1 is also a solution to system A2x=b2.
A solution to system A2x=b2 is also a solution to A1x=b1.
Suppose x=x2 solves system A2x=b2, so that A2x2=b2 is true. Then,
A1x2=(E1A2)x2=E1(A2x2)=E1b2=b1.
Thus, x=x2 is also a solution to system A1x=b1.
Conclusion.
Since we have now shown that every solution of one system is a solution to the other system, both systems must have exactly the same solution set.