Section 6.5 Theory
As mentioned, elementary matrices are precisely the connection we need between systems of equations and row operations on one hand and matrix multiplication and inverses on the other.
Subsection 6.5.1 Inverses of elementary matrices
Let’s first record an important property of elementary matrices we encountered in Section 6.3.
Lemma 6.5.1. Elementary is invertible.
Every elementary matrix is invertible, and its inverse is also an elementary matrix.
Proof.
We have already essentially proved this statement in Subsection 6.3.3.
Subsection 6.5.2 Inverses versus row operations
Now let’s connect inverses to row reduction via elementary matrices.
Theorem 6.5.2. Characterizations of invertibility.
For a square matrix the following are equivalent.
- Matrix
is invertible. - Every linear system that has
as a coefficient matrix has one unique solution. - The homogeneous system
has only the trivial solution. - There is some linear system that has
as a coefficient matrix and has one unique solution. - The rank of
is equal to the size of - The RREF of
is the identity. - Matrix
can be expressed as a product of some number of elementary matrices.
Proof.
We will show that each statement of the theorem implies the next.
Whenever Statement 1 is true, then so is Statement 2.
We have already seen in Proposition 5.5.10 that when is invertible, then also every system with as coefficient matrix will have one unique solution.
Whenever Statement 2 is true, then so is Statement 3.
If Statement 2 is true about then every system with as coefficient matrix has one unique solution. In particular, the homogeneous system (i.e. in the case that ) has one unique solution. But we know that a homogeneous system always has the trivial solution so that must be the one unique solution.
Whenever Statement 3 is true, then so is Statement 4.
We need to verify that there is at least one example of a system that has one unique solution. But we are already assuming that the homogeneous system has one unique solution, so the required example is provided by taking
Whenever Statement 4 is true, then so is Statement 5.
Suppose that Statement 4 is true, so that there is at least one example of a system that has one unique solution. Imagine trying to solve this system by row reducing the associated augmented matrix:
where is whatever appears in the “equals” column after all of our row operations. When we have arrived at the RREF of in the coefficient matrix part on the left, and are ready to solve the simplified solution, there should not be be any free variables. Because free variables would lead to parameters, and hence infinite solutions, whereas we are assuming that this particular system has only one unique solution. So every column in the RREF of must have a leading one. By definition, the rank of is equal to the number of leading ones in its RREF, and so for this the rank is equal to the number of columns. But is square, so the number of columns is the same as the size of
Whenever Statement 5 is true, then so is Statement 6.
If is square, so is its RREF, and both matrices have the same size. And if the rank of is equal to its size, then every column in the RREF of must have a leading one, and these leading ones must march down the diagonal of In a RREF matrix, a column that contains a leading one must have every other entry equal to zero. Thus, the RREF of must be the identity matrix.
Whenever Statement 6 is true, then so is Statement 7.
Suppose that Statement 6 is true, so that can be reduced to the identity. That is, can be reduced to by some sequence of elementary row operations. Each of these operations has a corresponding elementary matrix, so there is some collection of elementary matrices so that
Now, by Lemma 6.5.1, each of is invertible. Therefore, so is the product
with inverse
(Rule 4 of Proposition 5.5.5).
Using this inverse, we can isolate in (✶) above:
So, we have expressed as a product of the inverses of a collection of elementary matrices. But by Lemma 6.5.1, each of these inverses is actually an elementary matrix as well, and so we really have expressed as a product of a collection of elementary matrices, as desired.
Whenever Statement 7 is true, then so is Statement 1.
If is equal to a product of elementary matrices, then since each of those elementary matrices is invertible (Lemma 6.5.1), their product (and hence ) is also invertible (Rule 4 of Proposition 5.5.5).
Conclusion.
We now have a circle of logical deductions. Starting with the knowledge that any one of the seven statements is true for a particular matrix we can deduce from the logic above that the next statement is true for and then from that, that the next statement is true for and so on. When we get to the last statement, the logic above then requires that the first statement will also be true for and we can continue from there on to the second statement, and so on, until we are sure that all statements are true for Therefore, the seven statements are equivalent.
Remark 6.5.3.
- In this theorem, the claim that these seven statements are equivalent for a particular matrix
means that if we know that any one of the statements is true for then it must be that all seven statements are true for For example, if we had a square matrix that we were able to row reduce to the identity, then the theorem tells us that that matrix must be invertible, that every linear system with that matrix as coefficient matrix has one unique solution, and so on. On the other hand, if we know that any one of the statements is false for a particular matrix then it must be that all seven statements are false for As soon as one statement is known to be false for a particular square matrix, it becomes impossible for any of the other statements to be true for that matrix, since knowing that this other statement is true implies that all seven statements are true for it, including the original statement that we already knew was false. And a statement cannot be both true and false for a particular matrix - It may seem unneccessary or even redundant to have all three of Statements 2–4 included in the list in Theorem 6.5.2, but these statements are definitely not the same. The equivalence of Statement 1 and Statement 2 tells us that when a matrix is invertible, then every system corresponding to that coefficient matrix has one unique solution, and vice versa. But the reverse connection would be difficult to use in practice: would you want to check that every system with a particular square coefficient matrix has one unique solution in order to conclude that the matrix is invertible? There are infinity of possible systems to check! The equivalence of Statement 4 and Statement 1 makes the reverse logic easier in practice: if you have just one example of a linear system with a square coefficient matrix that has one unique solution, then you can conclude that the matrix is invertible. Even better, the equivalence of Statement 3 and Statement 1 tells you that you can just check the corresponding homogeneous system as your one example of a system with that particular coefficient matrix that has only one unique solution. Furthermore, the equivalence of Statement 2 and Statement 4 tells you that once you know one example of a system with that particular coefficient matrix that has only one unique solution, then you can conclude without checking that every system with that coefficient matrix has only one unique solution.
- In the proof of Theorem 6.5.2, the most important link is the one between Statement 6 and Statement 7, as this equivalence provides the link between row reducing and elementary matrices. In practice, the link between Statement 7 and Statement 1 is also important, as it helps us to compute the inverse of a matrix. But in further developing matrix theory, the most important link is the one between Statement 1 and Statement 3, as it will allow us to obtain further general properties of inverses. In particular, these statements will figure into the proofs of the propositions in the next subsection.
Subsection 6.5.3 More properties of inverses
Using our new connections between inverses and row operations, we can expand our knowledge about inverses in general.
Proposition 6.5.4. Left inverse is inverse.
Proof.
We are assuming that we have square matrices and so that We would first like to check that is invertible. By Theorem 6.5.2, we can instead check that the homogeneous system has only the trivial solution. So suppose that is a solution to this system, so that But then we can carry out two different simplifications of one using the assumption and one using the assumption
Since both simplifications are correct, we have So what we have discovered is that because there exists a matrix so that then whenever we think we have a solution to the system that solution turns out to be the trivial solution. Thus, must have only the trivial solution, and hence is invertible (Theorem 6.5.2).
Now that we know that is invertible, we can use its inverse to manipulate the equality
So, we have that is invertible and as desired.
Remark 6.5.5.
Recall that by definition, to verify that a matrix is the inverse of a matrix we would need to check that both and are true. We needed both orders of multiplication in the definition of inverse matrix because order of matrix multiplication matters, and we couldn’t be sure that both and would produce the same result. Via the theory of elementary matrices, we now have the above proposition that allows us to check an inverse by only checking one order of multiplication:
There is nothing special about versus The previous and following propositions combine to tell us we only need to verify only one of or to check that is the inverse of
Proposition 6.5.6. Right inverse is inverse.
Proof.
Here, we are assuming that we have square matrices and so that and we again would like to know that is invertible and that However, instead of appealing back to Theorem 6.5.2, we can use Proposition 6.5.4 with the roles of and reversed: since Proposition 6.5.4 says that must be invertible and that But inverses are themselves invertible (Rule 1 of Proposition 5.5.5), so is invertible with
as desired.
In Proposition 5.5.5, we learned that products and powers of invertible matrices are always invertible. It turns out that a product of matrices can only be invertible if the matrices making up the product are all invertible, and a power of a matrix can only be invertible if the base matrix is invertible.
Proposition 6.5.7. Invertible products have invertible factors.
- If the product
is invertible, where and are square matrices of the same size, then both and must be invertible. - If the product
are square matrices all of the same size, then each of must be invertible. - If power
is invertible, where is a square matrix and is positive integer, then must be invertible.
Proof of Statement 1.
Suppose that is invertible. Then it has an inverse; let’s call it instead of By definition, this means that
Similarly, since is the inverse of we may write
and rewrite
Applying Proposition 6.5.6 this time, we may conclude that is invertible with inverse
Proof of Statement 2.
We leave the proof of this statement to you, the reader.
Proof of Statement 3.
As in Proposition 5.5.7, we can turn the statements of Proposition 6.5.7 around to create new facts about singular matrices. Note that the statements below are new statements about singular matrices, related but not equivalent to the statements in Proposition 5.5.7.
Proposition 6.5.8. Product of singular is singular.
- If one or both of
or are singular, where and are square matrices of the same size, then the product will also be singular. - If one or more of the matrices
are singular, where are square matrices all of the same size, then the product - If
is a singular square matrix, then every power ( ) will also be singular.
Proof of Statement 1.
If the product were invertible, then Statement 1 of Proposition 6.5.7 says that each of and would have to be invertible. But we are assuming that at least one of them is not, so it is not possible for the product to be invertible.
Proof of Statement 2.
The proof of this statement is similar to the one above for Statement 1, relying on Statement 2 of Proposition 6.5.7 instead. We leave the details to you, the reader.
Proof of Statement 3.
This proof again is similar to that above for Statement 1, relying on Statement 3 of Proposition 6.5.7 instead. Alternatively, one could view this as the special case of Statement 2 of the current proposition, where each factor is taken to be equal to
Finally, we can use the link between Statement 1 and Statement 6 of Theorem 6.5.2 to make Proposition 5.5.4 more precise.
Proposition 6.5.9. invertibility.
Proof outline.
We explored this in Discovery 6.6.
Start with the matrix and row reduce to see whether it is possible to get to the identity. But in the operations we choose, we need to be careful not to divide by zero, because the variable entries could be any values, including some zero. So it will be necessary to break into cases, such as versus and the row reduction steps chosen will differ in the different cases. Ultimately, it will be possible to get the identity as the RREF of precisely when and it will be impossible when From here, we may appeal to the equivalence of Statement 1 and Statement 6 of Theorem 6.5.2.
Subsection 6.5.4 Solution sets of row equivalent matrices
Elementary matrices also give us the tool we need to prove that row equivalent matrices represent systems with the same solution set. We first recorded the following as Theorem 2.5.5 in Subsection 2.5.2, but did not prove it there. We repeat the theorem here, and include a proof.
Theorem 6.5.10.
Row equivalent matrices represent systems of equations that have the same solution set.
Proof.
Consider systems and where augmented matrices
are row equivalent. Then there exists a sequence of elementary row operations that can be applied to to produce If we set to be the product of all the elementary matrices corresponding to the operations in this sequence, then we have Because of the way matrix multiplication acts on columns, we then have
and so we also have
Furthermore, we know that every elementary matrix is invertible (Lemma 6.5.1), and that products of invertible matrices are invertible (Statement 4 of Proposition 5.5.5), so we conclude that is invertible. Therefore, we also have
We are now in a position to verify that a solution to one system is also a solution to the other system.
A solution to system is also a solution to .
Suppose solves system so that is true. Then,
Thus, is also a solution to system
A solution to system is also a solution to .
Suppose solves system so that is true. Then,
Thus, is also a solution to system
Conclusion.
Since we have now shown that every solution of one system is a solution to the other system, both systems must have exactly the same solution set.