Skip to main content

Section 22.3 Concepts

Subsection 22.3.1 Linearity of coordinate vectors

As a first step in our goal of exploring the concept of transition matrix, we re-familiarized ourselves with coordinate vectors and reminded ourselves of their linearity properties:

\begin{align*} \rmatrixOf{\uvec{v} + \uvec{w}}{B} \amp = \rmatrixOf{\uvec{v}}{B} + \rmatrixOf{\uvec{w}}{B} \text{,} \amp \rmatrixOf{k \uvec{v}}{B} \amp = k \rmatrixOf{\uvec{v}}{B} \text{.} \end{align*}

We first encountered these properties in Discovery 19.6 and Subsubsection 19.3.5.2. We will record these properties formally in Subsection 22.5.1, but for now what is most important for our development of the concept of transition matrix is the extension of these properties to linear combinations:

\begin{gather} \rmatrixOf{k_1 \uvec{v}_1 + k_2 \uvec{v}_2 + \dotsc + k_m \uvec{v}_m}{B} = k_1 \rmatrixOf{\uvec{v}_1}{B} + k_2 \rmatrixOf{\uvec{v}_2}{B} + \dotsc + k_m \rmatrixOf{\uvec{v}_m}{B}\text{.}\label{equation-change-of-basis-concepts-linearity-coordvecs}\tag{\(\star\)} \end{gather}

Subsection 22.3.2 Matrix-times-vector as a linear combination

Discovery 22.3 was a retread of Discovery 21.2, but from a different point of view. In Discovery 21.2, we discovered that a system \(A \uvec{x} = \uvec{b}\) is consistent if and only if the column vector \(\uvec{b}\) is somehow a linear combination of the columns of the coefficient matrix \(A\text{.}\) But in Discovery 22.3, we focused on exactly how the product \(A \uvec{x}\) is a linear combination of the columns of \(A\text{.}\)

Suppose \(A\) is an \(m \times n\) matrix, considered as a collection of column vectors:

\begin{equation*} A = \begin{bmatrix} | \amp | \amp \amp | \\ \uvec{a}_1 \amp \uvec{a}_2 \amp \dots \amp \uvec{a}_n \\ | \amp | \amp \amp | \\ \end{bmatrix}\text{.} \end{equation*}

We have seen a few times that a matrix times a standard basis vector is equal to the corresponding column of the matrix:

\begin{equation*} A \uvec{e}_j = \uvec{a}_j \text{.} \end{equation*}

So if we expand the column vector

\begin{equation*} \uvec{x} = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix} \end{equation*}

as a linear combination of standard basis vectors,

\begin{equation*} \uvec{x} = x_1 \uvec{e}_1 + x_2 \uvec{e}_2 + \dotsb + x_n \uvec{e}_n \text{,} \end{equation*}

then we can compute \(A\uvec{x}\) by multiplying \(A\) through this linear combination:

\begin{align} \begin{array}{ccccccccc} A \uvec{x} \amp = \amp x_1 A \uvec{e}_1 \amp + \amp x_2 A \uvec{e}_2 \amp + \amp \dotsb \amp + \amp x_n A \uvec{e}_n \\ \amp = \amp x_1 \uvec{a}_1 \amp + \amp x_2 \uvec{a}_2 \amp + \amp \dotsb \amp + \amp x_n \uvec{a}_n \end{array}\text{.}\label{equation-change-of-basis-concepts-matrix-times-vec-pattern}\tag{\(\star\star\)} \end{align}

From this, we see that a matrix times a vector can be expressed as a linear combination of the columns of the matrix with coefficients taken from the vector.

Subsection 22.3.3 Converting coordinate vectors

In Discovery 22.4, we explored how we might convert from coordinate vectors relative to one basis to those relative to another basis.

Suppose we have a basis \(\basisfont{B}\) for some particular finite-dimensional vector space \(V\text{.}\) Every vector \(\uvec{w}\) in \(V\) can be expressed uniquely as a linear combination

\begin{gather} \uvec{w} = k_1 \uvec{u}_1 + k_2 \uvec{u}_2 + \dotsc + k_n \uvec{u}_n,\label{equation-change-of-basis-concepts-lincomb}\tag{\(\star\star\star\)} \end{gather}

where the \(\uvec{u}_j\) are the basis vectors in \(\basisfont{B}\) (Theorem 19.5.3), and the \(n\)-dimensional vector

\begin{equation*} \rmatrixOf{\uvec{w}}{B} = (k_1,k_2,\dotsc,k_n) \end{equation*}

made up of the coefficients in that linear combination are precisely what we call the coordinate vector of \(\uvec{w}\) relative to basis \(\basisfont{B}\text{.}\) Now, what if we have a second basis \(\basisfont{B}'\) for \(V\text{?}\) We may have already gone to the work of computing the coordinate vectors for a number of vectors in \(V\) relative to \(\basisfont{B}\text{.}\) Can we easily convert these coordinate vectors to new ones relative to \(\basisfont{B}'\text{?}\)

To see how this can be done with a matrix, let's switch to coordinate vectors as column vectors. Just as in Task 22.4.b, we can use linearity property (\(\star\)) applied to (\(\star\star\star\)) to obtain an expression for the new coordinate vector \(\matrixOf{w}{B'}\) as a linear combination in \(\R^n\) of coordinate vectors:

\begin{gather} \matrixOf{\uvec{w}}{B'} = k_1 \matrixOf{\uvec{u}_1}{B'} + k_2 \matrixOf{\uvec{u}_2}{B'} + \dotsc + k_n \matrixOf{\uvec{u}_n}{B'}.\label{equation-change-of-basis-concepts-second-basis-coordvec-lincomb}\tag{\(\dagger\)} \end{gather}

Now, the pattern of Task 22.3.c tells us that for every matrix \(P\) of appropriate size to multiply against the column vector \(\matrixOf{\uvec{w}}{B}\text{,}\) the product \(P \matrixOf{\uvec{w}}{B} \) can be expressed as

\begin{gather} P \matrixOf{\uvec{w}}{B} = P \begin{bmatrix} k_1 \\ k_2 \\ \vdots \\ k_n \end{bmatrix} = k_1 \uvec{p}_1 + k_2 \uvec{p}_2 + \dotsb + k_n \uvec{p}_n,\label{equation-change-of-basis-concepts-matrix-times-col-pattern}\tag{\(\dagger\dagger\)} \end{gather}

where the \(\uvec{p}_j\) are the columns of \(P\text{.}\) Combining (\(\dagger\)) with (\(\dagger\dagger\)) gives us a way to use matrix multiplication to convert coordinate vector \(\matrixOf{\uvec{w}}{B}\) into \(\matrixOf{\uvec{w}}{B'}\text{:}\) if we take \(P\) to be the matrix whose columns are the coordinate vectors relative to \(\basisfont{B}'\) of the basis vectors of \(\basisfont{B}\text{,}\)

\begin{align} P = \begin{bmatrix} | \amp | \amp \amp | \\ \matrixOf{\uvec{u}_1}{B'} \amp \matrixOf{\uvec{u}_2}{B'} \amp \dots \amp \matrixOf{\uvec{u}_n}{B'} \\ | \amp | \amp \amp | \end{bmatrix}\text{,}\label{equation-change-of-basis-concepts-transition-matrix-def}\tag{\(\dagger\dagger\dagger\)} \end{align}

then

\begin{equation*} P \matrixOf{\uvec{w}}{B} = k_1 \matrixOf{\uvec{u}_1}{B'} + k_2 \matrixOf{\uvec{u}_2}{B'} + \dotsb + k_n \matrixOf{\uvec{u}_n}{B'} = \matrixOf{\uvec{w}}{B'}\text{.} \end{equation*}

We call the special matrix \(P\) defined in (\(\dagger\dagger\dagger\)) the transition matrix from basis \(\basisfont{B}\) to basis \(\basisfont{B}'\text{,}\) and write \(\ucobmtrx{B}{B'}\) to denote it.

As usual, it is helpful to express the pattern of (\(\dagger\dagger\dagger\)) in words, and you were asked to do this in Discovery 22.5. If we refer to \(\basisfont{B}\) as the “old basis” and \(\basisfont{B}'\) as the “new basis”, then the transition matrix \(\ucobmtrx{B}{B'}\) is formed as the \(n \times n\) matrix whose columns are the coordinate vectors of the old basis relative to the new basis.

We worked through an example of forming and using a transition matrix in Discovery 22.6, and you can also look through Example 22.4.1 in Subsection 22.4.1, which is similar.

Subsection 22.3.4 Properities of transition matrices

In Discovery 22.7, we explored properties of transition matrices via the defining property

\begin{equation*} \ucobmtrx{B}{B'} \matrixOf{\uvec{v}}{B} = \matrixOf{\uvec{v}}{B'} \text{.} \end{equation*}

We will state and prove each of the discovered properties in Subsection 22.5.2, but we briefly recount them here.

No change of basis.

First, we considered the transition matrix for the case where the “new” basis is the same as the “old”. In this case, we want to “convert” each coordinate vector \(\matrixOf{\uvec{w}}{B}\) to the coordinate vector \(\matrixOf{\uvec{w}}{B'}\text{.}\) That is, we want the conversion to have no effect at all. We already know a matrix that has no effect when it is multiplied against other matrices: the identity matrix. So we expect

\begin{equation*} \ucobmtrx{B}{B} = I. \end{equation*}
Chains of basis changes.

Next, we considered the transition matrices \(\ucobmtrx{B}{B'} \text{,}\) \(\ucobmtrx{B'}{B''}\text{,}\) and \(\ucobmtrx{B}{B''}\) associated to three bases of a particular vector space. We could consider the transition from the first basis \(\basisfont{B}\) to the third basis \(\basisfont{B}''\) as a two-step process, first converting \(\matrixOf{\uvec{w}}{B}\) to \(\matrixOf{\uvec{w}}{B'}\) by

\begin{equation*} \ucobmtrx{B}{B'} \matrixOf{\uvec{w}}{B} = \matrixOf{\uvec{w}}{B'} \text{,} \end{equation*}

and then converting \(\matrixOf{\uvec{w}}{B'}\) to \(\matrixOf{\uvec{w}}{B''}\) by

\begin{equation*} \ucobmtrx{B'}{B''} \matrixOf{\uvec{w}}{B'} = \matrixOf{\uvec{w}}{B''} \text{.} \end{equation*}

Substituting the first expression for \(\matrixOf{\uvec{w}}{B'}\) into the second equality, we find

\begin{equation*} \ucobmtrx{B'}{B''} \ucobmtrx{B}{B'} \matrixOf{\uvec{w}}{B} = \matrixOf{\uvec{w}}{B''} \text{,} \end{equation*}

and so we expect

\begin{equation*} \ucobmtrx{B'}{B''} \ucobmtrx{B}{B'} = \ucobmtrx{B}{B''} \text{.} \end{equation*}
Reversing change of basis.

Finally, we considered the relationship between transition matrices \(\ucobmtrx{B}{B'} \) and \(\ucobmtrx{B'}{B}\text{.}\) Transitioning \(\basisfont{B}' \to \basisfont{B}\) should be the reverse process of transitioning \(\basisfont{B} \to \basisfont{B}'\text{,}\) and so we expect

\begin{equation*} \ucobmtrx{B'}{B} = \uinvcobmtrx{B}{B'} \text{.} \end{equation*}

This relationship agrees with our expectations for no change of basis and chains of basis changes above, since if we consider basis \(\basisfont{B}\) to be both the first and third basis in a chain of basis changes, then

\begin{equation*} \ucobmtrx{B'}{B} \ucobmtrx{B}{B'} = \ucobmtrx{B}{B} = I \end{equation*}

as expected if matrices \(\ucobmtrx{B}{B'} \) and \(\ucobmtrx{B'}{B}\) are inverses.

Subsection 22.3.5 Change of basis in \(\R^n\)

Subsubsection 22.3.5.1 Changing to the standard basis

In Discovery 22.8, we first reminded ourselves that, relative to the standard basis \(\basisfont{S}\) of \(\R^n\text{,}\) a vector is equal to its coordinate vector. In particular, if \(\basisfont{B}\) is another basis of \(\R^n\text{,}\) then each basis vector in \(\basisfont{B}\) is equal to its own coordinate vector relative to the standard basis. So when we form a transition matrix \(\ucobmtrx{B}{S}\text{,}\) the columns of this matrix are precisely the vectors of basis \(\basisfont{B}\text{,}\) which makes \(\ucobmtrx{B}{S}\) particularly easy to produce.

Subsubsection 22.3.5.2 Computing transition matrices using the standard basis as an intermediate

Since transition matrices with the standard basis as the “new” basis are easy to produce, we can use the standard basis as an intermediate in a chain of basis changes to obtain any transition matrix \(\ucobmtrx{B}{B'}\) for bases of \(\R^n\text{.}\) Using what we learned in Subsection 22.3.4, we can write

\begin{gather} \ucobmtrx{B}{B'} = \ucobmtrx{S}{B'} \ucobmtrx{B}{S} = \uinvcobmtrx{B'}{S} \ucobmtrx{B}{S}\text{,}\label{equation-change-of-basis-concepts-inv-prod-formula-Rn}\tag{\(\maltese\)} \end{gather}

and so we can compute \(\ucobmtrx{B}{B'}\) from the easy-to-produce transition matrices \(\ucobmtrx{B}{S}\) and \(\ucobmtrx{B'}{S}\text{.}\)

Subsubsection 22.3.5.3 Computing transition matrices using row reduction

We use row reduction to do just about everything, so of course we can use it to compute transition matrices! Now, computing a transition matrix using Procedure 22.3.1 would already be carried out by (a lot) of row reducing, as computing a coordinate vector is equivalent to solving a linear system. (See the coordinate vector examples in Subsection 19.4.3.) But from (\(\maltese\)), we can use what we learned about computing inverses in Subsection 6.3.5 to develop a procedure for computing a transition matrix for \(\R^n\) by row reducing a single matrix.

In Subsection 6.3.5, we learned that if a matrix can be reduced to the identity by some sequence of row operations, then the exact same sequence of row operations will “unreduce” the identity matrix to the inverse of the original matrix. Look back at (\(\star\)) in Subsection 6.3.5, which describes the pattern of Procedure 6.3.7. Here is that pattern again, but in abbreviated form:

\begin{equation*} \left[\begin{array}{c|c} A \amp I \end{array}\right] \qquad \rowredarrow \qquad \left[\begin{array}{c|c} E_{\ell} \dotsm E_2 E_1 A \amp E_{\ell} \dotsm E_2 E_1 I \end{array}\right] = \left[\begin{array}{c|c} I \amp \inv{A} \end{array}\right]\text{,} \end{equation*}

where the \(E_j\) are the elementary matrices associated to a sequence of operations that reduces \(A\) to \(I\text{.}\)

What if we augmented \(A\) with a matrix other than \(I\) and row reduced? Then the pattern would be

\begin{align} \amp \left[\begin{array}{c|c} A \amp B \end{array}\right] \notag\\ \amp \qquad \rowredarrow \qquad \left[\begin{array}{c|c} E_{\ell} \dotsm E_2 E_1 A \amp E_{\ell} \dotsm E_2 E_1 B \end{array}\right] = \left[\begin{array}{c|c} I \amp \inv{A} B \end{array}\right]\text{.}\label{equation-change-of-basis-concepts-row-red-aug-transfer-inv}\tag{\(\maltese\maltese\)} \end{align}

This pattern reveals that we can also carry out an inverse-times-matrix computation by row reducing! And that kind of computation is exactly what we would carry out if we used (\(\maltese\)) to compute a transition matrix for \(\R^n\text{.}\)

Remark 22.3.3.
  1. Since a transition matrix \(\ucobmtrx{B}{S}\) for \(\R^n\) is formed by just writing the vectors of basis \(\basisfont{B}\) as columns in a matrix, it might be easier to remember the pattern of Procedure 22.3.2 if we remove some of the transition matrix notation:

    \begin{equation*} \left[\begin{array}{c|c} \basisfont{B}' \amp \basisfont{B} \end{array}\right] \qquad\rowredarrow\qquad \left[\begin{array}{c|c} I \amp (\basisfont{B} \to \basisfont{B}') \end{array}\right] \end{equation*}

    (where we even could write \(\basisfont{S}\) instead of \(I\) on the left of the reduced matrix, if we wished).

    Even better might be to express the pattern in words:

    \begin{equation*} \left[\begin{array}{c|c} \text{New} \amp \text{Old} \end{array}\right] \qquad\rowredarrow\qquad \left[\begin{array}{c|c} I \amp (\text{Old} \to \text{New}) \end{array}\right]\text{.} \end{equation*}
  2. Procedure 22.3.2 can also be used in other vector spaces besides \(\R^n\text{,}\) particularly those that have a “standard” basis that can be used to easily form intermediate transition matrices to use in the row reducing procedure. We will comment again on this in Remark 22.4.3, after we have seen a couple of related examples.