Skip to main content

Section 44.3 Concepts

Subsection 44.3.1 Composition of linear transformations

As linear transformations are functions between vector spaces, they can be composed just like functions. That is, for transformations \(\funcdef{T}{U}{V}\) and \(\funcdef{S}{V}{W}\text{,}\) where the domain space of \(S\) is the same as the domain space of \(T\text{,}\) then we can define the composite \(\funcdef{ST}{U}{W}\) by

\begin{equation*} ST(\uvec{u}) = S\bigl(T(\uvec{u})\bigr) \text{.} \end{equation*}

Now, if \(T\) turns sums and scalar multiples in \(U\) into sums and scalar multiples in \(V\text{,}\) and \(S\) in turn turns sums and scalar multiples in \(V\) into sums and scalar multiples in \(W\text{,}\) then it follows that the composition \(ST\) turns sums and scalar multiples in \(U\) into sums and scalar multiples in \(W\text{.}\) In other words, composition \(ST\) will be linear whenever \(S\) and \(T\) are both linear.

Warning 44.3.1.

Just as for linear transformations, in general compositions \(ST\) and \(TS\) are not equal. In fact, more often than not, one of the two orders is not even defined as domains and codomains will not match up in both orders.

Subsection 44.3.2 Composition of matrix transformations

If \(\funcdef{T_A}{\R^n}{\R^m}\) and \(\funcdef{S_B}{\R^m}{\R^\ell}\) are the matrix transformations corresponding to \(m \times n\) matrix \(A\) and \(\ell \times m\) matrix \(B\text{,}\) then chaining the input-output processes

\begin{align*} T_A(\uvec{x}) \amp = A \uvec{x} \text{,} \amp S_B(\uvec{y}) \amp = B \uvec{y} \end{align*}

by setting

\begin{equation*} \uvec{y} = T_A(\uvec{x}) = A \uvec{x} \end{equation*}

yields

\begin{equation*} (S_B T_A)(\uvec{x}) = S_B\left(T_A(\uvec{x})\right) = S_B(\uvec{y}) = B \uvec{y} = B (A \uvec{x}) = B A \uvec{x}\text{.} \end{equation*}

So clearly the composition \(S_B T_A\) is also a matrix transformation, with corresponding matrix \(B A\text{.}\) (And note that the sizes of \(A\) and \(B\) match up for this product: the composite has domain/codomain definition \(\funcdef{S_B T_A}{\R^n}{\R^\ell}\text{,}\) and for matrices we have \(\ell \times m\) times \(m \times n\) produces \(\ell \times n\text{.}\))

In other words, the standard matrix of a composition of matrix transformations is the product of the standard matrices:

\begin{equation*} \stdmatrixOf{ST} = \stdmatrixOf{S} \stdmatrixOf{T} \text{.} \end{equation*}

Subsection 44.3.3 Inverse transformations

When a function is one-to-one, each output can be traced back to one particular input that produced it. So if \(\funcdef{T}{V}{W}\) is injective and \(\uvec{w}\) is a vector in \(\im T\text{,}\) there is one unique answer to the question: what vector \(\uvec{v}\) in the domain space \(V\) will have result

\begin{equation*} T(\uvec{v}) = \uvec{w} \text{?} \end{equation*}

This reverse output-input process of tracing an output vector \(\uvec{w}\) in \(\im T\) back to the input \(\uvec{v}\) creates an inverse function \(\funcdef{\inv{T}}{\im T}{V}\text{.}\) While we did not consider it in Discovery set 44.1, we will prove that when \(T\) is linear, then so is \(\inv{T}\text{.}\)

Reversing the input-output process of an injective function to create an inverse function just gives us a different way to create pairs of vectors, one in the domain space paired with one in the codomain space, so that writing

\begin{equation*} T(\uvec{v}) = \uvec{w} \end{equation*}

becomes the same thing as writing

\begin{equation*} \uvec{v} = \inv{T}(\uvec{w}) \end{equation*}

for every \(\uvec{w}\) in \(\im T\text{.}\) And just as in Discovery 44.4, when we compose an injective transformation with its inverse, we obtain an identity transformation: if \(T(\uvec{v}) = \uvec{w}\text{,}\) then we have

\begin{align*} (\inv{T} T)(\uvec{v}) \amp = \inv{T}(\uvec{w}) = \uvec{v} \text{,} \amp (T \inv{T})(\uvec{w}) \amp = T(\uvec{v}) = \uvec{w}\text{.} \end{align*}

When composing functions, we usually require the domain of the second to match the codomain of the first, and this works out fine for the composition order \(T \inv{T}\text{:}\)

\begin{align*} \amp \funcdef{\inv{T}}{\im T}{V} \text{,} \amp \amp \funcdef{T}{V}{W}\text{,} \end{align*}

where codomain \(V\) of \(\inv{T}\) matches domain \(V\) of \(T\text{.}\) But even though codomain and domain do not match up for the order \(\inv{T} T\text{,}\) there is no harm in effectively considering the definition of \(T\) to be

\begin{equation*} \funcdef{T}{V}{\im T} \text{,} \end{equation*}

since by definition all of the outputs of \(T\) must be contained in its image. As we found in Discovery 44.4, it's important that we don't try to “fix” this the other way, by attempting to take the domain of \(\inv{T}\) to be all of the codomain \(W\) of \(T\) — we really must define the domain of \(\inv{T}\) to be just \(\im T\text{.}\) Even though \(\inv{T}\) represents the reverse process of \(T\text{,}\) it will not be possible to trace a vector in \(W\) that is not in \(\im T\) back to an input vector in \(V\text{.}\)

Subsection 44.3.4 Invertibility conditions

A function is invertible precisely when it is one-to-one, and by definition a function is one-to-one when pairs of distinct inputs always produce distinct outputs. If a linear transformation \(\funcdef{T}{V}{W}\) has a pair of distinct inputs \(\uvec{v}_1,\uvec{v}_2\) that produce the same output, then

\begin{equation*} T(\uvec{v}_1 - \uvec{v}_2) = T(\uvec{v}_1) - T(\uvec{v}_2) = \zerovec \text{.} \end{equation*}

As \(\uvec{v}_1,\uvec{v}_2\) are assumed distinct, then

\begin{equation*} \uvec{v}_1 - \uvec{v}_2 \neq \zerovec \text{,} \end{equation*}

and so our calculation above says that when \(T\) is not one-to-one, the kernel of \(T\) contains nonzero vectors.

In fact, we will prove that the kernel tells us precisely when a transformation is one-to-one: linear \(T\) is one-to-one precisely when \(\ker T = \{\zerovec\}\).

But requiring that \(\ker T\) be trivial also says something about \(\im T\text{,}\) as their dimensions are tied together by the Dimension Theorem: if \(\ker T\) is trivial, then \(\rank T\) must equal \(\dim V\text{.}\) In particular, the codomain \(W\) must have “room” for the subspace \(\im T\) to have dimension \(\dim V\text{,}\) so that \(\dim W \lt \dim V\) implies that \(T\) cannot be one-to-one.

Warning 44.3.2.

While \(\ker T = \{\zerovec\}\) is a sufficient condition to conclude that \(T\) is one-to-one, \(\dim W \ge \dim V\) is not.

Subsection 44.3.5 Inverses of matrix transformations

If \(\funcdef{T_A}{\R^n}{\R^n}\) is the matrix transformation corresponding to \(n \times n\) matrix \(A\text{,}\) then \(\ker T_A\) is precisely the null space of \(A\text{.}\) And we know that a matrix is invertible precisely when its null space is trivial (Statement 9 of Theorem 21.5.5). Transformation \(T_A\) is defined by multiplication by \(A\text{,}\) and clearly we can reverse that input-output process through multiplication by \(\inv{A}\text{.}\) That is, \(T_A\) is invertible precisely when \(A\) is invertible, and

\begin{equation*} \inv{T}_A = T_{\inv{A}}\text{.} \end{equation*}

In terms of standard matrices, a matrix transformation with a square standard matrix is invertible precisely when its standard matrix is invertible, with

\begin{equation*} \stdmatrixOf{\inv{T}} = \inv{\stdmatrixOf{T}} \text{.} \end{equation*}

What about \(\funcdef{T}{\R^n}{\R^m}\) for \(m \neq n\text{,}\) so that the standard matrix \(\stdmatrixOf{T}\) is not square? When \(m \gt n\text{,}\) it is still possible for \(T\) to be invertible, but the invertibility of \(\stdmatrixOf{T}\) can no longer even be considered. And when \(T\) is invertible, its inverse \(\funcdef{\inv{T}}{\im T}{\R^n}\) is no longer a matrix transformation, as \(\im T\) cannot be all of \(\R^m\text{.}\)

A look ahead.

In Chapter 45, we will see that the question of invertibility of any linear transformation with a finite-dimensional domain space can be reduced to invertibility of a square matrix, but doing so will require choosing a basis for each of the domain space and the image.

Subsection 44.3.6 Constructing invertible transformations

We know that a linear transformation \(V \to W\) can be defined by choosing a basis for the domain space \(V\) and then choosing a corresponding output vector in the codomain space \(W\) for each domain basis vector (Corollary 42.5.3). If we choose our output vectors to be linearly independent in \(W\) (assuming \(W\) has large enough dimension to do so) then those linearly independent vectors will span the image, and the rank of our transformation will be equal to \(\dim V\text{.}\) This forces the nullity to be zero by the Dimension Theorem, so, as in Subsection 44.3.4, the constructed transformation will then be invertible.

Subsection 44.3.7 Isomorphisms

When a transformation \(\funcdef{T}{V}{W}\) is both one-to-one and onto (so that \(\im T = W\)), then \(T\) and \(\inv{T}\) create a one-for-one matching in each direction. And since both \(T\) and \(\inv{T}\) are linear, any calculation of the vector operations corresponds through \(T\) to a calculation in \(W\text{,}\) and vice versa through \(\inv{T}\text{:}\)

\begin{align*} T (a_1 \uvec{v}_1 + a_2 \uvec{v}_2) \amp = a_1 T(\uvec{v}_1) + a_2 T(\uvec{v}_2) \text{,} \\ \inv{T} (b_1 \uvec{w}_1 + b_2 \uvec{w}_2) \amp = b_1 \inv{T}(\uvec{w}_1) + b_2 \inv{T}(\uvec{w}_2) \text{.} \end{align*}

In this case, \(T\) is called an isomorphism, and we write \(V \iso W\text{.}\) The use of the equality-like symbol \(\iso\) reflects the correspondence between vector operations in the two spaces, where \(T\) and \(\inv{T}\) can be used to transfer operations from one space to the other. Effectively, an isomorphism identifies the two spaces as being essentially the same. In particular, isomorphic vector spaces must have the same dimension.

A look ahead.

We will see that it also works the other way, so that finite-dimensional vector spaces with the same dimension are always isomorphic.

We will prove that the inverse of an isomorphism is an isomorphism, and that the composition of two isomorphisms is an isomorphism. Along with the identity isomorphism (see Subsection 44.3.7 below), these facts demonstrate that the relation of being isomorphic is an equivalence relation: that is,

reflexive

a vector space is always isomorphic to itself;

symmetric

if \(V\) is isomorphic to \(W\text{,}\) then \(W\) is isomorphic to \(V\text{;}\) and

transitive

if \(U\) is isomorphic to \(V\) and \(V\) is isomorphic to \(W\text{,}\) then \(U\) is isomorphic to \(W\text{.}\)

In fact, we will see that in the collection of all finite-dimensional vector spaces, an equivalence class under the isomorphic relation is completely determined by the dimension of its member spaces.

Subsection 44.3.8 Constructing isomorphisms

In Procedure 44.3.3, we described how an invertible transformation can be defined by sending a basis for the domain space to a linearly independent set in the codomain space. If we would like to also have surjectivity, then we will need that independent image collection to also span the entire codomain space. In other words, to construct an isomorphism we should send a basis to a basis.

Subsection 44.3.9 Important isomorphisms

The identity operator.

It should be clear that the identity operator \(\funcdef{I_V}{V}{V}\) defined by \(I_V(\uvec{v}) = \uvec{v}\) is always an isomorphism, as its kernel is trivial and the dimensions of the domain and codomain are equal. This is the isomorphism that sends every basis of \(V\) to that same basis.

Scalar operators.

Similar to the identity operator, every nonzero scalar operator \(\funcdef{m_a}{V}{V}\) defined by \(m_a(\uvec{v}) = a \uvec{v}\) is an isomorphism. In particular, the negative operator \(\funcdef{\neg_V}{V}{V}\) is an isomorphism. As nonzero scalar multiples do not affect independence, a scalar operator sends each basis of \(V\) to a scaled version of that same basis.

A coordinate map relative to a basis.

As described in Subsection 42.3.5, a choice of basis \(\basisfont{B}\) for a finite-dimensional vector space \(V\) creates a coordinate map \(\funcdef{\coordmap{B}}{V}{\R^n}\) or \(\funcdef{\coordmap{B}}{V}{\C^n} \) (depending on whether \(V\) is a real or complex space), defined by \(\coordmap{B}(\uvec{v}) = \matrixOf{\uvec{v}}{B}\) for each \(\uvec{v}\) in \(V\text{.}\)

In Example 43.4.7, we found that a coordinate map always has trivial kernel and full image. Therefore, every choice of basis for a finite-dimensional space creates an isomorphism to \(\R^n\) (real case) or \(\C^n\) (complex case). In particular, once the \(\basisfont{B}\) for the domain space \(V\) is chosen, the coordinate map is the unique isomorphism that sends \(\basisfont{B}\) to the standard basis of \(\R^n\) or \(\C^n\text{,}\) as appropriate.

Remark 44.3.5.

You probably used coordinate maps relative to familiar bases to create the isomorphisms requested of you in Discovery 44.8.

A vector space and its dual space.

Using Theorem 42.5.7, to every basis \(\basisfont{B}\) of a vector space \(V\) we can associate a dual basis \(\vecdual{\basisfont{B}}\) of the dual space \(\vecdual{V}\text{.}\) So applying Procedure 44.3.4, we can send each vector in \(\basisfont{B}\) to the corresponding dual linear functional in \(\vecdual{\basisfont{B}}\) to create an isomorphism \(V \to \vecdual{V}\text{.}\) We will record the fact that every vector space is isomorphic to its dual space in Subsection 44.5.3 as Corollary 44.5.19.