Section 42.3 Concepts
In this section.
- Subsection 42.3.1 Matrix transformations
- Subsection 42.3.2 Linear transformations
- Subsection 42.3.3 Spanning set and basis image vectors
- Subsection 42.3.4 The standard matrix of a transformation \(\R^n \to \R^m\)
- Subsection 42.3.5 Important examples
- Subsection 42.3.6 The space of transformations \(V \to W\)
- Subsection 42.3.7 Linear functionals
- Subsection 42.3.8 Linear transformations of complex vector spaces
Subsection 42.3.1 Matrix transformations
We have previously considered the matrix-times-vector pattern with a geometric perspective, especially in the context of a transition matrix that achieves a similarity relation, as a way to transform vectors in \(\R^n\text{.}\) We can think of this as an input-output process, where an input vector \(\uvec{x}\) is transformed into an output vector \(A \uvec{x}\text{.}\) In those past explorations, we focused on the case that \(A\) is square, so that input vectors and transformed output vectors are the same dimension, but there is no reason to restrict to just the square case.
Such an input-output process creates a function
where both inputs and outputs are vectors:
If the input-output process is achieved by matrix multiplication by a matrix \(A\text{,}\) then it resembles a linear system of equations
Breaking out into individual components, we get a linear-system-like set of input-output formulas
where the \(a_{ij}\) are the entries of \(A\text{.}\) So a matrix transformation \(\R^n \to \R^m\) is equivalent to a system of linear input-output formulas.
We will write \(T_A\) to mean the matrix transformation \(\R^n \to \R^m\) corresponding to \(m \times n\) matrix \(A\text{.}\) As we found in Discovery 42.1.c, the output results of
correspond to the columns of \(A\text{,}\) where the \(\uvec{e}_j\) are the standard basis vectors in \(\R^n\text{,}\) because we know that
the \(\nth[j]\) column of \(A\text{.}\) This pattern will be important below.
Subsection 42.3.2 Linear transformations
As described in the prelude to Discovery 42.2, we once again use the patterns of \(\R^n\) as a guide to create axioms for abstract vector spaces. A vector space is defined by the operations on its objects, vector addition and scalar multiplication, so we used the interactions of a matrix transformation \(\funcdef{T_A}{\R^n}{\R^m}\) with vector addition and scalar multiplication as a model for the abstract idea of a linear transformation between vector spaces. We know matrix multiplication satisfies algebra rules
(Rule 1.c and Rule 2.d of Proposition 4.5.1). Note that in each of these two algebra rules, the type of addition or scalar multiplication on the left-hand side of the equals sign is different from the type of addition or scalar multiplication on the right-hand side. On the left, the operation is being performed between \(n\)-dimensional vectors, before multiplication by \(A\text{.}\) On the right, the operation is being performed between \(m\)-dimensional vectors, after multiplication by \(A\text{.}\)
Writing these in a matrix-transformation point of view, we have
Accordingly, we require that a morphism \(\funcdef{T}{V}{W}\) between vector spaces \(V\) and \(W\) respect the vector operations of \(V\) and \(W\text{:}\)
for all vectors \(\uvec{u},\uvec{v}\) in \(V\) and all scalars \(k\text{,}\) and these two properties are the foundation for the definition of the concept of linear transformation. And just as for the matrix-transformation version of these properties, the type of addition or scalar multiplication on the left-hand side is different from the type of addition or scalar multiplication on the right-hand side. On the left, the operation is being performed between vectors in \(V\text{,}\) before being input into the transformation \(T\text{,}\) according to how the operation is defined in \(V\text{.}\) On the right, the operation is being performed between vectors in \(W\text{,}\) after being input into the transformation \(T\text{,}\) according to how the operation is defined in \(W\text{.}\)
In Discovery 42.2, we also considered the interaction of matrix transformations with other vector concepts, such as linear combinations, negatives, and the zero vector. However, in an abstract vector space, each of these concepts is defined in terms of or relative to the operations of addition and scalar multiplication, so we expect that the two linearity properties (\(\star\)) of an abstract linear transformation will also cause it to satisfy similar properties relative to these other concepts. (See Proposition 42.5.1 in Subsection 42.5.1.)
Subsection 42.3.3 Spanning set and basis image vectors
In Discovery 42.4, we found that knowing the image vectors for each vector in a spanning set for the domain space of a linear transformation \(\funcdef{T}{V}{W}\) is enough to recover the image vectors for all input vectors in the domain space.
If
and the image vectors
are known, then the image of every linear combination of the \(\uvec{v}_j\) can be determined from the corresponding linear combination of the \(\uvec{w}_j\text{.}\) As these vectors form a spanning set for \(V\text{,}\) the output for each domain vector can be determined in this way. See Example 42.4.7 for an example of using a domain space spanning set to analyze a linear transformation.
This pattern suggests a way to define a linear transformation with desired outputs.
Procedure 42.3.1. Using a domain basis to define a linear transformation.
To create a linear transformation \(\funcdef{T}{V}{W}\text{,}\) with \(V\) finite-dimensional.
- Choose a basis \(\basisfont{B} = \{\uvec{v}_1,\uvec{v}_2,\dotsc,\uvec{v}_n\}\) for \(V\text{.}\)
- Choose arbitrary vectors \(\uvec{w}_1,\uvec{w}_2,\dotsc,\uvec{w}_n\) in \(W\) (with duplicates allowed).
- Set \(T(\uvec{v}_j) = \uvec{w}_j\text{.}\)
Then every other image vector for \(T\) can be computed by
Remark 42.3.2.
Even though a linear transformation is completely determined by its image vectors for a spanning set of the domain space \(V\text{,}\) it's important to use a basis for \(V\) when using these patterns to define a linear combination. The redundancy created by a dependent spanning set (see Discovery 19.2) will also create ambiguity in defining \(T(\uvec{v}_j) = \uvec{w}_j\text{,}\) since if one of the \(\uvec{v}_j\) is a linear combination of the others and the \(\uvec{w}_j\) are truly arbitrarily chosen, the \(\uvec{w}_j\) won't necessarily exhibit the same dependence relationship as the \(\uvec{v}_j\text{.}\) See Example 42.4.8 for an example of how this could go wrong if we use a dependent spanning set for the domain space to define a linear transformation.
Subsection 42.3.4 The standard matrix of a transformation \(\R^n \to \R^m\)
Suppose we apply Procedure 42.3.1 to the task of creating a linear transformation \(\funcdef{T}{\R^n}{\R^m}\) by choosing the standard basis of the domain space \(\R^n\text{,}\) and then setting
for some collection of image vectors \(\uvec{w}_j\) in \(\R^m\text{.}\) As in Discovery 42.5, we already know a matrix transformation \(\funcdef{T_A}{\R^n}{\R^m}\) that satisfies
namely, the matrix transformation associated to the \(m \times n\) matrix
And in light of Discovery 42.4, since the outputs of \(T\) and \(T_A\) agree on the standard basis vectors for the domain space \(\R^n\text{,}\) they will also agree on every linear combination of those standard basis vectors, i.e. they will also agree on every vector in the domain space \(\R^n\text{.}\) That is, \(T = T_A\text{.}\)
In effect, every linear transformation \(\R^n \to \R^m\) is a matrix transformation. (See Corollary 42.5.4 in Subsection 42.5.1.) For a given linear transformation \(\funcdef{T}{\R^n}{\R^m}\text{,}\) the matrix
so that \(T = T_A\) for \(A = \stdmatrixOf{T}\text{,}\) is called the standard matrix of \(T\). Expressing the above pattern in words, the standard matrix of a transformation is the matrix whose columns are the image vectors for the standard basis vectors under the transformation.
Procedure 42.3.3. The standard matrix of a linear transformation \(\R^n \to \R^m\).
To compute \(\stdmatrixOf{T}\) for a linear transformation \(\funcdef{T}{\R^n}{\R^m}\text{.}\)
- Compute image vector \(T(\uvec{e}_j)\) for each standard basis vector in the domain space \(\R^n\text{.}\)
- Use the computed image vectors as the columns in \(\stdmatrixOf{T}\text{.}\)
See Subsection 42.4.2 for examples of determining standard matrices.
Subsection 42.3.5 Important examples
The zero transformation.
Given a pair of vector spaces \(V,W\text{,}\) we can create a trivial linear transformation \(\funcdef{\zerovec}{V}{W}\) by defining
for all \(\uvec{v}\) in \(V\text{,}\) where \(\zerovec_W\) is the zero vector in \(W\text{.}\) If we want to distinguish the zero transformation from the zero vector or the zero matrix or etc., we could write \(\funcdef{\zerovec_{V,W}}{V}{W}\text{,}\) though this notation is cumbersome and usually context alone is sufficient to determine what \(\zerovec\) means.
The fact that \(\zerovec_{V,W}\) satisfies the linearity properties (\(\star\)) is obvious from the vector identities
And clearly the standard matrix of \(\funcdef{\zerovec}{\R^n}{\R^m}\) will be the \(m \times n\) zero matrix.
The identity operator.
Given a vector space \(V\text{,}\) we can create an identity linear transformation \(\funcdef{I_V}{V}{V}\) by sending each vector in \(V\) to itself:
for all \(\uvec{v}\) in \(V\text{.}\) It should be obvious that \(I_V\) satisfies the linearity properties (\(\star\)).
And clearly the standard matrix of \(\funcdef{I}{\R^n}{\R^n}\) will be the \(n \times n\) identity matrix.
The negative operator.
Given a vector space \(V\text{,}\) we can create an negative linear transformation \(\funcdef{\neg_V}{V}{V}\) by sending each vector in \(V\) to its negative:
for all \(\uvec{v}\) in \(V\text{.}\) The fact that \(\neg_V\) satisfies the linearity properties (\(\star\)) is obvious from Rule 2.b and Rule 2.d of Proposition 16.6.2.
And clearly the standard matrix of \(\funcdef{\neg}{\R^n}{\R^n}\) will be the negative of the \(n \times n\) identity matrix.
Scalar operators.
Given a vector space \(V\) and a scalar \(a\text{,}\) we can create a scalar multiplication linear transformation \(\funcdef{m_a}{V}{V}\) by scaling each vector in \(V\) by scale factor \(a\text{:}\)
for all \(\uvec{v}\) in \(V\text{.}\)
We considered this operator in Task 42.3.d, where it should have been obvious that \(m_a\) satisfies the linearity properties (\(\star\)) from vector Axiom S 2 and vector Axiom S 4. Note that the scalar values \(a=-1,0,1\) create other special operators already discussed above:
And clearly the standard matrix of \(\funcdef{m_a}{\R^n}{\R^m}\) will be the \(n \times n\) scalar matrix \(a I\text{.}\)
The coordinate map relative to a basis.
Suppose \(\basisfont{B}\) is a basis for an \(n\)-dimensional vector space \(V\text{.}\) We have already seen that the process of determining coordinate vectors relative to \(\basisfont{B}\) is linear:
for all vectors \(\uvec{u},\uvec{v}\) in \(V\) and all scalars \(k\text{.}\) (See Discovery 22.2 and Subsection 22.3.1.) So the coordinate transformation
is linear.
If \(V = \R^n\text{,}\) then the standard matrix for \(\coordmap{B}\) is
which is precisely the transition matrix \(\ucobmtrx{S}{B}\text{,}\) where \(\basisfont{S}\) is the standard basis for \(\R^n\) (see Subsection 22.3.3).
Pairing with a fixed vector in a real inner product space.
The Linearity of inner products implies that by choosing a fixed vector \(\uvec{u}_0\) in an inner product space \(V\text{,}\) we can create a transformation \(\funcdef{T_{\uvec{u}_0}}{V}{\R^1}\) by pairing with \(\uvec{u}_0\text{:}\)
for all \(\uvec{x}\) in \(V\text{.}\)
If \(V = \R^n\) with the standard Euclidean inner product, then the standard matrix for \(T_{\uvec{u}_0}\) is the \(1 \times n\) matrix
But for each index \(j\text{,}\) the result of \(\dotprod{\uvec{e}_j}{\uvec{u}_0}\) is the \(\nth[j]\) coordinate of the \(\R^n\)-vector \(\uvec{u}_0\text{.}\) That is, \(\stdmatrixOf{T_{\uvec{u}_0}}\) is just \(\uvec{u}_0\) itself, as a row vector.
Subsection 42.3.6 The space of transformations \(V \to W\)
In Subsection 16.4.2, we created a vector space out of a collection of functions \(\funcdef{f}{D}{\R}\) defined on some domain \(D\) of real numbers by defining vector addition and scalar multiplication via addition and scaling of output values. Linear transformations are first and foremost functions, so we can attempt to do the same with the collection \(L(V,W)\) of all linear transformations \(V \to W\) for vector spaces \(V,W\text{.}\) As the codomain space \(W\) is equipped with vector addition and scalar multiplication operations, we can indeed use the pattern of adding and scaling transformations by adding and scaling outputs:
for all \(T,T_1,T_2\) in \(L(V,W)\) and all \(\uvec{v}\) in \(V\text{.}\)
Axiom A 1 and Axiom S 1 require that a vector space be closed under the addition and scalar multiplication operations, but we leave it to you, the reader, to verify that a sum of linear transformations is also linear, and that a scalar multiple of a linear transformation is also linear. And, in fact, all ten Vector space axioms will be satisfied by the above definitions of addition and scalar multiplication of linear transformations, making \(L(V,W)\) into a vector space.
In the case of \(L(\R^n,\R^m)\text{,}\) we saw in Subsection 42.3.4 that each linear transformation \(\R^n \to \R^m\) corresponds to multiplication by some \(m \times n\) matrix. How does adding and scalar multiplying linear transformations in \(L(\R^n,\R^m)\) affect the corresponding matrices? Suppose \(T_1 = T_{A_1}\) and \(T_2 = T_{A_2}\) for \(m \times n\) matrices \(A_1, A_2\text{.}\) Let \(B\) represent the matrix sum \(A_1 + A_2\text{.}\) Then for each vector \(\uvec{x}\) in \(\R^n\text{,}\) we have
This implies that the sum transformation \(T_1 + T_2\) is the same as the matrix transformation \(T_B\) for \(B = A_1 + A_2\text{.}\) In other words,
And one can similarly demonstrate that for each \(T\) in \(L(\R^n,\R^m)\) and each scalar \(k\text{,}\)
Combined, these two properties of the relationship between a linear transformation \(\R^n \to \R^m\) and the corresponding \(m \times n\) matrix mean that, as a vector space, \(L(\R^n,\R^m)\) can be considered as the “same” space as \(\matrixring_{m \times n}(\R)\text{.}\)
A look ahead.
In Chapter 44, we will use linear transformations to make the idea of two vector spaces appearing to be the “same” space technically precise.
Subsection 42.3.7 Linear functionals
Linear functionals on \(\R^n\).
In Discovery 42.6, we considered transformations \(\R^n \to \R^1\text{.}\) Such a transformation is special for two reasons. First, the output is more naturally considered as a number rather than a vector, which is why we refer to such a transformation as a linear functional. Second, the standard matrix of such a linear transformation will be a \(1 \times m\) matrix, which itself can be considered as a (row) vector in \(\R^n\). And multiplication of column vectors in \(\R^n\) by some fixed row vector is effectively a dot product. That is, every linear functional \(\R^n \to \R\) corresponds to dot product against a fixed \(n\)-dimensional vector. This creates a correspondence between the dual space \(\vecdual{(\R^n)} = L(\R^n,\R^1)\) and the original space \(\R^n\text{,}\) where each transformation in \(\vecdual{(\R^n)}\) is matched with the row vector in \(\R^n\) which describes the transformation in terms of the dot product.
Generalizing.
To try to generalize the pattern of linear functionals on \(\R^n\text{,}\) we expect that if \(V\) is a finite dimensional vector space, then there should be a connection between vectors in \(V\) and transformations in the dual space \(\vecdual{V} = L(V,\R^1)\text{.}\) We saw in Subsection 42.3.5 that such a connection exists if \(V\) is an inner product space, but we can do this even if \(V\) is not an inner product space by simply choosing a basis for \(V\text{.}\)
Suppose \(\basisfont{B} = \{\uvec{v}_1,\dotsc,\uvec{v}_n\}\) is a basis for finite-dimensional, real vector space \(V\text{.}\) Using the pattern of Proposition 37.5.6 as inspiration, we can create an inner product on \(V\) just by pretending \(\basisfont{B}\) is an orthonormal basis of \(V\text{,}\) and using the unique expansions
relative to \(\basisfont{B}\) to define a pairing
This pairing will be an inner product, and \(\basisfont{B}\) will indeed by orthonormal with respect to it. Then we can use a fixed vector \(\uvec{w}_0\) in \(V\) to create a linear functional \(T_{\uvec{w}_0}\) by
as before.
But the correspondence can be worked the other way as well, associating to a fixed linear functional \(\funcdef{f}{V}{\R^1}\) some vector \(\uvec{w}_0\) in \(V\) so that \(f = T_{\uvec{w}_0}\text{.}\) Indeed, as in Discovery 42.4, the functional \(f\) is completely determined by its outputs on the basis \(\basisfont{B}\text{.}\) Applying \(f\) to the basis vectors \(\uvec{v}_j\text{,}\) we obtain scalars
that can be used to create a vector
If
is any other vector in \(V\text{,}\) then the linearity of \(f\) gives us
as desired.
The double dual space.
Above we saw that a choosing a basis for a finite-dimensional vector space \(V\) creates a correspondence between vectors in \(V\) and linear functionals in the dual space \(\vecdual{V} = L(V,\R^1)\text{.}\) But \(\vecdual{V}\) is also a finite-dimensional vector space, and so choosing a basis for \(\vecdual{V}\) creates a correspondence between vectors in \(\vecdual{V}\) (i.e. linear functionals on \(V\)) and linear functionals in the double dual space \(\vecddual{V} = L(\vecdual{V},\R^1)\text{.}\) We can bridge these two correspondences to create a direct correspondence between vectors in \(V\) and vectors in the double dual \(\vecddual{V}\text{.}\)
Since the individual correspondences \(V \leftrightarrow \vecdual{V}\) and \(\vecdual{V} \leftrightarrow \vecddual{V}\) depend on choosing bases for the spaces, they are coordinate-dependent. However, in more advanced study of linear algebra, you might learn that there is actually a coordinate-free correspondence directly between \(V\) and its double dual space \(\vecddual{V}\text{.}\) This correspondence is easy to describe: for \(\uvec{v}\) in \(V\text{,}\) define linear functional \(\vecdual{T}_\uvec{v}\) on \(\vecdual{V}\) by
To make sense of this definition, keep in mind that a linear functional on \(\vecdual{V}\) should take as inputs linear functionals on \(V\text{.}\) We will leave further analysis of this correspondence to your future studies in linear algebra.
Subsection 42.3.8 Linear transformations of complex vector spaces
While we focused on the real case in this section so far, everything works the same for complex vector spaces. Linear transformations \(\C^n \to \C^m\) are still matrix transformations, but by complex matrices. We still have zero and identity operators on complex vector spaces, and linear functionals, and so on.