Skip to main content
Logo image

Section 12.3 Concepts

Subsection 12.3.1 Geometric length of a vector: the norm

We can easily determine the length of a vector in the plane from its components using the Pythagorean Theorem.
Using Pythagoras to determine the length of a vector.
If we let \(\ell\) represent the length of \(\uvec{v}\text{,}\) then Pythagoras tells us that
\begin{equation*} \ell^2 = (\Delta x)^2 + (\Delta y)^2 \text{.} \end{equation*}
We write \(\unorm{v}\) to mean the length of the vector \(\uvec{v}\) in the plane. Keep in mind in all that follows that \(\unorm{v}\) is always a single number, since it measures a length. If \(\uvec{v}\) has components \(\uvec{v} = (v_1,v_2)\) (where \(v_1 = \Delta x\) and \(v_2 = \Delta y\)), then solving for \(\ell\) in the Pythagorean equation above gives us
\begin{equation*} \unorm{v} = \sqrt{v_1^2 + v_2^2} \text{.} \end{equation*}
For a vector \(\uvec{v} = (v_1,v_2,v_3)\) in \(\R^3\text{,}\) consider the vector \(\uvec{v}' = (v_1,v_2,0)\) sitting in the \(xy\)-plane.
Using Pythagoras to determine the length of a vector.
Applying the Pythagorean Theorem to the vertical triangle, we find
\begin{equation*} \unorm{v}^2 = \norm{\uvec{v}'}^2 + (\Delta z)^2 \text{.} \end{equation*}
But \(\uvec{v}'\) lies flat in the \(xy\)-plane, and we have already analyzed that case above:
\begin{equation*} \norm{\uvec{v}'}^2 = (\Delta x)^2 + (\Delta y)^2 \text{.} \end{equation*}
Combining these, we get
\begin{equation*} \unorm{v}^2 = \bbrac{(\Delta x)^2 + (\Delta y)^2} + (\Delta z)^2 = v_1^2 + v_2^2 + v_3^2 \text{,} \end{equation*}
so that
\begin{equation*} \unorm{v} = \sqrt{v_1^2 + v_2^2 + v_3^2} \text{.} \end{equation*}
The word length ceases to have any meaning in \(\R^4\text{,}\) so in general we refer to \(\unorm{v}\) as the norm of \(\uvec{v}\) in any dimension. We imagine that if we were able to somehow measure length in \(\R^n\) for \(n\ge 4\text{,}\) then the pattern where we used length in \(\R^2\) to help us compute length in \(\R^3\) would be repeated, and we would be able to use length in \(\R^3\) to help us compute “length” in \(\R^4\text{,}\) and then we would be able to use “length” in \(\R^4\) to help us compute “length” in \(\R^5\text{,}\) and so on. So it seems reasonable to define the norm of a vector \(\uvec{v} = (v_1,v_2,\dotsc,v_n)\) in \(\R^n\) to be
\begin{equation*} \unorm{v} = \sqrt{v_1^2 + v_2^2 + \dotsb + v_n^2} \text{.} \end{equation*}
Square roots are annoying to work with algebraically, so we often work with the square of a norm, for which we developed the formula
\begin{equation*} \unorm{v}^2 = v_1^2 + v_2^2 + \dotsb + v_n^2 \end{equation*}

Subsection 12.3.2 Properties of the norm

We explored some other basic properties of the norm in Discovery 12.3. First, when we take the square root of a nonzero number, we always take the positive square root, so a norm is never a negative number. This property agrees with our conception of norm as a length in \(\R^2\) and \(\R^3\text{,}\) since in geometry we usually require lengths to be nonnegative.
Second, the zero vector \(\zerovec = (0,0,\dotsc,0)\) always has norm \(0\) in every dimension, since
\begin{equation*} \norm{\zerovec} = \sqrt{0^2 + 0^2 + \dotsb + 0^2} = \sqrt{0} = 0 \text{.} \end{equation*}
And it is the only vector that has norm \(0\text{,}\) since as soon as one of the components of a vector is nonzero, the sum of squares under the square root sign in the norm formula will be a positive number. There is no possibility of cancellation to zero under the square root, even if a vector has a mix of positive and negative components, because squaring the components will never have negative results.
Finally, we considered the effect of a scalar multiplication on norm. Geometrically, in \(\R^2\) and \(\R^3\) we think of scalar multiplication as scaling a vector’s length by some scale factor \(k\text{,}\) so we should expect the numerical norm of a vector to be multiplied by the scale factor. And that is (almost) exactly what happens:
\begin{align*} \unorm{v} \amp= \sqrt{v_1^2 + v_2^2 + \dotsb + v_n^2}, \amp \norm{k\uvec{v}} \amp= \sqrt{(kv_1)^2 + (kv_2)^2 + \dotsb + (kv_n)^2} \\ \amp\amp \amp= \sqrt{k^2v_1^2 + k^2v_2^2 + \dotsb + k^2v_n^2} \\ \amp\amp \amp= \sqrt{k^2(v_1^2 + v_2^2 + \dotsb + v_n^2)} \\ \amp\amp \amp= \sqrt{k^2}\sqrt{v_1^2 + v_2^2 + \dotsb + v_n^2} \\ \amp\amp \amp= \sqrt{k^2}\unorm{v}\text{.} \end{align*}
We need to be a little careful with the last step, because it is not always true that \(\sqrt{k^2} = k\). In particular, the result of \(\sqrt{k^2}\) is never negative, so if \(k\) is negative then it is impossible for \(\sqrt{k^2}\) to be equal to \(k\text{.}\) The proper formula for all values of \(k\) is \(\sqrt{k^2} = \abs{k}\text{,}\) so our norm formula becomes
\begin{equation*} \norm{k\uvec{v}} = \abs{k}\unorm{v} \text{.} \end{equation*}

Subsection 12.3.3 Unit vectors and normalization

In the plane or in space, a vector with length \(1\) is convenient geometrically because it can be used as a “meter stick” — every scalar multiple of that vector will have length equal to the (absolute value of) the scale factor. For example, if \(\uvec{u}\) has length \(1\text{,}\) then both \(3\uvec{u}\) and \(-3\uvec{u}\) have length \(3\text{.}\) The same pattern will hold in any dimension when we replace the word “length” with “norm.” A vector with norm \(1\) is called a unit vector. One of the reasons the standard basis vectors are so special is that each of them is a unit vector, as we saw in Discovery 12.4. Thus each standard basis vector can be used as a “meter stick” along the corresponding axis.
We also explored how to scale a nonzero vector to a unit vector in Discovery 12.4. For example, if a vector has norm \(1/2\text{,}\) then we can scale it up to a unit vector by multiplying it by \(2\) to double its norm. Conversely, if a vector has norm \(2\text{,}\) we can scale it down to a unit vector by multiplying it by \(1/2\) to halve its norm. In general, we can scale any nonzero vector \(\uvec{v}\) in \(\R^n\) up or down to a unit vector by multiplying it by scale factor \(k=\frac{1}{\unorm{v}}\text{,}\) since then
\begin{equation*} \norm{\frac{1}{\unorm{v}}\,\uvec{v}} = \abs{\frac{1}{\unorm{v}}}\,\unorm{v} = \frac{1}{\unorm{v}}\,\unorm{v} = 1\text{.} \end{equation*}
In the above, we have used the formula for the norm of a scalar multiple, \(\norm{k\uvec{v}} = \abs{k}\unorm{v}\text{,}\) with \(k=\frac{1}{\unorm{v}}\text{.}\) The absolute value brackets on this particular scalar \(k\) can be removed because norms are never negative, and so \(\abs{k} = k\) in this case.
In fact, every nonzero vector \(\uvec{v}\) is parallel to exactly two corresponding unit vectors, because \(k\uvec{v}\) and \(-k\uvec{v}\) always have the same norm. So
\begin{align*} \uvec{u}_1 \amp= \frac{1}{\unorm{v}}\,\uvec{v}, \amp \amp\text{and} \amp \uvec{u}_2 \amp= -\frac{1}{\unorm{v}}\,\uvec{v} \end{align*}
are always unit vectors, as long as \(\uvec{v} \neq \zerovec\text{.}\)

Subsection 12.3.4 Distance between vectors

As we saw in Subsection 11.3.4, if we position \(\uvec{u}\) and \(\uvec{v}\) to share the same initial points, then the difference vectors \(\uvec{u}-\uvec{v}\) and \(\uvec{v}-\uvec{u}\) run between the terminal points of \(\uvec{u}\) and \(\uvec{v}\text{.}\)
Diagram of difference vectors.
So we can measure the distance between the terminal points of \(\uvec{u}\) and \(\uvec{v}\) by computing \(\norm{\uvec{u}-\uvec{v}}\) or \(\norm{\uvec{v}-\uvec{u}}\text{,}\) as we discovered in Discovery 12.5. This process is even more straightforward when the common initial point of \(\uvec{u}\) and \(\uvec{v}\) is chosen to be the origin, so that the components of \(\uvec{u}\) and \(\uvec{v}\) are the same as the coordinates of their respective terminal points.

Remark 12.3.1.

The analysis above illustrates a useful strategy to compute distances in the plane or in space: determine some vector that traverses the distance in question, and then compute the norm of that vector to obtain the desired distance. Combined with some of the vector geometry that we will develop in the next few chapters, this strategy is often easier than trying to determine the coordinates of the points at the endpoints of the desired distance. You should remember this strategy when we explore the geometry of lines and planes in Chapters 13–14.

Subsection 12.3.5 Angle between vectors in the plane and in space

Two vectors in the plane, when given the same initial point, have two angles between them.
Diagram of the angles between two vectors in the plane.
We only need to know one of these two angles, since the other can be computed from the knowledge that the sum of the two angles is \(2\pi\) radians. We generally prefer to avoid ambiguity in math, so it would be nice to have a systematic way to choose one of the two angles between a pair of vectors that we can refer to as the angle between the vectors. We will not distinguish between clockwise and counterclockwise, because those terms will become meaningless when we move up a dimension. Instead we will always choose the smaller angle to be the angle between the two vectors.
Diagram illustrating the preference of the interior angle between two vectors in the plane.
Thus, the angle between two vectors in the plane will always be between \(0\) and \(\pi\) radians. Note that it is possible for the angle to be exactly \(0\) radians or exactly \(\pi\) radians, in the case the the two vectors are parallel.
Diagram of the angles between parallel vectors in the plane.
How can we measure the angle between vectors in three-dimensional space?
Diagram of the angle between vectors in space.
Figure 12.3.2. Diagram of the angle between vectors in space, embedded in a plane.
In space, two vectors that are positioned to share the same initial point can be completed to a triangle, and that triangle will lie in a plane. The angle between the two vectors can then be taken to be the smaller of the two angles between the two vectors in that shared plane.

Subsection 12.3.6 Dot product

In Discovery 12.7, we combined vector geometry with some high school geometry to determine a formula for the (cosine of the) angle between two plane vectors. Recall from Subsection 11.3.4 that a vector that runs between the terminal points of two vectors that share an initial point is a difference vector.
Diagram of the law of cosines applied to a triangle of vectors.
The lengths of the sides of the triangle formed by these three vectors are just the norms of the vectors:
\begin{align*} a \amp= \unorm{u}, \amp b \amp= \unorm{v}, \amp c \amp= \norm{\uvec{u}-\uvec{v}} \text{.} \end{align*}
The Law of Cosines applied to this triangle says that \(a^2 + b^2 - c^2 = 2 a b \cos\theta\text{.}\)
Let’s give our plane vectors some components so that we can work with this equality:
\begin{align*} \uvec{u} \amp= (u_1,u_2), \amp \uvec{v} \amp= (v_1,v_2), \amp \uvec{u}-\uvec{v} \amp= (u_1-v_1,u_2-v_2)\text{.} \end{align*}
Now we have
\begin{align*} a^2 \amp= \unorm{u}^2 \amp b^2 \amp= \unorm{v}^2 \amp c^2 \amp= \norm{\uvec{u}-\uvec{v}}^2 \\ \amp= u_1^2+u_2^2, \amp \amp= v_1^2 + v_2^2, \amp \amp= (u_1-v_1)^2 + (u_2-v_2)^2 \\ \amp\amp\amp\amp \amp= u_1^2 - 2u_1v_1 + v_1^2 + u_2^2 - 2u_2v_2 + v_2^2, \end{align*}
and so after some cancelling we have
\begin{equation*} a^2 + b^2 - c^2 = 2 u_1 v_1 + 2 u_2 v_2 \text{.} \end{equation*}
Using the expression on the right above for the left-hand side of the equality \(a^2 + b^2 - c^2 = 2 a b \cos\theta \) for \(\cos\theta\text{,}\) solving for \(\cos\theta\text{,}\) and then substituting \(a=\unorm{u}\) and \(b=\unorm{v}\) leads to
\begin{gather} \cos\theta = \frac{u_1v_1 + u_2v_2}{\unorm{u}\unorm{v}}\text{.}\tag{✶} \end{gather}
The expression on the left and the denominator on the right are both familiar — we have the ordinary cosine function from trigonometry and we have some vector norms. However, before we worked through Discovery 12.7, the expression in the numerator on the right-hand side was unknown.
Earlier in this chapter, we mentioned how two vectors in space with their initial points at the origin lie inside a common flat plane (see Figure 12.3.2). If we repeated the above geometric analysis of vector angle in this flat surface inside space, we would come to a similar conclusion:
\begin{gather} \cos\theta = \frac{u_1v_1 + u_2v_2 + u_3v_3}{\unorm{u}\unorm{v}}\text{.}\tag{✶✶} \end{gather}
There is an obvious pattern to the numerators on the right-hand sides of equations (✶) and (✶✶). And it seems that the value that these numerator formulas compute is important, since it provides a link between the two most important quantities in geometry: length and angle. So we give it a name, the dot product (or the Euclidean inner product), and use the symbol \(\dotprod{}{}\) between two vectors to represent this quantity. The formula can obviously be extended to higher dimensions than just the plane \(\R^2\) and space \(\R^3\text{,}\) so we will do just that:
\begin{equation*} \udotprod{u}{v} = u_1v_1 + u_2v_2 + \dotsb + u_n v_n \text{.} \end{equation*}

Warning 12.3.3.

The result of the computation \(\udotprod{u}{v}\) is a number, which is important to keep in mind if you are working algebraically with an expression containing a dot product. See Proposition 12.5.3 in Subsection 12.5.1 for algebraic rules involving the dot product.

Subsection 12.3.7 Angle between vectors in \(\R^n\)

Even though we can’t “see” geometry in \(\R^n\) for \(n>3\text{,}\) we have already seen that we can perform computations related to geometry in these spaces. We can attach the number \(\unorm{v}\) to a vector \(\uvec{v}\) in \(\R^n\) that can be interpreted as its “length.” And for two vectors \(\uvec{u}\) and \(\uvec{v}\) in \(\R^n\text{,}\) we can compute the number \(\udotprod{u}{v}\) that is somehow related to the geometric relationship between \(\uvec{u}\) and \(\uvec{v}\text{.}\) We have seen that in the plane and in space, \(\udotprod{u}{v}\) links the lengths of \(\uvec{u}\) and \(\uvec{v}\) to the angle between them. But do higher-dimensional vectors have angles between them? Is there some number that we can attach to \(\uvec{u}\) and \(\uvec{v}\) that “measures” the angle between them, even if we can’t see or measure this angle directly?
The equalities in (✶) and (✶✶) suggest a pattern we can copy into \(\R^n\) in general. We define the angle between \(\uvec{u}\) and \(\uvec{v}\) to be the unique angle \(\theta\text{,}\) between \(0\) and \(\pi\text{,}\) that makes
\begin{gather} \cos\theta \qquad \text{and} \qquad \frac{\udotprod{u}{v}}{\unorm{u}\unorm{v}}\tag{✶✶✶} \end{gather}
equal.

Question 12.3.4.

  • For every pair of vectors \(\uvec{u}\) and \(\uvec{v}\) in \(\R^n\text{,}\) can we always determine a suitable angle \(\theta\) in the domain \(0\le\theta\le\pi\) that works (i.e. that makes the two quantities in (✶✶✶) equal)?
  • For some pair of vectors \(\uvec{u}\) and \(\uvec{v}\) in \(\R^n\text{,}\) might it be possible that there are several values of \(\theta\) in the domain \(0\le\theta\le\pi\) that work?
Fortunately, for a pair of (nonzero) plane vectors or space vectors, there is exactly one number (once we restrict to the domain \(0\le\theta\le\pi\)) that gets to call itself the angle between the vectors. It would not bode well for the possibility of somehow doing geometry in higher-dimensional spaces if there were sometimes two numbers that could be reasonably called “the angle” between a pair of vectors, or sometimes none at all. Luckily neither of these is possible.
First, for a pair of nonzero vectors in \(\R^n\text{,}\) the formula
\begin{equation*} \frac{\udotprod{u}{v}}{\unorm{u}\unorm{v}} \end{equation*}
can always be computed, and the result of the computation is always a single, definite number.
Second, looking at the provided graph of \(y = \cos\theta\text{,}\) there are no instances in the domain \(0 \le \theta \le \pi\) where \(\cos\theta\) computes to the same value for two different values of \(\theta\text{.}\)
described in detail following the image
Graph of \(y=\cos\theta\text{.}\)
On this domain, we call the graph one-to-one. So a pair of vectors in \(\R^n\) can never have two angles in the domain \(0\le\theta\le\pi\) between them, because there are never two solutions to the equation
\begin{gather} \cos\theta = \frac{\udotprod{u}{v}}{\unorm{u}\unorm{v}}\tag{†} \end{gather}
in that domain.
But is there always some solution to equation (†)? No matter what domain you work on, \(\cos\theta\) never evaluates to a number greater than \(1\) or less than \(-1\text{.}\) Perhaps if we tried hard enough we could discover some unlucky pair of vectors \(\uvec{u}\) and \(\uvec{v}\) in \(\R^{13}\) where
\begin{equation*} \frac{\udotprod{u}{v}}{\unorm{u}\unorm{v}} \end{equation*}
computed to a number greater than \(1\) or to a number less than \(-1\text{.}\) In that case, it would be impossible for \(\cos\theta\) to be equal to that number, and \(\uvec{u}\) and \(\uvec{v}\) would have no angle between them. It turns out that forming such an unlucky pair of vectors is impossible, and we know this courtesy of a couple of dead guys.
Since the graph \(y=\cos\theta\) passes through every possible \(y\)-value in the range \(-1 \le y \le 1\text{,}\) and does so only once, equation (†) always has one unique solution for a pair of nonzero vectors.

Subsection 12.3.8 Dot product versus norm

We have already seen that the dot product is intimately tied to the geometry of \(\R^n\text{,}\) acting as a link between norm (length) and angle. But as we discovered in Discovery 12.9, it is also directly linked to the norm by the observation
\begin{align*} \unorm{v}^2 \amp= (\sqrt{v_1^2 + v_2^2 + \dotsb + v_n^2})^2 \amp \udotprod{v}{v} \amp= v_1v_1 + v_2v_2 + \dotsb + v_n v_n \\ \amp= v_1^2 + v_2^2 + \dotsb + v_n^2, \amp \amp= v_1^2 + v_2^2 + \dotsb + v_n^2. \end{align*}
So we obtain a very convenient formula: \(\unorm{v}^2 = \udotprod{v}{v}\text{.}\)

Remark 12.3.6.

Really, this “new” link between dot product and norm is just the special case of equation (†) where \(\uvec{u}\) is taken to be equal to \(\uvec{v}\text{,}\) since in this case the angle between \(\uvec{u}\) and \(\uvec{v}\) (i.e. between \(\uvec{v}\) and itself) is zero, and \(\cos 0 = 1\text{.}\)

Subsection 12.3.9 Dot product as matrix multiplication

The pattern in the formula for the dot product of two vectors should look vaguely familiar to you — it is a sum of products, which is exactly the pattern of the left-hand side of a linear equation, and so also the pattern in our “row-times-column” view of matrix multiplication in Subsection 4.3.7. In fact, the dot product can be defined in terms of matrix multiplication if we take our vectors to be column vectors and use the transpose to turn one of the columns into a row. Indeed, for
\begin{align*} \uvec{u} \amp= \begin{bmatrix}u_1\\u_2\\\vdots\\u_n\end{bmatrix}, \amp \uvec{v} \amp= \begin{bmatrix}v_1\\v_2\\\vdots\\v_n\end{bmatrix}, \end{align*}
we have
\begin{align*} \utrans{\uvec{v}} \uvec{u} \amp= \begin{bmatrix} v_1 \amp v_2 \amp \cdots \amp v_n \end{bmatrix} \begin{bmatrix} u_1 \\ u_2 \\ \vdots \\ u_n \end{bmatrix}\\ \amp= v_1 u_1 + v_2 u_2 + \dotsb + v_n u_n \\ \amp= u_1 v_1 + u_2 v_2 + \dotsb + u_n v_n \\ \amp= \udotprod{u}{v} \text{.} \end{align*}
So we obtain a matrix formula for dot product: \(\udotprod{u}{v} = \utrans{\uvec{v}} \uvec{u}\text{.}\)

Remark 12.3.7.

  • Technically, the result of multiplying the \(1 \times n\) matrix \(\utrans{\uvec{v}}\) and the \(n \times 1\) matrix \(\uvec{u}\) should be a \(1 \times 1\) matrix. But algebraically there is no difference between numbers and \(1 \times 1\) matrices with respect to the operations of addition, subtraction, and multiplication, so it is common to think of a \(1 \times 1\) matrix as just a number, as we did above.
  • It might seem more natural to use
    \begin{equation*} \udotprod{u}{v} \,=\, \utrans{\uvec{u}} \uvec{v} \,=\, \begin{bmatrix} u_1 \amp u_2 \amp \cdots \amp u_n \end{bmatrix} \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix} \,=\, u_1 v_1 + u_2 v_2 + \dotsb + u_n v_n \end{equation*}
    (as we did in Discovery 12.11), instead of the seemingly pointless reversal of order in the formula \(\udotprod{u}{v} = \utrans{\uvec{v}} \uvec{u} \text{.}\) However, if you continue on in your study of linear algebra beyond this course, you will discover that this reversal of order is necessary when studying complex vectors (that is, vectors where the components are complex numbers). Since this reversal of order is harmless here, we will start using it now so as to avoid confusion later.