Skip to main content

Section 14.3 Concepts

Subsection 14.3.1 Values of \(\udotprod{u}{v}\)

In Discovery 14.1, we compared the graph of the cosine function on the domain \(0\le\theta\le\pi\) with the formula

\begin{gather} \cos\theta = \frac{\udotprod{u}{v}}{\unorm{u}\unorm{v}}\text{,}\label{equation-orthog-concepts-cos-vs-dot-prod-eqn}\tag{\(\star\)} \end{gather}

where \(\theta\) is the angle between nonzero vectors \(\uvec{u}\) and \(\uvec{v}\text{.}\) On the right of equation (\(\star\)), the denominator is always positive, so whether the whole fraction is positve, negative, or zero depends entirely on the dot product in the numerator. On the left, the cosine function is positive, negative, or zero precisely when the angle \(\theta\) is acute, obtuse, or right. So we come to the following conclusions.

\(\theta\) \(\udotprod{u}{v}\)
acute: \(0\le \theta \lt \pi/2\) positive
right: \(\theta = \pi/2\) zero
obtuse: \(\pi/2 \lt \theta \le \pi\) negative
Figure 14.3.1.

Subsection 14.3.2 Orthogonal vectors

Right angles are extremely important in geometry, and from Figure 14.3.1 we see that the dot product gives us a very convenient way to tell when the angle \(\theta\) between two nonzero vectors \(\uvec{u}\) and \(\uvec{v}\) is right: we have \(\theta = \pi/2\) precisely when \(\udotprod{u}{v} = 0\). In the plane or in space, \(\uvec{u}\) and \(\uvec{v}\) will be perpendicular when \(\theta = \pi/2\) and \(\udotprod{u}{v} = 0\text{.}\) Since we can't “see” right angles and perpendicular lines in higher dimensions, in general we say that \(\uvec{u}\) and \(\uvec{v}\) are orthogonal when \(\udotprod{u}{v} = 0\text{.}\)

In Discovery 14.2, we tried to find a pattern to the task of choosing some vector that is orthogonal to a given one in the plane. Rather than struggle with the geometry, we unleash the power of algebra: given vector \(\uvec{u} = (a,b)\text{,}\) we are looking for a vector \(\uvec{v}\) so that \(\udotprod{u}{v} = 0\text{.}\) Expanding out the dot product, we are looking to fill in the blanks in the following equation with components for \(\uvec{v}\text{:}\)

\begin{equation*} a\cdot\boxed{\phantom{X}} + b\cdot\boxed{\phantom{X}} = 0 \text{.} \end{equation*}

Two numbers add to zero only if one is the negative of the other. We can make both terms in the sum the same number by entering \(b\) in the first blank and \(a\) in the second, so we can make the sum cancel to zero by also making one of those entries zero:

\begin{equation*} a\cdot\boxed{b} + b\cdot\boxed{(-a)} = 0 \text{.} \end{equation*}

We have now answered the question in Discovery 14.2.c: given vector \(\uvec{u} = (a,b)\) in the plane, one example of a vector that is orthogonal to \(\uvec{u}\) is \(\uvec{v} = (b,-a)\text{.}\)

Subsection 14.3.3 Orthogonal projection

Orthogonal projection is a vector solution to a problem in geometry.

Question 14.3.2.

Given a line through the origin in the plane, and a point not on the line, what point on the line is closest to the given point?

In Question 14.3.2, write \(\ell\) for the line through the origin and \(Q\) for the point not on that line. Consider the point \(P\) on \(\ell\) at the foot of the perpendicular to \(\ell\) from \(Q\) . Any other point \(P'\) on \(\ell\) will form a right triangle with \(P\) and \(Q\text{,}\) making it farther from \(Q\) than \(P\text{,}\) since the distance \(P'Q\) is the length of the hypotenuse in the right triangle.

Diagram illustrating the shortest distance from a point to a line in the plane.

All we know about \(P\) is that it is on line \(\ell\) and it is at the vertex of a right angle with \(\ell\) and \(Q\text{.}\) But if we introduce some vectors to help tackle this problem, then maybe we can use what we know about the dot product and right angles to help determine \(P\text{.}\)

Diagram illustrating orthogonal projection onto a line through the origin.

In this diagram, \(\uvec{u}\) is the vector corresponding to directed line segment \(\abray{OQ}\text{,}\) and \(\uvec{w}\) is the vector corresponding to the directed line segment \(\abray{OP}\text{,}\) where \(P\) is our unknown closest point. Since \(\uvec{w}\) is placed with its tail at the origin, the components of \(\uvec{w}\) are precisely the coordinates of \(P\text{.}\) So determining \(\uvec{w}\) will solve the problem.

We are assuming that the line \(\ell\) is known, and it would be nice to also have a vector means of describing it. But the vectors created by the points on this line (using the origin as a universal tail point) will all be parallel to each other, so (as we discovered in Discovery 14.3.a) line \(\ell\) could be described as all scalar multiples of a particular vector \(\uvec{a}\text{.}\) This vector can be arbitrarily chosen as any vector parallel to the line. Once we have chosen \(\uvec{a}\text{,}\) we have reduced our problem from determining the two unknown components of the vector \(\uvec{w}\) to determining a single unknown scalar \(k\) so that \(\uvec{w} = k\uvec{a}\text{.}\)

As mentioned, since \(P\) is the closest point, the directed line segment \(\abray{PQ}\) must be perpendicular to \(\ell\text{.}\) On the diagram above, we have used the vector \(\uvec{n}\) to represent this direct line segment. As in Discovery 14.3.b, we know that \(\udotprod{n}{a}\) must be zero — this is the perpendicular condition. However, the vector \(\uvec{n}\) is unknown as well, since we don't know its initial point. But we can also use the triangle formed by \(\uvec{u}\text{,}\) \(\uvec{n}\text{,}\) and \(\uvec{w}\) to replace \(\uvec{n}\text{:}\)

\begin{align*} \uvec{w} + \uvec{n} \amp= \uvec{u} \amp \amp\implies \amp \uvec{n} \amp= \uvec{u} - \uvec{w} = \uvec{u} - k\uvec{a} \end{align*}

Replacing \(\uvec{n}\) by this expression in the condition \(\udotprod{n}{a} = 0\) gives us an equation of numbers that we can solve for the unknown scale factor \(k\text{,}\) as we did in Discovery 14.3.d:

\begin{equation*} k = \frac{\udotprod{u}{a}}{\unorm{a}^2} \text{.} \end{equation*}

This vector \(\uvec{w} = k\uvec{a}\) pointing from the origin to the desired closest point \(P\) is called the projection of \(\uvec{u}\) onto \(\uvec{a}\) or sometimes the vector component of \(\uvec{u}\) parallel to \(\uvec{a}\), and we write \(\proj_{\uvec{a}} \uvec{u}\) to represent it. That is,

\begin{equation*} \proj_{\uvec{a}}{\uvec{u}} = \frac{\udotprod{u}{a}}{\unorm{a}^2}\, \uvec{a} \text{.} \end{equation*}

The normal vector \(\uvec{n}\) is sometimes called the vector component of \(\uvec{u}\) orthogonal to \(\uvec{a}\). These two vectors are called components of \(\uvec{u}\) (relative to \(\uvec{a}\)) because they represent an orthogonal decomposition of \(\uvec{u}\):

\begin{equation*} \uvec{u} = \uvec{w} + \uvec{n} \text{,} \end{equation*}

where \(\uvec{w}\) is parallel to \(\uvec{a}\) and \(\uvec{n}\) is orthogonal to \(\uvec{a}\text{.}\) While this decomposition is relative to \(\uvec{a}\text{,}\) it is really only the direction of \(\uvec{a}\) that matters — if \(\uvec{a}'\) is parallel to \(\uvec{a}\) (even possibly opposite to \(\uvec{a}\)), then both

\begin{align*} \uvec{w} \amp= \proj_{\uvec{a}} \uvec{u} = \proj_{\uvec{a}'} \uvec{u}, \amp \uvec{n} \amp= \uvec{u} - \proj_{\uvec{a}} \uvec{u} = \uvec{u} - \proj_{\uvec{a}'} \uvec{u} \end{align*}

will be true.

Remark 14.3.3.

All of these calculations can be performed in higher dimensions as well, the only difference being that there is no longer one unique perpendicular direction to a given vector \(\uvec{a}\text{.}\)

Subsection 14.3.4 Normal vectors of lines in the plane

Consider the line \(2x+3y=0\) that we investigated in Discovery 14.6. The point \((3,-2)\) is on this line, since

\begin{gather} 2\cdot 3 + 3\cdot (-2) = 0\text{.}\label{equation-orthog-concepts-point-normal-lines-1-point-calc}\tag{\(\star\star\)} \end{gather}

The left-hand side of this calculation looks a lot like a dot product — we could reinterpret equation (\(\star\star\)) as

\begin{equation*} \dotprod{(2,3)}{(3,-2)} = 0 \text{.} \end{equation*}

So verifying that the point \((3,-2)\) is on the line is equivalent to checking that the corresponding vector \(\uvec{v} = (3,-2)\) (with its tail at the origin) is orthogonal to the vector \(\uvec{n} = (2,3)\) whose components are the coefficients from our line equation.

Diagram of a normal vector to a line in the plane (homogeneous case).

Every other point \(\uvec{x} = (x,y)\) on the line satisfies the same relationship, as the equation for the line could be rewritten in a vector form as

\begin{gather} \udotprod{n}{x} = 0\text{.}\label{equation-orthog-concepts-point-normal-lines-1-normal-form}\tag{\(\star\star\star\)} \end{gather}

The vector \(\uvec{n}\) is called a normal vector for the line. Note that normal vectors for a line are not unique — every nonzero scalar multiple of \(\uvec{n}\) will also be normal to the line, and this is equivalent to noting that we could multiply the equation \(2 x + 3 y = 0\) by any nonzero factor to obtain a different equation that represents the same line in the plane.

In Discovery 14.7 we considered a line defined by a nonhomogeneous equation \(2 x + 3 y = 8\text{.}\) This line has the same slope as the line defined by \(2 x + 3 y = 0\) that we investigate above, and so the vector \(\uvec{n} = (2,3)\) obtained from the coefficients on \(x\) and \(y\) in the equation must still be normal. The constant \(8\) just changes the \(y\)-intercept.

Diagram of a normal vector to a line in the plane (nonhomogeneous case).

In the homogeneous case, vectors from the origin determined by a point on the line were also parallel to the line. Since things have shifted away from the origin in the nonhomogeneous case, to get a vector parallel to the line we need to consider two vectors from the origin to points on the line. Two convenient points for this the line are \(Q(1,2)\) and \(R(4,0)\text{,}\) with corresponding vectors \(\uvec{x}_0 = (1,2)\) and \(\uvec{x}_1 = (4,0)\text{.}\) Then the difference vector

\begin{equation*} \uvec{v} = \uvec{x}_1 - \uvec{x}_0 = (3,-2) \end{equation*}

is parallel to the line, as in the diagram above. In fact, this vector \(\uvec{v}\) is the same as previous vector \(\uvec{v}\) that appears parallel to the line through the origin in the diagram for the homogeneous case above, so we know it satisfies \(\udotprod{n}{v} = 0\text{.}\)

Is there a way to use the normal vector \(\uvec{n}\) to create a vector condition by which we can tell if a vector \(\uvec{x}\) represents a point on the line, as we did with equation (\(\star\star\star\)) in the homoegenous case? We need two points on the line to create a parallel difference vector, but we could compare the variable vector \(\uvec{x}\) with a arbitrarily chosen fixed vector representing a point on the line (like \(\uvec{x}_0\text{,}\) say).

Diagram of a normal vector to a line in the plane (nonhomogeneous case).

Every such difference vector \(\uvec{x}-\uvec{x}_0\) is parallel to the line and hence orthogonal to the normal vector \(\uvec{n}\text{,}\) so that we can describe the line as all points where the corresponding vector \(\uvec{x}\) satisfies

\begin{gather} \dotprod{\uvec{n}}{(\uvec{x} - \uvec{x}_0)} = 0\text{.}\label{equation-orthog-concepts-point-normal-form}\tag{\(\dagger\)} \end{gather}

This is called the point-normal form for the line, referring to the point on the line at the terminal point of \(\uvec{x}_0\) and the normal vector \(\uvec{n}\text{.}\)

Subsection 14.3.5 Normal vectors of planes in space

A similar analysis can be made for an equation \(a x + b y + c z = d\) describing a plane in space. The coefficients form a normal vector \(\uvec{n} = (a,b,c)\text{.}\) For vectors \(\uvec{x}_0\) and \(\uvec{x}_1\) that both have initial point at the origin and terminal points on the plane, then the difference vector \(\uvec{x}_1-\uvec{x}_0\) is parallel to the plane, hence normal to \(\uvec{n}\text{.}\) If we keep a fixed choice of \(\uvec{x}_0\) but replace \(\uvec{x}_1\) by a variable vector \(\uvec{x}\text{,}\) we can describe the plane as all points whose difference is orthogonal to \(\uvec{n}\text{,}\) giving us a point-normal for a plane just as in equation (\(\dagger\)).

Remark 14.3.4.

A line in space does not have a point-normal form, because it does not have one unique normal “direction” like a line in the plane or a plane in space does. To describe a line in space in a similar fashion you would need two normal vectors. We will see several more convenient ways to describe a line in space in the next chapter.

Subsection 14.3.6 The cross product

Seeing how the algebraic equation for a plane in \(\R^3\) is connected to a normal vector to the plane, a basic problem is how to quickly obtain a normal vector. If we know two vectors that are parallel to the plane in question, the problem reduces to the following.

Question 14.3.5.

Given two nonzero, nonparallel vectors in \(\R^3\text{,}\) determine a third vector that is orthogonal to each of the first two.

Diagram of the setup for the cross product problem (Question 14.3.5 above).

So if \(\uvec{u} = (u_1,u_2,u_3)\) and \(\uvec{v} = (v_1,v_2,v_3)\) are our starting vectors, we would like to simultaneously solve the equations

\begin{align*} \udotprod{u}{x} \amp= 0, \amp \udotprod{v}{x} \amp= 0, \end{align*}

for the unknown vector \(\uvec{x} = (x,y,z)\text{.}\) Expanding out the dot products, we get (surprise!) a system of linear equations:

\begin{equation*} \left\{ \begin{array}{lclclcr} u_1 x \amp + \amp u_2 y \amp + \amp u_3 z \amp = \amp 0, \\ v_1 x \amp + \amp v_2 y \amp + \amp v_3 z \amp = \amp 0. \end{array} \right. \end{equation*}

Specifically, we get a homogeneous system of two equations in the three unknown coordinates \(x,y,z\text{.}\) Now, since this system is homogeneous, it is consistent. But its general solution will also require at least one parameter, since its rank is at most \(2\text{,}\) while we have three variables. In the diagram above, we can see what the “freedom” of a parameter corresponds to — we can make \(\uvec{x}\) longer or shorter, or turn it around to be opposite of the way it is pictured, and it will remain orthogonal to \(\uvec{u}\) and \(\uvec{v}\text{.}\) Our end goal is a calculation formula and procedure that will compute one particular solution to this problem, so let's introduce a somewhat arbitrary additional equation to eliminate the need for a parameter in the solution.

\begin{equation*} \left\{ \begin{array}{rcrcrcr} x \amp + \amp y \amp + \amp z \amp = \amp 1, \\ u_1 x \amp + \amp u_2 y \amp + \amp u_3 z \amp = \amp 0, \\ v_1 x \amp + \amp v_2 y \amp + \amp v_3 z \amp = \amp 0. \end{array} \right. \end{equation*}

In matrix form, this system can be expressed as \(A \uvec{x} = \uvec{b}\text{,}\) with

\begin{align} A \amp= \begin{bmatrix} 1 \amp 1 \amp 1 \\ u_1 \amp u_2 \amp u_3 \\ v_1 \amp v_2 \amp v_3 \end{bmatrix}, \amp \uvec{b} \amp= \begin{bmatrix} 1\\ 0\\ 0\end{bmatrix}.\label{equation-orthog-concepts-cross-product-system-matrices}\tag{\(\dagger\dagger\)} \end{align}

Assuming that \(\det A \neq 0\text{,}\) Cramer's rule tells us the solution to this system.

\begin{align*} x \amp= \frac{1}{\det A} \begin{vmatrix} 1 \amp 1 \amp 1 \\ 0 \amp u_2 \amp u_3 \\ 0 \amp v_2 \amp v_3 \end{vmatrix} \amp y \amp= \frac{1}{\det A} \begin{vmatrix} 1 \amp 1 \amp 1 \\ u_1 \amp 0 \amp u_3 \\ v_1 \amp 0 \amp v_3 \end{vmatrix} \amp z \amp= \frac{1}{\det A} \begin{vmatrix} 1 \amp 1 \amp 1 \\ u_1 \amp u_2 \amp 0 \\ v_1 \amp v_2 \amp 0 \end{vmatrix}\\ \amp= \frac{1}{\det A} \begin{vmatrix} u_2 \amp u_3 \\ v_2 \amp v_3 \end{vmatrix} \amp \amp= \frac{-1}{\det A} \begin{vmatrix} u_1 \amp u_3 \\ v_1 \amp v_3 \end{vmatrix} \amp \amp= \frac{1}{\det A} \begin{vmatrix} u_1 \amp u_2 \\ v_1 \amp v_2 \end{vmatrix} \end{align*}

Now, each of \(x,y,z\) has a common factor of \(1/\det A\text{,}\) and all this common factor does is scale the length of our solution vector \(\uvec{x}\) without affecting orthogonality with \(\uvec{u}\) and \(\uvec{v}\text{.}\) Even worse, \(\det A\) depends on that extra equation we threw in, and we would like our solution to depend only on \(\uvec{u}\) and \(\uvec{v}\text{.}\) So let's remove it and use solution

\begin{equation*} \uvec{x} = \left( \begin{vmatrix} u_2 \amp u_3 \\ v_2 \amp v_3 \end{vmatrix}, \; -\begin{vmatrix} u_1 \amp u_3 \\ v_1 \amp v_3 \end{vmatrix}, \; \begin{vmatrix} u_1 \amp u_2 \\ v_1 \amp v_2 \end{vmatrix} \right). \end{equation*}

We call this the cross product of \(\uvec{u}\) and \(\uvec{v}\), and write \(\ucrossprod{u}{v}\) instead of \(\uvec{x}\text{.}\) There is a trick to remembering how to compute the cross product: if we replace the top row of \(A\) by the standard basis vectors \(\ivec{},\jvec{},\kvec{}\) in \(\R^3\text{,}\) then the cross product will be equal to its determinant expanded by cofactors along the first row. That is, setting

\begin{align} \ucrossprod{u}{v} \;=\; \begin{vmatrix} \ivec{} \amp \jvec{} \amp \kvec{} \\ u_1 \amp u_2 \amp u_3 \\ v_1 \amp v_2 \amp v_3 \end{vmatrix}\label{equation-orthog-concepts-cross-prod-det-formula}\tag{\(\dagger\dagger\dagger\)} \end{align}

and expanding the determinant along the first row yields

\begin{equation*} \ucrossprod{u}{v} \;=\; \begin{vmatrix} u_2 \amp u_3 \\ v_2 \amp v_3 \end{vmatrix} \ivec{} \; - \; \begin{vmatrix} u_1 \amp u_3 \\ v_1 \amp v_3 \end{vmatrix} \jvec{} \; + \; \begin{vmatrix} u_1 \amp u_2 \\ v_1 \amp v_2 \end{vmatrix} \kvec{}, \end{equation*}

as desired. See Example 14.4.4 in Subsection 14.4.3 for an example of using formula (\(\dagger\dagger\dagger\)) to compute cross products.

The cross product follows the right-hand rule — if you orient your right hand so that your fingers point in the direction of \(\uvec{u}\) and curl towards \(\uvec{v}\text{,}\) then your thumb will point in the direction of \(\ucrossprod{u}{v}\text{.}\)

Computing \(\ucrossprod{v}{u}\) instead of \(\ucrossprod{u}{v}\) should still produce a vector that is orthogonal to both \(\uvec{u}\) and \(\uvec{v}\text{,}\) but the right-hand rule tells that the two should be opposite to each other. From equation (\(\dagger\dagger\dagger\)) we can be even more specific. Computing \(\ucrossprod{v}{u}\) would swap the second and third rows of the special matrix in equation (\(\dagger\dagger\dagger\)), and we know that the resulting determinant would be the negative of that for the matrix for computing \(\ucrossprod{u}{v}\text{,}\) and so

\begin{equation*} \ucrossprod{v}{u} = -\ucrossprod{u}{v} \text{.} \end{equation*}

See Proposition 14.5.5 in Subsection 14.5.3 for more properties of the cross product.

Remark 14.3.6.

There is one more thing to say about our development of the cross product — Cramer's rule can only be applied if \(\det A\) is not zero, where \(A\) is the matrix in (\(\dagger\dagger\)). However, the coefficients in the extra equation we introduced did not figure into our final solution. So if \(\det A\) ended up being zero for some particular vectors \(\uvec{u}\) and \(\uvec{v}\text{,}\) we could just change the variable coefficients in that extra equation (but keep the \(1\) in the equals column) so that \(\det A\) is not zero, and we would still come to the same formula for \(\ucrossprod{u}{v}\text{.}\) And it follows from concepts we will learn in Chapter 21 that it is always possible to fill in the top row of this matrix \(A\) so that its determinant is nonzero, as long as we start with nonparallel vectors \(\uvec{u}\) and \(\uvec{v}\text{.}\)