Section 13.3 Concepts
Subsection 13.3.1 Values of \(\udotprod{u}{v}\)
In Discovery 13.1, we compared the graph of the cosine function on the domain \(0\le\theta\le\pi\) with the formula
\begin{gather}
\cos\theta = \frac{\udotprod{u}{v}}{\unorm{u}\unorm{v}}\text{,}\tag{✶}
\end{gather}
where \(\theta\) is the angle between nonzero vectors \(\uvec{u}\) and \(\uvec{v}\text{.}\) On the right of equation (✶), the denominator is always positive, so whether the whole fraction is positve, negative, or zero depends entirely on the dot product in the numerator. On the left, the cosine function is positive, negative, or zero precisely when the angle \(\theta\) is acute, obtuse, or right. So we come to the following conclusions.
\(\theta\) | \(\udotprod{u}{v}\) | |
acute: | \(0\le \theta \lt \pi/2\) | positive |
right: | \(\theta = \pi/2\) | zero |
obtuse: | \(\pi/2 \lt \theta \le \pi\) | negative |
Subsection 13.3.2 Orthogonal vectors
Right angles are extremely important in geometry, and from Figure 13.3.1 we see that the dot product gives us a very convenient way to tell when the angle \(\theta\) between two nonzero vectors \(\uvec{u}\) and \(\uvec{v}\) is right: we have \(\theta = \pi/2\) precisely when \(\udotprod{u}{v} = 0\). In the plane or in space, \(\uvec{u}\) and \(\uvec{v}\) will be perpendicular when \(\theta = \pi/2\) and \(\udotprod{u}{v} = 0\text{.}\) Since we can’t “see” right angles and perpendicular lines in higher dimensions, in general we say that \(\uvec{u}\) and \(\uvec{v}\) are orthogonal when \(\udotprod{u}{v} = 0\text{.}\)
Subsubsection 13.3.2.1 Orthogonal vectors in \(\R^2\)
In Discovery 13.2, we tried to find a pattern to the task of choosing some vector that is orthogonal to a given one in the plane. Rather than struggle with the geometry, we unleash the power of algebra: given vector \(\uvec{u} = (a,b)\text{,}\) we are looking for a vector \(\uvec{v}\) so that \(\udotprod{u}{v} = 0\text{.}\) Expanding out the dot product, we are looking to fill in the blanks in the following equation with components for \(\uvec{v}\text{:}\)
\begin{equation*}
a\cdot\boxed{\phantom{X}} + b\cdot\boxed{\phantom{X}} = 0 \text{.}
\end{equation*}
Two numbers add to zero only if one is the negative of the other. We can make both terms in the sum the same number by entering \(b\) in the first blank and \(a\) in the second, so we can make the sum cancel to zero by also flipping the sign of one of those entries. For example,
\begin{equation*}
a\cdot\boxed{b} + b\cdot\boxed{(-a)} = 0 \text{.}
\end{equation*}
We have now answered the question in Discovery 13.2.c.
Pattern 13.3.2. Orthogonal vectors in the plane.
Given vector \(\uvec{u} = (a,b)\) in the plane, two examples of vectors that are orthogonal to \(\uvec{u}\) are \(\uvec{v} = (b,-a)\) and \(-\uvec{v} = (-b, a)\text{,}\) and every vector that is orthogonal to \(\uvec{u}\) is some scalar multiple of this example \(\uvec{v}\text{.}\)
Note 13.3.3.
For patterns of orthogonal vectors in \(\R^3\text{,}\) see Subsection 13.3.6.
Subsection 13.3.3 Orthogonal projection
Orthogonal projection is a vector solution to a problem in geometry.
Question 13.3.4.
Given a line through the origin in the plane, and a point not on the line, what point on the line is closest to the given point?
In Question 13.3.4, write \(\ell\) for the line through the origin and \(Q\) for the point not on that line. Consider the point \(P\) on \(\ell\) at the foot of the perpendicular to \(\ell\) from \(Q\) . Any other point \(P'\) on \(\ell\) will form a right triangle with \(P\) and \(Q\text{,}\) making it farther from \(Q\) than \(P\text{,}\) since the distance \(P'Q\) is the length of the hypotenuse in the right triangle.
All we know about \(P\) is that it is on line \(\ell\) and it is at the vertex of a right angle with \(\ell\) and \(Q\text{.}\) But if we introduce some vectors to help tackle this problem, then maybe we can use what we know about the dot product and right angles to help determine \(P\text{.}\)
In this diagram, \(\uvec{u}\) is the vector corresponding to directed line segment \(\abray{OQ}\text{,}\) and \(\uvec{p}\) is the vector corresponding to the directed line segment \(\abray{OP}\text{,}\) where \(P\) is our unknown closest point. Since \(\uvec{p}\) is placed with its tail at the origin, the components of \(\uvec{p}\) are precisely the coordinates of \(P\text{.}\) So determining \(\uvec{p}\) will solve the problem.
We are assuming that the line \(\ell\) is known, and it would be nice to also have a vector means of describing it. But the vectors created by the points on this line (using the origin as a universal tail point) will all be parallel to each other, so (as we discovered in Discovery 13.3.a) line \(\ell\) could be described as all scalar multiples of a particular vector \(\uvec{a}\text{.}\) This vector can be arbitrarily chosen as any vector parallel to the line. Once we have chosen \(\uvec{a}\text{,}\) we have reduced our problem from determining the two unknown components of the vector \(\uvec{p}\) to determining a single unknown scalar \(k\) so that \(\uvec{p} = k\uvec{a}\text{.}\)
As mentioned, since \(P\) is the closest point, the directed line segment \(\abray{PQ}\) must be perpendicular to \(\ell\text{.}\) On the diagram above, we have used the vector \(\uvec{n}\) to represent this direct line segment. As in Discovery 13.3.b, we know that \(\udotprod{n}{a}\) must be zero — this is the perpendicular condition. However, the vector \(\uvec{n}\) is unknown as well, since we don’t know its initial point. But we can also use the triangle formed by \(\uvec{u}\text{,}\) \(\uvec{n}\text{,}\) and \(\uvec{p}\) to replace \(\uvec{n}\text{:}\)
\begin{align*}
\uvec{p} + \uvec{n} \amp= \uvec{u} \amp \amp\implies \amp \uvec{n} \amp= \uvec{u} - \uvec{p} = \uvec{u} - k\uvec{a}
\end{align*}
Replacing \(\uvec{n}\) by this expression in the condition \(\udotprod{n}{a} = 0\) gives us an equation of numbers that we can solve for the unknown scale factor \(k\text{,}\) as we did in Discovery 13.3.d:
\begin{equation*}
k = \frac{\udotprod{u}{a}}{\unorm{a}^2} \text{.}
\end{equation*}
This vector \(\uvec{p} = k\uvec{a}\) pointing from the origin to the desired closest point \(P\) is called the projection of \(\uvec{u}\) onto \(\uvec{a}\) or sometimes the vector component of \(\uvec{u}\) parallel to \(\uvec{a}\), and we write \(\uproj{u}{a}\) to represent it.
Procedure 13.3.5. Closest point on a line (orthogonal projection).
Given a line \(\ell\) through the origin and point \(Q\) that does not lie on \(\ell\text{,}\) compute the point \(P\) on \(\ell\) that is closest to \(Q\) as follows.
- Choose any point \(P'\) on the line (excluding the origin), and form the parallel vector \(\uvec{a} = \abray{OP'}\text{.}\)
- Form the vector \(\uvec{u} = \abray{OQ}\text{.}\)
- Compute the projection vector\begin{equation*} \uvec{p} = \uproj{u}{a} = \frac{\udotprod{u}{a}}{\unorm{a}^2}\, \uvec{a} \text{.} \end{equation*}
This projection vector will now point from the origin to the desired closest point \(P\text{,}\) parallel to the line \(\ell\), so that \(\uvec{p} = \abray{OP}\text{.}\)
Remark 13.3.6.
It is not actually necessary that \(Q\) be external to the line. If you were to carry out the procedure above in the case that \(Q\) lies on \(\ell\text{,}\) the calculations would end up with \(\uvec{p} = \uvec{u}\text{,}\) confirming that \(Q\) was already the point on the line that is closest to itself.
The normal vector \(\uvec{n}\) in the diagram above is sometimes called the vector component of \(\uvec{u}\) orthogonal to \(\uvec{a}\). Together, the projection vector and corresponding normal vector are called components of \(\uvec{u}\) (relative to \(\uvec{a}\)) because they represent an orthogonal decomposition of \(\uvec{u}\):
\begin{equation*}
\uvec{u} = \uvec{p} + \uvec{n} \text{,}
\end{equation*}
where \(\uvec{p}\) is parallel to \(\uvec{a}\) and \(\uvec{n}\) is orthogonal to \(\uvec{a}\text{.}\) While this decomposition is relative to \(\uvec{a}\text{,}\) it is really only the direction of \(\uvec{a}\) that matters — if \(\uvec{a}'\) is parallel to \(\uvec{a}\) (even possibly opposite to \(\uvec{a}\)), then both
\begin{align*}
\uvec{p} \amp= \uproj{u}{a} = \proj_{\uvec{a}'} \uvec{u}, \amp
\uvec{n} \amp= \uvec{u} - \uvec{p} = \uvec{u} - \uproj{u}{a} = \uvec{u} - \proj_{\uvec{a}'} \uvec{u}
\end{align*}
will be true.
Procedure 13.3.7. Shortest distance to a line.
Given a line \(\ell\) through the origin and point \(Q\) that does not lie on \(\ell\text{,}\) compute the shortest distance from \(Q\) to the line as follows.
- Compute the projection vector \(\uvec{p} = \uproj{u}{a}\) as in Procedure 13.3.5.
- Compute the normal vector \(\uvec{n} = \uvec{u} - \uvec{p}\text{.}\)
- Compute the norm \(\unorm{n}\text{.}\)
The computed norm is the distance from the closest point \(P\) to the point \(Q\text{.}\)
Remark 13.3.8.
- These procedures and calculations can be easily modified to work for lines that do not pass through the origin: simply choose some arbitrary “initial” point \(R\) on the line to “act” as the origin.
- All of these calculations can be performed in higher dimensions as well. In higher dimensions, it is true that there is no longer one unique perpendicular direction to a given vector \(\uvec{a}\text{,}\) but the calculation of \(\uvec{n}\) as above will pick out the correction direction to extend from the line to the point \(Q\) at a right angle to the line.
Subsection 13.3.4 Normal vectors of lines in the plane
Consider the line \(2x+3y=0\) that we investigated in Discovery 13.6. The point \((3,-2)\) is on this line, since
\begin{gather}
2\cdot 3 + 3\cdot (-2) = 0\text{.}\tag{✶✶}
\end{gather}
The left-hand side of this calculation looks a lot like a dot product — we could reinterpret equation (✶✶) as
\begin{equation*}
\dotprod{(2,3)}{(3,-2)} = 0 \text{.}
\end{equation*}
So verifying that the point \((3,-2)\) is on the line is equivalent to checking that the corresponding vector \(\uvec{v} = (3,-2)\) (with its tail at the origin) is orthogonal to the vector \(\uvec{n} = (2,3)\) whose components are the coefficients from our line equation.
Every other point \(\uvec{x} = (x,y)\) on the line satisfies the same relationship, as the equation for the line could be rewritten in a vector form as
\begin{gather}
\udotprod{n}{x} = 0\text{.}\tag{✶✶✶}
\end{gather}
The vector \(\uvec{n}\) is called a normal vector for the line. Note that normal vectors for a line are not unique — every nonzero scalar multiple of \(\uvec{n}\) will also be normal to the line, and this is equivalent to noting that we could multiply the equation \(2 x + 3 y = 0\) by any nonzero factor to obtain a different equation that represents the same line in the plane.
In Discovery 13.7 we considered a line defined by a nonhomogeneous equation \(2 x + 3 y = 8\text{.}\) This line has the same slope as the line defined by \(2 x + 3 y = 0\) that we investigate above, and so the vector \(\uvec{n} = (2,3)\) obtained from the coefficients on \(x\) and \(y\) in the equation must still be normal. The constant \(8\) just changes the \(y\)-intercept.
In the homogeneous case, vectors from the origin determined by a point on the line were also parallel to the line. Since things have shifted away from the origin in the nonhomogeneous case, to get a vector parallel to the line we need to consider two vectors from the origin to points on the line. Two convenient points for this the line are \(Q(1,2)\) and \(R(4,0)\text{,}\) with corresponding vectors \(\uvec{x}_0 = (1,2)\) and \(\uvec{x}_1 = (4,0)\text{.}\) Then the difference vector
\begin{equation*}
\uvec{v} = \uvec{x}_1 - \uvec{x}_0 = (3,-2)
\end{equation*}
is parallel to the line, as in the diagram above. In fact, this vector \(\uvec{v}\) is the same as previous vector \(\uvec{v}\) that appears parallel to the line through the origin in the diagram for the homogeneous case above, so we know it satisfies \(\udotprod{n}{v} = 0\text{.}\)
Is there a way to use the normal vector \(\uvec{n}\) to create a vector condition by which we can tell if a vector \(\uvec{x}\) represents a point on the line, as we did with equation (✶✶✶) in the homoegenous case? We need two points on the line to create a parallel difference vector, but we could compare the variable vector \(\uvec{x}\) with a arbitrarily chosen fixed vector representing a point on the line (like \(\uvec{x}_0\text{,}\) say).
Every such difference vector \(\uvec{x}-\uvec{x}_0\) is parallel to the line and hence orthogonal to the normal vector \(\uvec{n}\text{,}\) so that we can describe the line as all points where the corresponding vector \(\uvec{x}\) satisfies
\begin{gather}
\dotprod{\uvec{n}}{(\uvec{x} - \uvec{x}_0)} = 0\text{.}\tag{†}
\end{gather}
This is called the point-normal form for the line, referring to the point on the line at the terminal point of \(\uvec{x}_0\) and the normal vector \(\uvec{n}\text{.}\)
Pattern 13.3.9. Point-normal form for a line in \(\R^2\).
If \((x_0,y_0)\) is a point on the line \(\ell \colon a x + b y = d\) (that is, \(a x_0 + b y_0 = d\) is true), then \(\ell\) can alternatively be described as all points \((x,y)\) that satisfy
\begin{equation*}
\dotprod{(a,b)}{\bbrac{(x,y) - (x_0,y_0)}} = 0 \text{.}
\end{equation*}
Remark 13.3.10.
It may seem like the line parameter \(d\) has disappeared in converting from algebraic form \(a x + b y = d\) to point-normal form. But it has merely be replaced by the point \((x_0,y_0)\text{,}\) since \(d = a x_0 + b y_0\text{.}\) In fact, if we use the algebraic properties of the dot product to expand the left-hand side of the point-normal form equation, we can recover the original algebraic equation:
\begin{gather*}
\dotprod{(a,b)}{\bbrac{(x,y) - (x_0,y_0)}} = 0 \\
\dotprod{(a,b)}{(x,y)} - \dotprod{(a,b)}{(x_0,y_0)} = 0 \\
(a x + b y) - (a x_0 + b y_0) = 0 \\
(a x + b y) - d = 0 \\
a x + b y = d \text{.}
\end{gather*}
Subsection 13.3.5 Normal vectors of planes in space
A similar analysis can be made for an equation \(a x + b y + c z = d\) describing a plane in space. The coefficients form a normal vector \(\uvec{n} = (a,b,c)\text{.}\) For vectors \(\uvec{x}_0\) and \(\uvec{x}_1\) that both have initial point at the origin and terminal points on the plane, then the difference vector \(\uvec{x}_1-\uvec{x}_0\) is parallel to the plane, hence normal to \(\uvec{n}\text{.}\) If we keep a fixed choice of \(\uvec{x}_0\) but replace \(\uvec{x}_1\) by a variable vector \(\uvec{x}\text{,}\) we can describe the plane as all points whose difference is orthogonal to \(\uvec{n}\text{,}\) giving us a point-normal for a plane just as in equation (†).
Pattern 13.3.11. Point-normal form for a plane in \(\R^3\).
If \((x_0,y_0,z_0)\) is a point on the plane \(\Pi \colon a x + b y + c z = d\) (that is, \(a x_0 + b y_0 + c z_0 = d\) is true), then \(\Pi\) can alternatively be described as all points \((x,y,z)\) that satisfy
\begin{equation*}
\dotprod{(a,b,c)}{\bbrac{(x,y,z) - (x_0,y_0,z_0)}} = 0 \text{.}
\end{equation*}
Remark 13.3.12.
A line in space does not have a point-normal form, because it does not have one unique normal “direction” like a line in the plane or a plane in space does. To describe a line in space in a similar fashion you would need two normal vectors. We will see several more convenient ways to describe a line in space in the next chapter.
Subsection 13.3.6 The cross product
Seeing how the algebraic equation for a plane in \(\R^3\) is connected to a normal vector to the plane, a basic problem is how to quickly obtain a normal vector. If we know two vectors that are parallel to the plane in question, the problem reduces to the following.
Question 13.3.13.
Given two nonzero, nonparallel vectors in \(\R^3\text{,}\) determine a third vector that is orthogonal to each of the first two.
Diagram of the setup for the cross product problem (Question 13.3.13 above).
So if \(\uvec{u} = (u_1,u_2,u_3)\) and \(\uvec{v} = (v_1,v_2,v_3)\) are our starting vectors, we would like to simultaneously solve the equations
\begin{align*}
\udotprod{u}{x} \amp= 0, \amp \udotprod{v}{x} \amp= 0,
\end{align*}
for the unknown vector \(\uvec{x} = (x,y,z)\text{.}\) Expanding out the dot products, we get (surprise!) a system of linear equations:
\begin{equation*}
\left\{
\begin{array}{lclclcr}
u_1 x \amp + \amp u_2 y \amp + \amp u_3 z \amp = \amp 0, \\
v_1 x \amp + \amp v_2 y \amp + \amp v_3 z \amp = \amp 0.
\end{array}
\right.
\end{equation*}
Specifically, we get a homogeneous system of two equations in the three unknown coordinates \(x,y,z\text{.}\) Now, since this system is homogeneous, it is consistent. But its general solution will also require at least one parameter, since its rank is at most \(2\text{,}\) while we have three variables. In the diagram above, we can see what the “freedom” of a parameter corresponds to — we can make \(\uvec{x}\) longer or shorter, or turn it around to be opposite of the way it is pictured, and it will remain orthogonal to \(\uvec{u}\) and \(\uvec{v}\text{.}\) Our end goal is a calculation formula and procedure that will compute one particular solution to this problem, so let’s introduce a somewhat arbitrary additional equation to eliminate the need for a parameter in the solution.
\begin{equation*}
\left\{
\begin{array}{rcrcrcr}
x \amp + \amp y \amp + \amp z \amp = \amp 1, \\
u_1 x \amp + \amp u_2 y \amp + \amp u_3 z \amp = \amp 0, \\
v_1 x \amp + \amp v_2 y \amp + \amp v_3 z \amp = \amp 0.
\end{array}
\right.
\end{equation*}
In matrix form, this system can be expressed as \(A \uvec{x} = \uvec{b}\text{,}\) with
\begin{align}
A \amp= \begin{bmatrix}
1 \amp 1 \amp 1 \\
u_1 \amp u_2 \amp u_3 \\
v_1 \amp v_2 \amp v_3
\end{bmatrix},
\amp
\uvec{b} \amp= \begin{bmatrix} 1\\ 0\\ 0\end{bmatrix}.\tag{††}
\end{align}
Assuming that \(\det A \neq 0\text{,}\) Cramer’s rule tells us the solution to this system.
\begin{align*}
x \amp= \frac{1}{\det A}
\begin{vmatrix}
1 \amp 1 \amp 1 \\
0 \amp u_2 \amp u_3 \\
0 \amp v_2 \amp v_3
\end{vmatrix}
\amp
y \amp= \frac{1}{\det A}
\begin{vmatrix}
1 \amp 1 \amp 1 \\
u_1 \amp 0 \amp u_3 \\
v_1 \amp 0 \amp v_3
\end{vmatrix}
\amp
z \amp= \frac{1}{\det A}
\begin{vmatrix}
1 \amp 1 \amp 1 \\
u_1 \amp u_2 \amp 0 \\
v_1 \amp v_2 \amp 0
\end{vmatrix}\\
\amp= \frac{1}{\det A}
\begin{vmatrix}
u_2 \amp u_3 \\
v_2 \amp v_3
\end{vmatrix}
\amp
\amp= \frac{-1}{\det A}
\begin{vmatrix}
u_1 \amp u_3 \\
v_1 \amp v_3
\end{vmatrix}
\amp
\amp= \frac{1}{\det A}
\begin{vmatrix}
u_1 \amp u_2 \\
v_1 \amp v_2
\end{vmatrix}
\end{align*}
Now, each of \(x,y,z\) has a common factor of \(1/\det A\text{,}\) and all this common factor does is scale the length of our solution vector \(\uvec{x}\) without affecting orthogonality with \(\uvec{u}\) and \(\uvec{v}\text{.}\) Even worse, \(\det A\) depends on that extra equation we threw in, and we would like our solution to depend only on \(\uvec{u}\) and \(\uvec{v}\text{.}\) So let’s remove it and use solution
\begin{equation*}
\uvec{x} =
\left(
\begin{vmatrix}
u_2 \amp u_3 \\
v_2 \amp v_3
\end{vmatrix},
\;
-\begin{vmatrix}
u_1 \amp u_3 \\
v_1 \amp v_3
\end{vmatrix},
\;
\begin{vmatrix}
u_1 \amp u_2 \\
v_1 \amp v_2
\end{vmatrix}
\right).
\end{equation*}
We call this the cross product of \(\uvec{u}\) and \(\uvec{v}\), and write \(\ucrossprod{u}{v}\) instead of \(\uvec{x}\text{.}\) There is a trick to remembering how to compute the cross product: if we replace the top row of \(A\) by the standard basis vectors \(\ivec{},\jvec{},\kvec{}\) in \(\R^3\text{,}\) then the cross product will be equal to its determinant expanded by cofactors along the first row. That is, setting
\begin{align}
\ucrossprod{u}{v} \;=\;
\begin{vmatrix}
\ivec{} \amp \jvec{} \amp \kvec{} \\
u_1 \amp u_2 \amp u_3 \\
v_1 \amp v_2 \amp v_3
\end{vmatrix}\tag{†††}
\end{align}
and expanding the determinant along the first row yields
\begin{equation*}
\ucrossprod{u}{v} \;=\;
\begin{vmatrix}
u_2 \amp u_3 \\
v_2 \amp v_3
\end{vmatrix}
\ivec{}
\; - \;
\begin{vmatrix}
u_1 \amp u_3 \\
v_1 \amp v_3
\end{vmatrix}
\jvec{}
\; + \;
\begin{vmatrix}
u_1 \amp u_2 \\
v_1 \amp v_2
\end{vmatrix}
\kvec{},
\end{equation*}
as desired. See Example 13.4.4 in Subsection 13.4.3 for an example of using formula (†††) to compute cross products.
The cross product follows the right-hand rule — if you orient your right hand so that your fingers point in the direction of \(\uvec{u}\) and curl towards \(\uvec{v}\text{,}\) then your thumb will point in the direction of \(\ucrossprod{u}{v}\text{.}\)
Computing \(\ucrossprod{v}{u}\) instead of \(\ucrossprod{u}{v}\) should still produce a vector that is orthogonal to both \(\uvec{u}\) and \(\uvec{v}\text{,}\) but the right-hand rule tells that the two should be opposite to each other. From equation (†††) we can be even more specific. Computing \(\ucrossprod{v}{u}\) would swap the second and third rows of the special matrix in equation (†††), and we know that the resulting determinant would be the negative of that for the matrix for computing \(\ucrossprod{u}{v}\text{,}\) and so
\begin{equation*}
\ucrossprod{v}{u} = -\ucrossprod{u}{v} \text{.}
\end{equation*}
See Proposition 13.5.5 in Subsection 13.5.3 for more properties of the cross product.
Remark 13.3.14.
There is one more thing to say about our development of the cross product — Cramer’s rule can only be applied if \(\det A\) is not zero, where \(A\) is the matrix in (††). However, the coefficients in the extra equation we introduced did not figure into our final solution. So if \(\det A\) ended up being zero for some particular vectors \(\uvec{u}\) and \(\uvec{v}\text{,}\) we could just change the variable coefficients in that extra equation (but keep the \(1\) in the equals column) so that \(\det A\) is not zero, and we would still come to the same formula for \(\ucrossprod{u}{v}\text{.}\) And it follows from concepts we will learn in Chapter 20 that it is always possible to fill in the top row of this matrix \(A\) so that its determinant is nonzero, as long as we start with nonparallel vectors \(\uvec{u}\) and \(\uvec{v}\text{.}\)