Skip to main content

Section 41.3 Concepts

Subsection 41.3.1 Quadratic forms

A level set of a quadratic form in two or three variables without cross terms (i.e. consisting entirely of terms of the form \(a_i x_i^2\)) creates an easily-recognizable curve or surface — a circle, ellipse, or hyperbola in \(\R^2\) or a sphere, ellipsoid, hyperboloid, etc. in \(\R^3\text{.}\) When cross terms are present, the same geometric patterns are also present but are less evident because the curve or surface is aligned along some set of axes other than the standard \(xy\)-axes or \(xyz\)-axes.

As we saw in Discovery 41.1 and Discovery 41.2, a quadratic form \(q(x_1,x_2,\dotsc,x_n)\) can be represented by a matrix \(A\) so that

\begin{equation*} q(\uvec{x}) = \utrans{\uvec{x}} A \uvec{x} \text{,} \end{equation*}

where the input point \((x_1,x_2,\dotsc,x_n)\) is instead represented by an \(n\)-dimensional column vector \(\uvec{x}\text{.}\) The entries in the matrix \(A\) correspond to the coefficients in the quadratic polynomial, so that the coefficient on the term for \(x_i x_j\) is equal to the \((i,j)\) entry in \(A\text{.}\) However, since \(x_j x_i = x_i x_j\text{,}\) when we collect like terms we need to add the \((i,j)\) and \((j,i)\) entries in \(A\) to get the coefficients for the cross terms:

\begin{equation*} q(\uvec{x}) = \sum_{1 \le i \le n} a_{ii} x_i^2 + \sum_{1 \le i \lt j \le n} (a_{ij} + a_{ji}) x_i x_j \text{.} \end{equation*}

Even better is to use a symmetric matrix to represent the quadratic form, so that

\begin{equation*} q(\uvec{x}) = \sum_{1 \le i \le n} a_{ii} x_i^2 + \sum_{1 \le i \lt j \le n} 2 a_{ij} x_i x_j \text{.} \end{equation*}

A quadratic form has the desirable property of lacking cross terms if and only if it can be represented by a diagonal matrix. But as we will see, being represented by an orthogonally diagonalizable matrix is just as good.

Subsection 41.3.2 Diagonalizing quadratic forms

As explored in Discovery 41.4, we can eliminate cross terms in a quadratic form

\begin{equation*} q_A(\uvec{x}) = \utrans{\uvec{x}} A \uvec{x} \end{equation*}

by using a change of variables

\begin{equation*} \uvec{x} = P \uvec{w} \end{equation*}

for an orthogonal transition matrix \(P\) that diagonalizes \(A\text{.}\) It should be obvious from the patterns of Discovery 41.1 and Discovery 41.2 that every quadratic form can be represented by a symmetric matrix \(A\text{,}\) and every symmetric matrix is orthogonally diagonalizable (Theorem 40.6.13). So every quadratic form can be “diagonalized” to eliminate cross terms.

In the new variables, the quadratic form becomes

\begin{equation*} q_A(\uvec{x}) = \utrans{\uvec{x}} A \uvec{x} = \utrans{\uvec{w}} \utrans{P} A P \uvec{w} = q_{\utrans{P} A P}(\uvec{w})\text{.} \end{equation*}

If \(\lambda_1,\lambda_2,\dotsc,\lambda_n\) are a complete list of the eigenvalues of symmetric \(A\text{,}\) with eigenvalues of multiplicity greater than \(1\) listed multiple times, and the columns of \(P\) are arranged in an order corresponding to this order of eigenvalues, then the diagonal entries of the diagonal matrix \(\utrans{P} A P\) are those eigenvalues in order. As a diagonal matrix represents a quadratic form without cross terms, we have

\begin{equation*} q_A(\uvec{x}) = q_{\utrans{P} A P}(\uvec{w}) = \lambda_1 w_1^2 + \lambda_2 w_2^2 + \dotsc + \lambda_1 w_n^2\text{.} \end{equation*}

Subsection 41.3.3 Principal axes for a quadratic form

As above, if quadratic form \(q_A(\uvec{x})\) is represented by a symmetric matrix \(A\text{,}\) then orthogonally diagonalizing \(A\) will provide a way to change variables in the form to eliminate cross terms. The fact that the transition matrix \(P\) used to achieve this is orthogonal has geometric implications, as we know that orthogonal matrices preserve geometry as they transform \(\R^n\) (Proposition 39.5.4). So under an orthogonal change of variables \(\uvec{x} = P \uvec{w}\text{,}\) whatever geometric shape a level set \(q_A(\uvec{x}) = k\) has in \(\uvec{x}\)-coordinates, it has the same shape in \(\uvec{w}\)-coordinates, and vice versa. The shape may be rotated or reflected or etc., but it will not be stretched/compressed or sheared or etc. And as the level set shapes for the “diagonalized” version in terms of \(\uvec{w}\)-variables (without cross terms) are easier to analyze, we can determine the shape in terms of \(\uvec{w}\)-variables and then use the transformation

\begin{equation*} \uvec{x} = P \uvec{w} \end{equation*}

to obtain the shape in \(\uvec{x}\)-variables.

Just as discussed in Section 40.1, the orthonormal basis of \(\R^n\) formed by the columns of the orthogonal transition matrix \(P\) creates a new set of axes, called a set of principal axes for the quadratic form.

Diagram of an orthogonal change of variables via transition matrix.
Figure 41.3.1. An orthogonal transition matrix as change-of-variables, creating a new set of principal axes in \(\R^2\text{.}\) For each vector \(\uvec{x}\) on the \(xy\)-axes, the corresponding vector \(\widetilde{\uvec{x}}\) on the \(wz\)-axes indicates the converted vector \(\inv{P} \uvec{x}\text{.}\)

See Subsection 41.4.2 for examples of visualizing a level set of a two-variable quadratic form first on a transformed set of axes, and then on the principal axes the transformed axes create, superimposed on the usual \(xy\)-axes.