Section 21.4 Concepts
Subsection 21.4.1 Determining eigenvalues
To determine eigenvectors and their corresponding eigenvalues for a specific matrix \(A\text{,}\) we need to solve the matrix equation \(A\uvec{x} = \lambda\uvec{x}\) for both the unknown eigenvector \(\uvec{x}\) and the unknown eigenvalue \(\lambda\text{.}\) This is not like any matrix equation we’ve tried to solve before — the right-hand side involves unknown times unknown, making the equation nonlinear. However, as in Discovery 21.2, we can use some matrix algebra to turn this equation into something more familiar:
\begin{align*}
A\uvec{x} \amp= \lambda\uvec{x} \\
\zerovec \amp= \lambda I\uvec{x} - A\uvec{x} \\
\zerovec \amp= (\lambda I - A)\uvec{x}.
\end{align*}
A particular scalar \(\lambda\) will be an eigenvalue of \(A\) if and only if the above homogeneous system has nontrivial solutions.
A homogeneous system with square coefficient matrix has nontrivial solutions precisely when that coefficient matrix is not invertible, which is the case precisely when the determinant of that coefficient matrix is equal to zero (Theorem 10.5.3). So there will exist eigenvectors of \(A\) corresponding to a particular scalar \(\lambda\) precisely when \(\lambda\) is a root of the characteristic equation \(\det(\lambda I - A) = 0\).
Procedure 21.4.1. To determine all eigenvalues of a square matrix \(A\).
Determine the roots of the characteristic equation \(\det(\lambda I - A) = 0.\)
Remark 21.4.2.
Because calculating \(\det(\lambda I - A)\) only involves multiplication, addition, and subtraction, its result is always a polynomial in the variable \(\lambda\text{.}\) In fact, this polynomial will always be a monic polynomial of degree \(n\) (where \(A\) is \(n \times n\)).
This is the reason we moved \(A\uvec{x}\) to the right-hand side to obtain \((\lambda I - A)\uvec{x} = \zerovec\) in our algebraic manipulations above, instead of moving \(\lambda\uvec{x}\) to the left-hand side to obtain \((A - \lambda I)\uvec{x} = \zerovec\) — if we had chosen this second option, the characteristic polynomial would have a leading coefficient of \(\pm 1\) depending on whether \(n\) was even or odd.
Subsection 21.4.2 Eigenvalues for special forms of matrices
In Discovery 21.4, we considered the eigenvalue procedure for diagonal and triangular matrices. Suppose \(A\) is such a matrix, with values \(d_1,d_2,\dotsc,d_n\) down its main diagonal. Then \(\lambda I - A\) is of the same special form as \(A\) (diagonal or triangular), with entries \(\lambda-d_1,\lambda-d_2,\dotsc,\lambda-d_n\) down its main diagonal. Since we know that the determinant of a diagonal or triangular matrix is equal to the product of its diagonal entries (Statement 1 of Proposition 8.5.2), the characteristic polynomial for \(A\) will be
\begin{equation*}
\det (\lambda I - A) = (\lambda-d_1)(\lambda-d_2)\dotsm(\lambda-d_n),
\end{equation*}
and so the eigenvalues of \(A\) will be precisely its diagonal entries.
Subsection 21.4.3 Determining eigenvectors
Once we know all possible eigenvalues of a square matrix \(A\text{,}\) we can substitute those values into the matrix equation \(A\uvec{x}=\lambda\uvec{x}\) one at a time. With a value for \(\lambda\) substituted in, this matrix equation is no longer nonlinear and can be solved for all corresponding eigenvectors \(\uvec{x}\text{.}\) But the homogeneous version \((\lambda I - A)\uvec{x} = \zerovec\) is more convenient to work with, since to solve this system we just need to row reduce the coefficient matrix \(\lambda I - A\text{.}\)
Procedure 21.4.3. To determine all eigenvectors of a square matrix \(A\) that correspond to a specific eigenvalue \(\lambda\).
Compute the matrix \(C = \lambda I - A\text{.}\) Then the eigenvectors corresponding to \(\lambda\) are precisely the nontrivial solutions of the homogeneous system \(C\uvec{x} = \zerovec\text{,}\) which can be solved by row reducing as usual.
Subsection 21.4.4 Eigenspaces
Determining eigenvectors is the same as solving the homogeneous system \((\lambda I - A)\uvec{x} = \zerovec\text{,}\) so the eigenvectors of \(A\) corresponding to a specific eigenvalue \(\lambda\) are precisely the nonzero vectors in the null space of \(\lambda I - A\text{.}\) In particular, since a null space is a subspace of \(\R^n\text{,}\) we see that the collection of all eigenvectors of \(A\) that correspond to a specific eigenvalue \(\lambda\) creates a subspace of \(\R^n\text{,}\) once we also include the zero vector in the collection. This subspace is called the eigenspace of \(A\) for eigenvalue \(\lambda\text{,}\) and we write \(E_\lambda(A)\) for it.
Remark 21.4.4.
Since determining eigenvectors is the same as determining a null space, the typical result of carrying out Procedure 21.4.3 for a particular eigenvalue of a matrix will be to obtain a basis for the corresponding eigenspace, by row reducing, assigning parameters, and then extracting basis vectors from the general parametric solution as usual.
Subsection 21.4.5 Connection to invertibility
Recall that we do not call the zero vector an eigenvector of a square matrix \(A\text{,}\) because it would not correspond to one specific eigenvalue — the equality \(A\zerovec = \lambda\zerovec\) is true for all scalars \(\lambda\text{.}\) However, the scalar \(\lambda=0\) can (possibly) be an eigenvalue for a matrix \(A\text{,}\) and we explored this possibility in Discovery 21.7.
In the case of \(\lambda=0\text{,}\) the matrix equation \(A\uvec{x} = \lambda\uvec{x}\) turns into the homogeneous system \(A\uvec{x} = \zerovec\text{.}\) And for \(\lambda=0\) to actually be an eigenvalue of \(A\text{,}\) there needs to be nontrivial solutions to this equation — which we know will occur precisely when \(A\) is not invertible (Theorem 6.5.2).
Subsection 21.4.6 The geometry of eigenvectors
Multiplication of column vectors by a particular matrix can be thought of as a sort of function, i.e. an input-output process. But unlike the types of functions you are probably used to encountering, where the input is a number \(x\) and the output is a number \(y\text{,}\) this matrix-multiplication sort of function has a column vector \(\uvec{x}\) as input and a column vector \(\uvec{y}\) as output.
When the particular matrix used to form such a function is square, then the input and output vectors live in the same space (i.e. \(\R^n\text{,}\) where \(n\) is the size of the matrix), so we can think of the matrix transforming an input vector into its corresponding output vector geometrically. See Figure 21.4.5 for an example of this geometric transformation point of view.
When the input vector \(\uvec{x}\) is an eigenvector of the transformation matrix \(A\text{,}\) then the output vector \(A\uvec{x}\) is a scalar multiple of \(\uvec{x}\) (where the scale factor is the corresponding eigenvalue). See Figure 21.4.6 for a geometric example of this view of eigenvectors.
Geometrically, one vector is a scalar multiple of another if and only if the two vectors are parallel. So we can say that a vector is an eigenvector of a matrix precisely when it is transformed to a parallel vector when multiplied by the matrix.