The Spectral Theorem

Section 6.4 The Spectral Theorem

Subsection 6.4.1 Orthogonal diagonalizability

Diagonalizable matrices are great. If we can write \(A = PDP^{-1}\text{,}\) with \(D\) a diagonal matrix, then we can learn a lot about \(A\) by studying the diagonal matrix \(D\text{,}\) which is easier. It would be even better if \(P\) could be chosen to be an orthogonal matrix, because then \(P^{-1}\) would be very easy to calculate (because of Theorem 6.3.5).

We emphasize that, except where we explicitly state otherwise, the matrices in this section have real numbers as their entries. The ideas of this section can be developed for matrices with complex entries, but the results are not exactly the same, and we will not do that here.

Definition 6.4.1.

An \(n \times n\) matrix \(A\) is called orthogonally diagonalizable if there is an orthogonal matrix \(Q\) and a diagonal matrix \(D\) such that \(A = QDQ^{t}\text{.}\)

Most matrices, even most diagonalizable matrices, are not orthogonally diagonalizable. In fact, for a matrix to have a chance of being orthogonally diagonalizable, it must be symmetric.

Theorem 6.4.2.

Every orthogonally diagonalizable matrix is symmetric.

Proof.

Suppose that \(A = QDQ^t\text{,}\) where \(Q\) is an orthogonal matrix and \(D\) is a diagonal matrix. Then:

\begin{align*} A^t \amp = (QDQ^t)^t \\ \amp = (Q^t)^tD^tQ^t \\ \amp = QDQ^t\\ \amp = A \end{align*}

Thus \(A\) is symmetric.

Example 6.4.3.

Let \(A = \begin{bmatrix}1 \amp 3 \\ 2 \amp 3\end{bmatrix}\text{.}\) We can immediately see that \(A\) is not symmetric, and therefore it cannot be orthogonally diagonalizable.

This matrix \(A\) is diagonalizable, but there is no way to just "see" that fact; we really need to go through the process of Section 5.2. We won't do that here, but you are encouraged to do so as a bit of practice.

In Section 5.2 we saw that determining whether or not a matrix is diagonalizable is a non-trivial task: We had to find the eigenvalues and a basis for each eigenspace. Theorem 6.4.2 gives us a very fast way of showing that some matrices are not orthogonally diagonalizable. Very surprisingly, the converse of Theorem 6.4.2 is actually true, and so determining whether or not a matrix is orthogonally diagonalizable is actually very easy! The price is that the proof of the next theorem is very much not easy, and we will not provide it here.

Theorem 6.4.4. Spectral Theorem.

Let \(A\) be an \(n \times n\) matrix of real numbers. Then \(A\) is orthogonally diagonalizable if and only if \(A\) is symmetric.

In order to actually orthogonalize a diagonal matrix, the following theorem (which we will not prove) is very helpful.

Theorem 6.4.5.

Let \(A\) be a symmetric matrix of real numbers. Then every eigenvalue of \(A\) is real, and eigenvectors corresponding to different eigenvalues are automatically orthogonal.

Example 6.4.6.

The Spectral Theorem tells us at a glance that \(A = \begin{bmatrix}1 \amp -1 \amp 1 \\ -1 \amp 1 \amp -1 \\ 1 \amp -1 \amp 1\end{bmatrix}\) is orthogonally diagonalizable. However, if we want to actually find an orthogonal matrix \(Q\) and a diagonal \(D\) such that \(A = QDQ^t\) then we have to do quite a lot of work! We'll go through the process of finding \(Q\) and \(D\) for this matrix, just to illustrate how much more is involved if we need \(Q\) and \(D\) explicitly.

To start, we find the eigenvalues of \(A\text{.}\) The characteristic polynomial is \(\det(A-xI_3) = 3x^2-x^3 = x^2(3-x)\text{,}\) so the eigenvalues are \(0\text{,}\) with algebraic multiplicity \(2\text{,}\) and \(3\text{,}\) with algebraic multiplicity \(1\text{.}\) We let \(D = \begin{bmatrix}0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 3\end{bmatrix}\text{.}\)

Next we turn to finding eigenvectors. For the eigenvalue \(0\) we have

\begin{equation*} A-0I = A \to \begin{bmatrix}1 \amp -1 \amp 1 \\ 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0\end{bmatrix}\text{.} \end{equation*}

We then write \(\begin{bmatrix}x\\y\\z\end{bmatrix} = \begin{bmatrix}y-z \\ y \\ z\end{bmatrix} = y\begin{bmatrix}1\\1\\0\end{bmatrix} + z\begin{bmatrix}-1\\0\\1\end{bmatrix}\text{,}\) so for a basis of \(E_A(0)\) we can take the two vectors \(\begin{bmatrix}1\\1\\0\end{bmatrix}\) and \(\begin{bmatrix}-1\\0\\1\end{bmatrix}\text{.}\) If we were just setting out to diagonalize \(A\) we would make these two vectors the first two columns of the matrix \(P\text{;}\) however, we want \(Q\) to be an orthogonal matrix, which means that we need not just a basis for \(E_A(0)\text{,}\) but actually an orthonormal basis. We therefore need to run the Gram-Schmidt algorithm on these vectors. Doing so produces the orthogonal basis vectors \(\begin{bmatrix}1\\1\\0\end{bmatrix}, \begin{bmatrix}-1/2 \\ 1/2 \\ 1\end{bmatrix}\text{.}\) We then divide each vector by its length to get the orthonormal basis \(\left\{\begin{bmatrix}1/\sqrt{2} \\ 1/\sqrt{2} \\ 0\end{bmatrix}, \begin{bmatrix}-1/\sqrt{6} \\ 1/\sqrt{6} \\ 2/\sqrt{6}\end{bmatrix}\right\}\text{.}\)

Now we go to the eigenvalue \(3\text{.}\) Following the same kind of process as above, we find that a basis for \(E_A(3)\) is the single vector \(\begin{bmatrix}1\\-1\\1\end{bmatrix}\text{,}\) which we normalize to get the orthonormal basis \(\left\{\begin{bmatrix}1/\sqrt{3} \\ -1/\sqrt{3} \\ 1/\sqrt{3}\end{bmatrix}\right\}\text{.}\)

Finally, we put the pieces together. By Theorem 6.4.5 we don't need to worry about the different eigenspaces interacting, so we can just put the orthonormal bases for the eigenspaces side-by-side to form the orthogonal matrix \(Q = \begin{bmatrix}1/\sqrt{2} \amp -1/\sqrt{6} \amp 1/\sqrt{3} \\ 1/\sqrt{2} \amp 1/\sqrt{6} \amp -1/\sqrt{3} \\ 0 \amp 2/\sqrt{6} \amp 1/\sqrt{3}\end{bmatrix}\text{.}\) With this matrix \(Q\) and the diagonal \(D\) found above, we have \(A = QDQ^t\) (as you could verify directly, if you wish).

Subsection 6.4.2 An application

You might recall that we proved that if \(A\) is any matrix of real numbers then \(A^tA\) is a symmetric matrix (Theorem 4.3.24). At that time we promised to prove a partial converse, namely that every symmetric matrix of real numbers has the form \(A^tA\) for some matrix \(A\text{,}\) where \(A\) might have complex entries. With the Spectral Theorem in hand we can now give the proof.

Corollary 6.4.7.

Let \(B\) be a symmetric matrix of real numbers. Then there exists a matrix \(A\) (whose entries may be complex) such that \(B = A^tA\text{.}\)

Proof.

By the Spectral Theorem we can orthogonally diagonalize \(B\text{,}\) say \(B = Q^tDQ\) where \(Q\) is an orthogonal matrix and \(D\) is a diagonal matrix. Let \(\sqrt{D}\) be the diagonal matrix whose diagonal entries are the square roots of the diagonal entries of \(D\text{;}\) notice that even though the entries of \(D\) are real numbers they could be negative, and hence the entries of \(\sqrt{D}\) may be complex. Also notice that \(\sqrt{D}^t = \sqrt{D}\) and \((\sqrt{D})^2 = D\text{.}\)

Let \(A = \sqrt{D}Q\text{.}\) Again, the entries of \(Q\) are real numbers, but the entries of \(\sqrt{D}\) may not be, so \(A\) may have complex entries. Then:

\begin{equation*} A^tA = (\sqrt{D}Q)^t(\sqrt{D}Q) = Q^t\sqrt{D}^t\sqrt{D}Q = Q^t(\sqrt{D})^2Q = Q^tDQ = B\text{.} \end{equation*}

The same kind of idea used in the proof of Corollary 6.4.7 allow us to apply any function \(f : \mathbb{R} \to \mathbb{R}\) to a symmetric matrix: Write \(A = QDQ^t\text{,}\) define \(f(D)\) to be the diagonal matrix where we apply \(f\) to each diagonal entry of \(D\text{,}\) and then define \(f(A) = Qf(D)Q^t\text{.}\) This idea is the beginning of an area known as functional calculus, which plays an important role in application of linear algebra, especially to quantum physics.

Exercises 6.4.3 Exercises

1.

Which of the following matrices are orthogonally diagonalizable?

\(\begin{bmatrix}1 \amp 3 \amp 2 \\ 3 \amp -1 \amp 0 \\ 0 \amp 1 \amp 0\end{bmatrix}\text{.}\)
\(\begin{bmatrix}3 \amp -2 \amp 5 \\ -2 \amp \pi \amp 0 \\ 5 \amp 0 \amp -3\end{bmatrix}\text{.}\)
\(\begin{bmatrix}1 \amp 1 \amp 1 \amp 1 \amp 1 \\ 1 \amp 2 \amp 3 \amp -4 \amp 1 \\ 1 \amp 3 \amp 12 \amp 0 \amp 7 \\ 1 \amp -4 \amp 0 \amp 1 \amp 0 \\ 1 \amp 1 \amp 7 \amp 1 \amp 1\end{bmatrix}\text{.}\)

Hint.

Theorem 6.4.4 makes this something you can check without any calculations.

Answer.

Only (b).

Solution.

By Theorem 6.4.4 a matrix is orthogonally diagonalizable if and only if it is symmetric. The first and third matrices are not symmetric, but the second one is, so only the second one is orthogonally diagonalizable.

2.

Find an orthogonal matrix \(Q\) and a diagonal matrix \(D\) such that \(\begin{bmatrix}0 \amp -1 \amp 1 \\ -1 \amp 0 \amp 1 \\ 1 \amp 1 \amp 0\end{bmatrix} = QDQ^t\text{.}\)

Hint.

Start by diagonalizing as usual, but then use Gram-Schmid to convert the basis for each eigenspace into an orthonormal basis.

3.

Suppose that \(A\) and \(B\) are orthogonally diagonalizable, and that \(AB=BA\text{.}\) Prove that \(AB\) is orthogonally diagonalizable.

Hint.

Orthogonal diagonalizability is a hard property to work with, but Theorem 6.4.4 can help.

Solution.

By Theorem 6.4.4 both \(A\) and \(B\) are symmetric. Now we calculate:

\begin{equation*} (AB)^t = B^tA^t = BA = AB\text{.} \end{equation*}

Thus \(AB\) is symmetric, so by Theorem 6.4.4 \(AB\) is orthogonally diagonalizable.

4.

Let \(A = \begin{bmatrix}1 \amp 2 \\ 2 \amp 1\end{bmatrix}\) and \(B = \begin{bmatrix}1 \amp -1 \\ -1 \amp 0\end{bmatrix}\text{.}\) Show that \(A\) and \(B\) are orthogonally diagonalizable, but \(AB\) is not.

Hint.

Once again, Theorem 6.4.4 lets us convert this problem into a much easier one.

Solution.

Both \(A\) and \(B\) are symmetric, so by Theorem 6.4.4 they are orthogonally diagonalizable. We calculate

\begin{equation*} AB = \begin{bmatrix}-1 \amp -1 \\ 1 \amp -2\end{bmatrix}\text{.} \end{equation*}

We see that \(AB\) is not symmetric, so it is not orthogonally diagonalizable (by Theorem 6.4.4).