Skip to main content

Section 5.2 Diagonalization

Subsection 5.2.1 Similarity

Definition 5.2.1.

Let \(A\) and \(B\) be \(n \times n\) matrices. We say that \(A\) is similar to \(B\) if there is an invertible \(n \times n\) matrix \(P\) such that \(A = PBP^{-1}\text{.}\) In this case we write \(A \sim B\) to indicate that \(A\) is similar to \(B\text{.}\)

Note 5.2.2.

Caution! The word "similar" has an ordinary English meaning, and now it also has a technical meaning. When we say \(A\) is similar to \(B\) we always mean the precise statement defined above, not the ordinary English meaning.

Here are some basic facts about the relation \(\sim\text{.}\)

We will only prove the second statement, leaving the others as exercises. Suppose that \(A \sim B\text{.}\) Then there is an invertible matrix \(P\) such that \(A = PBP^{-1}\text{.}\) Multiplying on the left by \(P^{-1}\) and on the right by \(P\) we get \(P^{-1}AP = B\text{.}\) Now let \(Q = P^{-1}\text{.}\) Then \(Q\) is an invertible matrix and \(B = QAQ^{-1}\text{,}\) so \(B \sim A\text{.}\)

In light of point (2) above, we sometimes say that \(A\) and \(B\) are similar rather than saying \(A\) is similar to \(B\text{,}\) since the order doesn't actually matter. As you would expect from the name, similar matrices share many properties. Here are a few examples of those propertes.

Suppose that \(A = PBP^{-1}\text{.}\) Notice that for any scalar \(x\) we have

\begin{equation*} P(xI_n)P^{-1} = x(PI_nP^{-1}) = xPP^{-1} = xI_n\text{.} \end{equation*}

This, together with properties of the determinant that we saw in Section 4.5, allows us to calculate as follows:

\begin{align*} \chi_A(x) \amp = \det(A - xI_n)\\ \amp = \det(PBP^{-1} - xI_n)\\ \amp = \det(PBP^{-1} - P(xI_n)P^{-1})\\ \amp = \det(P(B - xI_n)P^{-1})\\ \amp = \det(P)\det(B-xI_n)\det(P^{-1})\\ \amp = \det(P)\det(B-xI_n)\frac{1}{\det(P)}\\ \amp = \det(B-xI_n) \\ \amp = \chi_B(x) \end{align*}

Using Theorem 5.2.4 applied when \(x=0\) we get:

\begin{equation*} \det(A) = \det(A - 0I_n) = \chi_A(0) = \chi_B(0) = \det(B - 0I_n) = \det(B)\text{.} \end{equation*}

We list some other properties shared by similar matrices, without proof.

Note 5.2.7.

Warning! When \(A\) and \(B\) are similar we have \(\rank(A) = \rank(B)\text{,}\) which means that \(\RREF(A)\) and \(\RREF(B)\) have the same number of non-zero rows. It does not mean that \(\RREF(A) = \RREF(B)\text{,}\) and indeed often that isn't the case. Similarly (pardon the pun), although for each eigenvalue \(\lambda\) we have \(\dim(E_A(\lambda)) = \dim(E_B(\lambda))\text{,}\) it is often not true that \(E_A(\lambda) = E_B(\lambda)\text{.}\)

Subsection 5.2.2 Diagonalization

As we have seen, if \(A \sim B\) then \(A\) and \(B\) have many properties in common. The easiest matrices to study are the diagonal matrices, so given a matrix \(A\) we would like to know if there is a diagonal matrix \(D\) such that \(A \sim D\text{.}\) In this section we will find out when that happens, and see some examples of how it can help us.

Definition 5.2.8.

Let \(A\) be a square matrix. We say that \(A\) is diagonalizable if there is a diagonal matrix \(D\) such that \(A \sim D\text{.}\)

Let \(A = \begin{bmatrix}-3 \amp 1 \\ 0 \amp -3\end{bmatrix}\text{.}\) We will show that \(A\) is not diagonalizable. We could try to do this by hand: Let \(P = \begin{bmatrix}a \amp b \\ c \amp d\end{bmatrix}\) and \(D = \begin{bmatrix}x \amp 0 \\ 0 \amp y\end{bmatrix}\text{,}\) and then show directly that the equation \(A = PDP^{-1}\) cannot be solved. While this is possible, it is rather unpleasant.

A much easier way is to calculate that the only eigenvalue of \(A\) is \(-3\text{,}\) which occurs with algebraic multiplicity \(2\text{.}\) By Theorem 5.2.4 if \(A \sim D\) then \(D\) must also have only \(-3\) as an eigenvalue, and with algebraic multiplicity \(2\text{.}\) The only such diagonal matrix is \(D = \begin{bmatrix}-3 \amp 0 \\ 0 \amp -3\end{bmatrix}\text{,}\) so if \(A\) is diagonalizable then we must have \(A = P\begin{bmatrix}-3 \amp 0 \\ 0 \amp -3\end{bmatrix}P^{-1}\text{.}\) Then a little calculation gives:

\begin{equation*} A = P\begin{bmatrix}-3 \amp 0 \\ 0 \amp -3\end{bmatrix}P^{-1} = P(-3I)P^{-1} = (-3)PIP^{-1} = (-3)PP^{-1} = -3I = \begin{bmatrix}-3 \amp 0 \\ 0 \amp -3\end{bmatrix}\text{,} \end{equation*}

but in fact that is not the matrix \(A\) we started with, so this shows that \(A\) is not diagonalizable.

Let \(A = \begin{bmatrix}2 \amp 0 \\ -5 \amp -3\end{bmatrix}\text{,}\) \(P = \begin{bmatrix}1 \amp 0 \\ -1 \amp 1\end{bmatrix}\text{,}\) and \(D = \begin{bmatrix}2 \amp 0 \\ 0 \amp -3\end{bmatrix}\text{.}\) Then \(A = PDP^{-1}\text{,}\) so \(A\) is diagonalizable. By the end of this section we will see how to find the matrices \(P\) and \(D\) starting from \(A\text{.}\)

The method that we used in Example 5.2.9 can be adapted to find the diagonal matrices that could be similar to a given matrix.

If \(D\) has diagonal entries \(d_1, \ldots, d_n\) then by Theorem 5.1.9 the eigenvalues of \(D\text{,}\) listed according to their algebraic multiplicities, are \(d_1, \ldots, d_n\text{.}\) If \(A\sim D\) then by Theorem 5.2.4 the eigenvalues of \(D\text{,}\) listed according to their algebraic multiplicities, are \(\lambda_1, \ldots, \lambda_n\text{.}\) So the lists \(d_1, \ldots, d_n\) and \(\lambda_1, \ldots, \lambda_n\) must be the same except for possibly the order in which they are written.

The theorem above tells us which diagonal matrices \(A\) might be similar to, but (as seen in Example 5.2.9) it does not guarantee that \(A\) is actually similar to a diagonal matrix. The missing ingredient is the matrix \(P\) from the definition of similarity. An example will help to explain where \(P\) comes from, and then we will give the general theorem.

Consider again the matrices \(A = \begin{bmatrix}2 \amp 0 \\ -5 \amp -3\end{bmatrix}\text{,}\) \(P = \begin{bmatrix}1 \amp 0 \\ -1 \amp 1\end{bmatrix}\text{,}\) and \(D = \begin{bmatrix}2 \amp 0 \\ 0 \amp -3\end{bmatrix}\) from Example 5.2.10. In that example we had \(A = PDP^{-1}\text{,}\) which is equivalent to \(AP = PD\text{.}\)

Using the definition of matrix multiplication, the first column of \(AP\) is

\begin{equation*} \begin{bmatrix}2 \amp 0 \\ -5 \amp -3\end{bmatrix}\begin{bmatrix}1 \\ -1\end{bmatrix} = \begin{bmatrix}2 \\ -2\end{bmatrix} = 2\begin{bmatrix}1\\-1\end{bmatrix} \end{equation*}

and the second column is

\begin{equation*} \begin{bmatrix}2 \amp 0 \\ -5 \amp -3 \end{bmatrix}\begin{bmatrix}0\\1\end{bmatrix} = \begin{bmatrix}0\\-3\end{bmatrix} = -3\begin{bmatrix}0\\1\end{bmatrix}\text{.} \end{equation*}

We see that the columns of \(P\) are eigenvectors of \(A\text{,}\) and even more, that the eigenvalue for the first column of \(P\) is the first diagonal entry of \(D\) and the eigenvalue for the second column of \(P\) is the second diagonal entry of \(D\text{.}\)

Let \(P\) have columns \(\vec{P_1}, \ldots, \vec{P_n}\text{,}\) and suppose that \(D = \begin{bmatrix}\lambda_1 \amp 0 \amp \cdots \amp 0 \\ 0 \amp \lambda_2 \amp \cdots \amp 0 \\ \vdots \amp \vdots \amp \ddots \amp \vdots \\ 0 \amp 0 \amp \cdots \amp \lambda_n\end{bmatrix}\text{.}\) Rearrange the equation \(A = PDP^{-1}\) to \(AP = PD\text{.}\) Now using the definition of matrix multiplication, for any \(j\) the \(j\)th column on the left is \(A\vec{P_j}\) and the \(j\)th column on the right is \(P\begin{bmatrix}0 \\ \vdots \\ 0 \\ \lambda_j \\ 0 \\ \vdots \\ 0\end{bmatrix} = \lambda_j\vec{P_j}\text{.}\) We therefore have \(A\vec{P_j} = \lambda_jP_j\text{,}\) as required.

Now we have all the information we need to find \(P\) and \(D\text{,}\) assuming that our matrix \(A\) is diagonalizable. But we know that not every matrix is diagonalizable (Example 5.2.9), so what went wrong?

As long as we allow our eigenvalues to come from \(\mathbb{C}\text{,}\) there is no problem constructing \(D\text{,}\) because Fact 5.1.7 guarantees that an \(n \times n\) matrix has exactly \(n\) eigenvalues (counted according to algebraic multiplicity). So the failure must be because of something going wrong in building \(P\text{.}\)

If we are to have a chance of writing \(A = PDP^{-1}\) we need the columns of \(P\) to be eigenvectors of \(A\text{,}\) but we also need \(P\) to be invertible. Using the Fundamental Theorem we can restate "\(P\) is invertible" as "\(P\) has linearly independent columns". So the problem we could encounter is that perhaps there are not enough linearly independent eigenvectors to form an invertible matrix \(P\text{.}\) In fact, that is the only possible obstacle; while we haven't formally proved it (and nor will we), hopefully you now find the following theorem believable.

We already know from Theorem 5.1.19 that eigenvectors from different eigenvalues are linearly independent. With some more work (which we won't do here) one can prove that this implies that our search for \(n\) independent eigenvectors is really a search for \(\alg_A(\lambda)\) many independent eigenvectors for each eigenvalue \(\lambda\text{.}\) We know that the maximum number of linearly independent \(\lambda\)-eigenvectors of \(A\) is \(\geo_A(\lambda) = \dim(E_A(\lambda))\text{,}\) so we finally come to the main result about diagonalizability.

Let \(A = \begin{bmatrix}0 \amp 1 \amp 3 \\ 2 \amp -1 \amp 3 \\ -2 \amp 5 \amp 1\end{bmatrix}\text{.}\) Determine whether or not \(A\) is diagonalizable, and if it is, write it in the form \(A = PDP^{-1}\) with \(D\) a diagonal matrix.

Solution.

We start by finding the eigenvalues of \(A\text{.}\)

\begin{equation*} \chi_A(x) = \det(A - xI_3) = \det\begin{bmatrix}-x \amp 1 \amp 3 \\ 2 \amp -1-x \amp 3 \\ -2 \amp 5 \amp 1-x\end{bmatrix} = -x^3 +12x +16 = -(x-4)(x+2)^2\text{.} \end{equation*}

Now we know that the eigenvalues are \(4\text{,}\) with algebraic multiplicitiy \(1\text{,}\) and \(-2\text{,}\) with algebraic multiplicity \(2\text{.}\)

Next we need to find out the geometric multiplicities of the eigenvalues, to see if we have enough linearly independent eigenvectors. Let's start with the eigenvalue \(-2\text{.}\) To find a basis for the eigenspace, we row-reduce:

\begin{equation*} A-(-2)I_3 = \begin{bmatrix}2 \amp 1 \amp 3 \\ 2 \amp 1 \amp 3 \\ -2 \amp 5 \amp 3\end{bmatrix} \to \begin{bmatrix}1 \amp 0 \amp 1 \\ 0 \amp 1 \amp 1 \\ 0 \amp 0 \amp 0\end{bmatrix}\text{.} \end{equation*}

From here we see that vectors in \(E_A(-2)\) have the form \(\begin{bmatrix}x\\y\\z\end{bmatrix} = \begin{bmatrix}-z\\-z\\z\end{bmatrix} = z\begin{bmatrix}-1\\-1\\1\end{bmatrix}\text{.}\) Therefore \(\left\{\begin{bmatrix}-1\\-1\\1\end{bmatrix}\right\}\) is a basis for \(E_A(-2)\text{,}\) and so \(\geo_A(-2) = 1\text{.}\) We thus have \(\geo_A(-2) = 1 \lt 2 = \alg_A(-2)\text{,}\) so by Theorem 5.2.15 the matrix \(A\) is not diagonalizable.

Notice that since we already found one eigenvalue with \(\alg_A(\lambda) \neq \geo_A(\lambda)\) there is no reason for us to consider any other eigenvalues.

Let \(A = \begin{bmatrix}1 \amp 2 \amp 0 \\ -1 \amp 4 \amp 0 \\ 1 \amp -1 \amp 3\end{bmatrix}\text{.}\) Determine whether or not \(A\) is diagonalizable, and if it is, write it in the form \(A = PDP^{-1}\) with \(D\) a diagonal matrix.

Solution.

The first step is again to find the eigenvalues of \(A\text{.}\)

\begin{equation*} \chi_A(x) = \det(A-xI_3) = \det\begin{bmatrix}1-x \amp 2 \amp 0 \\ -1 \amp 4-x \amp 0 \\ 1 \amp -1 \amp 3-x\end{bmatrix} = -x^3+8x^2-21x+18 = -(x-2)(x-3)^2\text{.} \end{equation*}

The eigenvalues are \(2\text{,}\) with algebraic multiplicity \(1\text{,}\) and \(3\text{,}\) with algebraic multiplicity \(2\text{.}\)

We next find a basis for \(E_A(3)\text{,}\) by row-reducing:

\begin{equation*} A-3I_3 = \begin{bmatrix}-2 \amp 2 \amp 0 \\-1 \amp 1 \amp 0 \\ 1 \amp -1 \amp 0\end{bmatrix} \to \begin{bmatrix}1 \amp -1 \amp 0 \\ 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0\end{bmatrix}\text{.} \end{equation*}

We see that vectors in \(E_A(3)\) look like \(\begin{bmatrix}x\\y\\z\end{bmatrix} = \begin{bmatrix}y\\y\\z\end{bmatrix} = y\begin{bmatrix}1\\1\\0\end{bmatrix} + z\begin{bmatrix}0\\0\\1\end{bmatrix}\text{.}\) Therefore \(\left\{\begin{bmatrix}1\\1\\0\end{bmatrix}, \begin{bmatrix}0\\0\\1\end{bmatrix}\right\}\) is a basis for \(E_A(3)\text{.}\) In particular, \(\geo_A(3) = 2\text{.}\) This matches with \(\alg_A(3) = 2\text{,}\) so we move on to the next eigenvalue.

We look for a baiss for \(E_A(2)\text{:}\)

\begin{equation*} A-2I_3 = \begin{bmatrix}-1 \amp 2 \amp 0 \\ -1 \amp 2 \amp 0 \\ 1 \amp -1 \amp 1\end{bmatrix} \to \begin{bmatrix}1 \amp 0 \amp 2 \\ 0 \amp 1 \amp 1 \\ 0 \amp 0 \amp 0\end{bmatrix}\text{.} \end{equation*}

Therefore vectors in \(E_A(2)\) have the form \(\begin{bmatrix}x\\y\\z\end{bmatrix} = \begin{bmatrix}-2z \\ -z \\ z\end{bmatrix} = z\begin{bmatrix}-2\\-1\\1\end{bmatrix}\text{.}\) This shows that a basis for \(E_A(2)\) is \(\left\{\begin{bmatrix}-2\\-1\\1\end{bmatrix}\right\}\text{.}\) It also shows that \(\geo_A(2) = 1\text{,}\) which matches with \(\alg_A(2) = 1\text{.}\)

Since \(\geo_A(\lambda) = \alg_A(\lambda)\) for every eigenvalue \(\lambda\text{,}\) it follows from Theorem 5.2.15 that \(A\) is diagonalizable.

To find the matrices \(P\) and \(D\text{,}\) we start by choosing an order to write the eigenvalues on the diagonal of \(D\text{.}\) Any order is fine; we choose \(D = \begin{bmatrix}3 \amp 0 \amp 0 \\ 0 \amp 2 \amp 0 \\ 0 \amp 0 \amp 3\end{bmatrix}\text{.}\) Next we need to make the columns of \(P\) to be eigenvectors in the corresponding order to the order we used for \(D\text{.}\) Using the bases found above, we choose \(P = \begin{bmatrix}1 \amp -2 \amp 0\\ 1 \amp -1 \amp 0\\ 0 \amp 1 \amp 1\end{bmatrix}\text{.}\) With these choices we have \(A = PDP^{-1}\) - that this equation is correct follows form Theorem 5.2.15, but you could check it by hand if you wish.

As the examples show, most of the time we need to do quite a bit of work to show that a matrix is diagonalizable, even with all the tools we now have at hand. There is one situation where it is easy to detect diagonalizability.

By Fact 5.1.7 the algebraic multiplicities of the eigenvalues add up to \(n\text{,}\) so if there are \(n\) distinct eigenvalues then they must all have algebraic multiplicity \(1\text{.}\) For each eigenvalue \(\lambda\) we therefore have, by Theorem 5.1.16, \(1 \leq \geo_A(\lambda) \leq \alg_A(\lambda) = 1\text{,}\) which implies that \(\geo_A(\lambda) = \alg_A(\lambda)\text{.}\) Therefore by Theorem 5.2.15 \(A\) is diagonalizable.

Example 5.2.17 shows that an \(n \times n\) matrix can be diagonalizable without having \(n\) distinct eigenvalues, so the converse of the above theorem is not true.

Subsection 5.2.3 An application

To conclude our discussion of diagonalization we give just one example of the usefulness of diagonalizing a matrix (there are many others!). The example we give is a very small example of a Markov chain; Markov chains appear in modelling whenever the state of a system changes randomly in such a way that the probabilities involved depend only on the current state and no earlier ones.

In the example we will use two key observation:

  1. If \(A = PDP^{-1}\) then

    \begin{equation*} A^2 = PDP^{-1}PDP^{-1} = PDIDP^{-1} = PD^2P^{-1}\text{,} \end{equation*}
    and more generally, for any \(k \geq 1\text{,}\) \(A^k = PD^kP^{-1}\text{.}\)

  2. If \(D\) is a diagonal matrix then so is \(D^k\text{,}\) and the diagonal entries of \(D^k\) are the \(k\)th powers of the diagonal entries of \(D\text{.}\) Specifically, for any \(k \geq 1\text{,}\) \(\begin{bmatrix}a \amp 0 \\ 0 \amp b\end{bmatrix}^k = \begin{bmatrix}a^k \amp 0 \\ 0 \amp b^k\end{bmatrix}\text{.}\)

With these tools in mind, here is our example.

Consider a (fictional) student in a mathematics course. At any given minute the student is either studying the course material or not studying the course material. After meticulous observation it has been discovered that a student who is studying at one minute has a \(60\%\) chance of studying the next minute, while a student who is not studying at one minute has only a \(10\%\) chance of studying the next minute. Suppose that you see a student studying. What is the probability that they will still be studying one hour later?

Solution.

We model the situation using the transition matrix \(A = \begin{bmatrix}0.6 \amp 0.1 \\ 0.4 \amp 0.9\end{bmatrix}\text{.}\) We think of the first coordinates of vectors as the probability of studying, and the second coordinate as the probability of not studying. The key fact we need from probability is that if a given student has an \(x\%\) chance of studying at one minute then at the next minute their chance of studying is the top entry of \(A\begin{bmatrix}x\\1-x\end{bmatrix}\text{.}\) Therefore the probability that the student is studying \(2\) minutes later is the top entry of \(A\left(A\begin{bmatrix}x\\1-x\end{bmatrix}\right) = A^2\begin{bmatrix}x\\1-x\end{bmatrix}\text{,}\) and in general the probability that the student is studying \(k\) minutes later is the top entry of \(A^k\begin{bmatrix}x\\1-x\end{bmatrix}\text{.}\)

Our student is observed to be studying, so our starting state has \(x = 1\text{.}\) We are thus interested in the top entry of \(A^{60}\begin{bmatrix}1\\0\end{bmatrix}\text{.}\) We could multiply \(A\) by itself \(60\) times, but doing that would take quite a lot of calculation. Instead, we diagonalize \(A\text{.}\) Using the method of Example 5.2.17 we find that \(A = PDP^{-1}\) where \(P=\begin{bmatrix}-1 \amp 1\\ 1 \amp 4\end{bmatrix}\) and \(D = \begin{bmatrix}0.5 \amp 0 \\ 0 \amp 1\end{bmatrix}\text{.}\) We also calculate \(P^{-1} = \begin{bmatrix}-0.8 \amp 0.2 \\ 0.2 \amp 0.2\end{bmatrix}\) using the techniques for finding matrix inverses from Section 4.4. Thus

\begin{align*} A^{60} \amp = PD^{60}P^{-1}\\ \amp = \begin{bmatrix}-1 \amp 1 \\ 1 \amp 4\end{bmatrix}\begin{bmatrix}0.5^{60} \amp 0 \\ 0 \amp 1^{60}\end{bmatrix}\begin{bmatrix}-0.8 \amp 0.2 \\ 0.2 \amp 0.2\end{bmatrix} \\ \amp = \begin{bmatrix}-1 \amp 1 \\ 1 \amp 4\end{bmatrix}\begin{bmatrix}0.5^{60} \amp 0 \\ 0 \amp 1 \end{bmatrix}\begin{bmatrix}-0.8 \amp 0.2 \\ 0.2 \amp 0.2\end{bmatrix}\\ \amp = \begin{bmatrix}0.8/2^{60}+0.2 \amp 0.2-0.2/2^{60} \\ 0.8-0.8/2^{60} \amp 0.2/2^{60} +0.8\end{bmatrix}\text{.} \end{align*}

Now the probability we are interested in is the top entry of \(A^{60}\begin{bmatrix}1\\0\end{bmatrix}\text{,}\) which is \(0.8/2^{60}+0.2 \approx 0.200000000000000000694\text{.}\) Thus there is approximately a \(20\%\) chance that the student will be studying one hour from now.

Exercises 5.2.4 Exercises

1.

Find the eigenvalues and eigenvectors of the matrix \(\begin{bmatrix}-13 \amp -28 \amp 28 \\ 4 \amp 9 \amp -8 \\ -4 \amp -8 \amp 9\end{bmatrix}\text{.}\) Diagonalize if possible.

2.

Find the eigenvalues and eigenvectors of the matrix \(\begin{bmatrix}5 \amp -18 \amp -32 \\ 0 \amp 5 \amp 4 \\ 2 \amp -5 \amp -11\end{bmatrix}\text{.}\) Diagonalize if possible.

3.

Find the eigenvalues and eigenvectors of the matrix \(\begin{bmatrix}8 \amp 0 \amp 10 \\ -6 \amp -3 \amp -6 \\ -5 \amp 0 \amp -7\end{bmatrix}\text{.}\) Diagonalize if possible.

4.

Let \(A = \begin{bmatrix}1 \amp -2 \amp -1 \\ 2 \amp -1 \amp 1 \\ -2 \amp 3 \amp 1\end{bmatrix}\text{.}\) Find \(A^{100}\) by diagonalization.

5.

Give an example of two diagonalizable matrices \(A\) and \(B\) whose sum \(A+B\) is not diagonalizable.

6.

If \(A\) is diagonalizable and \(1\) and \(-1\) are the only eigenvalues of \(A\text{,}\) show that \(A^{-1} = A\text{.}\)
Solution.

Write \(A = PDP^{-1}\text{.}\) Then

\begin{equation*} A^{-1} = (PDP^{-1})^{-1} = (P^{-1})^{-1}D^{-1}P^{-1} = PD^{-1}P^{-1}\text{.} \end{equation*}

Since \(D\) is diagonal, \(D^{-1}\) is also diagonal, and for each \(j\) if the \((j,j)\) entry of \(D\) is \(d_{j,j}\) then the \((j,j)\) entry of \(D^{-1}\) is \(1/d_{j,j}\text{.}\) We know that the diagonal entries of \(D\) are the eigenvalues of \(A\text{,}\) so by hypothesis they are all \(\pm 1\text{,}\) and hence \(1/d_{j,j} = d_{j,j}\) for all \(j\text{.}\) Thus \(D = D^{-1}\text{,}\) so

\begin{equation*} A^{-1} = PD^{-1}P^{-1} = PDP^{-1} = A\text{.} \end{equation*}