Linear transformations

Section 4.1 Linear transformations

Subsection 4.1.1 Definition and first properties

So far we have considered vectors in \(\mathbb{R}^n\) as fixed objects and carried out basic operations on them. It is now time to consider functions that move vectors around. You have studied functions in previous mathematics courses, but it is likely that the functions you have spent most of your time with have taken inputs that are real numbers and given outputs that are also real numbers. Here we will be interested in functions that take vectors as inputs and give back vectors as outputs.

Definition 4.1.1.

A function \(f\) from \(\mathbb{R}^n\) to \(\mathbb{R}^m\) is a machine that accepts as an input a vector \(\vec{v}\) from \(\mathbb{R}^n\) and returns as output a vector \(f(\vec{v})\) in \(\mathbb{R}^m\text{.}\)

We often write \(f : \mathbb{R}^n \to \mathbb{R}^m\) to emphasize that \(f\) is a function from \(\mathbb{R}^n\) to \(\mathbb{R}^m\text{.}\)

You may have seen in calculus that there is very little that you can say about an arbitrary function \(f : \mathbb{R} \to \mathbb{R}\text{;}\) in order to do calculus you need to assume more about the function (such as continuity, differentiability, or other such properties). Similarly, in linear algebra there isn't a lot that we can say about an arbitrary function \(f : \mathbb{R}^n \to \mathbb{R}^m\text{.}\) In order to be able to say something meaningful we need to assume that the function \(f\) has some nice properties. In particular, when we study vectors the most basic operations we have are addition and scalar multiplication, and we would like our functions to respect those operations. The idea of "respecting" the operations is made precise in the following definition.

Definition 4.1.2.

Let \(f : \mathbb{R}^n \to \mathbb{R}^m\) be a function. We say that \(f\) is a linear transformation if it satisfies both of the following properties:

For all vectors \(\vec{v}\) and \(\vec{w}\) in \(\mathbb{R}^n\text{,}\)
\begin{equation*} f(\vec{v}+\vec{w}) = f(\vec{v}) + f(\vec{w})\text{.} \end{equation*}
For every vector \(\vec{v}\) in \(\mathbb{R}^n\) and every scalar \(k\text{,}\)
\begin{equation*} f(k\vec{v}) = kf(\vec{v})\text{.} \end{equation*}

Sometimes instead of saying that a function "is a linear transformation" we will just say that it "is linear". It is conventional to use the letters \(T, S, R, \ldots \) instead of \(f, g, h, \ldots \) for linear transformations, although this is only a stylistic choice. You should also be careful - knowing that a function has been named \(T\) does not mean it must be linear!

Example 4.1.3.

Consider the function \(T : \mathbb{R}^3 \to \mathbb{R}^2\) defined by \(T\left(\matr{c}{x\\y\\z}\right) = \matr{c}{x+y\\2x-z}\text{.}\) We will show that this function \(T\) is a linear transformation. To do so we need to check both of the properties from Definition 4.1.2.

For the first property, suppose that we have two vectors \(\vec{v} = \begin{bmatrix}a\\b\\c\end{bmatrix}\) and \(\vec{w} = \begin{bmatrix}d\\e\\f\end{bmatrix}\text{.}\) Then we calculate:

\begin{align*} T(\vec{v}+\vec{w}) \amp = T\left(\begin{bmatrix}a\\b\\c\end{bmatrix} + \begin{bmatrix}d\\e\\f\end{bmatrix}\right)\\ \amp = T\left(\begin{bmatrix}a+d\\b+e\\c+f\end{bmatrix}\right) \\ \amp = \begin{bmatrix}(a+d)+(b+e)\\2(a+d)-(c+f)\end{bmatrix} \\ \amp = \begin{bmatrix}a+b\\2a-c\end{bmatrix} + \begin{bmatrix}d+e\\2d-f\end{bmatrix}\\ \amp = T\left(\begin{bmatrix}a\\b\\c\end{bmatrix}\right) + T\left(\begin{bmatrix}d\\e\\f\end{bmatrix}\right)\\ \amp = T(\vec{v}) + T(\vec{w}) \end{align*}

The calculation above verifies the first property in the definition of being a linear transformation. For the second property, suppose that \(\vec{v} = \begin{bmatrix}a\\b\\c\end{bmatrix}\) and that \(k\) is a scalar. Then:

\begin{align*} T(k\vec{v}) \amp = T\left(k\begin{bmatrix}a\\b\\c\end{bmatrix}\right)\\ \amp = T\left(\begin{bmatrix}ka\\kb\\kc\end{bmatrix}\right)\\ \amp = \begin{bmatrix}ka+kb\\2ka-kc\end{bmatrix}\\ \amp = \begin{bmatrix}k(a+b)\\k(2a-c)\end{bmatrix}\\ \amp = k\begin{bmatrix}a+b\\2a-c\end{bmatrix}\\ \amp = kT\left(\begin{bmatrix}a\\b\\c\end{bmatrix}\right) \\ \amp = kT(\vec{v}) \end{align*}

Now we have verified both properties from the definition of being a linear transformation, so we conclude that our function \(T\) is a linear transformation.

Example 4.1.4.

Consider the function \(T : \mathbb{R}^2 \to \mathbb{R}^3\) defined by \(T\left(\begin{bmatrix}x\\y\end{bmatrix}\right) = \begin{bmatrix}0\\xy\\0\end{bmatrix}\text{.}\) We will show that \(T\) is not a linear transformation. To do this we need to show that at least one of the two parts of the definition fails at least one time. For instance, we have the following calculations:

\begin{gather*} T\left(2\begin{bmatrix}1\\1\end{bmatrix}\right) = T\left(\begin{bmatrix}2\\2\end{bmatrix}\right) = \begin{bmatrix}0\\4\\0\end{bmatrix}\\ 2T\left(\begin{bmatrix}1\\1\end{bmatrix}\right) = 2\begin{bmatrix}0\\1\\0\end{bmatrix} = \begin{bmatrix}0\\2\\0\end{bmatrix} \end{gather*}

We therefore see that \(T\left(2\begin{bmatrix}1\\1\end{bmatrix}\right) \neq 2T\left(\begin{bmatrix}1\\1\end{bmatrix}\right)\text{.}\) Therefore the second point of Definition 4.1.2 is not satisfied, so \(T\) is not a linear transformation.

Notice that in Example 4.1.3, where we showed that a certain function is a linear transformation, we needed to verify both parts of Definition 4.1.2 in general - using a specific numerical example would not be enough to be sure that the required properties hold for all vectors and all scalars. On the other hand, in Example 4.1.4, where we showed that a certain function is not a linear transformation, it was enough to give one specific numerical example where one of the properties from Definition 4.1.2 does not hold.

Example 4.1.5.

Fix an angle \(\theta\text{,}\) and consider the function \(T_\theta : \mathbb{R}^2 \to \mathbb{R}^2 \) that rotates vectors in the plane counterclockwise by \(\theta\) radians. Take a moment to convince yourself that \(T_\theta\) is a linear transformation.

We defined linear transformations to be functions that respect our fundamental operations of vector addition and scalar multiplication. It turns out that our definition is equivalent to requiring that the function respects all linear combinations.

Theorem 4.1.6.

Let \(T : \mathbb{R}^n \to \mathbb{R}^m \) be a function. The following are equivalent:

\(T\) is a linear transformation.
For all vectors \(\vec{v_1}, \ldots, \vec{v_k}\) in \(\mathbb{R}^n\) and all scalars \(c_1, \ldots, c_k\text{,}\)
\begin{equation*} T(c_1\vec{v_1} + \cdots + c_k\vec{v_k}) = c_1T(\vec{v_1}) + \cdots + c_kT(\vec{v_k})\text{.} \end{equation*}

Proof.

A fully rigorous proof of (1) implies (2) would use the technique of mathematical induction; instead, we will only give the proof in the case \(k=3\) (we stress that this is not a complete proof, just an illustration to help you believe the statement). Assume that \(T\) is a linear transformation. Using the two properties from Definition 4.1.2 we calculate:

\begin{align*} T(c_1\vec{v_1}+c_2\vec{v_2}+c_3\vec{v_3}) \amp = T(c_1\vec{v_1} + (c_2\vec{v_2} + c_3\vec{v_3}))\\ \amp = T(c_1\vec{v_1}) + T(c_2\vec{v_2}+c_3\vec{v_3})\\ \amp = T(c_1\vec{v_1}) + T(c_2\vec{v_2}) + T(c_3\vec{v_3})\\ \amp = c_1T(\vec{v_1}) + c_2T(\vec{v_2}) + c_3T(\vec{v_3}) \end{align*}

Now we prove that (2) implies (1), so assume that (2) is true. Then, in particular, for any vectors \(\vec{v}, \vec{w}\) we can use the \(k=2\) case of (2) to calculate:

\begin{equation*} T(\vec{v}+\vec{w}) = T(1\vec{v}+1\vec{w}) = 1T(\vec{v}) + 1T(\vec{w}) = T(\vec{v})+T(\vec{w})\text{,} \end{equation*}

which confirms the first point of the definition of linear transformations. The second point of the definition is exactly the \(k=1\) case of (2), so there is nothing to prove there.

Subsection 4.1.2 Matrix representation

The goal of this section is to show how we can encode a linear transformation as a matrix. Doing this will allow us to translate almost all questions about linear transformations into questions about matrices, which are often easier to solve. Our next theorem is the key to making this possible.

Theorem 4.1.7.

Fix vectors \(\vec{v_1}, \ldots, \vec{v_n}\) such that \(\SpanS(\vec{v_1}, \ldots, \vec{v_n}) = \mathbb{R}^n\text{.}\) Suppose that \(T : \mathbb{R}^n \to \mathbb{R}^m\) and \(S : \mathbb{R}^n \to \mathbb{R}^m\) are two linear transformations, and suppose that for every \(k\) with \(1 \leq k \leq n\) we have \(T(\vec{v_k}) = S(\vec{v_k})\text{.}\) Then \(T=S\) (that is, for every vector \(\vec{w}\) in \(\mathbb{R}^n\text{,}\) \(T(\vec{w}) = S(\vec{w})\)).

Proof.

By Definition 3.1.1 there are scalars \(c_1, \ldots, c_n\) such that \(\vec{w} = c_1\vec{v_1}+\cdots+c_n\vec{v_n}\text{.}\) We now use Theorem 4.1.6 to calculate:

\begin{align*} T(\vec{w}) \amp = T(c_1\vec{v_1} + \cdots + c_n\vec{v_n})\\ \amp = c_1T(\vec{v_1}) + \cdots + c_nT(\vec{v_n})\\ \amp = c_1S(\vec{v_1}) + \cdots + c_nS(\vec{v_n})\\ \amp = S(c_1\vec{v_1}+\cdots+c_n\vec{v_n})\\ \amp = S(\vec{w}). \end{align*}

While Theorem 4.1.7 can be applied using any set of vectors that spans \(\mathbb{R}^n\text{,}\) there is a very natural choice of vectors to work with, namely the standard basis vectors \(\vec{e_1}, \ldots, \vec{e_n}\text{.}\) Since any two linear transformations that act in the same way on \(\vec{e_1}, \ldots, \vec{e_n}\) must be the same transformation, it makes sense to keep track of a transformation \(T\) by recording the outputs \(T(\vec{e_1}), \ldots, T(\vec{e_n})\text{.}\)

Definition 4.1.8.

Let \(T : \mathbb{R}^n \to \mathbb{R}^m\) be a linear transformation. The standard matrix of \(T\) is the matrix whose columns are \(T(\vec{e_1}), \ldots, T(\vec{e_n})\text{.}\) We denote the standard matrix of \(T\) by \([T]\text{.}\)

Example 4.1.9.

Let \(T:\mathbb{R}^2 \to \mathbb{R}^4\) be defined by \(T\left(\begin{bmatrix}x\\y\end{bmatrix}\right) = \begin{bmatrix}2x-y\\4y\\0\\x+y\end{bmatrix}\text{.}\) Then \(T\) is a linear transformation (as you should verify!). To find the standard matrix of \(T\) we do some calculations.

\begin{gather*} T(\vec{e_1}) = T\left(\begin{bmatrix}1\\0\end{bmatrix}\right) = \begin{bmatrix}2\\0\\0\\1\end{bmatrix}\\ T(\vec{e_2}) = T\left(\begin{bmatrix}0\\1\end{bmatrix}\right) = \begin{bmatrix}-1\\4\\0\\1\end{bmatrix} \end{gather*}

Therefore, by definition,

\begin{equation*} [T] = \matr{cc}{2 \amp -1 \\ 0 \amp 4 \\ 0 \amp 0 \\ 1 \amp 1}\text{.} \end{equation*}

Definition 4.1.10.

A matrix with \(m\) rows and \(n\) columns is said to be an \(m \times n\) matrix, and \(m \times n\) is referred to as the size of the matrix. If \(1 \leq i \leq m\) and \(1 \leq j \leq n\) then the entry in row \(i\) and column \(j\) of a matrix is called the \((i, j)\) entry. We sometimes write \(A = [a_{i,j}]\) to mean that for each \(i,j\) the \((i,j)\) entry of \(A\) is \(a_{i,j}\text{.}\)

If \(A\) and \(B\) are matrices then we say \(A = B\) if \(A\) and \(B\) have the same size, and for all \(i, j\) the \((i, j)\) entry of \(A\) is the same as the \((i, j)\) entry of \(B\text{.}\)

Observation 4.1.11.

Notice that if \(T : \mathbb{R}^n \to \mathbb{R}^m\) is a linear transformation then the outputs from \(T\) will be vectors in \(\mathbb{R}^m\text{,}\) so the matrix \([T]\) will have \(m\) rows. The columns will be \(T(\vec{e_1}), \ldots, T(\vec{e_n})\text{,}\) so there will be \(n\) columns. Thus if \(T : \mathbb{R}^n \to \mathbb{R}^m\) is a linear transformation then \([T]\) is an \(m \times n\) matrix.

In the terminology we have established, Theorem 4.1.7 says that if \(T\) and \(S\) are linear transformations then \(T\) and \(S\) are the same transformation if and only if \([T] = [S]\text{.}\)

Subsection 4.1.3 Using the matrix representation

Given a linear transformation \(T : \mathbb{R}^n \to \mathbb{R}^m\text{,}\) we now have a matrix \([T]\) that encodes all the information about \(T\text{.}\) The main thing one does with functions is apply them to vectors and look at the output, so we would like to be able to use the matrix \([T]\) to calculate outputs of \(T\text{.}\) That is, we would like to be able to start with a vector \(\vec{v}\) in \(\mathbb{R}^n\) and the matrix \([T]\) and use that data to calculate the vector \(T(\vec{v})\text{.}\)

Suppose that \(\vec{v} = \begin{bmatrix}x_1 \\ \vdots \\ x_n\end{bmatrix}\text{.}\) Then we can write \(\vec{v} = x_1\vec{e_1} + \cdots + x_n\vec{e_n}\text{,}\) so since \(T\) is linear we have

\begin{equation*} T(\vec{v}) = x_1T(\vec{e_1}) + \cdots x_nT(\vec{e_n})\text{.} \end{equation*}

Now the vectors \(T(\vec{e_1}), \ldots, T(\vec{e_n})\) are precisely the columns of the matrix \([T]\text{,}\) so we see that \(T(\vec{v})\) is the linear combination of the columns of \([T]\) where the coefficients of the linear combination are the entries of the vector \(\vec{v}\text{.}\) This calculation is so important that it gets its own definition.

Definition 4.1.12.

Let \(A\) be an \(m \times n\) matrix with columns \(\vec{A_1}, \ldots, \vec{A_n}\text{,}\) and let \(\vec{v} = \begin{bmatrix}x_1\\ \vdots \\ x_n\end{bmatrix}\) be a vector in \(\mathbb{R}^n\text{.}\) We define the product of \(A\) and \(\vec{v}\) to be:

\begin{equation*} A\vec{v} = x_1\vec{A_1} + \cdots + x_n\vec{A_n}\text{.} \end{equation*}

The discussion before the definition is actually a proof of the following very important theorem.

Theorem 4.1.13.

Let \(T : \mathbb{R}^n \to \mathbb{R}^m\) be a linear transformation, and let \(\vec{v}\) be a vector in \(\mathbb{R}^n\text{.}\) Then

\begin{equation*} T(\vec{v}) = [T]\vec{v}\text{.} \end{equation*}

Example 4.1.14.

In Example 4.1.9 we considered the linear transformation \(T:\mathbb{R}^2 \to \mathbb{R}^4\) defined by \(T\left(\begin{bmatrix}x\\y\end{bmatrix}\right) = \begin{bmatrix}2x-y\\4y\\0\\x+y\end{bmatrix}\text{,}\) and we found that \([T] = \matr{cc}{2 \amp -1 \\ 0 \amp 4 \\ 0 \amp 0 \\ 1 \amp 1}\text{.}\) Now consider the vector \(\vec{v} = \begin{bmatrix}5\\-2\end{bmatrix}\text{.}\) On the one hand, plugging this in to the formula for \(T\) gives us

\begin{equation*} T(\vec{v}) = \begin{bmatrix} 2(5)-(-2) \\ 4(-2) \\ 0 \\ 5+(-2)\end{bmatrix} = \begin{bmatrix}12 \\ -8 \\ 0 \\ 3\end{bmatrix}\text{.} \end{equation*}

On the other hand, if we use the product of \([T]\) with \(\vec{v}\) we get

\begin{equation*} [T]\vec{v} = \matr{cc}{2 \amp -1 \\ 0 \amp 4 \\ 0 \amp 0 \\ 1 \amp 1}\begin{bmatrix}5\\-2\end{bmatrix} = 5\begin{bmatrix}2\\0\\0\\1\end{bmatrix} + (-2)\begin{bmatrix}-1\\4\\0\\1\end{bmatrix} = \begin{bmatrix}12 \\ -8 \\ 0 \\3\end{bmatrix}\text{.} \end{equation*}

Thus, as predicted by Theorem 4.1.13, \(T(\vec{v}) = [T]\vec{v}\text{.}\)

Example 4.1.15.

Fix an angle \(\theta\text{,}\) and consider the function \(T_\theta : \mathbb{R}^2 \to \mathbb{R}^2\) which rotates counterclockwise by \(\theta\) radians. In Example 4.1.5 you were asked to think about why \(T_\theta\) is a linear transformation. Taking for granted that it is linear, we can use the matrix \([T_\theta]\) to find a formula for this rotation.

First, we need to calculate \([T_\theta]\text{.}\) To do that we need to find \(T_\theta\left(\begin{bmatrix}1\\0\end{bmatrix}\right)\) and \(T_\theta\left(\begin{bmatrix}0\\1\end{bmatrix}\right)\text{.}\) That is, need to know where we will end up if we start at \(\begin{bmatrix}1\\0\end{bmatrix}\) and rotate counterclockwise by \(\theta\) radians. As you know from highschool trigonometry, the result is \(\begin{bmatrix}\cos(\theta) \\ \sin(\theta)\end{bmatrix}\text{.}\)

Figure 4.1.16. Rotating \(\vec{e_1} = \begin{bmatrix}1\\0\end{bmatrix}\) by \(\theta\text{.}\)

Working out what happens when we rotate \(\begin{bmatrix}0\\1\end{bmatrix}\) requires a bit more trigonometry, but you can check that the result is \(\begin{bmatrix}-\sin(\theta) \\ \cos(\theta)\end{bmatrix}\text{.}\)

Figure 4.1.17. Rotating \(\vec{e_2} = \begin{bmatrix}0\\1\end{bmatrix}\) by \(\theta\text{.}\)

Thus we have \([T_\theta] = \matr{cc}{\cos(\theta) \amp -\sin(\theta) \\ \sin(\theta) \amp \cos(\theta)}\text{.}\) Now we can calculate, for any vector,

\begin{align*} T_\theta\left(\begin{bmatrix}x\\y\end{bmatrix}\right) \amp = [T_\theta]\begin{bmatrix}x\\y\end{bmatrix} \\ \amp = \matr{cc}{\cos(\theta) \amp -\sin(\theta) \\ \sin(\theta) \amp \cos(\theta)}\begin{bmatrix}x\\y\end{bmatrix} \\ \amp = x\begin{bmatrix}\cos(\theta) \\ \sin(\theta)\end{bmatrix} + y\begin{bmatrix}-\sin(\theta) \\ \cos(\theta)\end{bmatrix} \\ \amp = \begin{bmatrix}x\cos(\theta) - y\sin(\theta) \\ x\sin(\theta) + y\cos(\theta)\end{bmatrix}. \end{align*}

We have thus found a formula for the rotation of any vector in \(\mathbb{R}^2\) by any angle \(\theta\text{!}\)

We have seen that every linear transformation gives rise to a matrix, and that applying the transformation to a vector is the same thing as multiplying the matrix by the vector. Now we go the other way, and start with the matrix. The next theorem completes the picture that matrices and linear transformations are fundamentally the same thing.

Theorem 4.1.18.

Let \(A\) be an \(m \times n\) matrix. Define a function \(T : \mathbb{R}^n \to \mathbb{R}^m\) by \(T(\vec{v}) = A\vec{v}\text{.}\) Then \(T\) is a linear transformation, and \([T] = A\text{.}\)

Proof.

Let the columns of \(A\) be \(\vec{A_1}, \ldots, \vec{A_n}\text{.}\) Suppose that \(\vec{v} = \begin{bmatrix}x_1 \\ \vdots \\ x_n\end{bmatrix}\) and \(\vec{w} = \begin{bmatrix}y_1 \\ \vdots \\ y_n\end{bmatrix}\text{,}\) and let \(c\) be a scalar. Then we can calculate:

\begin{align*} T_A(\vec{v}+\vec{w}) \amp = A(\vec{v}+\vec{w}) \\ \amp = A\begin{bmatrix}x_1+y_1 \\ \vdots \\ x_n+y_n\end{bmatrix} \\ \amp = (x_1+y_1)\vec{A_1} + \cdots + (x_n+y_n)\vec{A_n} \\ \amp = x_1\vec{A_1}+y_1\vec{A_1} + \cdots + x_n\vec{A_n} + y_n\vec{A_n} \\ \amp = (x_1\vec{A_1} + \cdots + x_n\vec{A_n}) + (y_1\vec{A_1} + \cdots + y_n\vec{A_n}) \\ \amp = A\begin{bmatrix}x_1 \\ \vdots \\ x_n\end{bmatrix} + A\begin{bmatrix}y_1 \\ \vdots \\ y_n\end{bmatrix} \\ \amp = A\vec{v} + A\vec{w} \\ \amp = T_A(\vec{v}) + T_A(\vec{w})\text{,} \end{align*}

and

\begin{align*} T_A(c\vec{v}) \amp = A\begin{bmatrix}cx_1 \\ \vdots \\ cx_n\end{bmatrix} \\ \amp = (cx_1)\vec{A_1} + \cdots + (cx_n)\vec{A_n} \\ \amp = c(x_1\vec{A_1} + \cdots + x_n\vec{A_n}) \\ \amp = c(A\vec{v}) \\ \amp = cT_A(\vec{v}) \text{.} \end{align*}

These two calculations show that \(T_A\) is a linear transformation.

Now we need to show that \([T_A] = A\text{.}\) To do that, let us start by calculating the first column of \([T_A]\text{,}\) which by definition is \(T_A(\vec{e_1})\text{.}\)

\begin{align*} T_A(\vec{e_1}) \amp = A\vec{e_1} \\ \amp = A\begin{bmatrix}1 \\ 0 \\ \vdots \\ 0\end{bmatrix} \\ \amp = 1\vec{A_1} + 0\vec{A_2} + \cdots + 0\vec{A_n} \\ \amp = \vec{A_1} \text{.} \end{align*}

This shows that the first column of \([T_A]\) is the same as the first column of \(A\text{.}\) Repeating this calculation with \(\vec{e_2}, \cdots, \vec{e_n}\) shows that for every \(k\) the \(k\)th column of \([T_A]\) is the same as the \(k\)th column of \(A\text{,}\) and thus \([T_A] = A\text{.}\)

We will only occasionally have a reason to explicitly talk about the transformation \(T_A\) associated to a matrix \(A\text{,}\) but we will use the result of Theorem 4.1.18 very frequently, in the sense that we will use these two equations all the time:

\(\displaystyle A(\vec{v}+\vec{w}) = A\vec{v} + A\vec{w}\)
\(A(c\vec{v}) = cA\vec{v}\text{.}\)

Subsection 4.1.4 A connection to systems of linear equations

Although we introduced multiplication of a matrix and a vector for reasons connected to linear transformations, the same operation has very close ties to systems of linear equations.

Theorem 4.1.19.

Let \(A\) be an \(m \times n\) matrix, let \(\vec{b} = \begin{bmatrix}b_1 \\ \vdots \\ b_m\end{bmatrix}\text{,}\) and let \(\vec{x} = \begin{bmatrix}x_1 \\ \vdots \\ x_n\end{bmatrix}\text{.}\) Then \(A\vec{x} = \vec{b}\) if and only if \(\vec{x}\) is a solution to the system of linear equations with augmented matrix \([A | \vec{b}]\text{.}\)

Proof.

Let \(A = \begin{bmatrix}a_{1,1} \amp a_{1, 2} \amp \cdots \amp a_{1,n} \\ a_{2, 1} \amp a_{2,2} \amp \cdots \amp a_{2, n} \\ \vdots \amp \vdots \amp \ddots \amp \vdots \\ a_{m,1} \amp a_{m,2} \amp \cdots \amp a_{m,n}\end{bmatrix}\text{.}\) Then carrying out the matrix multiplication gives:

\begin{align*} A\vec{x} \amp = \begin{bmatrix}a_{1,1} \amp a_{1, 2} \amp \cdots \amp a_{1,n} \\ a_{2, 1} \amp a_{2,2} \amp \cdots \amp a_{2, n} \\ \vdots \amp \vdots \amp \ddots \amp \vdots \\ a_{m,1} \amp a_{m,2} \amp \cdots \amp a_{m,n}\end{bmatrix}\begin{bmatrix}x_1\\ \vdots \\ x_n\end{bmatrix} \\ \amp = x_1\begin{bmatrix}a_{1,1}\\a_{2,1}\\ \vdots \\ a_{m,1}\end{bmatrix} + x_2\begin{bmatrix}a_{1,2} \\ a_{2,2} \\ \vdots \\ a_{m,2}\end{bmatrix} + \cdots + x_n\begin{bmatrix}a_{1,n}\\a_{2,n}\\ \vdots \\ a_{m,n}\end{bmatrix} \\ \amp = \begin{bmatrix}a_{1,1}x_1 + a_{1, 2}x_2 + \cdots + a_{1,n}x_n \\ a_{2,1}x_1 + a_{2,2}x_2 + \cdots + a_{2,n}x_n \\ \vdots \\ a_{m,1}x_1 + a_{m,2}x_2 + \cdots + a_{m,n}x_n\end{bmatrix}\text{.} \end{align*}

We therefore see that the equation \(A\vec{x}=\vec{b}\) is equivalent to the system of equations

\begin{gather*} a_{1,1}x_1 + \cdots + a_{1,n}x_n = b_1\\ a_{2,1}x_1 + \cdots + a_{2,n}x_n = b_n\\ \vdots \\ a_{m,1}x_1 + \cdots + a_{m,n}x_n = b_m \end{gather*}

The coefficient matrix of this system is exactly \(A\text{,}\) and so \(A\vec{x}=\vec{b}\) if and only if \(\vec{x}\) is a solution to the system \([A|\vec{b}]\text{.}\)

For the moment, this result just gives us a way of re-writing a system of linear equations as a single equation involving matrices and vectors. After we develop more of the algebraic properties of equations involving matrices in Section 4.2 and Section 4.3 we will return in Section 4.4 to see how thinking of a system of linear equations as a matrix equation can sometimes be helpful.

Exercises 4.1.5 Exercises

1.

Which of the following vector functions are linear transformations? Explain.

\(T_1\left( \begin{bmatrix} x \\ y \end{bmatrix}\right) = \begin{bmatrix} 2x + y \\ x-2y \\ -x-y \end{bmatrix}\) Solution.
Note first that \(T_{1} \colon \mathbb{R}^{2} \to \mathbb{R}^{3}\text{.}\) We claim that \(T_{1}\) is indeed a linear transformation, so we check the properties in Definition 4.1.2:
1. \(T_{1}\) preserves addition: let \(\mathbf{v} = \begin{bmatrix} v_{1} \\ v_{2} \end{bmatrix}, \mathbf{u} = \begin{bmatrix} u_{1} \\ u_{2} \end{bmatrix}\in\mathbb{R}^2\) be arbitrary. We compute:
  \begin{equation*} T_{1}(\mathbf{v}+ \mathbf{u}) = T_{1}\left(\begin{bmatrix} v_{1} + u_{1} \\ v_{2} + u_{2} \end{bmatrix}\right) = \begin{bmatrix} 2(v_{1}+u_{1}) + v_{2} + u_{2} \\ v_{1}+u_{1}-2(v_{2} + u_{2} ) \\ -(v_{1}+u_{1})-(v_{2} + u_{2} ) \end{bmatrix} . \end{equation*}
  On the other hand:
  \begin{align*} T_{1}(\mathbf{v})+T(\mathbf{u}) =\amp \begin{bmatrix} 2v_{1} + v_{2} \\ v_{1}-2v_{2} \\ -v_{1}-v_{2}\end{bmatrix} + \begin{bmatrix} 2u_{1} + u_{2} \\ u_{1}-2u_{2} \\ -u_{1}-u_{2} \end{bmatrix}\\ =\amp \begin{bmatrix} 2(v_{1}+u_{1}) + v_{2} + u_{2} \\ v_{1}+u_{1}-2(v_{2} + u_{2} ) \\ -(v_{1}+u_{1})-(v_{2} + u_{2} ) \end{bmatrix} = T_{1}(\mathbf{v}+ \mathbf{u}). \end{align*}
  Since the two are equal, we conclude that \(T_{1}\) preserves addition.
2. \(T_{1}\) preserves scalar multiplication: let \(\mathbf{u} = \begin{bmatrix} u_{1} \\ u_{2} \end{bmatrix}\in\mathbb{R}^2\) and \(k\in\mathbb{R}\) be arbitrary. We compute:
  \begin{equation*} T_{1}(k\mathbf{u}) = T_{1}\left( \begin{bmatrix} k u_{1} \\ k u_{2} \end{bmatrix}\right) = \begin{bmatrix} 2 (k u_{1}) + k u_{2} \\ k u_{1} -2(k u_{2}) \\ -k u_{1}-k u_{2} \end{bmatrix}. \end{equation*}
  On the other hand:
  \begin{align*} kT_{1}(\mathbf{u}) =\amp k \begin{bmatrix} 2 u_{1} + u_{2} \\ u_{1} -2 u_{2} \\ - u_{1}- u_{2} \end{bmatrix}\\ =\amp \begin{bmatrix} 2 (k u_{1}) + k u_{2} \\ k u_{1} -2(k u_{2}) \\ -k u_{1}-k u_{2} \end{bmatrix} = T_{1}(k\mathbf{u}). \end{align*}
  Since the two are equal, we conclude that \(T_{1}\) preserves scalar multiplication.
Answer.

Yes, it is a linear transformation.
\(T_2 \left(\begin{bmatrix} x \\ y \\ z \end{bmatrix} \right) = \begin{bmatrix} x + y^2 \\ (x+y)z \\ 0 \end{bmatrix}\) Solution.

Note first that \(T_{2} \colon \mathbb{R}^{3} \to \mathbb{R}^{3}\text{.}\) We claim that \(T_{2}\) is not linear. For example,
\begin{equation*} T_{2} \left(3 \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \right) = T_{2} \left(\begin{bmatrix} 0 \\ 3 \\ 0 \end{bmatrix} \right) = \begin{bmatrix} 0 + 3^2 \\ (0+3)0 \\ 0 \end{bmatrix} = \begin{bmatrix} 9 \\ 0 \\ 0 \end{bmatrix}, \end{equation*}
while
\begin{equation*} 3 T_{2} \left(\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \right) = 3\begin{bmatrix} 0 + 1^2 \\ (0+1)0 \\ 0 \end{bmatrix} = \begin{bmatrix} 3 \\ 0 \\ 0 \end{bmatrix}. \end{equation*}
This shows that
\begin{equation*} T_{2} \left(3 \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \right) \neq 3 T_{2} \left(\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \right), \end{equation*}
so \(T_{2}\) does not preserve scalar multiplication.
Answer.

No, it is not a linear transformation.
\(T_3 \left(\begin{bmatrix} x \\ y \\ z \end{bmatrix}\right) = \begin{bmatrix} 0 \\ 0 \end{bmatrix}\) Solution.
Note first that \(T_{3} \colon \mathbb{R}^{3} \to \mathbb{R}^{2}\text{.}\) We claim that \(T_{3}\) is indeed a linear transformation, so we check the properties in Definition 4.1.2:
1. \(T_{3}\) preserves addition: let \(\mathbf{v} = \begin{bmatrix} v_{1} \\ v_{2} \\ v_{3} \end{bmatrix}, \mathbf{u} = \begin{bmatrix} u_{1} \\ u_{2} \\ u_{3} \end{bmatrix}\in\mathbb{R}^3\) be arbitrary. We compute:
  \begin{equation*} T_{3}(\mathbf{v}+ \mathbf{u}) = T_{3}\left(\begin{bmatrix} v_{1} + u_{1} \\ v_{2} + u_{2} \\ v_{3} + u_{3} \end{bmatrix}\right) = \begin{bmatrix} 0 \\ 0 \end{bmatrix} . \end{equation*}
  On the other hand:
  \begin{align*} T_{3}(\mathbf{v})+T(\mathbf{u}) =\amp \begin{bmatrix} 0 \\ 0 \end{bmatrix} + \begin{bmatrix} 0 \\ 0 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} = T_{3}(\mathbf{v}+ \mathbf{u}). \end{align*}
  Since the two are equal, we conclude that \(T_{3}\) preserves addition.
2. \(T_{3}\) preserves scalar multiplication: let \(\mathbf{u} = \begin{bmatrix} u_{1} \\ u_{2} \\ u_{3} \end{bmatrix}\in\mathbb{R}^3\) and \(k\in\mathbb{R}\) be arbitrary. We compute:
  \begin{equation*} T_{3}(k\mathbf{u}) = T_{3}\left( \begin{bmatrix} k u_{1} \\ k u_{2} \\ k u_{3}\end{bmatrix}\right) = \begin{bmatrix} 0 \\ 0 \end{bmatrix}. \end{equation*}
  On the other hand:
  \begin{align*} k T_{3}(\mathbf{u}) =\amp k \begin{bmatrix} 0 \\ 0 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} = T_{3}(k\mathbf{u}). \end{align*}
  Since the two are equal, we conclude that \(T_{1}\) preserves scalar multiplication.
Answer.

Yes, it is a linear transformation.

Hint.

Recall the definition of a linear transformation.

2.

Consider the following functions \(T\colon \mathbb{R}^3 \to \mathbb{R}^2. \) Explain why each of these functions \(T \) is not linear.

\begin{equation*} T \begin{bmatrix} x \\ y \\ z \end{bmatrix} = \begin{bmatrix} x + 2y + 3z + 1 \\ 2y - 3x + z \end{bmatrix} \end{equation*}
Solution.

The problem is the \(+1\) in the first component: For a vector function \(S\) to be linear, we need that \(S(k\mathbf{u}) = k S(\mathbf{u})\) for all scalars \(k\) and vectors \(\mathbf{u}\text{.}\) In particular, \(S(\mathbf{0}) = S(0\mathbf{0})= 0 S(\mathbf{0}) = \mathbf{0}.\) However, \(T(\mathbf{0}) = \begin{bmatrix} 1 \\ 0 \end{bmatrix}\neq \mathbf{0}\text{,}\) so \(T\) cannot be linear.
Answer.

The problem is in the \(+1\text{.}\)
\begin{equation*} T \begin{bmatrix} x \\ y \\ z \end{bmatrix} = \begin{bmatrix} x + 2y^2 + 3z \\ 2y + 3x + z \end{bmatrix} \end{equation*}
Solution.

The problem is in the \(y^2\text{:}\) Consider
\begin{equation*} T \left(4 \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \right) = T \left(\begin{bmatrix} 0 \\ 4 \\ 0 \end{bmatrix} \right) = \begin{bmatrix} 0 + 2\cdot 4^2 + 2\cdot 0 \\ 2\cdot 4 + 3\cdot 0 + 0 \end{bmatrix} = \begin{bmatrix} 32 \\ 8 \\ 0 \end{bmatrix}, \end{equation*}
while
\begin{equation*} 4 T \left(\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \right) = 4 \begin{bmatrix} 0 + 2\cdot 1^2 + 3\cdot 0 \\ 2\cdot 1 + 3\cdot 0 + 0 \end{bmatrix} = 4 \begin{bmatrix} 2 \\ 2 \\ 0 \end{bmatrix} = \begin{bmatrix} 8 \\ 8 \\ 0 \end{bmatrix} . \end{equation*}
This shows that
\begin{equation*} T \left(4 \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \right) \neq 4 T \left(\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \right), \end{equation*}
so \(T\) does not preserve scalar multiplication.
Answer.

The problem is in the \(y^2\text{.}\)
\begin{equation*} T \begin{bmatrix} x \\ y \\ z \end{bmatrix} = \begin{bmatrix} \sin(x) + 2y + 3z \\ 2y - 3x + z \end{bmatrix} \end{equation*}
Solution.

The problem is in the \(\sin (x)\text{:}\) Consider
\begin{equation*} T \left(2 \begin{bmatrix} \frac{\pi}{2} \\ 0 \\ 0 \end{bmatrix} \right) = T \left(\begin{bmatrix} \pi \\ 0 \\ 0 \end{bmatrix} \right) = \begin{bmatrix}\sin(\pi) + 2\cdot 0 + 3\cdot 0 \\ 2\cdot 0 - 3\pi + 0\end{bmatrix} = \begin{bmatrix} 0 \\ - 3\pi\end{bmatrix}, \end{equation*}
while
\begin{equation*} 2 T \left(\begin{bmatrix} \frac{\pi}{2} \\ 0 \\ 0 \end{bmatrix} \right) = 2 \begin{bmatrix} \sin\left(\frac{\pi}{2}\right) + 2\cdot 0 + 3\cdot 0 \\ 2\cdot 0 - 3\cdot \frac{\pi}{2} + 0\end{bmatrix} = 2 \begin{bmatrix} 1 \\ - 3\cdot \frac{\pi}{2} \end{bmatrix} = \begin{bmatrix} 2 \\ - 3 \pi \end{bmatrix} . \end{equation*}
This shows that
\begin{equation*} T \left(2 \begin{bmatrix} \frac{\pi}{2} \\ 0 \\ 0 \end{bmatrix}\right) \neq 2 T \left(\begin{bmatrix} \frac{\pi}{2} \\ 0 \\ 0 \end{bmatrix} \right), \end{equation*}
so \(T\) does not preserve scalar multiplication.
Answer.

The problem is in the \(\sin(x)\text{.}\)

Hint.

Recall the definition of a linear transformation.

3.

Consider the following linear transformations \(T: \mathbb{R}^3 \to \mathbb{R}^2. \) For each, determine the matrix \(A \) such that \(T(x) = Ax. \)

\begin{equation*} T \begin{bmatrix} x \\ y \\ z \end{bmatrix} = \begin{bmatrix} x + 2y + 3z \\ 2y - 3x + z \end{bmatrix} \end{equation*}
Solution.

We apply \(T\) to the standard basis \(\left\{\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix},\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix},\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}\right\}\) of \(\mathbb{R}^{3}\text{:}\)
\begin{align*} T\left(\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}\right) =\amp \begin{bmatrix} 1 + 2\cdot 0 + 3\cdot 0 \\ 2\cdot 0 - 3\cdot 1 + 0 \end{bmatrix} = \begin{bmatrix} 1 \\ -3 \end{bmatrix},\\ T\left(\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}\right) =\amp \begin{bmatrix} 0 + 2\cdot 1 + 3\cdot 0 \\ 2\cdot 1 - 3\cdot 0 + 0\end{bmatrix} = \begin{bmatrix} 2 \\ 2\end{bmatrix},\\ T\left(\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}\right) =\amp \begin{bmatrix} 0 + 2\cdot 0 + 3\cdot 1 \\ 2\cdot 0 - 3\cdot 0 + 1 \end{bmatrix} = \begin{bmatrix} 3 \\ 1 \end{bmatrix}. \end{align*}
We conclude that \(T\) corresponds to the matrix
\begin{equation*} A = \matr{ccc} { 1 \amp 2 \amp 3 \\ -3 \amp 2 \amp 1 }. \end{equation*}

Answer.

\begin{equation*} A = \matr{ccc} { 1 \amp 2 \amp 3 \\ -3 \amp 2 \amp 1 } \end{equation*}
\begin{equation*} T \begin{bmatrix} x \\ y \\ z \end{bmatrix} = \begin{bmatrix} 7x + 2y + z\\ 3x - 11y + 2z \end{bmatrix} \end{equation*}
Solution.

We apply \(T\) to the standard basis \(\left\{\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix},\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix},\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}\right\}\) of \(\mathbb{R}^{3}\text{:}\)
\begin{align*} T\left(\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}\right) =\amp \begin{bmatrix} 7\cdot 1 + 2\cdot 0 + 0\\ 3\cdot 1 - 11\cdot 0 + 2\cdot 0\end{bmatrix} = \begin{bmatrix} 7 \\ 3 \end{bmatrix},\\ T\left(\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}\right) =\amp \begin{bmatrix} 7\cdot 0 + 2\cdot 1 + 0\\ 3\cdot 0 - 11\cdot 1 + 2\cdot 0\end{bmatrix} = \begin{bmatrix} 2 \\ -11 \end{bmatrix},\\ T\left(\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}\right) =\amp \begin{bmatrix} 7\cdot 0 + 2\cdot 0 + 1\\ 3\cdot 0 - 11\cdot 0 + 2\cdot 1\end{bmatrix} = \begin{bmatrix} 1 \\ 2 \end{bmatrix}. \end{align*}
We conclude that \(T\) corresponds to the matrix
\begin{equation*} A = \matr{ccc} { 7 \amp 2 \amp 1 \\ 3 \amp -11 \amp 2 }. \end{equation*}

Answer.

\begin{equation*} A = \matr{ccc} { 7 \amp 2 \amp 1 \\ 3 \amp -11 \amp 2 } \end{equation*}
\begin{equation*} T \begin{bmatrix} x \\ y \\ z \end{bmatrix} = \begin{bmatrix} 3x + 2y + z \\ x + 2y + 6z \end{bmatrix} \end{equation*}
Solution.

We apply \(T\) to the standard basis \(\left\{\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix},\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix},\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}\right\}\) of \(\mathbb{R}^{3}\text{:}\)
\begin{align*} T\left(\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}\right) =\amp \begin{bmatrix} 3\cdot 1 + 2\cdot 0 + 0 \\ 1 + 2\cdot 0 + 6\cdot 0\end{bmatrix} = \begin{bmatrix} 3 \\ 1 \end{bmatrix},\\ T\left(\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}\right) =\amp \begin{bmatrix} 3\cdot 0 + 2\cdot 1 + 0 \\ 0 + 2\cdot 1 + 6\cdot 0\end{bmatrix} = \begin{bmatrix} 2 \\ 2\end{bmatrix},\\ T\left(\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}\right) =\amp \begin{bmatrix} 3\cdot 0 + 2\cdot 0 + 1 \\ 0 + 2\cdot 0 + 6\cdot 1\end{bmatrix} = \begin{bmatrix} 1 \\ 6 \end{bmatrix}. \end{align*}
We conclude that \(T\) corresponds to the matrix
\begin{equation*} A = \matr{ccc} { 3 \amp 2 \amp 1 \\ 1 \amp 2 \amp 6 }. \end{equation*}

Answer.

\begin{equation*} A = \matr{ccc} { 3 \amp 2 \amp 1 \\ 1 \amp 2 \amp 6 } \end{equation*}
\begin{equation*} T \begin{bmatrix} x \\ y \\ z \end{bmatrix} = \begin{bmatrix} 2y - 5x + z \\ x + y + z \end{bmatrix} \end{equation*}
Solution.

We apply \(T\) to the standard basis \(\left\{\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix},\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix},\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}\right\}\) of \(\mathbb{R}^{3}\text{:}\)
\begin{align*} T\left(\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}\right) =\amp \begin{bmatrix} 2\cdot 0 - 5\cdot 1 + 0 \\ 1 + 0 + 0 \end{bmatrix} = \begin{bmatrix} -5 \\ 1 \end{bmatrix},\\ T\left(\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}\right) =\amp \begin{bmatrix}2\cdot 1 - 5\cdot 0 + 0 \\ 0 + 1 + 0 \end{bmatrix} = \begin{bmatrix} 2 \\ 1\end{bmatrix},\\ T\left(\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}\right) =\amp \begin{bmatrix}2\cdot 0 - 5\cdot 0 + 1 \\ 0 + 0 + 1 \end{bmatrix} = \begin{bmatrix} 1 \\ 1 \end{bmatrix}. \end{align*}
We conclude that \(T\) corresponds to the matrix
\begin{equation*} A = \matr{ccc} { -5 \amp 2 \amp 1 \\ 1 \amp 1 \amp 1 }. \end{equation*}

Answer.

\begin{equation*} A = \matr{ccc} { -5 \amp 2 \amp 1 \\ 1 \amp 1 \amp 1 } \end{equation*}

Hint.

The \(m\times n\) matrix \(A\) that corresponds to the linear transformation \(T\colon \mathbb{R}^n \to \mathbb{R}^m\) has the vector \(T\mathbf{e}_{i}\) as its \(i^{\text{th}}\) column, where \(\mathbf{e}_{i}\) is the \(i^{\text{th}}\) standard basis vector of \(\mathbb{R}^n\text{.}\)

4.

In each case, assume that \(T \) is a linear transformation.

If \(T : V \to \mathbb{R} \) and \(T(\mathbf{v_1}) =1, T(\mathbf{v_2}) = -1 \text{,}\) find \(T(3 \mathbf{v_1} - 5 \mathbf{v_2}).\) Solution.

Since \(T\) is linear, we know from Theorem 4.1.6 that
\begin{equation*} T(3 \mathbf{v_1} - 5 \mathbf{v_2}) = 3 T(\mathbf{v_1}) + (- 5) T(\mathbf{v_2}) = 3 \cdot 1 + (-5) \cdot (-1) = 8 . \end{equation*}

Answer.

\begin{equation*} T(3 \mathbf{v_1} - 5 \mathbf{v_2}) = 8 \end{equation*}
If \(T : V \to \mathbb{R} \) and \(T(\mathbf{v_1}) =2, T(\mathbf{v_2}) = -3 \text{,}\) find \(T(3 \mathbf{v_1} + 2 \mathbf{v_2}).\) Solution.

Since \(T\) is linear, we know (see Theorem 4.1.6) that
\begin{equation*} T(3 \mathbf{v_1} +2 \mathbf{v_2}) = 3 T(\mathbf{v_1}) + 2 T(\mathbf{v_2}) = 3 \cdot 2 + 2 \cdot (-3) = 0 . \end{equation*}

Answer.

\begin{equation*} T(3 \mathbf{v_1} +2 \mathbf{v_2}) = 0 \end{equation*}
If \(T : \mathbb{R}^2 \to \mathbb{R}^2 \) and \(T\left(\begin{bmatrix}1 \\ 3 \end{bmatrix}\right) = \begin{bmatrix} 1 \\ 1 \end{bmatrix}, \ T\left(\begin{bmatrix}1 \\ 1 \end{bmatrix} \right) = \begin{bmatrix} 1 \\ 0 \end{bmatrix},\) find \(T\left(\begin{bmatrix} -1 \\ 3 \end{bmatrix}\right). \) Solution.

First, we have to write the vector at which we want to evaluate \(T\) in terms of the known vectors. In other words, we need to solve the following system of linear equations:
\begin{align*} \matr{cc|c} { 1 \amp 1 \amp -1 \\ 3 \amp 1 \amp 3 } \overset{-3R_{1}+R_{2}}{\longrightarrow} \amp \matr{cc|c} { 1 \amp 1 \amp -1 \\ 0 \amp -2 \amp 6 }\\ \overset{\frac{1}{-2}R_{2}}{\underset{-R_{2}' + R_{1}}{\longrightarrow}} \amp \matr{cc|c} { 1 \amp 0 \amp 2 \\ 0 \amp 1 \amp -3 }. \end{align*}
We conclude that
\begin{equation*} 2 \begin{bmatrix}1 \\ 3 \end{bmatrix} -3 \begin{bmatrix}1 \\ 1 \end{bmatrix} = \begin{bmatrix} -1 \\ 3 \end{bmatrix}. \end{equation*}
Therefore,
\begin{align*} T\left(\begin{bmatrix} -1 \\ 3 \end{bmatrix}\right) =\amp T \left( 2 \begin{bmatrix}1 \\ 3 \end{bmatrix} -3 \begin{bmatrix}1 \\ 1 \end{bmatrix} \right)\\ =\amp 2T\left(\begin{bmatrix}1 \\ 3 \end{bmatrix}\right) -3 T\left(\begin{bmatrix}1 \\ 1 \end{bmatrix}\right) = 2 \begin{bmatrix} 1 \\ 1 \end{bmatrix} - 3 \begin{bmatrix} 1 \\ 0 \end{bmatrix} = \begin{bmatrix} -1 \\ 2 \end{bmatrix}. \end{align*}

Answer.

\begin{equation*} T\left(\begin{bmatrix} -1 \\ 3 \end{bmatrix}\right) = \begin{bmatrix} -1 \\ 2 \end{bmatrix} \end{equation*}
If \(T : \mathbb{R}^2 \to \mathbb{R}^2 \) and \(T\left(\begin{bmatrix}1 \\ -1 \end{bmatrix} = \begin{bmatrix} 0 \\ 1 \end{bmatrix}\right), \ T\left(\begin{bmatrix}1 \\ 1 \end{bmatrix}\right) = \begin{bmatrix} 1 \\ 0 \end{bmatrix}, \) find \(T\left(\begin{bmatrix} 1 \\ -7 \end{bmatrix}\right). \) Solution.

First, we have to write the vector at which we want to evaluate \(T\) in terms of the known vectors. In other words, we need to solve the following system of linear equations:
\begin{align*} \matr{cc|c} { 1 \amp 1 \amp 1 \\ -1 \amp 1 \amp -7 } \overset{R_{1}+R_{2}}{\longrightarrow} \amp \matr{cc|c} { 1 \amp 1 \amp 1 \\ 0 \amp 2 \amp -6 }\\ \overset{\frac{1}{2}R_{2}}{\underset{-R_{2}' + R_{1}}{\longrightarrow}} \amp \matr{cc|c} { 1 \amp 0 \amp 4 \\ 0 \amp 1 \amp -3 }. \end{align*}
We conclude that
\begin{equation*} 4 \begin{bmatrix}1 \\ -1 \end{bmatrix} -3 \begin{bmatrix}1 \\ 1 \end{bmatrix} = \begin{bmatrix} 1 \\ -7 \end{bmatrix}. \end{equation*}
Therefore,
\begin{align*} T\left(\begin{bmatrix} 1 \\ -7 \end{bmatrix}\right) =\amp T \left( 4 \begin{bmatrix}1 \\ -1 \end{bmatrix} -3 \begin{bmatrix}1 \\ 1 \end{bmatrix} \right)\\ =\amp 4T \left(\begin{bmatrix}1 \\ -1 \end{bmatrix}\right) -3 T\left(\begin{bmatrix}1 \\ 1 \end{bmatrix}\right) = 4 \begin{bmatrix} 0 \\ 1 \end{bmatrix} -3 \begin{bmatrix} 1 \\ 0 \end{bmatrix} = \begin{bmatrix} -3 \\ 4 \end{bmatrix}. \end{align*}

Answer.

\begin{equation*} T\left(\begin{bmatrix} 1 \\ -7 \end{bmatrix}\right) = \begin{bmatrix} -3 \\ 4 \end{bmatrix} \end{equation*}

Hint.

The first two can be solved right away, using only linearity of \(T\text{.}\) For the second two, you first need to write the vector at which you need to evaluate \(T\) as a linear combination of the vectors at which you know the values of \(T\text{.}\)

5.

Find the matrix of the linear transformation that rotates every vector of \(\mathbb{R}^2 \) by an angle of \(\frac{\pi}{3} \) (clockwise).

Hint.

In Example 4.1.15 we saw the matrix for a counterclockwise rotation. Try to represent this clockwise rotation by \(\frac{\pi}{3}\) as a counterclockwise rotation.

6.

Find the matrix of the linear transformation that reflects every vector of \(\mathbb{R}^2 \) about the \(x\)-axis.

7.

Find the matrix of the linear transformation that reflects every vector of \(\mathbb{R}^2 \) about the line \(y = -x\text{.}\)

8.

Find the matrix of the linear transformation that stretches \(\mathbb{R}^2 \) by a factor of 3 in the vertical direction.

9.

Find the matrix for \(T(\mathbf{w}) = proj_{\mathbf{v}}(\mathbf{w}) \text{,}\) where \(v = \begin{bmatrix} 1 \\ -2 \\ 3 \end{bmatrix}. \)

10.

Let \(T \) be a linear transformation defined as multiplication by the matrix \(A = \begin{bmatrix} 3 \amp 1 \\ -1 \amp 2 \end{bmatrix} \) and \(S \) a linear transformation defined as multiplication by \(B = \begin{bmatrix} 0 \amp -2 \\ 4 \amp 2 \end{bmatrix}. \) Find the matrix of \(S \circ T \) and find \((S \circ T )(\mathbf{x}) \) for \(\mathbf{x} = \begin{bmatrix} 2 \\ -1 \end{bmatrix}. \)

Hint.

It might help to look back into the solution of Exercise 4.1.5.3 to see how to find the matrix corresponding to a linear transformation.

Answer.

\begin{equation*} (S \circ T )\left(\begin{bmatrix} 2 \\ -1 \end{bmatrix}\right) = \begin{bmatrix} 8 \\ 12 \end{bmatrix} \end{equation*}

and the matrix corresponding to the linear transformation \(S\circ T\) is given by

\begin{equation*} \matr{cc} { 2 \amp -4 \\ 10 \amp 8 } . \end{equation*}

Solution.

Later, in Section 4.3 we will have a better way to solve this problem, but for now we use the definition of composition. By definition of \(T\) and \(S\text{,}\) we have

\begin{equation*} T \left(\begin{bmatrix} x \\ y \end{bmatrix}\right) = A\begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 3x + y \\ -x + 2y \end{bmatrix} , \text{ and } \end{equation*}

\begin{equation*} S \left(\begin{bmatrix} v \\ w \end{bmatrix}\right) = B\begin{bmatrix} v \\ w \end{bmatrix} = \begin{bmatrix} -2w \\ 4v + 2w \end{bmatrix} . \end{equation*}

Thus,

\begin{align*} (S\circ T)\left(\begin{bmatrix} x \\ y \end{bmatrix}\right) =\amp S \left( T \left(\begin{bmatrix} x \\ y \end{bmatrix}\right) \right) = S \left( \begin{bmatrix} 3x + y \\ -x + 2y \end{bmatrix} \right)\\ =\amp \begin{bmatrix} -2(-x + 2y) \\ 4(3x + y ) + 2(-x + 2y) \end{bmatrix} = \begin{bmatrix} 2x - 4y \\ 10x + 8y \end{bmatrix} . \end{align*}

We see from this that

\begin{equation*} (S \circ T )\left(\begin{bmatrix} 2 \\ -1 \end{bmatrix}\right) = \begin{bmatrix} 2\cdot 2 - 4\cdot (-1) \\ 10\cdot 2 + 8\cdot (-1) \end{bmatrix} = \begin{bmatrix} 8 \\ 12 \end{bmatrix} . \end{equation*}

Moreover, using the same techniques as in Exercise 4.1.5.3, we conclude that the matrix corresponding to the linear transformation \(S\circ T\) is given by

\begin{equation*} \matr{cc} { 2 \amp -4 \\ 10 \amp 8 } . \end{equation*}

11.

Write the following system of linear equations as a matrix equation of the form \(A\mathbf{x} = \mathbf{b}\text{.}\)

\begin{align*} x - 3z - 2y \amp = 5\\ 6 -x \amp = 4+y - z\\ 2x + 3 \amp = x+3y \end{align*}

Hint.

First simplify and rearrange the equations so that ''like'' variables are lined up in columns. Write the variables in the order \(x,y,z\text{.}\)

Answer.

\begin{align*} \matr{ccc|c}{ 1 \amp -2 \amp - 3 \amp 5\\ -1 \amp -1 \amp 1 \amp -2\\ 1 \amp -3 \amp 0 \amp -3 } \end{align*}

Solution.

We sort by ''like'' variables:

\begin{align*} x - 2y - 3z \amp = 5\\ -x - y + z \amp = -2\\ x -3y \amp = -3 \end{align*}

Now we can easily write it as augmented matrix:

\begin{align*} \matr{ccc|c}{ 1 \amp -2 \amp - 3 \amp 5\\ -1 \amp -1 \amp 1 \amp -2\\ 1 \amp -3 \amp 0 \amp -3 }. \end{align*}

12.

For each of the following matrices, describe the linear transformation \(T\) that has \([T]\) equal to the given matrix. Draw a before-and-after picture for each.

\(\displaystyle A = \begin{bmatrix} 1 \amp -1 \\ 1 \amp 1 \end{bmatrix}.\)
\(\displaystyle B = \begin{bmatrix} 0 \amp 2 \\ 1 \amp 0 \end{bmatrix}.\)
\(\displaystyle C = \begin{bmatrix} 1 \amp 0 \\ -1 \amp 1 \end{bmatrix}.\)
\(\displaystyle D = \begin{bmatrix} 1 \amp 0\\ 0 \amp 0 \end{bmatrix}.\)