Week 6 Lecture Notes

This week we’ll continue to look at least squares regression using linear algebra.

Monday, February 17

Last week, we introduced the four fundamental subspaces of a matrix. We ended with the following exercise that we didn’t have time to finish:

  1. (Exercise) Find the range and null space for the 3-by-3 matrix: eeT=[111111111]?e e^T = \begin{bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix}?

  2. (Exercise) Why is every column of AA in Range(A)\operatorname{Range}(A)?

  3. (Exercise) Prove that Null(A)Range(AT)\operatorname{Null}(A) \perp \operatorname{Range}(A^T).

Definition (Complementary Subspaces)

Two subspaces U,VnU, V \subseteq \mathbb{R}^n are called complementary if UV={0}U \cap V = \{ 0 \} and UVU \cup V spans n\mathbb{R}^n. An equivalent condition is that dimU+dimV=n\dim U + \dim V = n while dim(UV)=0\dim (U \cap V) = 0.


Definition (Orthogonal Complement)

The orthogonal complement of a set VnV \subseteq \mathbb{R}^n is the subspace V={wn:wTv=0 for all vV}.V^{\perp} = \{ w \in \mathbb{R}^n : w^T v = 0 \text{ for all } v \in V \}.

Note that two subspaces U,VnU, V \subseteq \mathbb{R}^n are orthogonal complements if and only if they are orthogonal and complementary. In particular, if U=VU = V^{\perp}, then V=UV = U^{\perp} and (V)=V(V^{\perp})^{\perp} = V as long as VV is a subspace.

Theorem (The Fundamental Theorem of Linear Algebra)

For any matrix Am×nA \in \mathbb{R}^{m \times n},

  1. Null(A)=Range(AT)\operatorname{Null}(A) = \operatorname{Range}(A^T)^\perp.
  2. Null(AT)=Range(A)\operatorname{Null}(A^T) = \operatorname{Range}(A)^\perp.

In other words, the row space and null space of AA are orthogonal complements in the domain of AA which is n\mathbb{R} ^n and the column space and left null space of AA are orthogonal complements of AA in the codomain of AA which is m\mathbb{R}^m. Furthermore, the dimensions of Range(A)\operatorname{Range}(A) and Range(AT)\operatorname{Range}(A^T) are both the same and both are the rank of AA.

Applications

Application 1. In linear regression, we needed to find the vector ŷRange(X)\hat{y} \in \operatorname{Range}(X) that is closest to yy. Geometrically, this is the same as requiring ŷy\hat{y} - y to be orthogonal to Range(X)\operatorname{Range}(X). What is the set of vectors orthogonal to Range(X)\operatorname{Range}(X)?

  1. Use the Fundamental Theorem of Linear Algebra to show that ŷy\hat{y} - y is orthogonal to Range(X)\operatorname{Range}(X) if and only if XT(ŷy)=0X^T ( \hat{y} - y ) = 0.

If ŷ=Xb\hat{y} = Xb, then XT(ŷy)=0X^T( \hat{y} - y) = 0 can be rewritten XTXb=XTyX^TXb = X^T y which is the normal equation for least squares regression.

Application 2. To solve the normal equations for bb, it helps if XTXX^T X is an invertible matrix. It turns out that XTXX^T X is invertible if and only if the columns of XX are linearly independent.

  1. For any Xm×nX \in \mathbb{R}^{m \times n}, show that Null(X)=Null(XTX)\operatorname{Null}(X) = \operatorname{Null}(X^TX). Hint: To prove that Null(XTX)Null(XTX)\operatorname{Null}(X^T X) \subseteq \operatorname{Null}(X^T X), it helps to prove that if bNull(XTX)b \in \operatorname{Null}(X^T X), then Xb=0\|Xb \| = 0 which means that Xb=0Xb = 0.

We ran out of time before we could answer these last two questions, but they are good linear algebra review questions:

  1. Why are the columns of XX linearly independent if and only if Null(X)={0}\operatorname{Null}(X) = \{ 0 \}?

  2. Why is XTXX^T X invertible if and only if Null(XTX)={0}\operatorname{Null}(X^T X) = \{ 0 \}? Hint: When is XTXX^T X onto? When is it 1-to-1?