This week we’ll continue our review linear algebra and apply it to
least squares regression.
Monday, February 10
Let
.
Express
as a matrix times the vector
.
Definition (Orthogonality)
Two vectors
are orthogonal if
.
This can be denoted
.
Two sets of vectors
are orthogonal if
for all
and
.
Show that for any
,
is orthogonal to the vector
.
Theorem (Properties of Transposes)
Let
and
be two matrices. Then
,
.
Show that for matrices
and
,
.
From Exercise 1, we know that
where
We can use this to give a different solution to Exercise 2. We want to
show that
is orthogonal to
.
By substitution, that is the same as showing that
.
This is the same as:
For
,
re-write
using the matrix
.
Simplify as much as you can.
We finished class today by asking the question:
What if
for
?
What matrix would you multiply
by to get
.
We were able to figure out the pattern pretty easily… but I suggested
a totally different approach to solve the problem using matrix algebra.
First, let
denote the vector with all 1 entries, and let
denote the identity matrix in
.
So we get a nice general formula for the matrix
in any dimension:
.
Wednesday, February 12
Today we introduced least squares regression with a (relatively)
simple in-class example. Here is the link:
In the example, we found the slope and y-intercept of a linear
regression line by minimizing the sum of squared residuals. We did this
two ways: first we used calculus, and then we used linear algebra. The
linear algebra approach is much easier computationally!
To find the y-intercept
and slope
of the least squares regression line for the points
,
,
you solve the normal equation:
where
As long as the columns of
are linearly independent, then
is an invertible matrix and the solution of the normal equation is:
Friday, February 14
Today we dug a little deeper into the linear algebra behind least
squares regression. We started with the following (hard) problem.
Problem: Find coefficients
,
,
and
that give the best fit parabola
for the four points
,
,
,
and
.
Solution: You can use the matrix approach to least
squares regression. Let
Then we just need to solve
to find the coefficients
,
,
and
.
Unfortunately, this system of equations has no solution. So we look for
a least squares solution, that is we try to find a
vector
such that
where
is the closest vector in the column space of
to
.
To do this,
must be orthogonal to the column space of
and that leads to the normal equations:
Definition (The Normal Equation)
If
and
,
then the least squares solution of
can be found by solving
Furthermore, if the columns of
are linearly independent, then
is invertible and
Notice that we are using a vector
instead of the vector
the we used last time. This notation is more convenient especially for
fitting equations with lots of coefficients.
We used Octave/Matlab to solve the least squares problem above by
using the following Octave/Matlab commands:
{matlab} X = [1 -2 4; 1 -1 1; 1 1 1; 1 3 9] y = [3; 0; -1; 2] b = inv(X'*X)*X'*y haty = X*b
Notice that X' is
in Octave/Matlab and inv(X'*X) is
.
(Exercise) How would you adjust the matrix
to find a best fit cubic polynomial of the form
.
Try using Octave/Matlab to find the coefficients.
(Exercise) Graph the solution and notice that it
hits every dot perfect. What about the matrix
changed when you added another column that made this possible?
To understand this a little better, it helps to know about the four
fundamental subspaces of a matrix and about the Fundamental Theorem of
Linear Algebra.
Definition (The Four Fundamental Subspaces of a Matrix)
Every matrix
has four fundamental subspaces:
1. Column Space (aka
Range):
2. Row Space:
3. Null Space:
4. Left Null Space:
The row space and null space are subspaces of the domain
and the range and left null space are subspace of the codomain
.
Important Concept: A matrix
should be understood two ways:
is a rectangular array of numbers with
rows and
columns. (You already know this!)
That’s why it makes sense to call the column space the range of
.
(Exercise) Find the range and null space for the
3-by-3 identity matrix. What about the 3-by-3 matrix
We ran out of time after this example, so next week we’ll see how the
four fundamental subspaces are related by the Fundamental Theorem of Linear Algebra.