
Multiple Linear Regression
Now that we've completed simple linear regression, we can move on to multiple linear regression, or in other words, linear regression with multiple independent variables.
Definition
Suppose we are given a matrix that holds values of independent variables.
is of size
n x d
, where is the number of data points we have, and
is the number of independent variables we have:
We are also given a vector of targets
:
Each row in corresponds to a single element in
. Thus, we can hypothesize the following relationship between a single row in
and a single element in
:
where:
is the intercept, or the bias
is the set of
regression coefficients, or the weights of
with respect to
is the error in our prediction
For convenience, we can define a matrix that is constructed by padding a column of ones to the left-hand side of
:
Note that is of size
n x (d + 1)
.
We also define vector as the series of weights:
Note that is a vector of length
.
Now, we can simplify our hypothesis in matrix form:
where is a vector of
of length
:
The goal of regression is finding the correct values for .
Loss Function
Similar to the case of simple linear regression, we need a loss function to evaluate how good/bad our solution for
is. And once again, we will use the squared-loss:
where is the L2-norm:
Minimizing Loss Function
Now that we have our loss function, we can work on optimizing to minimize the loss:
Note that both and
are scalar and equal. Thus we can combine them:
Now we can take the partial derivative of the loss function with respect to :
Setting the partial derivative to zero, we get:
And that's it! The optimal value for is the one that satisfies this equation.
Performance Metric
Now that we have a closed-form solution for linear regression, how do we evaluate it? Once again, similar to what we did for simple linear regression, we can compute the value, also known as the coefficient of determination:
where:
is the mean of
is the residual sum of squares, or the amount of unexplained variation
is the total sum of squares, or the total amount of variation
Exercise
Write a program that performs multiple linear regression. Complete the fit()
, predict()
, and score()
methods:
fit()
takes in arraysX
andy
, and computes and stores the weightsw
predict()
takes in arrayX
and returns a predicted arrayy_pred
using the stored weightsw
score()
takes in arraysX
andy
, predicts they_pred
values usingX
and the stored weightsw
, and then usesy_pred
andy
arrays to compute and return theperformance metric
Sample Test Cases
Test Case 1
Input:
[
[[ 1],
[ 4],
[ 3]],
[ 2, 9, 6],
[[ 2],
[ 6],
[-2]],
[ 4,16,-5]
]
Output:
w = [-0.429, 2.286]
y_pred = [4.143, 13.286, -5]
r_squared = 0.9667
Test Case 2
Input:
[
[[ 2, 4],
[ 3, 7],
[ 5, 2]],
[ 1, 5, 3],
[[ 0,-1],
[ 4, 3]],
[-4, 2]
]
Output:
w = [-5.182, 1.273, 0.909]
y_pred = [-6.091 2.636]
r_squared = 0.735