Regularized Linear Regression
The cost function for regularized linear regression is:
The gradient descent algorithm for regularized linear regression is:
After some manipulation, this can be written as:
In the algorithm step, for the first term will always be less than 1 since . Intuitively we can see it as reducing the value of by some amount on every update. Notice that the second term is now exactly the same as it was before.
Now let's use regularization for the non-iterative normal equation method.
To add in regularization, the equation is the same as our original, except that we add another term inside the parentheses:
Intuitively, this is the identity matrix (though we are not including ), multiplied with a single real number .
Recall that in some cases (e.g., ), could be non-invertible. However, it can be shown that when we add the term , then is always invertible.