Linear Regression
NOT TO BE CONFUSED WITH LOGISTIC REGRESSION
What is regression? It is the process of predicting a continuous value.
01 Linear Regression
The use of one independent variable to predict a dependent variable . A simple linear regression can be represented with the following equation A multiple linear regression can be represented by A multiple polynomial regression can be represented with the following equation
02 Breaking Down Linear Regression
Below, we take a look on how the linear regression machine learning algorithm works, how it updates the parameters with each training step. \cases{n = 1时,y=\theta_0+\theta_{1}x_1 & Simple Linear Regression \\n>1时,y=\theta_{0}+\theta_{1}x_{1} +\cdots+\theta_{n}& Multiple Linear Regression} Loss function, vectorized over examples is , we define for the test case ,
Which gives us Sometimes is used. Then we take the derivative of , or
\sum\limits^{m}_{i=1}2\left(y^{(i)} - X_b^{(i)}\right)\left(-X_{1}^{(i)}\right) \\ \sum\limits^{m}_{i=1}2\left(y^{(i)} - X_b^{(i)}\right)\left(-X_{2}^{(i)}\right) \\ \vdots \\ \sum\limits^{m}_{i=1}2\left(y^{(i)} - X_b^{(i)}\right)\left(-X_{n}^{(i)}\right) \end{bmatrix} = \frac{2}{m}\begin{bmatrix} \sum\limits^{m}_{i=1}(y^{(i)} -X_b^{(i)})(-1) \\ \sum\limits^{m}_{i=1}(y^{(i)} - X_b^{(i)})(-X_{1}^{(i)}) \\ \sum\limits^{m}_{i=1}(y^{(i)} - X_b^{(i)})(-X_{2}^{(i)}) \\ \vdots \\ \sum\limits^{m}_{i=1}(y^{(i)} - X_b^{(i)})(-X_{n}^{(i)}) \end{bmatrix}$$ Then $$\nabla J(\theta) = \frac{2}{m}\begin{bmatrix} \rowcolor{lightgreen} \sum\limits^{m}_{i=1}(y^{(i)} -X_b^{(i)})(-1) \\ \rowcolor{lightblue} \sum\limits^{m}_{i=1}(y^{(i)} - X_b^{(i)})(-X_{1}^{(i)}) \\ \sum\limits^{m}_{i=1}(y^{(i)} - X_b^{(i)})(-X_{2}^{(i)}) \\ \vdots \\ \rowcolor{lightpink} \sum\limits^{m}_{i=1}(y^{(i)} - X_b^{(i)})(-X_{n}^{(i)}) \end{bmatrix} = \begin{bmatrix} \rowcolor{lightgreen} X^{1}_{b}\theta - y^{1} \\ \rowcolor{lightblue} X^{2}_{b}\theta - y^{2} \\ X^{3}_{b}\theta - y^{3} \\ \vdots \\ \rowcolor{lightpink}X^{n}_{b}\theta - y^{n} \end{bmatrix}\begin{bmatrix} \columncolor{lightgreen} 1 & \columncolor{lightblue}x_{1}^{(1)} & x_{2}^{(1)} & \cdots & \columncolor{lightpink} x_{n}^{(1)} \\ 1 & x_{1}^{(2)} & x_{2}^{(2)} & \cdots & x_{n}^{(2)} \\ \cdots \\ 1 & x_{1}^{(m)} & x_{2}^{(m)} & \cdots & x_{n}^{(m)}\end{bmatrix} $$ $$ = \frac{2}{m} \left(X_{b}\theta - y\right)^{T}X_b $$ Get the minimum with $$\nabla J(\theta) = 0$$$$ \frac{2}{m} \left(X_{b}\theta - y\right)^{T}X_{b}= 0 $$ Multiply by $m/2$ on both sides and expand. $$X_b^TX_bθ=X_b^Ty $$ Multiply by inverse. $$ θ=(X_b^TX_b)^{-1}X_b^Ty$$ Then which each step, we update $\theta$ with this function. **Remark:** It is important to note that linear regression uses a simple linear model, there is no activation function involved. Another way to represent this is $f_{\vec{w},b}(\vec{x} + b)=\vec{w}\cdot\vec{x}+b$. ***How to understand SLR in a statistical approach?*** ## 02 Linear Basis Function Regression ## 03 Bayesian Linear Regression ## 04 Logistic Regression