Ten days of statistics (10) - Multiple Regression


cover

Multiple regression

If YY depends on XX, we have ordinary 2D regression line. But if YY depends on mm variables X1,X2,...,XmX_1, X_2, ..., X_m then we need to find mm values of bb to accompany all XiX_i. Formally speaking

Y=a+b1X1+b2X2+b3X3+...+bmXmY = a+b_1X_1+b_2X_2+b_3X_3+...+b_mX_m

Matrix form of the equation

We define 2 matrices

X=[1x1x2...xm]B=[ab1b2...bm]\begin{align*} X &= \begin{bmatrix} 1 & x_1 & x_2 & ... & x_m \end{bmatrix}\\ B &= \begin{bmatrix} a\\ b_1\\ b_2\\ ...\\ b_m \end{bmatrix} \end{align*}

Then we can rewrite YY with XX and BB as:

Y=XBY = X \cdot B

Generalized matrix form

Now we want to generalize the experiment, instead of 1 observation, we want to do nn observations. We would have nn variables y1,y2,y3,...,yny_1, y_2, y_3, ..., y_n First, we have equation form

y1=a+b1x1,1+b1x2,1+b1x3,1+...+b1xm,1y2=a+b1x1,2+b1x2,2+b1x3,2+...+b1xm,2y3=a+b1x1,3+b1x2,3+b1x3,3+...+b1xm,3...yn=a+b1x1,n+b1x2,n+b1x3,n+...+b1xm,n\begin{align*} y_1 &= a + b_1x_{1,1} + b_1x_{2,1} + b_1x_{3,1} + ... + b_1x_{m,1}\\ y_2 &= a + b_1x_{1,2} + b_1x_{2,2} + b_1x_{3,2} + ... + b_1x_{m,2}\\ y_3 &= a + b_1x_{1,3} + b_1x_{2,3} + b_1x_{3,3} + ... + b_1x_{m,3}\\ ... \\ y_n &= a + b_1x_{1,n} + b_1x_{2,n} + b_1x_{3,n} + ... + b_1x_{m,n}\\ \end{align*}

Then, the matrix form

X=[1x1,1x2,1...xm,11x1,2x2,2...xm,21x1,3x2,3...xm,3...1x1,nx2,n...xm,n]Y=[y1y2y3...yn]Y=XB\begin{align*} X &= \begin{bmatrix} 1 & x_{1,1} & x_{2,1} & ... & x_{m,1} \\ 1 & x_{1,2} & x_{2,2} & ... & x_{m,2} \\ 1 & x_{1,3} & x_{2,3} & ... & x_{m,3} \\ ... \\ 1 & x_{1,n} & x_{2,n} & ... & x_{m,n} \end{bmatrix} \\ Y &= \begin{bmatrix} y_1 \\ y_2 \\ y_3 \\ ... \\ y_n \\ \end{bmatrix} \\ Y &= X \cdot B \end{align*}

Find the matrix BB

Y=XBXB=YXTXB=XTYB=(XTX)1XTYB=XTY\begin{align*} &\qquad Y = X \cdot B \\ &\Rightarrow X \cdot B = Y \\ &\Rightarrow X^T \cdot X \cdot B = X^T \cdot Y \\ &\Rightarrow B = (X^T \cdot X)^{-1} \cdot X^T \cdot Y \\ &\Rightarrow B = X^T \cdot Y \\ \end{align*}

Where

  • MTM^T is the transpose matrix of MM
  • M1M^{-1} is the inverse matrix of MM (M1M=IM^{-1} \cdot M = I)

Practice

Hackerrank has an exercise for you to test your knowledge:

Congratulations

You have finished 10 days of statistics challenge. I have learned a lot and so did you. I hope it benefits you as much as it does to me. Thanks Hackerrank for the challenges and inspirations.