@@ -123,7 +123,7 @@ only the livable square footage of the home.
123123The linear regression model for this situation is
124124
125125$$
126- \log(\text{price}) = \beta_0 + \beta_1 \text{sqft_living } + \epsilon
126+ \log(\text{price}) = \beta_0 + \beta_1 \text{sqft\_living } + \epsilon
127127$$
128128
129129$\beta_0$ and $\beta_1$ are called parameters (also coefficients or
@@ -132,14 +132,14 @@ that best fit the data.
132132
133133$\epsilon$ is the error term. It would be unusual for the observed
134134$\log(\text{price})$ to be an exact linear function of
135- $\text{sqft_living }$. The error term captures the deviation of
136- $\log(\text{price})$ from a linear function of $\text{sqft_living }$.
135+ $\text{sqft \_ living }$. The error term captures the deviation of
136+ $\log(\text{price})$ from a linear function of $\text{sqft \_ living }$.
137137
138138The linear regression algorithm will choose the parameters that minimize the
139139* mean squared error* (MSE) function, which for our example is written.
140140
141141$$
142- \frac{1}{N} \sum_{i=1}^N \left(\log(\text{price}_i) - (\beta_0 + \beta_1 \text{sqft_living }_i) \right)^2
142+ \frac{1}{N} \sum_{i=1}^N \left(\log(\text{price}_i) - (\beta_0 + \beta_1 \text{sqft\_living }_i) \right)^2
143143$$
144144
145145The output of this algorithm is the straight line (hence linear) that passes as
@@ -218,7 +218,7 @@ Suppose that in addition to `sqft_living`, we also wanted to use the `bathrooms`
218218In this case, the linear regression model is
219219
220220$$
221- \log(\text{price}) = \beta_0 + \beta_1 \text{sqft_living } +
221+ \log(\text{price}) = \beta_0 + \beta_1 \text{sqft\_living } +
222222\beta_2 \text{bathrooms} + \epsilon
223223$$
224224
@@ -227,7 +227,7 @@ We could keep adding one variable at a time, along with a new $\beta_{j}$ coeffi
227227Let's write this equation in vector/matrix form as
228228
229229$$
230- \underbrace{\begin{bmatrix} \log(\text{price}_1) \\ \log(\text{price}_2) \\ \vdots \\ \log(\text{price}_N)\end{bmatrix}}_Y = \underbrace{\begin{bmatrix} 1 & \text{sqft_living }_1 & \text{bathrooms}_1 \\ 1 & \text{sqft_living }_2 & \text{bathrooms}_2 \\ \vdots & \vdots & \vdots \\ 1 & \text{sqft_living }_N & \text{bathrooms}_N \end{bmatrix}}_{X} \underbrace{\begin{bmatrix} \beta_0 \\ \beta_1 \\ \beta_2 \end{bmatrix}}_{\beta} + \epsilon
230+ \underbrace{\begin{bmatrix} \log(\text{price}_1) \\ \log(\text{price}_2) \\ \vdots \\ \log(\text{price}_N)\end{bmatrix}}_Y = \underbrace{\begin{bmatrix} 1 & \text{sqft\_living }_1 & \text{bathrooms}_1 \\ 1 & \text{sqft\_living }_2 & \text{bathrooms}_2 \\ \vdots & \vdots & \vdots \\ 1 & \text{sqft\_living }_N & \text{bathrooms}_N \end{bmatrix}}_{X} \underbrace{\begin{bmatrix} \beta_0 \\ \beta_1 \\ \beta_2 \end{bmatrix}}_{\beta} + \epsilon
231231$$
232232
233233Notice that we can add as many columns to $X$ as we'd like and the linear
0 commit comments