线性模型:通过对输入属性进行线性组合构成假设函数 $h(\boldsymbol x)$,进行输出值 $\hat y$ 的预测:
$$ \begin{align*} \hat y = h(\boldsymbol x) &= w_1x_1 + w_2x_2 + \cdots + w_Nx_N + b = \boldsymbol w^{\rm T} \boldsymbol x + b \\ {\rm or} \ \ \hat y = h(\boldsymbol x) &= w_0 + w_1x_1 + w_2x_2 + \cdots + w_Nx_N = \hat {\boldsymbol w}^{\rm T} \boldsymbol x_{+} \\ \end{align*} $$
<aside> 💡 线性回归问题描述
给入一组某高维空间到某一维空间的数对,求解此高维空间到此一维空间的线性变换($\hat y= {\boldsymbol w} ^{\rm T}\boldsymbol x + \boldsymbol b$),让变换后数对在一维空间中的最小二乘误差最小。
</aside>
只考虑输入属性为一维时,线性模型的假设函数 $h(x)$、目标函数为残差平方和代价 ${\mathcal C}(w, b)$、最优解 $(w^, b^)$ 如下:
$$ h(x) = wx + b \\ {\mathcal C}(w, b) = \sum_{m=1}^{M}\left( h(x^{(i)}) - y^{(i)}\right)^2 = \sum_{m=1}^{M}\left( wx^{(i)} + b - y^{(i)}\right)^2 \\ (w^, b^) = \argmin_{(w, b)} {\mathcal C}(w, b) $$
对代价 ${\mathcal C}(w, b)$ 求一阶偏导,有:
$$ \begin{align*}
\frac{\partial {\mathcal C} (w, b)}{\partial w}
&= \sum_{m=1}^{M} 2 \left( wx^{(i)} + b - y^{(i)}\right) x^{(i)} \\
&= 2 w \sum_{m=1}^{M} x^{(i)2} + 2b \sum_{m=1}^{M} x^{(i)} -2 \sum_{m=1}^{M} x^{(i)} y^{(i)} \\
\frac{\partial {\mathcal C} (w, b)}{\partial b}
&= \sum_{m=1}^{M} 2 \left( wx^{(i)} + b - y^{(i)}\right) \\
&= 2 w \sum_{m=1}^{M} x^{(i)} + 2bM -2 \sum_{m=1}^{M} y^{(i)} \\
\end{align*} $$
求二阶偏导可知 ${\mathcal C}(w, b)$ 为严格下凸函数,如存在极小值点 $(w_0, b_0)$ ,其必为最小值点 $(w^, b^)$ :
$$ \frac{\partial^2 {\mathcal C} (w, b)}{\partial w^2} = 2 \sum_{m=1}^{M} x^{(i)2} > 0, \ \ \ \frac{\partial^2 {\mathcal C} (w, b)}{\partial b^2} = 2M > 0 $$
因而对于解 $(w^, b^)$ 有:
$$ \left.\frac{\partial {\mathcal C} (w, b)}{\partial w} \right|{{\rm at} \ (w^, b^)} = 0, \ \ \ \left.\frac{\partial {\mathcal C} (w, b)}{\partial b} \right|{{\rm at} \ (w^, b^)} = 0 $$
解得闭式解:
$$ w^* = \frac{ \sum_{m=1}^{M} x^{(i)} \sum_{m=1}^{M} y^{(i)} - M \sum_{m=1}^{M} x^{(i)} y^{(i)} }{\sum_{m=1}^{M} x^{(i)}\sum_{m=1}^{M} x^{(i)} - M \sum_{m=1}^{M} x^{(i)2}}, \\
b ^* = \overline y - w^* \overline x $$
闭式解(又称解析解)指的是可以由一个函数表达式得到的解。