The regression models we have seen so far require the determination and selection of features, e.g.
Canonical example: XOR function
(Figure from Raquel Urtasun & Rich Zemel)
Possible choice: $\phi(x_1, x_2) = \cos(\pi(x_1+x_2))$
so that $\phi(0,0)=\phi(1,1)=1$ and $\phi(0,1)=\phi(1,0)=-1$
(E. Greenwood, et al. JAHS 60, 022007 (2015))
(E. Greenwood, et al. JAHS 60, 022007 (2015))
This is why people are interested in neural networks, which can somehow figure out features from raw data.
xx = np.linspace(-5, 5, 100)
_=plt.plot(xx, 1/(1 + np.exp(-1 * xx)), '-g')
xx = np.linspace(-5, 5, 100)
_=plt.plot(xx, (np.exp(xx) - np.exp(-1 * xx))/(np.exp(xx) + np.exp(-1 * xx)), '-g')
Easier to optimize
xx = np.linspace(-5, 5, 101)
_=plt.plot(xx, xx * (xx > 0).astype(np.int), '-g')
For example
[1] Sitzmann, Vincent, et al. Implicit neural representations with periodic activation functions. Arxiv 2006.09661
2-layer MLP with infinite number of hidden units can approximate any functions.
[1] A. Pinkus, Approximation theory of the MLP model in neural networks, Acta numerica, 8(1999), 143-195.
[2] ReLU deep neural networks and linear finite elements arXiv:1807.03973
[3] Why Deep Neural Networks for Function Approximation? arXiv:1610.04161
Another (easier) exercise would be using sigmoid activation functions to do the XOR problem.
plt.plot(xx, yy, 'rs-', label='Target Function')
_=plt.legend()
$h(x;k,b) = ReLU(kx+b)$
plt.plot(xx, yy, 'rs-', label='Target Function')
plt.plot(xx, y2, 'bo-', label='Layer 1')
plt.plot(xx, yy-y2, 'g^--', label='Residual')
_=plt.legend()