We say that a function $f$ is convex if, for any distinct pair of points $x,y$ we have $$f\left(\frac{x + y}{2}\right) \leq \frac{f(x)}{2} + \frac{f(y)}{2}$$
Illustration
Assume $f$ is some function, and $C \subset \mathbb{R}^n$ is some set. The following is an optimization problem: $$ \begin{array}{ll} \mbox{minimize} & f(x) \\ \mbox{subject to} & x \in C \end{array} $$
Neural Networks
)x = np.linspace(-1,1,51)
X, Y = np.meshgrid(x, x)
F = X**2+Y**2+1
f = plt.figure()
c = plt.contour(X, Y, F, 10)
plt.colorbar(c)
plt.plot(0,0,'bo')
_=plt.text(0.1,0,'x*',fontsize=12)
Here the set $C := \{ x : g_i(x) \leq 0, i=1, \ldots, m \}$
Two cases:
Without loss of generality, consider one constraint: $$ \begin{array}{ll} \mbox{minimize} & f(x) \\ \mbox{subject to} & g(x) = 0 \end{array} $$
Note that at minimizer $$ \nabla f \parallel \nabla g,\quad \rightarrow\quad \nabla f + \lambda\nabla g = 0,\quad \lambda\neq 0 $$
Convert to an unconstrained problem using Lagrangian $$ \begin{array}{ll} \mbox{minimize} & L(x,\lambda) = f(x) + \lambda g(x) \end{array} $$
The solution is found by $$ \begin{array}{ll} \nabla_x L(x,\lambda) &= 0 \\ \nabla_\lambda L(x,\lambda) &= 0 \end{array} $$
Lagrangian: $L(x,\lambda)=(x_1^2+x_2^2+1)+\lambda(-1/2+x_1+x_2)$
Minimizer: $x^*=[1/4,1/4]$.
g = 0.5-x
f = plt.figure()
c = plt.contour(X, Y, F, 10)
plt.colorbar(c)
plt.plot(x, g, 'r--')
plt.plot(0.25,0.25,'bo')
plt.text(0.15,0.15,'x*',fontsize=12)
_=plt.ylim([-1,1])
Lagrangian: $L(x,\lambda)=(x_1^2+x_2^2+1)+\lambda(-1/2+x_1+x_2)$
Minimizer: $x^*=[0,0]$.
h = -1*np.ones_like(x)
f = plt.figure()
c = plt.contour(X, Y, F, 10)
plt.colorbar(c)
plt.fill_between(x, h, g, color='g', alpha=0.3)
plt.plot(0.0,0.0,'bo')
plt.text(0.1,0.0,'x*',fontsize=12)
_=plt.ylim([-1,1])
Using vanilla SciPy optimizer
# Function to be minimized
def fun(x):
return x[0]**2 + x[1]**2 + 1
# Constraint function
# BE CAREFUL: For inequalities, non-negative values are assumed
def cnst(x):
return 0.5 - x[0] - x[1]
# Initial guess
x0 = [2.0, 2.0]
# Unconstrained problem
# Note the number of evaluations of the function and its jacobian
sol0 = minimize(fun, x0)
print(sol0)
fun: 1.000000000000016 hess_inv: array([[ 0.75000001, -0.24999999], [-0.24999999, 0.75000001]]) jac: array([-1.63912773e-07, -1.63912773e-07]) message: 'Optimization terminated successfully.' nfev: 9 nit: 2 njev: 3 status: 0 success: True x: array([-8.93115557e-08, -8.93115557e-08])
# Equality constrained problem
c1 = {'type': 'eq', 'fun': cnst}
sol1 = minimize(fun, x0, constraints=(c1,))
print(sol1)
fun: 1.125 jac: array([0.5, 0.5]) message: 'Optimization terminated successfully' nfev: 6 nit: 2 njev: 2 status: 0 success: True x: array([0.25, 0.25])
# Inequality constrained problem
c2 = {'type': 'ineq', 'fun': cnst}
sol2 = minimize(fun, x0, constraints=(c2,))
print(sol2)
fun: 1.0 jac: array([1.49011612e-08, 1.49011612e-08]) message: 'Optimization terminated successfully' nfev: 7 nit: 2 njev: 2 status: 0 success: True x: array([0., 0.])
. $$ \begin{array}{ll} \mbox{minimize} & f(x) \\ \mbox{subject to} & g_i(x) \leq 0, \quad i = 1, \ldots, m \end{array} $$