Support Vector Machine

The goal of support vector machines is to find the line that maximizes the minimum distance to the line.

The optimal margin classifier $h$ is such that:

\[ h(x) = sign(w^{T}x - b) \]

Objective function

\[ \mathop{\min}_{\theta} C\ [\sum^{m}_{i = 1} y^{(i)} cost_{1}(\theta^{T}f^{i}) + (1 - y^{(i)})cost_{0}(\theta^{T}f^{i})] + \frac{1}{2} \sum^{m}_{j=1}\theta_{j} ^2 \]

\[ C = \frac{1}{\lambda} \]

Large C: Lower bias, higher variance.
Smaller C: Higher bias, lower bias

Kernel

In practice, the kernel $K$ defined by $K(x, z) = exp(- \frac {||x - z||^2}{2 \sigma ^2})$ is called the Gaussian kernel and is commonly used.

Last updated on Dec 3, 2019