Support Vector Machine
The goal of support vector machines is to find the line that maximizes the minimum distance to the line.
The optimal margin classifier $h$ is such that:
\[ h(x) = sign(w^{T}x - b) \]
Objective function
\[ \mathop{\min}_{\theta} C\ [\sum^{m}_{i = 1} y^{(i)} cost_{1}(\theta^{T}f^{i}) + (1 - y^{(i)})cost_{0}(\theta^{T}f^{i})] + \frac{1}{2} \sum^{m}_{j=1}\theta_{j} ^2 \]
\[ C = \frac{1}{\lambda} \]
- Large C: Lower bias, higher variance.
- Smaller C: Higher bias, lower bias
Kernel
In practice, the kernel $K$ defined by $K(x, z) = exp(- \frac {||x - z||^2}{2 \sigma ^2})$ is called the Gaussian kernel and is commonly used.