SVM: Kernel Methods | Haobin Tan

SVM: Kernel Methods

Kernel function

Given a mapping function $\phi: \mathcal{X} \rightarrow \mathcal{V}$ , the function

\mathcal{K}: x \rightarrow v, \quad \mathcal{K}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)=\left\langle\phi(\mathbf{x}), \phi\left(\mathbf{x}^{\prime}\right)\right\rangle_{\mathcal{V}}

is called a kernel function.

“A kernel is a function that returns the result of a dot product performed in another space.”

Applying the kernel trick simply means replacing the dot product of two examples by a kernel function.

Kernel Type	Definition
Linear kernel	$k\left(\boldsymbol{x}, \boldsymbol{x}^{\prime}\right)=\left\langle\boldsymbol{x}, \boldsymbol{x}^{\prime}\right\rangle$
Polynomial kernel	$k\left(\boldsymbol{x}, \boldsymbol{x}^{\prime}\right)=\left\langle\boldsymbol{x}, \boldsymbol{x}^{\prime}\right\rangle^{d}$
Gaussian / Radial Basis Function (RBF) kernel	$k \left(\boldsymbol{x}, \boldsymbol{y}\right)=\exp \left(-\frac{\\|\boldsymbol{x}-\boldsymbol{y}\\|^{2}}{2 \sigma^{2}}\right)$

Kernels can be used for all feature based algorithms that can be rewritten such that they contain inner products of feature vectors
- This is true for almost all feature based algorithms (Linear regression, SVMs, …)
Kernels can be used to map the data $\mathbf{x}$ in an infinite dimensional feature space (i.e., a function space)
- The feature vector never has to be represented explicitly
- As long as we can evaluate the inner product of two feature vectors

➡️ We can obtain a more powerful representation than standard linear feature models.

Last updated on 2024-09-05