Classification

Linear Discriminant Analysis (LDA)

Linear Discriminant Analysis (LDA) also called Fisher’s Linear Discriminant reduces dimension (like PCA) but focuses on maximizing seperability among known categories 💡 Idea Create a new axis Project the data onto this new axis in a way to maximize the separation of two categories How it works?

2020-11-07

Linear Discriminant Functions

No assumption about distributions -> non-parametric Linear decision surfaces Begin by supervised training (given class of training data) Linear Discriminant Functions and Decision Surfaces A discriminant function that is a linear combination of the components of $x$ can be written as $$ g(\mathbf{x})=\mathbf{w}^{T} \mathbf{x}+w\_{0} $$ $\mathbf{x}$: feature vector $\mathbf{w}$: weight vector $w\_0$: bias or threshold weight The two category case Decision rule:

2020-11-07

Classification And Regression Tree (CART)

Tree-based Methods CART: Classification And Regression Tree Grow a binary tree At each node, “split” the data into two “daughter” nodes. Splits are chosen using a splitting criterion. Bottom nodes are “terminal” nodes.

2020-10-27

Classification

Assign a class label to imput sample.

2020-09-07

SVM: Kernelized SVM

SVM (with features) Maximum margin principle Slack variables allow for margin violation $$ \begin{array}{ll} \underset{\mathbf{w}}{\operatorname{argmin}} \quad &\|\mathbf{w}\|^{2} + C \sum_i^N \xi_i \\\\ \text { s.t. } \quad & y_{i}\left(\mathbf{w}^{T} \color{red}{\phi(\mathbf{x}_{i})} + b\right) \geq 1 -\xi_i, \quad \xi_i \geq 0\end{array} $$ Math basics Solve the constrained optimization problem: Method of Lagrangian Multipliers

2020-07-13

SVM: Kernel Methods

Kernel function Given a mapping function $\phi: \mathcal{X} \rightarrow \mathcal{V}$, the function $$ \mathcal{K}: x \rightarrow v, \quad \mathcal{K}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)=\left\langle\phi(\mathbf{x}), \phi\left(\mathbf{x}^{\prime}\right)\right\rangle_{\mathcal{V}} $$ is called a kernel function. “A kernel is a function that returns the result of a dot product performed in another space.

2020-07-13

SVM: Basics

🎯 Goal of SVM To find the optimal separating hyperplane which maximizes the margin of the training data it correctly classifies the training data it is the one which will generalize better with unseen data (as far as possible from data points from each category) SVM math formulation Assuming data is linear separable

2020-07-13

Logistic Regression: Probabilistic view

Class label: $$ y_i \in \\{0, 1\\} $$ Conditional probability distribution of the class label is $$ \begin{aligned} p(y=1|\boldsymbol{x}) &= \sigma(\boldsymbol{w}^T\boldsymbol{x}+b) \\\\ p(y=0|\boldsymbol{x}) &= 1 - \sigma(\boldsymbol{w}^T\boldsymbol{x}+b) \end{aligned} $$ with

2020-07-13

Logistic Regression: Basics

💡 Use regression algorithm for classification Logistic regression: estimate the probability that an instance belongs to a particular class If the estimated probability is greater than 50%, then the model predicts that the instance belongs to that class (called the positive class, labeled “1”), or else it predicts that it does not (i.

2020-07-13

K Nearest Neighbors

Classification models.

2020-07-13