Naive Bayes

Evaluation

Two classes Gold labels the human-defined labels for each document that we are trying to match Confusion Matrix To evaluate any system for detecting things, we start by building a Contingency table (Confusion matrix):

2020-08-03

Optimizing for Sentiment Analysis

While standard naive Bayes text classification can work well for sentiment analysis, some small changes are generally employed that improve performance. 💪 Binary multinomial naive Bayes (binary NB) First, for sentiment classification and a number of other text classification tasks, whether a word occurs or not seems to matter more than its frequency.

2020-08-03

Train Naive Bayes Classifiers

Maximum Likelihood Estimate (MLE) In Naive Bayes calculation we have to learn the probabilities $P(c)$ and $P(w_i|c)$ . We use the Maximum Likelihood Estimate (MLE) to estimate them. We’ll simply use the frequencies in the data.

2020-08-03

Naive Bayes Classifiers

Notation Classifier for text classification Input: $d$ (“document”) Output: $c$ (“class”) Training set: $N$ documents that have each been hand-labeled with a class $(d_1, c_1), \dots, (d_N, c_N)$ 🎯 Goal: to learn a classifier that is capable of mapping from a new document $d$ to its correct class $c\in C$

2020-08-03