Two classes Gold labels the human-defined labels for each document that we are trying to match Confusion Matrix To evaluate any system for detecting things, we start by building a Contingency table (Confusion matrix):
2020-08-03
While standard naive Bayes text classification can work well for sentiment analysis, some small changes are generally employed that improve performance. 💪 Binary multinomial naive Bayes (binary NB) First, for sentiment classification and a number of other text classification tasks, whether a word occurs or not seems to matter more than its frequency.
2020-08-03
Maximum Likelihood Estimate (MLE) In Naive Bayes calculation we have to learn the probabilities $P(c)$ and $P(w_i|c)$. We use the Maximum Likelihood Estimate (MLE) to estimate them. We’ll simply use the frequencies in the data.
2020-08-03
Notation Classifier for text classification Input: $d$ (“document”) Output: $c$ (“class”) Training set: $N$ documents that have each been hand-labeled with a class $(d_1, c_1), \dots, (d_N, c_N)$ 🎯 Goal: to learn a classifier that is capable of mapping from a new document $d$ to its correct class $c\in C$
2020-08-03