AdaBoost

Adaptive Boosting:

Correct its predecessor by paying a bit more attention to the training instance that the predecessor underfitted. This results in new predictors focusing more and more on the hard cases.

AdaBoost

Pseudocode

  1. Assign observation $i$ the weight for $d\_{1,i}=\frac{1}{n}$ (equal weights)

  2. For $t=1:T$

    1. Train weak learning alg orithm using data weighted by $d\_{ti}$. This produces weak classifier $h\_t$

    2. Choose coefficient $\alpha\_t$ (tells us how good is the classifier is at that round)

    $$ \begin{aligned} \operatorname{Error}\_{t} &= \displaystyle\sum\_{i; h\_{t}\left(x\_{i}\right) \neq y\_{i}} d\_{t} \quad \text{(sum of weights of misclassified points)} \\\\ \alpha\_t &= \frac{1}{2} (\frac{1 - \operatorname{Error}\_{t}}{\operatorname{Error}\_{t}}) \end{aligned} $$
    1. Update weights

      $$ d\_{t+1, i}=\frac{d\_{t, i} \cdot \exp (-\alpha\_{t} y\_{i} h\_{t}\left(x\_{i}\right))}{Z\_{t}} $$
      • $Z\_t = \displaystyle \sum\_{i=1}^{n} d\_{t,i} $: normalization factor

        • If prediction $i$ is correct $\rightarrow y\_i h\_t(x\_i) = 1 \rightarrow $ Weight of observation $i$ will be decreased by $\exp(-\alpha\_t)$
        • If prediction $i$ is incorrect $ \rightarrow y\_i h\_t(x\_i) = -1 \rightarrow $ Weight of observation $i$ will be increased by $\exp(\alpha\_t)$
  3. Output the final classifier

    $ H(x)=\operatorname{sign}\left(\sum\_{t=1}^{T} \alpha\_{t} h\_{t}\left(x\_{i}\right)\right) $

Example

AdaBoost_Eg-00 AdaBoost_Eg-01 AdaBoost_Eg-02 AdaBoost_Eg-03 AdaBoost_Eg-04 AdaBoost_Eg-05 AdaBoost_Eg-06 AdaBoost_Eg-07

Tutorial