<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Ensemble Learning | Haobin Tan</title><link>https://haobin-tan.netlify.app/tags/ensemble-learning/</link><atom:link href="https://haobin-tan.netlify.app/tags/ensemble-learning/index.xml" rel="self" type="application/rss+xml"/><description>Ensemble Learning</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Sat, 07 Nov 2020 00:00:00 +0000</lastBuildDate><image><url>https://haobin-tan.netlify.app/media/icon_hu7d15bc7db65c8eaf7a4f66f5447d0b42_15095_512x512_fill_lanczos_center_3.png</url><title>Ensemble Learning</title><link>https://haobin-tan.netlify.app/tags/ensemble-learning/</link></image><item><title>Ensemble Learning</title><link>https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/</link><pubDate>Mon, 07 Sep 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/</guid><description/></item><item><title>Why ensemble learning?</title><link>https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/why-ensemble-learning/</link><pubDate>Sat, 07 Nov 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/why-ensemble-learning/</guid><description>&lt;p>&lt;strong>wisdom of the crowd&lt;/strong> : In many cases you will find that this aggregated answer is better than an expert’s answer.&lt;/p>
&lt;p>Similarly, if you aggregate the predictions of a group of predictors (such as classifiers or regressors), you will often get &lt;strong>better&lt;/strong> predictions than with the best individual predictor.&lt;/p>
&lt;p>A group of predictors is called an &lt;strong>ensemble&lt;/strong>;&lt;/p>
&lt;p>thus, this technique is called &lt;strong>Ensemble Learning&lt;/strong>,&lt;/p>
&lt;p>and an Ensemble Learning algorithm is called an &lt;strong>Ensemble method&lt;/strong>.&lt;/p>
&lt;p>Popular Emsemble methods:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/bagging-and-pasting/">Bagging and Pasting&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/boosting/">Boosting&lt;/a>&lt;/li>
&lt;li>stacking&lt;/li>
&lt;li>&lt;a href="https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/voting-classifier/">Voting Classifier&lt;/a>&lt;/li>
&lt;/ul></description></item><item><title>Voting Classifier</title><link>https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/voting-classifier/</link><pubDate>Sat, 07 Nov 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/voting-classifier/</guid><description>&lt;p>Suppose we have trained a few classifiers, each one achieving about 80% accuracy.&lt;/p>
&lt;p>A very simple way to create an even better classifier is to aggregate the predictions of each classifier and predict the class that gets the &lt;strong>most&lt;/strong> votes.&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/Voting_Classifier.png" alt="Voting_Classifier" style="zoom:67%;" />
&lt;p>This majority-vote classifier is called a &lt;strong>hard voting classifier&lt;/strong>&lt;/p>
&lt;blockquote>
&lt;p>Surprisingly, this voting classifier often achieves a higher accuracy than the best classifier in the ensemble. In fact, even if each classifier is a weak learner (meaning it does only slightly better than random guessing), the ensemble can still be a strong learner (achieving high accuracy), provided there are a sufficient number of weak learners and they are sufficiently diverse. (Reason behind: the law of large numbers)&lt;/p>
&lt;/blockquote>
&lt;p>&lt;strong>Ensemble methods work best when the predictors are as independent from one another as possible.&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>One way to get diverse classifiers is to &lt;strong>train them using very different algorithms.&lt;/strong> This increases the chance that they will make very different types of errors, improving the ensemble’s accuracy.&lt;/li>
&lt;li>Another approach is to use the &lt;strong>same&lt;/strong> training algorithm for every predictor, but to train them on different random subsets of the training set. (See &lt;a href="https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/bagging-and-pasting/">Bagging and Pasting&lt;/a>)&lt;/li>
&lt;/ul></description></item><item><title>Random Forest</title><link>https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/random-forest/</link><pubDate>Sat, 07 Nov 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/random-forest/</guid><description>&lt;img src="https://i.stack.imgur.com/iY55n.jpg" style="zoom:80%; background-color:white">
&lt;p>Train a group of Decision Tree classifiers (generally via the bagging method (or sometimes pasting)), each on a different random subset of the training set&lt;/p>
&lt;p>To make predictions, just obtain the preditions of all individual trees, then predict the class that gets the &lt;strong>most&lt;/strong> votes.&lt;/p>
&lt;h2 id="why-is-random-forest-good">Why is Random Forest good?&lt;/h2>
&lt;p>The Random Forest algorithm &lt;strong>introduces extra randomness&lt;/strong> when growing trees; instead of searching for the very best feature when splitting a node, it searches for the best feature among a random subset of features. &lt;strong>This results in a greater tree diversity, which (once again) trades a higher bias for a lower variance, generally yielding an overall better model.&lt;/strong> 👏&lt;/p></description></item><item><title>Ensemble Learners</title><link>https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/ensemble-learners/</link><pubDate>Sat, 07 Nov 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/ensemble-learners/</guid><description>
&lt;div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="allowfullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/Un9zObFjBH0?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"
>&lt;/iframe>
&lt;/div>
&lt;h2 id="why-emsemble-learners">Why emsemble learners?&lt;/h2>
&lt;p>Lower error&lt;/p>
&lt;ul>
&lt;li>Each learner (model) has its own bias. It we put them together, the bias tend to be reduced (they fight against each other in some sort of way)&lt;/li>
&lt;li>Less overfitting&lt;/li>
&lt;li>Tastes great&lt;/li>
&lt;/ul></description></item><item><title>Boosting</title><link>https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/boosting/</link><pubDate>Sat, 07 Nov 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/boosting/</guid><description>&lt;h1 id="boosting">Boosting&lt;/h1>
&lt;p>Refers to any Ensemble method that can &lt;strong>combine serval weak learners into a strong learner&lt;/strong>&lt;/p>
&lt;p>💡 &lt;strong>General idea: train predictors sequentially, each trying to correct its predecessor.&lt;/strong>&lt;/p>
&lt;p>Popular boosting methods:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/adaboost/">AdaBoost&lt;/a>&lt;/li>
&lt;li>Gradient Boost&lt;/li>
&lt;/ul></description></item><item><title>Bagging and Pasting</title><link>https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/bagging-and-pasting/</link><pubDate>Sat, 07 Nov 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/bagging-and-pasting/</guid><description>&lt;h2 id="tldr">TL;DR&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Bootstrap Aggregating (Boosting): Sampling &lt;strong>with&lt;/strong> replacement&lt;/p>
&lt;p>
&lt;figure >
&lt;div class="flex justify-center ">
&lt;div class="w-100" >&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/Boostrap_Aggregating.png" alt="Boostrap_Aggregating" loading="lazy" data-zoomable />&lt;/div>
&lt;/div>&lt;/figure>
&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Pasting: Sampling &lt;strong>without&lt;/strong> replacement&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="explaination">Explaination&lt;/h2>
&lt;p>Ensemble methods work best when the predictors are as independent from one another as possible.&lt;/p>
&lt;p>One way to get a diverse set of classifiers: &lt;strong>use the same training algorithm for every predictor, but to train them on different random subsets of the training set&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Sampling &lt;strong>with&lt;/strong> replacement: &lt;strong>boostrap aggregating (Bagging)&lt;/strong>&lt;/li>
&lt;li>Sampling &lt;strong>without&lt;/strong> replacement: &lt;strong>pasting&lt;/strong>&lt;/li>
&lt;/ul>
&lt;p>Once all predictors are trained, the ensemble can make a prediction for a new instance by simply aggregating the predictions of all predictors. The aggregation function is typically the &lt;strong>statistical mode&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>classification: the most frequent prediction (just like a hard voting classifier)&lt;/li>
&lt;li>regression: average&lt;/li>
&lt;/ul>
&lt;p>Each individual predictor has a higher bias than if it were trained on the original training set, but aggregation reduces both bias and variance. 👏&lt;/p>
&lt;p>Generally, the net result is that the ensemble has a &lt;strong>similar bias but a lower variance&lt;/strong> than a single predictor trained on the original training set.&lt;/p>
&lt;p>##Advantages of Bagging and Pasting&lt;/p>
&lt;ul>
&lt;li>Predictors can all be trained in parallel, via different CPU cores or even different servers.&lt;/li>
&lt;li>Predictions can be made in parallel.&lt;/li>
&lt;/ul>
&lt;p>-&amp;gt; They scale very well 👍&lt;/p>
&lt;h2 id="bagging-vs-pasting">Bagging vs. Pasting&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Bootstrapping introduces a bit more diversity in the subsets that each predictor is trained on, so bagging ends up with a &lt;strong>slightly&lt;/strong> &lt;strong>higher bias&lt;/strong> than pasting, but this also means that predictors end up being &lt;strong>less correlated&lt;/strong> so the ensemble’s variance is reduced.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Overall, bagging often results in better models&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>However, if you have spare time and CPU power you can use cross- validation to evaluate both bagging and pasting and select the one that works best.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="out-of-bag-evaluation">Out-of-Bag Evaluation&lt;/h2>
&lt;p>With bagging, some instances may be sampled several times for any given predictor, while others may not be sampled at all. This means that only about 63% of the training instances are sampled on average for each predictor.&lt;/p>
&lt;p>The remaining 37% of the training instances that are not sampled are called &lt;strong>out-of-bag (oob) instances.&lt;/strong> Note that they are &lt;strong>not the same 37%&lt;/strong> for all predictors.&lt;/p>
&lt;p>Since a predictor never sees the oob instances during training, it can be evaluated on these instances, without the need for a separate validation set. You can evaluate the ensemble itself by averaging out the oob evaluations of each predictor.&lt;/p></description></item><item><title>AdaBoost</title><link>https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/adaboost/</link><pubDate>Sat, 07 Nov 2020 00:00:00 +0000</pubDate><guid>https://haobin-tan.netlify.app/docs/ai/machine-learning/ensemble-learning/adaboost/</guid><description>&lt;p>&lt;strong>Ada&lt;/strong>ptive &lt;strong>Boost&lt;/strong>ing:&lt;/p>
&lt;p>Correct its predecessor by paying a bit more attention to the training instance that the predecessor underfitted. This results in new predictors focusing more and more on the hard cases.&lt;/p>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/AdaBoost.png" alt="AdaBoost" style="zoom:80%;" />
&lt;h2 id="pseudocode">Pseudocode&lt;/h2>
&lt;ol>
&lt;li>
&lt;p>Assign observation $i$ the weight for $d\_{1,i}=\frac{1}{n}$ (equal weights)&lt;/p>
&lt;/li>
&lt;li>
&lt;p>For $t=1:T$&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Train weak learning alg orithm using data weighted by $d\_{ti}$. This produces weak classifier $h\_t$&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Choose coefficient $\alpha\_t$ (tells us how good is the classifier is at that round)&lt;/p>
&lt;/li>
&lt;/ol>
$$
\begin{aligned}
\operatorname{Error}\_{t} &amp;= \displaystyle\sum\_{i; h\_{t}\left(x\_{i}\right) \neq y\_{i}} d\_{t} \quad \text{(sum of weights of misclassified points)} \\\\
\alpha\_t &amp;= \frac{1}{2} (\frac{1 - \operatorname{Error}\_{t}}{\operatorname{Error}\_{t}})
\end{aligned}
$$
&lt;ol start="3">
&lt;li>
&lt;p>Update weights
&lt;/p>
$$
d\_{t+1, i}=\frac{d\_{t, i} \cdot \exp (-\alpha\_{t} y\_{i} h\_{t}\left(x\_{i}\right))}{Z\_{t}}
$$
&lt;ul>
&lt;li>
&lt;p>$Z\_t = \displaystyle \sum\_{i=1}^{n} d\_{t,i} $: &lt;strong>normalization factor&lt;/strong>&lt;/p>
&lt;blockquote>
&lt;ul>
&lt;li>If prediction $i$ is correct $\rightarrow y\_i h\_t(x\_i) = 1 \rightarrow $ Weight of observation $i$ will be decreased by $\exp(-\alpha\_t)$&lt;/li>
&lt;li>If prediction $i$ is incorrect $ \rightarrow y\_i h\_t(x\_i) = -1 \rightarrow $ Weight of observation $i$ will be increased by $\exp(\alpha\_t)$&lt;/li>
&lt;/ul>
&lt;/blockquote>
&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ol>
&lt;/li>
&lt;li>
&lt;p>Output the final classifier&lt;/p>
&lt;p>$
H(x)=\operatorname{sign}\left(\sum\_{t=1}^{T} \alpha\_{t} h\_{t}\left(x\_{i}\right)\right)
$&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h2 id="example">Example&lt;/h2>
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/AdaBoost_Eg-00.png" alt="AdaBoost_Eg-00" style="zoom:50%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/AdaBoost_Eg-01.png" alt="AdaBoost_Eg-01" style="zoom:50%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/AdaBoost_Eg-02.png" alt="AdaBoost_Eg-02" style="zoom:50%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/AdaBoost_Eg-03.png" alt="AdaBoost_Eg-03" style="zoom:50%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/AdaBoost_Eg-04.png" alt="AdaBoost_Eg-04" style="zoom:50%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/AdaBoost_Eg-05.png" alt="AdaBoost_Eg-05" style="zoom:50%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/AdaBoost_Eg-06.png" alt="AdaBoost_Eg-06" style="zoom:50%;" />
&lt;img src="https://raw.githubusercontent.com/EckoTan0804/upic-repo/master/uPic/AdaBoost_Eg-07.png" alt="AdaBoost_Eg-07" style="zoom:50%;" />
&lt;h2 id="tutorial">Tutorial&lt;/h2>
&lt;div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
&lt;iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen="allowfullscreen" loading="eager" referrerpolicy="strict-origin-when-cross-origin" src="https://www.youtube.com/embed/-DUxtdeCiB4?autoplay=0&amp;controls=1&amp;end=0&amp;loop=0&amp;mute=0&amp;start=0" style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;" title="YouTube video"
>&lt;/iframe>
&lt;/div></description></item></channel></rss>