Ensemble learning needs to be discussed before knowing the difference of Bagging and Boosting.
Sometimes, it is not sufficient to depend upon the results of just one machine learning model.
Ensemble models combine the predictive power of multiple weak learners. The resultant is a single model which gives the aggregated output from several models.Ensemble also helps to minimize the noise,bias and variance.
Ensemble learning combines several decision trees models to give better performance. The main principle behind the ensemble model is that a group of weak learners come together to form a strong learner.
There are two techniques of Ensemble Learning:
Concept is to create several subsets of data from training sample chosen randomly with replacement and chosen features are to be a subset (m) of all the independent columns(M).
Now, each collection of subset data is used to train their decision trees. As a result, we end up with an ensemble of different models.
Now there is a voting process or average results taken from all the models which is more reliable or stable than a single decision tree model.
Random Forest algorithm implements the Bagging technique.
In this technique, learners are learned sequentially with early learners fitting simple models to the data and then analysing data for errors.
By adding models sequentially, the errors of the previous model are corrected by the next predictor, until the training data is accurately predicted or reproduced by the model.
In other words, the goal is to reduce the error generating from previous tree.
The different types of boosting algorithm are:
Which is the best , Bagging and Boosting?
There is no winner here, it all depends on the situation or data.
If the problem is that the single model gets a very low performance, Boosting may help you.
If the problem is of the single model is over-fitting, then Bagging is the best option.