Nowadays most of the people use either XGBoost or LightGBM or CatBoost to win the competitions at Kaggle or different Hackathons.
XGboost,famous algorithm among kagglers,efficiency not satisfactory when the feature dimension is high and data size is large. LightGBM is powerful algorithm when big data come into the picture.

Why LightGBM?

When the feature dimension are high and data size is large, the efficiency and scalability of other algorithms are still unsatisfactory. Other algorithms need to scan all the data instances to estimate the information gain of all possible split points, which is very time consuming.

There are two novel techniques:Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB)in LightGBM to tackle the problem.

LightGBM speed is more than twenty times in comparison of other algorithms while achieving same accuracy.

What is the significance of GOSS and EFB in LightGBM?


  • It stands for Gradient based One-Side sampling.
  • It is a novel sampling method for GBDT that can achieve a good balance between reducing the number of data instances and keeping the accuracy for learned decision trees.

We can notice that if an instance is associated with a small gradient, the training error for this instance is small and it is already well-trained.
A straightforward idea is to discard those data instances with small gradients. But this may impact the accuracy of model as losing the important information.

GOSS is the solution for it that keeps all the instances of large gradient and performs random sampling on the instances with small gradient.GOSS introduces a constant multiplier for the data instances with small gradients while computing the information gain.

So lightGBM increases the weight of the samples with small gradients when computing their contribution to the change in loss.


  • EFB stands for Exclusive Feature Bundling.
  • It is a method to effectively reduce the number of features.
  • It takes advantage of the sparsity of large datasets.The essential observation behind this method is that the sparsity of features means that some features are never non-zero together.
    For instance, the words “Python” and “Panaroma” might never appear in the same document in the data. This means that these features can be “bundled” into a single feature without losing any information.
  • We can safely bundle exclusive features into a single feature.

How LightGBM works?

LightGBM grows tree vertically meaning that it grows tree leaf-wise.It will choose the leaf with max delta loss to grow. When growing the same leaf, Leaf-wise algorithm can reduce more loss than a level-wise algorithm. Below diagram explain the implementation of LightGBM.

Leaf-wise tree growth

LightGBM contains two novel techniques:GOSS and EFB.

GOSS,Gradient bases One-Side Sampling,keeps all the instances of large gradient and performs random sampling on the instances with small gradient.

EFB,Exclusive Feature Bundling,helps to bundle multiple features into a single feature without losing any information.

Features of LightGBM –

Features of LightGBM


LightGBM Implementation

Please find full implementation here.

Insert math as
Additional settings
Formula color
Text color
Type math using LaTeX
Nothing to preview