Descriptive Statistics Part-2

Earlier we discussed about how mean is calculated and useful for the Statistics and DataScience. Here we will talk more about median, Lets start discussion on it. Median: Median is also important measure of central tendency. It also helps to represent our dataset in a single number. Median represents that Read more…

Performance Measures- Part3

Here we are discussing about ROC -AUC curve(Receiver operating characteristic-Area under curve), how they are using to decide the efficiency of model. ROC-AUC ROC curve is graphical plot that illustrates the diagnostic ability of model. ROC is a probability curve, created by plotting the TPR(True Positive Rate) and FPR(False Positive Read more…

Performance Measures -Part 2

There are other performance measures except the accuracy which we have gone through in earlier blog. Other performance measures are Precision, Recall , F score and ROC. We are going to discuss one by one using the example and use of it. Precision: Of the transaction classified positive fraudulent, how Read more…

Performance Measures -Part 1

When we say that “model is ready”, there should be some techniques or methods to evaluate the readiness of it and to deploy in the production.We are going to learn below techniques to find out the ways to evaluate the machine learning model. Confusion Matrix Accuracy Precision Recall F1 Score Read more…


CatBoost is an algorithm for gradient boosting on decision trees,developed by Yandex researchers and engineers.This is the first Russian machine learning technology that’s an open source.It is widely used within the company for ranking tasks,forecasting and making recommendations.It is universal and can be applied across a wide range of areas Read more…

Light GBM

Nowadays most of the people use either XGBoost or LightGBM or CatBoost to win the competitions at Kaggle or different Hackathons.XGboost,famous algorithm among kagglers,efficiency not satisfactory when the feature dimension is high and data size is large. LightGBM is powerful algorithm when big data come into the picture. Why LightGBM? Read more…


Nowadays most of the people use either XGBoost or LightGBM or CatBoost to win the competitions at Kaggle or different Hackathons. AdaBoost is the starting steps to get in to the world of Boosting. AdaBoost-: AdaBoost,short for Adaptive Boosting, formulated by Yoav Freund and Robert Schapire, who won the 2003 Read more…

Difference between Boosting and Bagging

Ensemble learning needs to be discussed before knowing the difference of Bagging and Boosting. Sometimes, it is not sufficient to depend upon the results of just one machine learning model. Ensemble models combine the predictive power of multiple weak learners. The resultant is a single model which gives the aggregated Read more…


Entropy: It defines the randomness in the data. It helps to find out the root node,intermediate nodes and leaf node to develop the decision tree It is just a metric which measures the impurity. It reaches its minimum (zero) when all cases in the node fall into a single target Read more…

Insert math as
Additional settings
Formula color
Text color
Type math using LaTeX
Nothing to preview