Entropy:

  • It defines the randomness in the data.
  • It helps to find out the root node,intermediate nodes and leaf node to develop the decision tree
  • It is just a metric which measures the impurity.
  • It reaches its minimum (zero) when all cases in the node fall into a single target and maximum when half cases in to each targets.

Entropy can be calculated using below formula:
Fig-1 Entropy

In the above graph , H(X) is entropy and it is maximum when probability is 0.5 and minimum(zero) when probability is either 0 or 1.

Que-How to calculate the Entropy if there are Ten records in the dataset?

Fig-2 Calculate the Entropy

Information Gain:

  • It measures the reduction in Entropy.
  • Higher the Entropy reduces, Higher the Information Gain.
  • Decides which node should be selected as root or intermediate node in decision tree.

Information Gain can be calculated using below formula:
Information Gain=Entropy(S) – [(Weighted Avg)* Entropy(each feature)]

We are going to discuss about these below terms in the content and prepare the decision tree:

Entropy– We already discussed the use of Entropy and formula to calculate it. Decision tree nodes are splitted until we get minimum entropy or zero entropy. So entropy is calculated in each iteration and the node will be treated as leaf when entropy is lowest or zero.
Information– (Weighted average)*(Entropy of feature).
Information Gain- Reduction in the entropy. It also decides which node should be treated as root node or intermediate node.

Let us calculate the Entropy and Information Gain for the below dataset:

Fig-3 Dataset

There are three independent columns(Outlook,Humidity,Wind) and target variable(play) columns available in the dataset.

Que-How to find which feature is root or intermediate node and prepare the decision tree to decide whether child will play outside or not?

To decide the root node :
1: Finding the Entropy for whole dataset:

We have Fourteen rows in the dataset in which Nine are “Yes” records and Five are “No” records. If I say the probability of “Yes”, it will be 9/14 and similarly probability of “No” will be 5/14 .


Here probabilities of Yes and No are put into the Entropy formula:

E(S)=-P(Yes)* log P(Yes)-P(No)*log P(No)
E(S)= -(9/14)*log (9/14) -(5/14)*log (5/14)
E(S)= 0.94

Now we are going to calculate the gini index or impurity of each individual features (Wind,Humidity,Outlook)

Outlook feature: There are three categories Sunny, Overcast ,Rain exist in the feature and below are counts of the categories in the dataset.
Sunny: 5
Overcast: 4
Rain: 5
Further these categories are to be broken respective of target variable and calculate the entropy and information gain as per below:

Fig-4 Calculate Entropy for Outlook

Wind feature: There are two categories Strong, Weak exist in the feature and below are counts of the categories.
Strong:6
Weak:8

Fig-5 Calculate Entropy for Wind

Humidity feature: There are two categories High,Normal exist in the feature and below are counts of the categories.
High:7
Normal:7

Fig-6 Entropy for Humidity
  • Entropy of whole dataset is already known ->0.94.
  • Information of each features are also known->0.699,0.788,0.892.
  • Information Gain is the difference of Entropy and Information of each feature.
Fig-7 Information and Information Gain for each feature

Highest Information Gain should be chosen as the root node of the Decision Tree. Here Outlook feature must be chosen as a root node. We will again reiterate the above approach to find out the intermediate and leaf nodes. So final tree will be looked like :

Fig-8 Decision Tree

14 Comments

Hairstyles Cool · August 3, 2019 at 5:44 am

I do consider all of the ideas you’ve offered for your post. They are really convincing and can certainly work. Nonetheless, the posts are too quick for beginners. Could you please prolong them a little from subsequent time? Thanks for the post.

    admin · August 4, 2019 at 4:42 am

    We are in the phase to improve the article,btw thanks a lot.

    admin · August 4, 2019 at 4:44 am

    We are improving the article day by day, and thanks a lot for your feedback.

plenty of fish dating site · November 12, 2019 at 8:35 am

Excellent article. Keep writing such kind of information on your
site. Im really impressed by your blog.
Hey there, You have done an incredible job. I’ll certainly digg it and individually recommend to my friends.
I’m sure they’ll be benefited from this site.

of coconut oil · November 16, 2019 at 4:34 am

I am genuinely delighted to read this blog posts which consists of plenty
of useful facts, thanks for providing these
kinds of information.

http://tinyurl.com/quest-bars-cheap-85497 · November 21, 2019 at 10:58 am

You actually make it appear really easy along with your presentation but I find this matter to be really one thing which I feel I’d by no means understand.
It sort of feels too complex and extremely extensive
for me. I am having a look ahead to your subsequent publish, I will attempt to get the cling of it!

http://tinyurl.com/vuwj2lf · November 21, 2019 at 8:03 pm

Hi there to all, how is the whole thing, I think every one is getting more from this site, and your views are
fastidious in favor of new viewers.

Sherryl · February 14, 2020 at 9:02 pm

You really make it seem so easy with your presentation but I find this
matter to be actually something that I think I would
never understand. It seems too complex and very broad for me.
I am looking forward for your next post, I will try
to get the hang of it!

Dog Grooming Manhattan · February 14, 2020 at 10:17 pm

It’s an awesome paragraph for all the online people; they will get benefit from it I
am sure.

Leonora · February 17, 2020 at 2:53 am

There’s definately a great deal to learn about this topic.
I love all of the points you have made.

Smith · June 15, 2020 at 1:02 pm

Hi there to every body, it’s my first go to see of this weblog; this weblog includes amazing
and truly excellent information in favor of readers.

gyan · June 16, 2020 at 12:07 am

Nice answers in return of this question with genuine arguments and telling the whole thing on the topic of that.

Tina · June 17, 2020 at 2:10 am

Appreciation to my father who told me on the topic of this
webpage, this website is genuinely remarkable.

Rashik · June 17, 2020 at 6:55 am

Tremendous issues here. I’m very satisfied to see your post.
Thank you a lot and I am taking a look forward to contact you.
Will you kindly drop me a e-mail?

Leave a Reply

Your email address will not be published. Required fields are marked *

Insert math as
Block
Inline
Additional settings
Formula color
Text color
#333333
Type math using LaTeX
Preview
\({}\)
Nothing to preview
Insert