In many practical scenarios, we might come across the situation where observations are available on two or more variables like heights and weights of the person ,expenditure on advertisement and sales revenue, Tweet likes and popularity index of person etc.
A natural question arises in mind that is there any connection between popularity of person and number of likes on the tweet.
When two variables are related to each other in such a way that change in one value affects the value of other variable , then variables are said to be correlated like expenditure on advertisement can increase the sales revenue so they are correlated or there is a correlation between these two variables.
There are two types of correlation –
Positive Correlation –Correlation between two variables is said to be positive if the value of one variable increase or decrease other variables also increase or decrease.
Amount of rainfall and yield of crops- If there is a increase in rainfall ,crops yield will also increase
Expenditure on advertising and sales revenue- Sales revenue may increase if increase in
Negative Correlation– Correlation between two variables is said to be negative if the value of one variable increase or decrease then other variable decrease or increase.
Example: Price and demand of goods , Literacy and poverty in a country.
Scatter Plot –
Scatter plot is a statistical diagram that determine the positive and negative correlation.
Scatter diagram gives an idea about the existence of correlation between two variable.
If the values of dependent variable Y are plotted against corresponding values of the independent variable X in the XY plane, such diagram of dots is called scatter diagram or dot diagram.
It is to be noted that scatter diagram is not suitable for large number of observations.
If dots of scatter diagram don’t show any trend, there is no correlation between the variables.
Coefficient of Correlation
Coefficient of correlation measures the intensity or degree of linear relationship between two variables. It was given by Karl Pearson and it is denoted by r.
Correlation coefficient lies between -1 and +1
If X and Y are two independent variables then correlation coefficient(r) between X and Y is zero. corr(x,y)=0
|+1||Perfect Positive Correlation|
|-1||Perfect Negative Correlation|
|0||There is no Correlation|
|0 to .25||Weak Positive Correlation|
|.75 to .1||Strong Positive Correlation|
|-.25 to 0||Weak Negative Correlation|
|-.75 to -1||Strong Negative Correlation|
Coefficient of Determination-
It is a measure which is used in statistical model analysis to assess how well a model explains and predicts future outcomes is known as coefficient of determination.
The coefficient of determination is the measure of variation in the dependent variable that is explained by the regression function.
It is square of correlation of coefficient and denoted by r2
The Coefficient of Determination ranges from 0 to 1.
If coefficient of determination is 0 indicates that the dependent variable can not be predicted from the independent variable, whereas the value of r2 =1 indicates that the dependent variable can be predicted from the independent variable without error.
If r2 is .75, it implies that 75% of the variation in dependent variable (Y) is by the regression function or explained by the independent variable.
If variables are not linearly related and show some curvilinear relationship then correlation coefficient is not a suitable measure to show the extent of relationship. We use Correlation Ratio in this type of cases.
Well, we look forward to meet you in next article.