In many practical scenarios, we might come across the situation where observations are available on two or more variables like heights and weights of the person ,expenditure on advertisement and sales revenue, tweet likes and popularity index of person etc.

A natural question arises in mind that is there any connection between popularity of person and number of likes on the tweet.

When two variables are related to each other in such a way that change in the value of one variable affects the value of another, variables are said to be correlated like expenditure on advertisement can increase the sales revenue so they are correlated.

There are two types of correlation :

Positive CorrelationCorrelation between two variables is said to be positive if increase or decrease in value of one variable causes increase or decrease in the value of another variable respectively.

Example:

If there is increase in rainfall, crops yield will also increase.

Negative CorrelationCorrelation between two variables is said to be negative if increase in the value of one variable decreases the value of another variable and vice versa.

Example: Literacy and poverty in a country. As literacy rate increases, poverty rate falls down.

Scatter Plot

Scatter plot is a statistical diagram that determine the positive and negative correlation.

Scatter diagram gives an idea about the existence of correlation between two variable.

If the values of dependent variable Y are plotted against corresponding values of the independent variable X in the XY plane, such diagram of dots is called scatter diagram or dot diagram.

It is to be noted that scatter diagram is not suitable for large number of observations.

Below are scatter plots showing different types of correlation.

• Positive Correlation
• Negative Correlation

If dots of scatter diagram don’t show any trend, there is no correlation between the variables as shown below.

Coefficient of Correlation

Coefficient of correlation measures the intensity or degree of linear relationship between two variables. It was given by Karl Pearson and it is denoted by r.

Correlation between variables x and y is given as

r = Corr(x,y)

= \frac{Cov(x,y)}{\sqrt{V(x)}\sqrt{V(y)}}

where

Cov(x,y) is the covariance between x and y and is given as

Cov(x,y)= \frac{\sum(x-\hat{x})(y-\hat{y})}{n}

V(x) is the variance of x given as:

V(x) = \frac{\sum(x-\hat{x})^2}{n}

V(y) is the variance of y given as:

V(x) = \frac{\sum(y-\hat{y})^2}{n}

Putting above values in equation (1), we can get correlation formula as

r= \frac{\sum(x-\hat{x})(y-\hat{y})/n}{(\sqrt{\sum(x-\hat{x})^2}/n)(\sqrt{\sum(y-\hat{y})^2}/n)}

Correlation coefficient lies between -1 and +1

If X and Y are two independent variables then correlation coefficient (r) between X and Y is zero.

Corr(x,y)=0

Below table gives meaning of different values of correlation.

 r Correlation is +1 Perfect Positive Correlation -1 Perfect Negative Correlation 0 No Correlation 0 to .25 Weak Positive Correlation .75 to .1 Strong Positive Correlation -.25 to 0 Weak Negative Correlation -.75 to -1 Strong Negative Correlation

Coefficient of Determination

It is a measure which is used in statistical model analysis to assess how well a model explains and predicts future outcomes.

The coefficient of determination is the measure of variation in the dependent variable that is explained by the regression function.It is square of correlation of coefficient and denoted by r2.The Coefficient of Determination

If coefficient of determination is 0, dependent variable can not be predicted from the independent variable, whereas the value of r2 =1 indicates that the dependent variable can be predicted from the independent variable without error.

If r2 is .75, it implies that 75% of the variations in dependent variable (Y) can be explained by the independent variable.

Correlation Ratio

If variables are not linearly related and show some curvilinear relationship then correlation coefficient is not a suitable measure to show the extent of relationship. We use Correlation Ratio in this type of cases. Please refer below link for details of correlation ratio.

