Introduction of Feature Scaling

• It is a technique, used to normalise the range of independent variables or features of data.
• Feature scaling means adjusting data that has different scales so as to avoid biases.
• It is generally performed during the data preprocessing step.
• It is also known as data normalization.

Let’ s say we want to see how speed of running of an athlete depends on his weight and height.

Height ranges from say 4 feet to 6.5 feet while weight ranges from say 40 kgs to 130 kgs.
If we feed these features to the model as it is then the model will give a higher weightage to weight as compared to height because the values for weights are larger than heights. This may not give us desired results.

To overcome this, we would have to rescale or standardise the two features so that their range is comparable. And these values will now be fed to the model.

Methods using in Feature Scaling:

1. Min-Max Normalization
2. Mean Normalization
3. Standardization (Z-score Normalization)
4. Unit Vector

Min-Max Normalisation:

It is a simplest method and consists in rescaling the range of features to scale the range in [0, 1] or [−1, 1]. The general formula for a min-max of [0, 1] is given as:

Standardization:

Standardization (also called z-score normalization) transforms your data such that the resulting distribution has a mean of 0 and a standard deviation of 1.

Examples of Algorithms where Feature Scaling matters :

1. K-Means  uses the Euclidean distance measure here feature scaling matters.
2. K-Nearest-Neighbours also require feature scaling.
3. Principal Component Analysis (PCA): Here too feature scaling is required.
4. Gradient Descent: Calculation speed increase after feature scaling.
5. Neural Networks: Feature scaling is required here.

Note: Naive Bayes, Linear Discriminant Analysis, and Tree-Based models are not affected by feature scaling.
In Short, any Algorithm which is Not Distance based is Not affected by Feature Scaling.

$${}$$