Rajandran R Creator of OpenAlgo - OpenSource Algo Trading framework for Indian Traders. Telecom Engineer turned Full-time Derivative Trader. Mostly Trading Nifty, Banknifty, High Liquid Stock Derivatives. Trading the Markets Since 2006 onwards. Using Market Profile and Orderflow for more than a decade. Designed and published 100+ open source trading systems on various trading tools. Strongly believe that market understanding and robust trading frameworks are the key to the trading success. Building Algo Platforms, Writing about Markets, Trading System Design, Market Sentiment, Trading Softwares & Trading Nuances since 2007 onwards. Author of Marketcalls.in

Feature Scaling – Normalization Vs Standardization Explained in Simple Terms – Machine Learning Basics

3 min read

Feature scaling is a preprocessing technique used in machine learning to standardize or normalize the range of independent variables (features) in a dataset. The primary goal of feature scaling is to ensure that no particular feature dominates the others due to differences in the units or scales. By transforming the features to a common scale, it helps improve the performance, stability, and convergence speed of machine learning algorithms.

Some machine learning algorithms, especially those that rely on the calculation of distances or similarity measures between data points (e.g., k-Nearest Neighbors, Support Vector Machines, Neural Networks), are sensitive to the scale of input features. If features have different scales, an algorithm may give more importance to features with larger scales, leading to suboptimal performance.

Normalization and Standardization are two common techniques used in data preprocessing to scale and transform numerical features in a dataset. They help in handling different feature scales and improving the performance of machine learning algorithms. Here’s a brief explanation of each technique, followed by a Python example:

Normalization (Min-Max Scaling):

Normalization rescales the features to a specific range, usually [0, 1] without losing the format of the data. It’s also known as Min-Max Scaling. It is calculated using the following formula:

normalized_value = (value – min) / (max – min)

By rescaling the features to a common range, the Min-Max Scaler helps improve the performance of machine learning algorithms that are sensitive to the scale of input features, such as k-Nearest Neighbors, Neural Networks, and Gradient Descent-based algorithms.

Python example:

Here’s a Python code example using matplotlib and sklearn to plot data before and after normalization. In this example, we generate random data points and then normalize them using Min-Max scaling.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler

# Generate random data
np.random.seed(42)
data = np.random.randint(0, 100, (50, 2))

# Normalize data
scaler = MinMaxScaler()
normalized_data = scaler.fit_transform(data)

# Plot before normalization
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
plt.scatter(data[:, 0], data[:, 1], color='blue', label='Original Data')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.legend()
plt.title('Before Normalization')

# Plot after normalization
plt.subplot(1, 2, 2)
plt.scatter(normalized_data[:, 0], normalized_data[:, 1], color='green', label='Normalized Data')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.legend()
plt.title('After Normalization')

plt.show()
Normalization

Standardization (Z-score normalization):

Z-score standardization, also known as Z-score normalization, is a feature scaling technique used in machine learning to transform numerical features to have zero mean and unit variance. This transformation helps improve the performance of machine learning algorithms, especially those that are sensitive to the scale of input features.. It is calculated using the following formula:

standardized_value = (value – mean) / standard_deviation

Python example:

Here’s an example using the matplotlib library to visualize the dataset before and after standardization. This example uses a synthetic dataset with two numerical features.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler

# Create a synthetic dataset
np.random.seed(42)
feature1 = np.random.normal(20, 5, 100)
feature2 = np.random.normal(100, 20, 100)
data = np.column_stack((feature1, feature2))

# Standardize the data
scaler = StandardScaler()
standardized_data = scaler.fit_transform(data)

# Create a plot
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))

# Plot the original data
ax1.scatter(data[:, 0], data[:, 1], label='Original Data')
ax1.set_title('Before Standardization')
ax1.set_xlabel('Feature 1')
ax1.set_ylabel('Feature 2')
ax1.legend()

# Plot the standardized data
ax2.scatter(standardized_data[:, 0], standardized_data[:, 1], label='Standardized Data', c='r')
ax2.set_title('After Standardization')
ax2.set_xlabel('Feature 1')
ax2.set_ylabel('Feature 2')
ax2.legend()

# Show the plot
plt.show()
Standardization

When to Use Normalization and Standardization during PreProcessing in Machine Learning?

Choosing when to use Normalization or Standardization during preprocessing in Machine Learning depends on the characteristics of the dataset and the requirements of the algorithm being used. Here are some guidelines to help you make the right decision:

  1. Normalization (Min-Max Scaling):
  • Use when the data has a skewed distribution or when the minimum and maximum values are known.
  • Useful when the algorithm is sensitive to the scale of input features, such as k-Nearest Neighbors, Neural Networks, and Gradient Descent-based algorithms.
  • Recommended when the algorithm relies on the similarity or distance measures between data points, as normalization scales the features within a specific range.
  • May not be suitable if there are outliers in the data, as normalization could lead to the suppression of important information.
  1. Standardization (Z-score normalization):
  • Use when the data follows a Gaussian (normal) distribution or when the distribution is unknown.
  • More robust to outliers, as it is less sensitive to extreme values.
  • Preferred for algorithms that assume that input features have zero mean and unit variance, such as Support Vector Machines (SVM), Principal Component Analysis (PCA), and Linear Discriminant Analysis (LDA).
  • Can be used with most machine learning algorithms, as it maintains the original distribution of the data while transforming it to a standard scale.

In practice, you can experiment with both techniques and choose the one that yields better performance for your specific problem. It’s also possible to use different scaling methods for different features if needed. Remember that not all machine learning algorithms require feature scaling, such as decision tree-based algorithms (e.g., Decision Trees, Random Forests) and Naive Bayes.

Rajandran R Creator of OpenAlgo - OpenSource Algo Trading framework for Indian Traders. Telecom Engineer turned Full-time Derivative Trader. Mostly Trading Nifty, Banknifty, High Liquid Stock Derivatives. Trading the Markets Since 2006 onwards. Using Market Profile and Orderflow for more than a decade. Designed and published 100+ open source trading systems on various trading tools. Strongly believe that market understanding and robust trading frameworks are the key to the trading success. Building Algo Platforms, Writing about Markets, Trading System Design, Market Sentiment, Trading Softwares & Trading Nuances since 2007 onwards. Author of Marketcalls.in

Introduction to Neural Networks for Traders

In today's world, obtaining a competitive trading edge can often mean the difference between achieving success and facing failure. To achieve this edge, traders...
Rajandran R
4 min read

[Infographic] Evolution of Machine Learning

Imagine a world where machines learn like humans, constantly evolving and improving. This isn't a scene from a sci-fi movie—it's the reality of machine...
Rajandran R
2 min read

What is Online Machine Learning?

Online machine learning, also known as incremental or streaming machine learning, is a type of machine learning paradigm where a model learns from data...
Rajandran R
4 min read

Leave a Reply

Get Notifications, Alerts on Market Updates, Trading Tools, Automation & More