Feature scaling in python

2 min readJun 17, 2018

When we are pre-processing our dataset for machine learning, we need to scale our data so that all of them are in same scale. Let’s take a small dataset for example:

S.N Country Hours Salary     House
0   France  34.0  12000.0        No
1    Spain  37.0  49000.0       Yes
2  Germany  20.0  34000.0        No
3    Spain  58.0  41000.0        No
4  Germany  40.0  43333.3       Yes
5   France  45.0  28000.0       Yes
6    Spain  39.8 51000.0        No
7   France  28.0  89000.0       Yes
8  Germany  50.0  53000.0        No
9   France  47.0  33000.0       Yes

Here, Hours worked by an employee and salary column are of different scale, one is less than 100 and another column is more than thousands. This different scale makes the calculation in machine learning difficult because the parameters of small scale may converge earlier and parameter of higher scale may converge slowly.

There are two main reasons why we need to do feature scaling :

Most of machine learning algorithms are based on euclidean distance and if we don’t perform feature scaling, one feature may dominate another. In the above dataset, the Hours column has range from 28–58 and Salary column has range 12000–89000. They are not in same scale and if we calculate euclidean distance between rows 0 and 3 then, (58–34)² is very small number compared to (41000–12000)² so salary column may dominate our features.
Even if algorithms are not based on machine learning, if we perform feature scaling, they run faster and we have a huge performance benefit for large dataset.

Feature Scaling Techniques

1. Standardization

In this technique, we replace the value by its z-score.

z=(x−μ)/σ

The result after standardization is that all the features will be rescaled so that they will have properties of standard normal distribution with μ=0(mean) and σ=1(standard deviation).

2. Mean Normalization

x = (x — mean(x))/(max(x) — min(x))

This normalization will create the distribution of features between -1 and 1 and μ=0.

3. Mean-Max Scaling

x = (x — min(x))/(max(x) — min(x))

This scaling brings the value between 0 and 1. This is used when we have features to be normalized to fit within a certain range i.e in image processing intensity values should be between 0 and 255.

When to scale

When we are preparing our dataset for Euclidean distance based algorithm like k-nearest neighbor.
When performing PCA(Principal Component Analysis) for dimensionality reduction.
To speed up gradient descent.
Tree based models can handle different range of values so we don’t need scaling for them.

Now let’s see in code how we can do this in python

We assume that X_train and X_test are extracted from the above dataset.

If you like the post, don’t forget to follow me on medium and clap the post.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Written by Rabin Poudyal

272 Followers

27 Following

Software Engineer, Data Science Practitioner. Say "Hi!" via email: rabinpoudyal1995@gmail.com or visit my website https://rabinpoudyal.com.np

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

More from Rabin Poudyal

Building a knowledge graph in python from scratch

Rabin Poudyal

Building a knowledge graph in python from scratch

A knowledge graph is one of the widely used applications of machine learning that tech giants like Google and Microsoft are using in their…

Jan 12, 2020

Build a recommendation engine from scratch for your university project

Rabin Poudyal

Build a recommendation engine from scratch for your university project

Almost every CS student need to complete a final year project. There is a lot of confusion in what language to choose, what frameworks to…

Sep 7, 2018

Content Based Filtering in Recommendation Systems

Rabin Poudyal

Content Based Filtering in Recommendation Systems

This is one of the simple approach of recommending products or contents to the user. The idea here is that if a user indicates (s)he likes…

Jun 21, 2018

Nearest neighbour based method for collaborative filtering

Rabin Poudyal

Nearest neighbour based method for collaborative filtering

It is one of the method for performing collaborative filtering. If collaborative filtering is new to you don’t forget to read this article…

Jun 25, 2018

See all from Rabin Poudyal

Recommended from Medium

Interpreting Support Vector Machine Coefficients: A Comprehensive Analysis

D.H. Jang

Interpreting Support Vector Machine Coefficients: A Comprehensive Analysis

In the rapidly advancing landscape of artificial intelligence (AI) and machine learning (ML), specific methodologies and their…

Nov 3, 2024

Feature Selection Techniques in Machine Learning

JABERI Mohamed Habib

Feature Selection Techniques in Machine Learning

Feature selection is a critical step in the data preprocessing phase of machine learning. It involves selecting a subset of relevant…

Sep 27, 2024

Lists

Predictive Modeling w/ Python

20 stories1857 saves

Practical Guides to Machine Learning

10 stories2225 saves

Coding & Development

11 stories1033 saves

Natural Language Processing

1977 stories1620 saves

Surrogate Modeling: The Secret to Faster, Smarter Engineering

AI Advances

Shuai Guo, PhD

Surrogate Modeling: The Secret to Faster, Smarter Engineering

Its fundamentals, capabilities, and engineering applications

6d ago

Data Science All Algorithm Cheatsheet 2025

Artificial Intelligence in Plain English

Ritesh Gupta

Data Science All Algorithm Cheatsheet 2025

Stories, strategies, and secrets to choosing the perfect algorithm.

Jan 5

20 Cutting-Edge Statistical Techniques Every Data Scientist Should Master in 2025

The Data Beast

20 Cutting-Edge Statistical Techniques Every Data Scientist Should Master in 2025

In today’s fast-paced data world, traditional methods are evolving rapidly. In 2025, the fusion of classical statistics, AI, and modern…

6d ago

How Does Our Sense of Humor Change With Age? A Statistical Analysis

Fanfare

Daniel Parris

How Does Our Sense of Humor Change With Age? A Statistical Analysis

How do our comedic sensibilities form and transform over time?

Jun 22, 2024

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams