Member-only story

Two hyperparameters every ML engineer should care

Rabin Poudyal
2 min readNov 11, 2020

Building a successful ML model can involve twisting and tweaking hyperparameters that give the best model for the dataset you are working with. But among all other hyperparameters, there are some of them that are present in almost any machine learning or deep learning algorithms and play an important role. Without having a good idea of how to tweak them, it is often impossible to build a model with good accuracy.

  1. Learning Rate(alpha)

An artificial neural network is trained in optimization algorithms like Gradient descent, Stochastic Gradient Descent, and Adam optimization. The objective of these algorithms is to find the global minimum in a convex function. To reach the global minimum, we move downwards in that convex function. So while moving downwards, we can choose how aggressively we want to step down to reach the global minimum. The more aggressive we step, the faster we reach the minimum. In other words, we can train the machine learning model faster but the downside is that we can overshoot the minimum.

source: https://gfycat.com/angryinconsequentialdiplodocus

2. Batch size

The batch size is also an important hyperparameter that determines the number of samples that…

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

Rabin Poudyal
Rabin Poudyal

Written by Rabin Poudyal

Software Engineer, Data Science Practitioner. Say "Hi!" via email: rabinpoudyal1995@gmail.com or visit my website https://rabinpoudyal.com.np

No responses yet

Write a response