"One of my PhD students from Stanford, many years after he'd already graduated from Stanford, once said to me that while he was studying at Stanford, he learned about bias and variance and felt like he got it, he understood it. But that subsequently, after many years of work experience in a few different companies, he realized that bias and variance is one of those concepts that takes a short time to learn, but takes a lifetime to master. Those were his exact words. Bias and variance is one of those very powerful ideas" - Andrew N.G
Let me talk about one amusing concept today in ML/AI learning - the trade off between Bias and Variance.
Essentially all these Machine Learning / AI models consist of 3 steps - Create a model, train the same, Test/Validate it before actual release. Out of the entire data set, always a portion is reserved for testing / validation - generally referred as Dev set. That's the fundamental knowledge that you will need to go thru this post.
During Training we may face an issue of "under-fitting" or Bias which means that the model's is not doing so great with the training data. It could be due to simplistic assumptions made OR not considering the key features / input parameters. Understandably when we train the model, it would come to light very quick pointing the need for refinement of the model to make it more accurate in training stage itself.
Most of the times, a model quite successful during training may not produce as good results during validation / testing. This is the other side of the coin - the challenge of Variance problem which is referred as "over-fitting". It may be an isolated issue or also as a consequence of too much of bias-treatment given during training stage.
To given an analogy, a child pampered too much at home (model treatment for "bias" problem) may end up not so-well-behaved in public isn't it ? We may consider the data set reserved for testing / validation to be equivalent to going public (or the future performance) & that goes for a task if we try to handle bias problem too much.
Historically, during pre-neural network days, this was always referred as "Bias-Variance" Trade off when we were using Linear regressions, Simple Decision Trees and K-Means algorithms. For example, in case of Linear regression, if we add a lot of input features, we can reduce bias substantially but we will find it reflecting adversely when we test the model with new set of data since the model gets too much "attached" to the training data set. Similarly when we use a higher order of polynomial function in linear regression, bias problem can be effectively reduced but it will reflect very badly during testing.
With the Deep Learning era where we have complex and deeper neural network coupled with large volumes of data, are we better off now ? We still have the same issues that exist earlier but the way we handle them has changed with better tools in hand.
To begin with, it has been time and again proved that a deep neural network is capable of handling Bias problem effectively without impacting the variance issue in any manner WHEN we use appropriate regularization factor. While building neural network we don't get into the paradox type of situation any more - for instance, the high level thumb rule is "Solve bias problem with more complex neural network" and there after "Use more volume of data to tackle the variance problem".
What is stated above is bit simplistic but we have more number of tools in hand - the hyper parameter tuning, batch normalization, drop outs, early stoppages - we are better equipped to handle the Bias Variance issues. There is of course a cost associated to this luxury - well, larger network means more cost and need for better and larger IT infrastructure. Similarly on the side of having more data, it may not be always possible to get large data for all kinds of situations.
So, the moral of today's blog is that we get out of challenges only to get caught into newer issues. :-) That is how human evolution too happened right ??
Just musing.....
Suren
No comments:
Post a Comment