We have heard the terms such as High Variance, Low Variance, High Bias, and Low Bias in our Regression and Classification Algorithms. This blog helps you to understand this concept clearly and in a simple way.
Remember this error is the difference between the predicted point and the actual point.
Bias:
In simple terms, Bias refers to the error that is generated during the training of our model. So, Bias is the assumption made by the Model to learn on the given data. The assumptions made by the Model may or may not be correct.
Bias is of two Types:
High Bias: It states that the model doesn't have good predictive performance and has a high error.
Low Bias: It states that there is less amount of error and has a good predictive performance.
Variance:
In simple terms, Variance refers to the variability of the data. It is the error that is generated when testing the model.
Variance is of two types:
High Variance: It refers to the large error generated during the testing part.
Low Variance: It refers to the small error generated during the testing part.
Looking at the image, you can have a clear idea,
I will give a simple explanation to the image, In 1st figure, the training data is fit well with the predicted line. So it has trained well
In the 2nd figure, the predictive points are far from the predictive line. It has a large error when the sum of residuals is calculated.
In the 3rd figure, the training is done well and it has less bias, and the predicted line is also able to predict the test points accurately.
In the 4th figure, even if the Model is trained well, the test data has high variance.
The Bias-Variance tradeoff can be explained in 4 terms,
- High Bias - Low Variance
- Low Bias - High Variance
- Low Bias - Low Variance
- High Bias - High Variance
High Bias - High Variance:
High Bias tells that the Model has larger errors during training the model. And it also has a high error and unable to predict the data accurately.
This case is called UnderFitting.
Low Bias - High Variance:
The model is well trained on training data and has a low bias but not able to predict the testing data.
This case is called OverFitting. You can see this case in Ridge Regression.
Low Bias - Low Variance:
The model is trained well on training data and able to predict the testing data accurately. This is the model everyone prefers. The Generalized model has low bias and low variance.
High Bias - Low Variance:
The model has a high error on training data and but able to predict the testing data. But it can give good results not great results. Just think, how can a model able to predict if it doesn't train well.
You can be able to imagine the plots. If you find any mistakes or want to add anything in the blog. Comment it out, it gives a chance to learn for everyone.
Thank You
Post a Comment