Deep Learning and AI

Bias and Variance Trade-off in Machine Learning Models

May 22, 2024 • 7 min read


Bias and Variance are the biggest proponents to facilitating good and desirable accuracy of a machine learning model. If these are askew, your model can overfit or underfit which leads to undesirable accuracy and performance. We go over how you can master bias and variance in machine learning to get that optimal fit, as well as redefine overfitting and underfitting

Overfitting and Underfitting Overview

We have talked about Overfitting and Underfitting before on this SabrePC Blog so here’s a quick overview:

Overfitting is when a model fits the dataset too closely, to the point where it cannot capture the underlying relationships between data points. Think of it like memorization of a math problem from the study guide, but not understanding the concepts when presented with different values.

Underfitting is where the model has not learned the data to a sufficient and reliable level which also lacs the capacity to capture underlying patters and relationship between data points. Think of it like not even understanding a math concept when presented on a test making an educated guess.

Underfitting Overfitting
  • High training error
  • High testing error
  • not developed enough at learning
  • inaccurate generalizations any data
  • Low training error
  • High test error
  • too good at learning
  • poor generalization to new data

Ideal Complexity

The optimal fitting model is the sweet spot where your model should be at, not too simple, not too complex. If a model is too simple, it cannot capture the important relationships in the data leading to an underfitted model. If the model is too complex, it will start to memorize the training data instead of learning the underlying patterns, leading to overfitting. The goal of finding the ideal complexity is to strike a balance between Bias and Variance.

  • Bias refers to the error introduced by approximating a real-world problem with a simplified model. Like our image above, the data is a complex function. A model with high base oversimplified reading the data thus outputting a linear function leading to underfitting.
  • Variance refers to the model's sensitivity to the fluctuations in the training data. A model with high variance is highly influenced by the specific training data it was trained on and may not generalize well to unseen data. This phenomenon is often associated with overfitting.

  • High Bias
  • Low Variance
  • Low Bias
  • High Variance

The Bias Variance Tradeoff refers to the balance that is needed between bias and variance that trains a model which can generalize optimally when presented with new data.

If you are creating a linear regression model for classification, begin with a simple model with one feature and gradually include more features to make the models more complex. As complexity increases, we monitor so it does not go overboard.

We would then train each model on a portion of our data and evaluate its performance on a separate test set. To measure the prediction error on the test set, we could use a metric like mean squared error (MSE) or mean absolute error (MAE).

To determine the optimal complexity for bias and variance, we want to aim for the model complexity at the point at which bias and variance intersect. This is where the test error is the lowest, representing the optimal model complexity for this specific problem.

Sadly, this graph is conceptual. Bias and variance aren't concrete function and cannot be calculated for a predictive model. But the idea is to find that optimal zone within few degrees of freedom.


Model Evaluation Metrics

When it comes to checking how well our machine learning models are doing, we use different metrics to measure their performance. These metrics give us insights into how accurate our models are and whether they're making the right predictions. Let's take a look at some common metrics:

  1. Accuracy tells us how often our model is making correct predictions. It's a simple measure that gives us a good overall sense of performance.
  2. Precision and Recall measures how many of the predicted positive cases were actually positive, while recall tells us how many of the actual positive cases our model managed to catch.
  3. F1 Score combines precision and recall into a single number, giving us a balanced view of our model's performance.
  4. R-squared helps us understand how well our regression model is fitting the data. It tells us the proportion of the variance in the data that our model explains.
  5. Mean Absolute Error (MAE) and Mean Squared Error (MSE) is for regression problems (predicting numbers). MAE tells us how far, on average, our predictions are from the actual values, while MSE gives more weight to bigger errors.
  6. ROC-AUC gives us an overall idea of how well our model can distinguish between different classes. It's especially useful when we have imbalanced data.
  7. Confusion Matrix is a simple table that helps us understand where our model is making mistakes by showing us how many true positives, true negatives, false positives, and false negatives it's producing.


Understanding the balance between bias and variance, recognizing the dangers of overfitting and underfitting, and selecting appropriate evaluation metrics are essential steps toward building reliable and effective models. Finding the sweet spot between these two extremes is the key to creating models that perform well on unseen data tuning parameters and adjusting training metrics. Read more on how Epochs, Batch Size, and Iterations affect training.

As you continue your journey in machine learning, keep in mind that it's not just about building models—it's about understanding the data, asking the right questions, and using the right tools to find meaningful answers. Building these models can start on a laptop but if you’re getting into serious AI training, a highly performant GPU and workstation can help facilitate faster training. Run these models on your desktop at home or configure your very own GPU (or multi-GPU) workstation by SabrePC.



machine learning

deep learning


Related Content