Unlocking the Secrets of Regression Metrics: A Friendly Guide to MAE, MSE, RMSE, and R-Squared

Dooinn KIm
3 min readAug 18, 2023

--

Photo by PEIWEN HE on Unsplash

In the previous article, you were given a sneak peek into the metrics used for validating your regression model. In this article, we will take a deeper dive into the content and nuances of those metrics.

Mean Absolute Error (MAE)

What Is It?

Mean Absolute Error (MAE) is like a friendly school teacher who grades papers. When your answers are off, MAE tells you how far off on average. It’s as simple as taking the absolute difference between the actual and predicted values and averaging them.

Interpretation

MAE is a clear and direct way to understand how wrong a model’s predictions are. If the MAE is 5, it means that on average, your predictions are 5 units away from the truth. Lower MAE means better predictions, like getting closer to a bull’s-eye in darts.

Mean Squared Error (MSE)

What Is It?

Mean Squared Error (MSE) is like a strict coach who punishes bigger mistakes more. It squares the differences between actual and predicted values before averaging them, so bigger mistakes count way more.

Interpretation

MSE emphasizes larger errors. If you’re off by 2, MSE counts it as 4 (because 22=422=4). If you’re off by 10, MSE counts it as 100! This makes it sensitive to outliers and big mistakes. If your model’s making large errors, MSE will let you know!

Root Mean Squared Error (RMSE)

What Is It?

Root Mean Squared Error (RMSE) is the square root of MSE. It’s like turning the strict coach’s punishment back into human terms. While MSE might say “your error was 100,” RMSE will say “it was 10,” which is easier to understand.

Interpretation

RMSE is a balance between clarity and sensitivity to large mistakes. It’s more interpretable than MSE but still gives more weight to bigger mistakes. It’s like a translator that helps you understand how much you’re really off by.

R-Squared

What Is It?

R-Squared is like a cheerleader that tells you how much better you’re doing than if you just guessed the average every time. It ranges from 0 to 1, and closer to 1 means your model is awesome!

Interpretation

R-Squared is a measure of how well your model fits the data compared to a simple average. If R-Squared is 0.8, it means your model explains 80% of the variability in the data. It’s like scoring 80% on a test — pretty good!

Example in Python

Let’s create a simple example to illustrate these metrics in action.

from sklearn.linear_model import LinearRegression

# Sample data
X = [[1], [2], [3], [4]]
y_true = [2, 4, 6, 8]

# Train the model
model = LinearRegression()
model.fit(X, y_true)

# Predictions
y_pred = model.predict(X)

# Calculate metrics
mae = mean_absolute_error(y_true, y_pred)
mse = mean_squared_error(y_true, y_pred)
rmse = np.sqrt(mse)
r2 = r2_score(y_true, y_pred)

print("MAE:", mae)
print("MSE:", mse)
print("RMSE:", rmse)
print("R-Squared:", r2)

Conclusion

These metrics are like different coaches in a game. Some are friendly, some are strict, and some cheer you on. By understanding their feedback, you can become a star player in the game of data science. Choose the right metric for your needs, and happy modeling! 🌟

So, after learning about those metrics, aren’t you intrigued to find out how to interpret whether your regression model is overfitting or underfitting? Check out the next article about this topic!

In the upcoming article, you will discover the meanings of Overfitting and Underfitting within the context of a Regression model, using specific metrics that we’ve covered, such as R², MAE, RMSE, and MSE. Don’t miss it! :)

--

--

No responses yet