Interpreting the Root Mean Squared Error of a Linear Regression Model

  • Wow! I built something that can actually predict housing prices!
  • Ok, but how good are these predictions?
Photo by Evelyn Paris via Unsplash

Log Transformation & Normalization

In this example, I am building a Linear Regression model to predict housing prices. The target feature here is housing prices, which are typically in USD (or whatever currency you’re working with). In the process of building this model, I decided to log transform price (the target).

Transforming Mean Squared Error Back to USD

After building the model, I can validate it with a train-test split or k-fold cross-validation. Model validation is important to see if the model can predict a target using new data, instead of the just data it was trained on. We can analyze whether the model is overfitting (i.e. it predicts training data super well, but cannot generalize to new data) or underfitting (i.e. it is too generalized and thus produces predictions that are too far off). Many times during model validation, we analyze Mean Squared Error (MSE) or Root Mean Squared Error (RMSE) — AKA the average distance (squared to get rid of negative numbers) between the model’s predicted target value and the actual target value. In this example, I can use RMSE to see how far off the model’s predicted price generally is from the actual home price.

In conclusion…

If I could give you one takeaway from this article, it’s this:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Tia Plagata

Tia Plagata

66 Followers

Data Scientist/Analyst | Yoga Teacher | Marketer | Life-Long Learner | github: https://github.com/tiaplagata