Predicting House Price Appreciation using Big Geo-Data and Machine Learning


Background Information

Is it possible to use big geo-data to predict house price appreciation? If so, what are the repercussions? Policy makers will be able to better plan for future revitalization or development, and real estate developers will be able to make data driven decisions on where to invest.

A group of researchers from MIT, University of Wisconsin and PUCPR in Brazil have tackled this challenge. Using four main sources of data, they ran a series of Machine Learning algorithms to effectively predict where house price appreciation will occur across 20,000 homes in the Greater Boston Area. Each red dot represents one home in the dataset, which they joined to the four sources of data below.

Four Sources of Data Used in the Predictive Machine Learning Model:

1)House Information: House area, number of bedrooms, number of bathrooms and house photos;

2)Built Environment: Distances to nearest metro station, university and other amenities as well as Google Street View images using Google’s API;

3)Human Mobility: Mobility patters collected from SafeGraph’s dataset on the GPS systems in individual cell phones, calculating average time travelled daily; and

4)Socioeconomic Attributes: Including demographic data, median income, ethnicity and employment.


Results of Machine Learning Output

They ran a series of experiments to see which of the four data sources above generated the highest impact for predictability.

Using the first data source (House Information) as a baseline, they compared the predictive accuracy of each dataset by adding them on individually. They found that Google’s Street View Images led to the greatest increase in predictive power, meaning that local neighborhood characteristics, such as greenery and surrounding house quality lead to the greatest increase in predictive accuracy.



The Machine Learning model utilizing the four sources of data was able to predict price appreciation with a high degree of certainty. The houses that experienced the highest increase in value were small, low cost and located in a nice area. Other aspects such as low travel time, and a variety of neighborhood amenities were also highly correlated to price appreciation. However, the performance of the model weakened in Boston (image on the left), indicating that downtown areas are more complex and difficult to predict price changes. helps solve for pricing ambiguity. We use a series of Location, Project and Unit Level attributes to ensure your project is priced effectively. We also monitor sales or lease-up velocity to ensure your new development is achieving the highest potential revenue.

Book a demo to learn more about how optimizes revenue!

Source: Kang, Y., Zhang, F., Peng, W., Gao, S., Rao, J., Duarte, F., & Ratti, C. (2020). Understanding house price appreciation using multi-source big geo-data and machine learning. Land Use Policy, 104919. doi:10.1016/j.landusepol.2020.104919

Photo Credit:

58 views0 comments

Recent Posts

See All