Abstract
Real estate contributes significantly to all major economies around the world. In particular, house prices have a direct impact on stakeholders, ranging from house buyers to financing companies. Thus, a plethora of techniques have been developed for real estate price prediction. Most of the existing techniques rely on different house features to build a variety of prediction models to predict house prices. Perceiving the effect of spatial dependence on house prices, some later works focused on introducing spatial regression models for improving prediction performance. However, they fail to take into account the geo-spatial context of the neighborhood amenities such as how close a house is to a train station, or a highly-ranked school, or a shopping center. Such contextual information may play a vital role in users’ interests in a house and thereby has a direct influence on its price. In this paper, we propose to leverage the concept of graph neural networks to capture the geo-spatial context of the neighborhood of a house. In particular, we present a novel method, the geo-spatial network embedding (GSNE), that learns the embeddings of houses and various types of points of interest (POIs) in the form of multipartite networks, where the houses and the POIs are represented as attributed nodes and the relationships between them as edges. Extensive experiments with a large number of regression techniques show that the embeddings produced by our proposed GSNE technique consistently and significantly improve the performance of the house price prediction task regardless of the downstream regression model. Relevant source code for GSNE is available at: https://github.com/sarathismg/gsne.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Basu S, Thibodeau TG (1998) Analysis of spatial autocorrelation in house prices. J Real Estate Financ Econ 17(1):61–85
Bojchevski A, Günnemann Stephan (2017) Deep gaussian embedding of attributed graphs: Unsupervised inductive learning via ranking. arXiv preprint arXiv:1707.03815
Bourassa SC, Hoesli M, Peng VS (2003) Do housing submarkets really matter? J Hous Econ 12(1):12–28
Bourassa SC, Cantoni E, Hoesli M (2007) Spatial dependence, housing submarkets, and house price prediction. J Real Estate Financ Econ 35(2):143–160
Bourassa S, Cantoni E, Hoesli M (2010) Predicting house prices with spatial dependence: a comparison of alternative methods. J Real Estate Res 32(2):139–159
Cai H, Zheng VW, Chang KC-C (2018) A comprehensive survey of graph embedding: problems, techniques, and applications. IEEE Trans Knowl Data Eng 30(9):1616–1637
Case B, Clapp J, Dubin R, Rodriguez M (2004) Modeling spatial and temporal house price patterns: a comparison of four models. J Real Estate Finance Econ 29(2):167–191
Cavalcante L, Bessa RJ, Reis M, Browell J (2017) Lasso vector autoregression structures for very short-term wind power forecasting. Wind Energy 20(4):657–675
Ceci M, Corizzo R, Malerba D, Rashkovska A (2019) Spatial autocorrelation and entropy for renewable energy forecasting. Data Min Knowl Disc 33(3):698–729
Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794
Chen X, Wei L, Xu J (2017) House price prediction using lstm. arXiv preprint. https://arxiv.org/abs/1709.08432
Clevert DA, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (ELUs). arXiv preprint. https://arxiv.org/abs/1511.07289
Corizzo R, Ceci M, Fanaee-T H, Gama J (2020) Multi-aspect renewable energy forecasting. Inf Sci 546:701–722
Dubin RA (1998) Predicting house prices using multiple listings data. J Real Estate Financ Econ 17(1):35–59
Feng Y, Jones K (2015) Comparing multilevel modelling and artificial neural networks in house price prediction. In: 2015 2nd IEEE international conference on spatial data mining and geographical knowledge services (ICSDM), pp 108–114. IEEE
Fik TJ, Ling DC, Mulligan GF (2003) Modeling spatial variation in housing prices: a variable interaction approach. Real Estate Econ 31(4):623–646
Fletcher M, Gallimore P, Mangan J (2000) The modelling of housing submarkets. J Prop Invest Finance
Gao Guangliang, Bao Zhifeng, Cao Jie, Qin A Kai (2019) Timos Sellis, Zhiang Wu, et al. Location-centered house price prediction: a multi-task learning approach. arXiv preprint https://arxiv.org/abs/1901.01774
Gauvin L, Panisson A, Cattuto C (2014) Detecting the community structure and activity patterns of temporal networks: a non-negative tensor factorization approach. PloS ONE 9(1):e86028,
Grover Aditya, Leskovec Jure (2016) node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864
Hettige B, Li YF, Wang W, Buntine W (2020) Gaussian embedding of large-scale attributed graphs. In: Australasian database conference, pp 134–146. Springer
Jenkins P, Farag A, Wang S, Li Z (2019) Unsupervised representation learning of spatial data via multimodal embedding. In: Proceedings of the 28th ACM international conference on information and knowledge management, pp 1993–2002
Kelley Pace R, Gilley OW (1998) Generalizing the OLS and grid estimators. Real Estate Econ 26(2):331–347
Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu TY(eds) (2017) LightGBM: a highly efficient gradient boosting decision tree. In: Advances in neural information processing systems. pp 3146–3154
Kipf TN, Welling M (2016a) Semi-supervised classification with graph convolutional networks. arXiv preprint. https://arxiv.org/abs/1609.02907
Kipf TN, Welling M (2016b) Variational graph auto-encoders. arXiv preprint. https://arxiv.org/abs/1611.07308
Krige DG (1951) A statistical approach to some basic mine valuation problems on the witwatersrand. J South Afr Inst Min Metall 52(6):119–139
Król A (2015) Application of hedonic methods in modelling real estate prices in Poland. Data science, learning by latent structures, and knowledge discovery. Springer, Berlin, pp 501–511
Li D-Y, Xu W, Zhao H, Chen R-Q (2009) A svr based forecasting approach for real estate price prediction. In: 2009 International conference on machine learning and cybernetics, volume 2, pp 970–974. IEEE
Limsombunchai Visit (2004) House price prediction: hedonic price model vs. artificial neural network. In: New Zealand agricultural and resource economics society conference, pp 25–26
Manganelli B, De Mare G, Nesticò A (2015). Using genetic algorithms in the housing market analysis. In: International conference on computational science and its applications, pp 36–45. Springer
Montero J-M, Mínguez R, Fernández-Avilés G (2018) Housing price prediction: parametric versus semi-parametric spatial hedonic models. J Geogr Syst 20(1):27–55
Morano P, Tajani F, Locurcio M (2018) Multicriteria analysis and genetic algorithms for mass appraisals in the Italian property market. Int J Hous Mark Anal
Ottensmann JR, Payton S, Man J (2008) Urban location and housing prices within a hedonic model. J Reg Anal Policy, 38(1100-2016-89822)
Owusu-Ansah A (2013) A review of hedonic pricing models in housing research. A Compend Int Real Estate Const Issues 1:17–38, 02
Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 701–710
Piao Y, Chen A, Shang Z (2019) Housing price prediction based on cnn. In: 2019 9th international conference on information science and technology (ICIST), pp 491–495. IEEE
Ravikumar AS (2017) Real estate price prediction using machine learning. Dublin, National College of Ireland (PhD thesis)
Rosen S (1974) Hedonic prices and implicit markets: product differentiation in pure competition. J Polit Econ 82(1):34–55
Serigne. Stacked Regressions : Top 4% on LeaderBoard, 2017. https://kaggle.com/serigne/stacked-regressions-top-4-on-leaderboard
Sikder A, Züfle A (2020) Augmenting geostatistics with matrix factorization: a case study for house price estimation. ISPRS Int J Geo Inf 9(5):288
Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: Large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web, pp 1067–1077
Thibodeau TG (2003) Marking single-family property values to market. Real Estate Econ 31(1):1–22
Thomas Ng S, Skitmore M, Wong KF (2008) Using genetic algorithms and linear regression analysis for private housing demand forecast. Build Environ 43(6):1171–1184
Trojanek R et al (2013) Measuring dwelling price changes in Poland with the application of the hedonic methods. Technical report, European Real Estate Society (ERES)
van der Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(Nov):2579–2605
Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv preprint https://arxiv.org/abs/1710.10903
Wang X, Wen J, Zhang Y, Wang Y (2014) Real estate price forecasting based on SVM optimized by PSO. Optik 125(3):1439–1443
Wang D, Cui P, Zhu W (2016) Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1225–1234
Wang T, Li YQ, Zhao SF (2008) Application of SVM based on rough set in real estate prices prediction. In: 2008 4th international conference on wireless communications, networking and mobile computing, pp 1–4. IEEE
Xin SJ, Khalid K (2018) Modelling house price using ridge regression and lasso regression. Int J Eng Technol 7(4.30):498–501
Xiong S, Sun Q, Zhou A (2019) Improve the house price prediction accuracy with a stacked generalization ensemble model. In: International conference on internet of vehicles, pp 382–389. Springer
Yayar R, Demir D (2014) Hedonic estimation of housing market prices in turkey. Erciyes Univ. J. Fac. Econ. Adm. Sci 67–82
Zhao Y, Chetty G, Tran D (2019) Deep learning with XGBoost for real estate appraisal. In: 2019 IEEE symposium series on computational intelligence (SSCI), pp 1396–1401. IEEE
Zhu D, Cui P, Wang D, Zhu W (2018) Deep variational network embedding in wasserstein space. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 2827–2836
Acknowledgements
We are grateful to Dr Zhifeng Bao, Associate Professor, RMIT University, Australia for sharing the Melbourne housing price dataset with us. This work is done at DataLab, Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology (BUET).
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Michelangelo Ceci.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Das, S.S.S., Ali, M.E., Li, YF. et al. Boosting house price predictions using geo-spatial network embedding. Data Min Knowl Disc 35, 2221–2250 (2021). https://doi.org/10.1007/s10618-021-00789-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-021-00789-x