《Urban neighborhood socioeconomic status (SES) inference: A machine learning approach based on semantic and sentimental analysis of online housing advertisements》
打印
- 作者
- Lingqi Wang;Shenjing He;Shiliang Su;Yu Li;Lirong Hu;Guie Li
- 来源
- HABITAT INTERNATIONAL,Vol.124,P.102572
- 语言
- 英文
- 关键字
- 作者单位
- School of Resource and Environmental Sciences, Wuhan University, Wuhan, China;Department of Urban Planning and Design & Social Infrastructure for Equity and Wellbeing (SIEW) Lab, The University of Hong Kong, Hong Kong, China;School of Public Policy & Management, China University of Mining and Technology, Xuzhou, China;School of Resource and Environmental Sciences, Wuhan University, Wuhan, China;Department of Urban Planning and Design & Social Infrastructure for Equity and Wellbeing (SIEW) Lab, The University of Hong Kong, Hong Kong, China;School of Public Policy & Management, China University of Mining and Technology, Xuzhou, China;School of Management, Guangzhou University, Guangzhou, 510006, China;Department of Urban Studies, University of Glasgow, Glasgow, G12 8QQ, UK;Department of Building and Real Estate, The Hong Kong Polytechnic University, Hong Kong;Department of Agricultural Extension and Education, Agricultural Sciences and Natural Resources University of Khuzestan, Mollasani, Iran;Department of Agricultural Economics, Agricultural Sciences and Natural Resources University of Khuzestan, Mollasani, Iran;Department of Urban Design and Studies, Chung-Ang University, Seoul, South Korea;School of Applied Economics, Renmin University of China, Beijing, 100872, China;Department of Agricultural Economics, Ghent University, Coupure Links 653, 9000, Gent, Belgium;Department of Political Science, Northeastern University, 960A Renaissance Park, 360 Huntington Ave, Boston, MA, 02115, USA
- 摘要
- Understanding the dynamic distribution of residents' socioeconomic status (SES) across neighborhoods within cities is essential for urban planning and policy-making aligning to the Sustainable Development Goals 2030. Whereas the promise in explicitly linking geographical features to SES has been highlighted fairly clear in previous works, scholars hold an eclectic attitude in their outlook, given the absence of theoretical ground, the heavy reliance on nontransparent proprietary data sources and the relatively coarse resolution predictions. Drawing on a case study of Hangzhou metropolitan in China, this paper aims to address these problems by demonstrating a novel approach to neighborhood SES inference based on online housing advertisements. We first revisit the theoretical debates on the linkage between neighborhood SES and online housing advertisements. Then, the Naïve Bayes classifier is employed to semantically identify the topics from online housing advertisements and the associated sentiments are quantified using the lexicon-based approach. Following that, seven commonly used machine learning algorithms are compared and utilized to infer the fine-grained neighborhood SES at residential quarters scale based on the housing attributes and extracted topics from online housing advertisements. Results show that machine learning algorithms vary with predictive ability and the tree-based algorithms are much more powerful in inferring neighborhood SES. More specifically, we distinguish 8 reliable features which not only present relative high importance estimated by all the machine learning algorithms but also exhibit great robustness in inferring neighborhood SES and show promising potential to being applied for unraveling social inequalities. We also observe noteworthy spatial heterogeneity in neighborhood SES across the research site. The demonstrated approach not only enables the policymakers to take stock of deprived neighborhoods in a timely manner, but also lays firm ground for framing contextualized strategies of urban governance. This study is among the first attempts to bridge the theoretical interpretation of housing attributes with the proxy indicator -based approach for fine-grained neighborhood SES measurement.