TY - JOUR
T1 - Quality assessment of crowdsourced social media data for urban flood management
AU - Songchon, Chanin
AU - Wright, Grant
AU - Beevers, Lindsay Catherine
N1 - Funding Information:
We would like to thank the anonymous reviewers for their careful reading of the manuscript and their constructive comments and suggestions. This work was supported by the Office of the Civil Service Commission (OCSC) and the Ministry of Agriculture and Cooperatives (MOAC) of Thailand.
Publisher Copyright:
© 2021 Elsevier Ltd
PY - 2021/11
Y1 - 2021/11
N2 - Urban flooding can cause widespread devastation in terms of loss of life and damage to property. As such, monitoring urban flood evolution is crucial in identifying the most affected areas, where emergency response resources should be directed. Flood monitoring through airborne or satellite remote sensing is often limited due to weather conditions and urban topography. In contrast, crowdsourced data is not affected by weather or topography, and they hence offer great potential for urban flood monitoring through real-time information shared by individuals. Despite the benefits, there is no guarantee of quality associated with crowdsourced data, which hampers its usability. In this paper, we present and evaluate two different approaches (binary logistic regression and fuzzy logic) to assess the quality of crowdsourced social media data retrieved from the public Twitter archive. Input variables were constructed based on Twitter metadata and spatiotemporal analysis. Both models were trained and tested using actual flood-related information Tweeted during three consecutive years of flooding in Phetchaburi City, Thailand (2016 to 2018), and produced good results. The fuzzy logic approach is shown to perform better, however its implementation involves significantly more subjectivity. The ability to assess data quality enables the uncertainty associated with crowdsourced social media data to be estimated, which allows this type of data to supplement conventional observations, and hence improve flood management activities.
AB - Urban flooding can cause widespread devastation in terms of loss of life and damage to property. As such, monitoring urban flood evolution is crucial in identifying the most affected areas, where emergency response resources should be directed. Flood monitoring through airborne or satellite remote sensing is often limited due to weather conditions and urban topography. In contrast, crowdsourced data is not affected by weather or topography, and they hence offer great potential for urban flood monitoring through real-time information shared by individuals. Despite the benefits, there is no guarantee of quality associated with crowdsourced data, which hampers its usability. In this paper, we present and evaluate two different approaches (binary logistic regression and fuzzy logic) to assess the quality of crowdsourced social media data retrieved from the public Twitter archive. Input variables were constructed based on Twitter metadata and spatiotemporal analysis. Both models were trained and tested using actual flood-related information Tweeted during three consecutive years of flooding in Phetchaburi City, Thailand (2016 to 2018), and produced good results. The fuzzy logic approach is shown to perform better, however its implementation involves significantly more subjectivity. The ability to assess data quality enables the uncertainty associated with crowdsourced social media data to be estimated, which allows this type of data to supplement conventional observations, and hence improve flood management activities.
KW - Crowdsourcing
KW - Data quality
KW - Fuzzy logic
KW - Logistic regression
KW - Social media
UR - http://www.scopus.com/inward/record.url?scp=85111906250&partnerID=8YFLogxK
U2 - 10.1016/j.compenvurbsys.2021.101690
DO - 10.1016/j.compenvurbsys.2021.101690
M3 - Article
SN - 0198-9715
VL - 90
JO - Computers, Environment and Urban Systems
JF - Computers, Environment and Urban Systems
M1 - 101690
ER -