TY - JOUR
T1 - A Machine Learning Approach to Predict Air Quality in California
AU - Castelli, Mauro
AU - Clemente, Fabiana Martins
AU - Popovič, Aleš
AU - Silva, Sara
AU - Vanneschi, Leonardo
N1 - info:eu-repo/grantAgreement/FCT/3599-PPCDT/DSAIPA%2FDS%2F0022%2F2018/PT#
info:eu-repo/grantAgreement/FCT/3599-PPCDT/PTDC%2FCCI-INF%2F29168%2F2017/PT#
Castelli, M., Clemente, F. M., Popovič, A., Silva, S., & Vanneschi, L. (2020). A Machine Learning Approach to Predict Air Quality in California. Complexity, 2020, 1-23. [8049504]. https://doi.org/10.1155/2020/8049504
PY - 2020/8/4
Y1 - 2020/8/4
N2 - Predicting air quality is a complex task due to the dynamic nature, volatility, and high variability in time and space of pollutants and particulates. At the same time, being able to model, predict, and monitor air quality is becoming more and more relevant, especially in urban areas, due to the observed critical impact of air pollution on citizens' health and the environment. In this paper, we employ a popular machine learning method, support vector regression (SVR), to forecast pollutant and particulate levels and to predict the air quality index (AQI). Among the various tested alternatives, radial basis function (RBF) was the type of kernel that allowed SVR to obtain the most accurate predictions. Using the whole set of available variables revealed a more successful strategy than selecting features using principal component analysis. The presented results demonstrate that SVR with RBF kernel allows us to accurately predict hourly pollutant concentrations, like carbon monoxide, sulfur dioxide, nitrogen dioxide, ground-level ozone, and particulate matter 2.5, as well as the hourly AQI for the state of California. Classification into six AQI categories defined by the US Environmental Protection Agency was performed with an accuracy of 94.1% on unseen validation data.
AB - Predicting air quality is a complex task due to the dynamic nature, volatility, and high variability in time and space of pollutants and particulates. At the same time, being able to model, predict, and monitor air quality is becoming more and more relevant, especially in urban areas, due to the observed critical impact of air pollution on citizens' health and the environment. In this paper, we employ a popular machine learning method, support vector regression (SVR), to forecast pollutant and particulate levels and to predict the air quality index (AQI). Among the various tested alternatives, radial basis function (RBF) was the type of kernel that allowed SVR to obtain the most accurate predictions. Using the whole set of available variables revealed a more successful strategy than selecting features using principal component analysis. The presented results demonstrate that SVR with RBF kernel allows us to accurately predict hourly pollutant concentrations, like carbon monoxide, sulfur dioxide, nitrogen dioxide, ground-level ozone, and particulate matter 2.5, as well as the hourly AQI for the state of California. Classification into six AQI categories defined by the US Environmental Protection Agency was performed with an accuracy of 94.1% on unseen validation data.
UR - http://www.scopus.com/inward/record.url?scp=85089776250&partnerID=8YFLogxK
UR - http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=Alerting&SrcApp=Alerting&DestApp=WOS_CPL&DestLinkType=FullRecord&UT=000562873900003
U2 - 10.1155/2020/8049504
DO - 10.1155/2020/8049504
M3 - Article
AN - SCOPUS:85089776250
SN - 1076-2787
VL - 2020
SP - 1
EP - 23
JO - Complexity
JF - Complexity
M1 - 8049504
ER -