Machine learning approaches for outdoor air quality modelling: a systematic review

Yves Rybarczyk, Rasa Zalakeviciute

Research output: Contribution to journalReview article

6 Citations (Scopus)
40 Downloads (Pure)

Abstract

Current studies show that traditional deterministic models tend to struggle to capture the non-linear relationship between the concentration of air pollutants and their sources of emission and dispersion. To tackle such a limitation, the most promising approach is to use statistical models based on machine learning techniques. Nevertheless, it is puzzling why a certain algorithm is chosen over another for a given task. This systematic review intends to clarify this question by providing the reader with a comprehensive description of the principles underlying these algorithms and how they are applied to enhance prediction accuracy. A rigorous search that conforms to the PRISMA guideline is performed and results in the selection of the 46 most relevant journal papers in the area. Through a factorial analysis method these studies are synthetized and linked to each other. The main findings of this literature review show that: (i) machine learning is mainly applied in Eurasian and North American continents and (ii) estimation problems tend to implement Ensemble Learning and Regressions, whereas forecasting make use of Neural Networks and Support Vector Machines. The next challenges of this approach are to improve the prediction of pollution peaks and contaminants recently put in the spotlights (e.g., nanoparticles).

Original languageEnglish
Article number2570
JournalApplied Sciences (Switzerland)
Volume8
Issue number12
DOIs
Publication statusPublished - 11 Dec 2018

Fingerprint

machine learning
air quality
Air quality
contaminants
Learning systems
readers
continents
predictions
pollution
forecasting
learning
regression analysis
Air Pollutants
nanoparticles
Support vector machines
air
Pollution
Impurities
Nanoparticles
Neural networks

Keywords

  • Atmospheric pollution
  • Data mining
  • Multiple correspondence analysis
  • Predictive models

Cite this

@article{c41317317cda4d749ee89edf95afe9b5,
title = "Machine learning approaches for outdoor air quality modelling: a systematic review",
abstract = "Current studies show that traditional deterministic models tend to struggle to capture the non-linear relationship between the concentration of air pollutants and their sources of emission and dispersion. To tackle such a limitation, the most promising approach is to use statistical models based on machine learning techniques. Nevertheless, it is puzzling why a certain algorithm is chosen over another for a given task. This systematic review intends to clarify this question by providing the reader with a comprehensive description of the principles underlying these algorithms and how they are applied to enhance prediction accuracy. A rigorous search that conforms to the PRISMA guideline is performed and results in the selection of the 46 most relevant journal papers in the area. Through a factorial analysis method these studies are synthetized and linked to each other. The main findings of this literature review show that: (i) machine learning is mainly applied in Eurasian and North American continents and (ii) estimation problems tend to implement Ensemble Learning and Regressions, whereas forecasting make use of Neural Networks and Support Vector Machines. The next challenges of this approach are to improve the prediction of pollution peaks and contaminants recently put in the spotlights (e.g., nanoparticles).",
keywords = "Atmospheric pollution, Data mining, Multiple correspondence analysis, Predictive models",
author = "Yves Rybarczyk and Rasa Zalakeviciute",
year = "2018",
month = "12",
day = "11",
doi = "10.3390/app8122570",
language = "English",
volume = "8",
journal = "Applied sciences-Basel",
issn = "2076-3417",
publisher = "MDPI AG",
number = "12",

}

Machine learning approaches for outdoor air quality modelling: a systematic review. / Rybarczyk, Yves; Zalakeviciute, Rasa.

In: Applied Sciences (Switzerland), Vol. 8, No. 12, 2570, 11.12.2018.

Research output: Contribution to journalReview article

TY - JOUR

T1 - Machine learning approaches for outdoor air quality modelling: a systematic review

AU - Rybarczyk, Yves

AU - Zalakeviciute, Rasa

PY - 2018/12/11

Y1 - 2018/12/11

N2 - Current studies show that traditional deterministic models tend to struggle to capture the non-linear relationship between the concentration of air pollutants and their sources of emission and dispersion. To tackle such a limitation, the most promising approach is to use statistical models based on machine learning techniques. Nevertheless, it is puzzling why a certain algorithm is chosen over another for a given task. This systematic review intends to clarify this question by providing the reader with a comprehensive description of the principles underlying these algorithms and how they are applied to enhance prediction accuracy. A rigorous search that conforms to the PRISMA guideline is performed and results in the selection of the 46 most relevant journal papers in the area. Through a factorial analysis method these studies are synthetized and linked to each other. The main findings of this literature review show that: (i) machine learning is mainly applied in Eurasian and North American continents and (ii) estimation problems tend to implement Ensemble Learning and Regressions, whereas forecasting make use of Neural Networks and Support Vector Machines. The next challenges of this approach are to improve the prediction of pollution peaks and contaminants recently put in the spotlights (e.g., nanoparticles).

AB - Current studies show that traditional deterministic models tend to struggle to capture the non-linear relationship between the concentration of air pollutants and their sources of emission and dispersion. To tackle such a limitation, the most promising approach is to use statistical models based on machine learning techniques. Nevertheless, it is puzzling why a certain algorithm is chosen over another for a given task. This systematic review intends to clarify this question by providing the reader with a comprehensive description of the principles underlying these algorithms and how they are applied to enhance prediction accuracy. A rigorous search that conforms to the PRISMA guideline is performed and results in the selection of the 46 most relevant journal papers in the area. Through a factorial analysis method these studies are synthetized and linked to each other. The main findings of this literature review show that: (i) machine learning is mainly applied in Eurasian and North American continents and (ii) estimation problems tend to implement Ensemble Learning and Regressions, whereas forecasting make use of Neural Networks and Support Vector Machines. The next challenges of this approach are to improve the prediction of pollution peaks and contaminants recently put in the spotlights (e.g., nanoparticles).

KW - Atmospheric pollution

KW - Data mining

KW - Multiple correspondence analysis

KW - Predictive models

UR - http://www.scopus.com/inward/record.url?scp=85058241931&partnerID=8YFLogxK

U2 - 10.3390/app8122570

DO - 10.3390/app8122570

M3 - Review article

VL - 8

JO - Applied sciences-Basel

JF - Applied sciences-Basel

SN - 2076-3417

IS - 12

M1 - 2570

ER -