Abstract
In the effort to achieve accurate air pollution predictions, researchers have contributed
various methodologies with varying data and different approaches that can be judged
accurate in their respective contexts. Diverse approaches have been used so far in the
literature to achieve optimal accuracy in the prediction of air pollution. Researchers have
also used different combinations of data such as Meteorological, Traffic and Air Quality
data. Hence, creating a situation where there are open questions on which of the machine
learning (ML) algorithms or ensemble of algorithms is best suited for various combinations
of data and varying dependent and independent variables. While it is obvious that there is
a need for a more optimally performing predictive model for air pollution prediction, it is
difficult to know what combination of algorithms and data is best suited for various
dependent variables. In this study, we reviewed 26 research articles reported recently in the
literature and the methods applied to different data to identify what combination of ML
algorithms and data works best for the prediction of various air pollutants. The study
revealed that despite the availability of many datasets, researchers in this domain cannot
avoid the use of Air Quality and Meteorological datasets. However, Random Forest appears
to perform well for various combinations of datasets.
various methodologies with varying data and different approaches that can be judged
accurate in their respective contexts. Diverse approaches have been used so far in the
literature to achieve optimal accuracy in the prediction of air pollution. Researchers have
also used different combinations of data such as Meteorological, Traffic and Air Quality
data. Hence, creating a situation where there are open questions on which of the machine
learning (ML) algorithms or ensemble of algorithms is best suited for various combinations
of data and varying dependent and independent variables. While it is obvious that there is
a need for a more optimally performing predictive model for air pollution prediction, it is
difficult to know what combination of algorithms and data is best suited for various
dependent variables. In this study, we reviewed 26 research articles reported recently in the
literature and the methods applied to different data to identify what combination of ML
algorithms and data works best for the prediction of various air pollutants. The study
revealed that despite the availability of many datasets, researchers in this domain cannot
avoid the use of Air Quality and Meteorological datasets. However, Random Forest appears
to perform well for various combinations of datasets.
Original language | English |
---|---|
Title of host publication | EDMIC 2021 CONFERENCE PROCEEDINGS ENVIRONMENTAL DESIGN & MANAGEMENT INTERNATIONAL CONFERENCE |
Subtitle of host publication | Confluence of Theory and Practice in the Built Environment: Beyond Theory into Practice |
Place of Publication | Faculty of Environmental Design and Management Obafemi Awolowo University, Ile-Ife |
Publisher | Obafemi Awolowo University, Ile-Ife |
ISBN (Print) | 978-37119-9-7 |
Publication status | Published - 6 Jul 2021 |