Short-Term Hourly Ozone Concentration Forecasting Using Functional Data Approach

Shah, Ismail; Gul, Naveed; Ali, Sajid; Houmani, Hassan

doi:10.3390/econometrics12020012

Air pollution, especially ground-level ozone, poses severe threats to human health and ecosystems. Accurate forecasting of ozone concentrations is essential for reducing its adverse effects. This study aims to use the functional time series approach to model ozone concentrations, a method less explored in the literature, and compare it with traditional time series and machine learning models. To this end, the ozone concentration hourly time series is first filtered for yearly seasonality using smoothing splines that lead us to the stochastic (residual) component. The stochastic component is modeled and forecast using a functional autoregressive model (FAR), where each daily ozone concentration profile is considered a single functional datum. For comparison purposes, different traditional and machine learning techniques, such as autoregressive integrated moving average (ARIMA), vector autoregressive (VAR), neural network autoregressive (NNAR), random forest (RF), and support vector machine (SVM), are also used to model and forecast the stochastic component. Once the forecast from the yearly seasonality component and stochastic component are obtained, both are added to obtain the final forecast. For empirical investigation, data consisting of hourly ozone measurements from Los Angeles from 2013 to 2017 are used, and one-day-ahead out-of-sample forecasts are obtained for a complete year. Based on the evaluation metrics, such as R2, root mean squared error (RMSE), and mean absolute error (MAE), the forecasting results indicate that the FAR outperforms the competitors in most scenarios, with the SVM model performing the least favorably across all cases.