ArXiv Preprint
Poor air quality can have a significant impact on human health. The National
Oceanic and Atmospheric Administration (NOAA) air quality forecasting guidance
is challenged by the increasing presence of extreme air quality events due to
extreme weather events such as wild fires and heatwaves. These extreme air
quality events further affect human health. Traditional methods used to correct
model bias make assumptions about linearity and the underlying distribution.
Extreme air quality events tend to occur without a strong signal leading up to
the event and this behavior tends to cause existing methods to either under or
over compensate for the bias. Deep learning holds promise for air quality
forecasting in the presence of extreme air quality events due to its ability to
generalize and learn nonlinear problems. However, in the presence of these
anomalous air quality events, standard deep network approaches that use a
single network for generalizing to future forecasts, may not always provide the
best performance even with a full feature-set including geography and
meteorology. In this work we describe a method that combines unsupervised
learning and a forecast-aware bi-directional LSTM network to perform bias
correction for operational air quality forecasting using AirNow station data
for ozone and PM2.5 in the continental US. Using an unsupervised clustering
method trained on station geographical features such as latitude and longitude,
urbanization, and elevation, the learned clusters direct training by
partitioning the training data for the LSTM networks. LSTMs are forecast-aware
and implemented using a unique way to perform learning forward and backwards in
time across forecasting days. When comparing the RMSE of the forecast model to
the RMSE of the bias corrected model, the bias corrected model shows
significant improvement (27\% lower RMSE for ozone) over the base forecast.
Sophia Hamer, Jennifer Sleeman, Ivanka Stajner
2023-03-23