Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

In Traffic injury prevention

OBJECTIVE : This study aimed to introduce the random forest (RF) method as a valuable tool for short-term crash frequency prediction. Besides, the study compares the forecast efficiency of the RF model with the classical seasonal autoregressive integrated moving average (SARIMA) model in the multivariate time-series analysis of crash counts.

METHODS : To this end, fatal accidents reported by the police and intercity traffic flow extracted from the loop detectors were aggregated in intercity highways at the country's level monthly from Farvardin 1395 to Mordad 1400. The first 55 data points were used as the training sample, and the remaining ten months were considered the test sample. The Box-Jenkins and random forest machine learning methods were employed for short-term crash frequency prediction. The mean absolute percentage error (MAPE) criterion was utilized to compare the forecast accuracy of the developed models.

RESULTS : The performance of the random forest model (MAPE = 2.6) with the exogenous variables of traffic flow, crash year, and month outperformed the best SARIMA (1,0,0) (1,0,0)12 model (MAPE = 5.7) with traffic flow as the regressor.

CONCLUSIONS : This study suggests that the random forest as an ensemble learning algorithm is a better crash prediction tool compared to the classical Box-Jenkins method, accounting for the non-linear dependencies in crash count time-series. Besides, the results illustrate that the multivariate SARIMA (SARIMAX) model significantly outperforms its univariate counterpart, accounting for the simultaneous impacts of exogenous variables.

Nassiri Habibollah, Mohammadpour Seyed Iman, Dahaghin Mohammad


Forecast, SARIMAX, machine learning, random forest, time series, traffic crash