In Journal of biomedical informatics ; h5-index 55.0
OBJECTIVE : In the fields of medical care and research as well as hospital management, time series are an important part of the overall data basis. To ensure high quality standards and enable suitable decisions, tools for precise and generic imputations and forecasts that integrate the temporal dynamics are of great importance. Since forecasting and imputation tasks involve an inherent uncertainty, the focus of our work lay on a probabilistic multivariate generative approach that samples infillings or forecasts from an analysable distribution rather than producing deterministic results.
MATERIALS AND METHODS : For this task, we developed a system based on generative adversarial networks that consist of recurrent encoders and decoders with attention mechanisms and can learn the distribution of intervals from multivariate time series conditioned on the periods before and, if available, periods after the values that are to be predicted. For training, validation and testing, a data set of jointly measured blood pressure series (ABP) and electrocardiograms (ECG) (length: 1,250=ˆ10s) was generated. For the imputation tasks, one interval of fixed length was masked randomly and independently in both channels of every sample. For the forecasting task, all masks were positioned at the end.
RESULTS : The models were trained on around 65,000 bivariate samples and tested against 14,000 series of different persons. For the evaluation, 50 samples were produced for every masked interval to estimate the range of the generated infillings or forecasts. The element-wise arithmetic average of these samples served as an estimator for the mean of the learned conditional distribution. The approach showed better results than a state-of-the-art probabilistic multivariate forecasting mechanism based on Gaussian copula transformation and recurrent neural networks. On the imputation task, the proposed method reached a mean squared error (MSE) of 0.057 on the ECG channel and an MSE of 28.30 on the ABP channel, while the baseline approach reached MSEs of 0.095 (ECG) and 229.1 (ABP). Moreover, on the forecasting task, the presented system achieved MSEs of 0.069 (ECG) and 33.73 (ABP), outperforming the recurrent copula approach, which reached MSEs of 0.082 (ECG) and 196.53 (ABP).
CONCLUSION : The presented generative probabilistic system for the imputation and forecasting of (medical) time series features the flexibility to handle masks of different sizes and positions, the ability to quantify uncertainty due to its probabilistic predictions, and an adjustable trade-off between the goals of minimising errors in individual predictions and minimising the distance between the learned and the real conditional distribution of the infillings or forecasts. .
Festag Sven, Spreckelsen Cord
2023-Feb-13
Forecast, GAN, Imputation, Machine learning, Time series