In Journal of vascular surgery. Venous and lymphatic disorders
BACKGROUND : Post-thrombotic syndrome (PTS) is the most common chronic complication of deep venous thrombosis (DVT). Risk measurement and stratification of PTS are crucial for DVT patients. This study aimed to develop predictive models of PTS using machine learning (ML) for proximal DVT patients.
METHODS : Herein, hospital inpatients from a DVT registry electronic health record (EHR) database were randomly divided into a derivation and a validation set, and four predictive models were constructed using logistic regression, simple decision tree, eXtreme Gradient Boosting (XGBoost) and random forest (RF) algorithms. The presence of PTS was defined according to the Villalta scale. The areas under the receiver operating characteristic curves (AUC), decision-curve analysis (DCA), and calibration curves were applied to evaluate the performance of these models. The Shapley Additive exPlanations (SHAP) analysis was performed to explain the predictive models.
RESULTS : Among the 300 patients, 126 developed a PTS at 6 months post-DVT. The RF model exhibited the best performance among the four models, with an AUC of 0.891. The RF model demonstrated that Villalta score at admission, age, body mass index (BMI), and pain on calf compression were significant predictors for PTS, with accurate prediction at the individual level. The SHAP analysis suggested a nonlinear correlation between age and PTS, with two peak ages of onset at 50 and 70 years.
CONCLUSIONS : The current predictive model identified significant predictors and accurately predicted PTS for patients with proximal DVT. Moreover, the model demonstrated a nonlinear correlation between age and PTS, which might be valuable in risk measurement and stratification of PTS in proximal DVT patients.
Wu Zhaoyu, Li Yixuan, Lei Jiahao, Qiu Peng, Liu Haichun, Yang Xinrui, Chen Tao, Lu Xinwu
2022-Dec-26
age, deep venous thrombosis, machine learning, post-thrombotic syndrome, random forest