In Environmental health and preventive medicine
BACKGROUND : Previous cardiovascular risk prediction models in Japan have utilized prospective cohort studies with concise data. As the health information including health check-up records and administrative claims becomes digitalized and publicly available, application of large datasets based on such real-world data can achieve prediction accuracy and support social implementation of cardiovascular disease risk prediction models in preventive and clinical practice. In this study, classical regression and machine learning methods were explored to develop ischemic heart disease (IHD) and stroke prognostic models using real-world data.
METHODS : IQVIA Japan Claims Database was searched to include 691,160 individuals (predominantly corporate employees and their families working in secondary and tertiary industries) with at least one annual health check-up record during the identification period (April 2013-December 2018). The primary outcome of the study was the first recorded IHD or stroke event. Predictors were annual health check-up records at the index year-month, comprising demographic characteristics, laboratory tests, and questionnaire features. Four prediction models (Cox, Elnet-Cox, XGBoost, and Ensemble) were assessed in the present study to develop a cardiovascular disease risk prediction model for Japan.
RESULTS : The analysis cohort consisted of 572,971 invididuals. All prediction models showed similarly good performance. The Harrell's C-index was close to 0.9 for all IHD models, and above 0.7 for stroke models. In IHD models, age, sex, high-density lipoprotein, low-density lipoprotein, cholesterol, and systolic blood pressure had higher importance, while in stroke models systolic blood pressure and age had higher importance.
CONCLUSION : Our study analyzed classical regression and machine learning algorithms to develop cardiovascular disease risk prediction models for IHD and stroke in Japan that can be applied to practical use in a large population with predictive accuracy.
Yoshida Shigeto, Tanaka Shu, Okada Masafumi, Ohki Takuya, Yamagishi Kazumasa, Okuno Yasushi
2023
Ischemic heart disease, Machine learning, Real-world data, Risk prediction model, Stroke