In Breast cancer research and treatment
BACKGROUND : The aim of this study was to develop a machine learning (ML) based model to accurately predict pathologic complete response (pCR) to neoadjuvant chemotherapy (NAC) using pretreatment clinical and pathological characteristics of electronic medical record (EMR) data in breast cancer (BC).
METHODS : The EMR data from patients diagnosed with early and locally advanced BC and who received NAC followed by curative surgery were reviewed. A total of 16 clinical and pathological characteristics was selected to develop ML model. We practiced six ML models using default settings for multivariate analysis with extracted variables.
RESULTS : In total, 2065 patients were included in this analysis. Overall, 30.6% (n = 632) of patients achieved pCR. Among six ML models, the LightGBM had the highest area under the curve (AUC) for pCR prediction. After hyper-parameter tuning with Bayesian optimization, AUC was 0.810. Performance of pCR prediction models in different histology-based subtypes was compared. The AUC was highest in HR+HER2- subgroup and lowest in HR-/HER2- subgroup (HR+/HER2- 0.841, HR+/HER2+ 0.716, HR-/HER2 0.753, HR-/HER2- 0.653).
CONCLUSIONS : A ML based pCR prediction model using pre-treatment clinical and pathological characteristics provided useful information to predict pCR during NAC. This prediction model would help to determine treatment strategy in patients with BC planned NAC.
Kim Ji-Yeon, Jeon Eunjoo, Kwon Soonhwan, Jung Hyungsik, Joo Sunghoon, Park Youngmin, Lee Se Kyung, Lee Jeong Eon, Nam Seok Jin, Cho Eun Yoon, Park Yeon Hee, Ahn Jin Seok, Im Young-Hyuck
Breast cancer, Machine learning, Neoadjuvant chemotherapy, Pathologic complete response