In JCO clinical cancer informatics
PURPOSE : There is limited knowledge of the prediction of 2-year cancer-specific survival (CSS) in the head and neck cancer (HNC) population. The aim of this study is to develop and validate machine learning models and a nomogram for the prediction of 2-year CSS in patients with HNC using real-world data collected by major teaching and tertiary referral hospitals in New South Wales (NSW), Australia.
MATERIALS AND METHODS : Data collected in oncology information systems at multiple NSW Cancer Centres were extracted for 2,953 eligible adults diagnosed between 2000 and 2017 with squamous cell carcinoma of the head and neck. Death data were sourced from the National Death Index using record linkage. Machine learning and Cox regression/nomogram models were developed and internally validated in Python and R, respectively.
RESULTS : Machine learning models demonstrated highest performance (C-index) in the larynx and nasopharynx cohorts (0.82), followed by the oropharynx (0.79) and the hypopharynx and oral cavity cohorts (0.73). In the whole HNC population, C-indexes of 0.79 and 0.70 and Brier scores of 0.10 and 0.27 were reported for the machine learning and nomogram model, respectively. Cox regression analysis identified age, T and N classification, and time-corrected biologic equivalent dose in two gray fractions as independent prognostic factors for 2-year CSS. N classification was the most important feature used for prediction in the machine learning model followed by age.
CONCLUSION : Machine learning and nomogram analysis predicted 2-year CSS with high performance using routinely collected and complete clinical information extracted from oncology information systems. These models function as visual decision-making tools to guide radiotherapy treatment decisions and provide insight into the prediction of survival outcomes in patients with HNC.
Kotevski Damian P, Smee Robert I, Vajdic Claire M, Field Matthew
2023-Jan