In JCO clinical cancer informatics
PURPOSE : Building well-performing machine learning (ML) models in health care has always been exigent because of the data-sharing concerns, yet ML approaches often require larger training samples than is afforded by one institution. This paper explores several federated learning implementations by applying them in both a simulated environment and an actual implementation using electronic health record data from two academic medical centers on a Microsoft Azure Cloud Databricks platform.
MATERIALS AND METHODS : Using two separate cloud tenants, ML models were created, trained, and exchanged from one institution to another via a GitHub repository. Federated learning processes were applied to both artificial neural networks (ANNs) and logistic regression (LR) models on the horizontal data sets that are varying in count and availability. Incremental and cyclic federated learning models have been tested in simulation and real environments.
RESULTS : The cyclically trained ANN showed a 3% increase in performance, a significant improvement across most attempts (P < .05). Single weight neural network models showed improvement in some cases. However, LR models did not show much improvement after federated learning processes. The specific process that improved the performance differed based on the ML model and how federated learning was implemented. Moreover, we have confirmed that the order of the institutions during the training did influence the overall performance increase.
CONCLUSION : Unlike previous studies, our work has shown the implementation and effectiveness of federated learning processes beyond simulation. Additionally, we have identified different federated learning models that have achieved statistically significant performances. More work is needed to achieve effective federated learning processes in biomedicine, while preserving the security and privacy of the data.
Rajendran Suraj, Obeid Jihad S, Binol Hamidullah, D Agostino Ralph, Foley Kristie, Zhang Wei, Austin Philip, Brakefield Joey, Gurcan Metin N, Topaloglu Umit