ArXiv Preprint
In recent years, large amounts of electronic health records (EHRs) concerning
chronic diseases, such as cancer, diabetes, and mental disease, have been
collected to facilitate medical diagnosis. Modeling the dynamic properties of
EHRs related to chronic diseases can be efficiently done using dynamic
treatment regimes (DTRs), which are a set of sequential decision rules. While
Reinforcement learning (RL) is a widely used method for creating DTRs, there is
ongoing research in developing RL algorithms that can effectively handle large
amounts of data. In this paper, we present a novel approach, a distributed
Q-learning algorithm, for generating DTRs. The novelties of our research are as
follows: 1) From a methodological perspective, we present a novel and scalable
approach for generating DTRs by combining distributed learning with Q-learning.
The proposed approach is specifically designed to handle large amounts of data
and effectively generate DTRs. 2) From a theoretical standpoint, we provide
generalization error bounds for the proposed distributed Q-learning algorithm,
which are derived within the framework of statistical learning theory. These
bounds quantify the relationships between sample size, prediction accuracy, and
computational burden, providing insights into the performance of the algorithm.
3) From an applied perspective, we demonstrate the effectiveness of our
proposed distributed Q-learning algorithm for DTRs by applying it to clinical
cancer treatments. The results show that our algorithm outperforms both
traditional linear Q-learning and commonly used deep Q-learning in terms of
both prediction accuracy and computation cost.
Di Wang, Yao Wang, Shaojie Tang, Shao-Bo Lin
2023-02-21