ArXiv Preprint
Infectious disease outbreaks can have a disruptive impact on public health
and societal processes. As decision making in the context of epidemic
mitigation is hard, reinforcement learning provides a methodology to
automatically learn prevention strategies in combination with complex epidemic
models. Current research focuses on optimizing policies w.r.t. a single
objective, such as the pathogen's attack rate. However, as the mitigation of
epidemics involves distinct, and possibly conflicting criteria (i.a.,
prevalence, mortality, morbidity, cost), a multi-objective approach is
warranted to learn balanced policies. To lift this decision-making process to
real-world epidemic models, we apply deep multi-objective reinforcement
learning and build upon a state-of-the-art algorithm, Pareto Conditioned
Networks (PCN), to learn a set of solutions that approximates the Pareto front
of the decision problem. We consider the first wave of the Belgian COVID-19
epidemic, which was mitigated by a lockdown, and study different deconfinement
strategies, aiming to minimize both COVID-19 cases (i.e., infections and
hospitalizations) and the societal burden that is induced by the applied
mitigation measures. We contribute a multi-objective Markov decision process
that encapsulates the stochastic compartment model that was used to inform
policy makers during the COVID-19 epidemic. As these social mitigation measures
are implemented in a continuous action space that modulates the contact matrix
of the age-structured epidemic model, we extend PCN to this setting. We
evaluate the solution returned by PCN, and observe that it correctly learns to
reduce the social burden whenever the hospitalization rates are sufficiently
low. In this work, we thus show that multi-objective reinforcement learning is
attainable in complex epidemiological models and provides essential insights to
balance complex mitigation policies.
Mathieu Reymond, Conor F. Hayes, Lander Willem, Roxana Rădulescu, Steven Abrams, Diederik M. Roijers, Enda Howley, Patrick Mannion, Niel Hens, Ann Nowé, Pieter Libin
2022-04-11