ArXiv Preprint
COVID-19 has caused thousands of deaths around the world and also resulted in
a large international economic disruption. Identifying the pathways associated
with this illness can help medical researchers to better understand the
properties of the condition. This process can be carried out by analyzing the
medical records. It is crucial to develop tools and models that can aid
researchers with this process in a timely manner. However, medical records are
often unstructured clinical notes, and this poses significant challenges to
developing the automated systems. In this article, we propose a pipeline to aid
practitioners in analyzing clinical notes and revealing the pathways associated
with this disease. Our pipeline relies on topological properties and consists
of three steps: 1) pre-processing the clinical notes to extract the salient
concepts, 2) constructing a feature space of the patients to characterize the
extracted concepts, and finally, 3) leveraging the topological properties to
distill the available knowledge and visualize the result. Our experiments on a
publicly available dataset of COVID-19 clinical notes testify that our pipeline
can indeed extract meaningful pathways.
Negin Karisani, Daniel E. Platt, Saugata Basu, Laxmi Parida
2021-01-19