In Journal of medical engineering & technology
Since the outbreak of the novel coronavirus, COVID-19 has continuously spread across the globe briskly. However, since its existence, the symptoms of the disease have been varying widely; thus, developing an urgent need to stratify high-risk categories of people who show more propensity to be affected by this deadly virus will be beneficial for health care. Using the open-access data and machine learning algorithms, this paper aims to cluster countries in groups with similar profiles with respect to the country level pre COVID-19 pandemic parameters. The purpose of performing the data analysis is to measure the extent to which these major risk factors determine the mortality rate due to the coronavirus disease 2019. An unsupervised machine learning model (k-means) was employed for two hundred and eight countries to define data-driven clusters based on thirteen country-level parameters. After performing the one-way ANOVA for comparing the clusters in terms of total cases, total deaths, total cases per population, total deaths per population, and death rate, the paradigm with four and seven clusters showed the best ability to stratify the countries according to total cases per population and death rate with p-values of less than 0.05 and 0.001, respectively. However, the model could not stratify countries in total deaths/cases and total deaths per population.
Garg Poojita, Joshi Deepak
COVID-19, K-means, machine learning, risk-factors