Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

ArXiv Preprint

Boolean Matrix Factorization (BMF) aims to find an approximation of a given binary matrix as the Boolean product of two low-rank binary matrices. Binary data is ubiquitous in many fields, and representing data by binary matrices is common in medicine, natural language processing, bioinformatics, computer graphics, among many others. Unfortunately, BMF is computationally hard and heuristic algorithms are used to compute Boolean factorizations. Very recently, the theoretical breakthrough was obtained independently by two research groups. Ban et al. (SODA 2019) and Fomin et al. (Trans. Algorithms 2020) show that BMF admits an efficient polynomial-time approximation scheme (EPTAS). However, despite the theoretical importance, the high double-exponential dependence of the running times from the rank makes these algorithms unimplementable in practice. The primary research question motivating our work is whether the theoretical advances on BMF could lead to practical algorithms. The main conceptional contribution of our work is the following. While EPTAS for BMF is a purely theoretical advance, the general approach behind these algorithms could serve as the basis in designing better heuristics. We also use this strategy to develop new algorithms for related $\mathbb{F}_p$-Matrix Factorization. Here, given a matrix $A$ over a finite field GF($p$) where $p$ is a prime, and an integer $r$, our objective is to find a matrix $B$ over the same field with GF($p$)-rank at most $r$ minimizing some norm of $A-B$. Our empirical research on synthetic and real-world data demonstrates the advantage of the new algorithms over previous works on BMF and $\mathbb{F}_p$-Matrix Factorization.

Fedor Fomin, Fahad Panolan, Anurag Patil, Adil Tanveer


General General

Application of machine learning in predicting blood flow and red cell distribution in capillary vessel networks.

In Journal of the Royal Society, Interface

Capillary blood vessels in the body partake in the exchange of gas and nutrients with tissues. They are interconnected via multiple vascular junctions forming the microvascular network. Distributions of blood flow and red cells (RBCs) in such networks are spatially uneven and vary in time. Since they dictate the pathophysiology of tissues, their knowledge is important. Theoretical models used to obtain flow and RBC distribution in large networks have limitations as they treat each vessel as a one-dimensional segment and do not explicitly consider cell-cell and cell-vessel interactions. High-fidelity computational models that accurately model each individual RBC are computationally too expensive to predict haemodynamics in large vascular networks and over a long time. Here we investigate the applicability of machine learning (ML) techniques to predict blood flow and RBC distributions in physiologically realistic vascular networks. We acquire data from high-fidelity simulations of deformable RBC suspension flowing in the networks. With the flow and haematocrit specified at an inlet of vasculature, the ML models predict the time-averaged flow rate and RBC distributions in the entire network, time-dependent flow rate and haematocrit in each vessel and vascular bifurcation in isolation over a long time, and finally, simultaneous spatially and temporally evolving quantities through the vessel hierarchy in the networks.

Ebrahimi Saman, Bagchi Prosenjit


blood cell, computational fluid dynamics, haemodynamics, machine learning, microcirculation

Ophthalmology Ophthalmology

Machine learning classification of multiple sclerosis in children using optical coherence tomography.

In Multiple sclerosis (Houndmills, Basingstoke, England)

BACKGROUND : In children, multiple sclerosis (MS) is the ultimate diagnosis in only 1/5 to 1/3 of cases after a first episode of central nervous system (CNS) demyelination. As the visual pathway is frequently affected in MS and other CNS demyelinating disorders (DDs), structural retinal imaging such as optical coherence tomography (OCT) can be used to differentiate MS.

OBJECTIVE : This study aimed to investigate the utility of machine learning (ML) based on OCT features to identify distinct structural retinal features in children with DDs.

METHODS : This study included 512 eyes from 187 (neyes = 374) children with demyelinating diseases and 69 (neyes = 138) controls. Input features of the analysis comprised of 24 auto-segmented OCT features.

RESULTS : Random Forest classifier with recursive feature elimination yielded the highest predictive values and identified DDs with 75% and MS with 80% accuracy, while multiclass distinction between MS and monophasic DD was performed with 64% accuracy. A set of eight retinal features were identified as the most important features in this classification.

CONCLUSION : This study demonstrates that ML based on OCT features can be used to support a diagnosis of MS in children.

Ciftci Kavaklioglu Beyza, Erdman Lauren, Goldenberg Anna, Kavaklioglu Can, Alexander Cara, Oppermann Hannah M, Patel Amish, Hossain Soaad, Berenbaum Tara, Yau Olivia, Yea Carmen, Ly Mina, Costello Fiona, Mah Jean K, Reginald Arun, Banwell Brenda, Longoni Giulia, Ann Yeh E


Multiple sclerosis, optical coherence tomography, pediatric, retinal nerve fiber layer thickness, supervised learning

General General

COVID-RDNet: A novel coronavirus pneumonia classification model using the mixed dataset by CT and X-rays images.

In Biocybernetics and biomedical engineering

Corona virus disease 2019 (COVID-19) testing relies on traditional screening methods, which require a lot of manpower and material resources. Recently, to effectively reduce the damage caused by radiation and enhance effectiveness, deep learning of classifying COVID-19 negative and positive using the mixed dataset by CT and X-rays images have achieved remarkable research results. However, the details presented on CT and X-ray images have pathological diversity and similarity features, thus increasing the difficulty for physicians to judge specific cases. On this basis, this paper proposes a novel coronavirus pneumonia classification model using the mixed dataset by CT and X-rays images. To solve the problem of feature similarity between lung diseases and COVID-19, the extracted features are enhanced by an adaptive region enhancement algorithm. Besides, the depth network based on the residual blocks and the dense blocks is trained and tested. On the one hand, the residual blocks effectively improve the accuracy of the model and the non-linear COVID-19 features are obtained by cross-layer link. On the other hand, the dense blocks effectively improve the robustness of the model by connecting local and abstract information. On mixed X-ray and CT datasets, the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), area under curve (AUC), and accuracy can all reach 0.99. On the basis of respecting patient privacy and ethics, the proposed algorithm using the mixed dataset from real cases can effectively assist doctors in performing the accurate COVID-19 negative and positive classification to determine the infection status of patients.

Fang Lingling, Wang Xin


Adaptive region enhancement, COVID-19, Deep learning, Dense block, Mixed dataset

General General

Urban spatial risk prediction and optimization analysis of POI based on deep learning from the perspective of an epidemic.

In International journal of applied earth observation and geoinformation : ITC journal

From an epidemiological perspective, previous research on COVID-19 has generally been based on classical statistical analyses. As a result, spatial information is often not used effectively. This paper uses image-based neural networks to explore the relationship between urban spatial risk and the distribution of infected populations, and the design of urban facilities. To achieve this objective, we use spatio-temporal data of people infected with new coronary pneumonia prior to 28 February 2020 in Wuhan. We then use kriging, which is a method of spatial interpolation, as well as core density estimation technology to establish the epidemic heat distribution on fine grid units. We further evaluate the influence of nine major spatial risk factors, including the distribution of agencies, hospitals, park squares, sports fields, banks and hotels, by testing them for significant positive correlation with the distribution of the epidemic. The weights of these spatial risk factors are used for training Generative Adversarial Network (GAN) models, which predict the distribution of cases in a given area. The input image for the machine learning model is a city plan converted by public infrastructures, and the output image is a map of urban spatial risk factors in the given area. The results of the trained model demonstrate that optimising the relevant point of interests (POI) in urban areas to effectively control potential risk factors can aid in managing the epidemic and preventing it from dispersing further.

Zhang Yecheng, Zhang Qimin, Zhao Yuxuan, Deng Yunjie, Zheng Hao


Coronavirus disease, Deep learning, Design improvement, Incidence prediction, Spatial risk factors

General General

Predictive model of risk factors of High Flow Nasal Cannula using machine learning in COVID-19.

In Infectious Disease Modelling

With the rapid increase in the number of COVID-19 patients in Japan, the number of patients receiving oxygen at home has also increased rapidly, and some of these patients have died. An efficient approach to identify high-risk patients with slowly progressing and rapidly worsening COVID-19, and to avoid missing the timing of therapeutic intervention will improve patient prognosis and prevent medical complications. Patients admitted to medical institutions in Japan from November 14, 2020 to April 11, 2021 and registered in the COVID-19 Registry Japan were included. Risk factors for patients with High Flow Nasal Cannula invasive respiratory management or higher were comprehensively explored using machine learning. Age-specific cohorts were created, and severity prediction was performed for the patient surge period and normal times, respectively. We were able to obtain a model that was able to predict severe disease with a sensitivity of 57% when the specificity was set at 90% for those aged 40-59 years, and with a specificity of 50% and 43% when the sensitivity was set at 90% for those aged 60-79 years and 80 years and older, respectively. We were able to identify lactate dehydrogenase level (LDH) as an important factor in predicting the severity of illness in all age groups. Using machine learning, we were able to identify risk factors with high accuracy, and predict the severity of the disease. We plan to develop a tool that will be useful in determining the indications for hospitalisation for patients undergoing home care and early hospitalisation.

Matsunaga Nobuaki, Kamata Keisuke, Asai Yusuke, Tsuzuki Shinya, Sakamoto Yasuaki, Ijichi Shinpei, Akiyama Takayuki, Yu Jiefu, Yamada Gen, Terada Mari, Suzuki Setsuko, Suzuki Kumiko, Saito Sho, Hayakawa Kayoko, Ohmagari Norio


COVID-19, Japan, Machine learning, Risk prediction, Severity