Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

A Self-Paced Regularization Framework for Partial-Label Learning.

In IEEE transactions on cybernetics

Partial-label learning (PLL) aims to solve the problem where each training instance is associated with a set of candidate labels, one of which is the correct label. Most PLL algorithms try to disambiguate the candidate label set, by either simply treating each candidate label equally or iteratively identifying the true label. Nonetheless, existing algorithms usually treat all labels and instances equally, and the complexities of both labels and instances are not taken into consideration during the learning stage. Inspired by the successful application of a self-paced learning strategy in the machine-learning field, we integrate the self-paced regime into the PLL framework and propose a novel self-paced PLL (SP-PLL) algorithm, which could control the learning process to alleviate the problem by ranking the priorities of the training examples together with their candidate labels during each learning iteration. Extensive experiments and comparisons with other baseline methods demonstrate the effectiveness and robustness of the proposed method.

Lyu Gengyu, Feng Songhe, Wang Tao, Lang Congyan


General General

Estimation of causal effects of multiple treatments in observational studies with a binary outcome.

In Statistical methods in medical research

There is a dearth of robust methods to estimate the causal effects of multiple treatments when the outcome is binary. This paper uses two unique sets of simulations to propose and evaluate the use of Bayesian additive regression trees in such settings. First, we compare Bayesian additive regression trees to several approaches that have been proposed for continuous outcomes, including inverse probability of treatment weighting, targeted maximum likelihood estimator, vector matching, and regression adjustment. Results suggest that under conditions of non-linearity and non-additivity of both the treatment assignment and outcome generating mechanisms, Bayesian additive regression trees, targeted maximum likelihood estimator, and inverse probability of treatment weighting using generalized boosted models provide better bias reduction and smaller root mean squared error. Bayesian additive regression trees and targeted maximum likelihood estimator provide more consistent 95% confidence interval coverage and better large-sample convergence property. Second, we supply Bayesian additive regression trees with a strategy to identify a common support region for retaining inferential units and for avoiding extrapolating over areas of the covariate space where common support does not exist. Bayesian additive regression trees retain more inferential units than the generalized propensity score-based strategy, and shows lower bias, compared to targeted maximum likelihood estimator or generalized boosted model, in a variety of scenarios differing by the degree of covariate overlap. A case study examining the effects of three surgical approaches for non-small cell lung cancer demonstrates the methods.

Hu Liangyuan, Gu Chenyang, Lopez Michael, Ji Jiayi, Wisnivesky Juan


Causal inference, generalized propensity score, inverse probability of treatment weighting, machine learning, matching

Public Health Public Health

Role of biological Data Mining and Machine Learning Techniques in Detecting and Diagnosing the Novel Coronavirus (COVID-19): A Systematic Review.

In Journal of medical systems ; h5-index 48.0

Coronaviruses (CoVs) are a large family of viruses that are common in many animal species, including camels, cattle, cats and bats. Animal CoVs, such as Middle East respiratory syndrome-CoV, severe acute respiratory syndrome (SARS)-CoV, and the new virus named SARS-CoV-2, rarely infect and spread among humans. On January 30, 2020, the International Health Regulations Emergency Committee of the World Health Organisation declared the outbreak of the resulting disease from this new CoV called 'COVID-19', as a 'public health emergency of international concern'. This global pandemic has affected almost the whole planet and caused the death of more than 315,131 patients as of the date of this article. In this context, publishers, journals and researchers are urged to research different domains and stop the spread of this deadly virus. The increasing interest in developing artificial intelligence (AI) applications has addressed several medical problems. However, such applications remain insufficient given the high potential threat posed by this virus to global public health. This systematic review addresses automated AI applications based on data mining and machine learning (ML) algorithms for detecting and diagnosing COVID-19. We aimed to obtain an overview of this critical virus, address the limitations of utilising data mining and ML algorithms, and provide the health sector with the benefits of this technique. We used five databases, namely, IEEE Xplore, Web of Science, PubMed, ScienceDirect and Scopus and performed three sequences of search queries between 2010 and 2020. Accurate exclusion criteria and selection strategy were applied to screen the obtained 1305 articles. Only eight articles were fully evaluated and included in this review, and this number only emphasised the insufficiency of research in this important area. After analysing all included studies, the results were distributed following the year of publication and the commonly used data mining and ML algorithms. The results found in all papers were discussed to find the gaps in all reviewed papers. Characteristics, such as motivations, challenges, limitations, recommendations, case studies, and features and classes used, were analysed in detail. This study reviewed the state-of-the-art techniques for CoV prediction algorithms based on data mining and ML assessment. The reliability and acceptability of extracted information and datasets from implemented technologies in the literature were considered. Findings showed that researchers must proceed with insights they gain, focus on identifying solutions for CoV problems, and introduce new improvements. The growing emphasis on data mining and ML techniques in medical fields can provide the right environment for change and improvement.

Albahri A S, Hamid Rula A, Alwan Jwan K, Al-Qays Z T, Zaidan A A, Zaidan B B, Albahri A O S, AlAmoodi A H, Khlaf Jamal Mawlood, Almahdi E M, Thabet Eman, Hadi Suha M, Mohammed K I, Alsalem M A, Al-Obaidi Jameel R, Madhloom H T


Artificial Intelligence, Biological Data Mining, COVID-19, Coronaviruses, MERS-CoV, Machine Learning, SARS-CoV-2

General General

Precisely Predicting Acute Kidney Injury with Convolutional Neural Network Based on Electronic Health Record Data

ArXiv Preprint

The incidence of Acute Kidney Injury (AKI) commonly happens in the Intensive Care Unit (ICU) patients, especially in the adults, which is an independent risk factor affecting short-term and long-term mortality. Though researchers in recent years highlight the early prediction of AKI, the performance of existing models are not precise enough. The objective of this research is to precisely predict AKI by means of Convolutional Neural Network on Electronic Health Record (EHR) data. The data sets used in this research are two public Electronic Health Record (EHR) databases: MIMIC-III and eICU database. In this study, we take several Convolutional Neural Network models to train and test our AKI predictor, which can precisely predict whether a certain patient will suffer from AKI after admission in ICU according to the last measurements of the 16 blood gas and demographic features. The research is based on Kidney Disease Improving Global Outcomes (KDIGO) criteria for AKI definition. Our work greatly improves the AKI prediction precision, and the best AUROC is up to 0.988 on MIMIC-III data set and 0.936 on eICU data set, both of which outperform the state-of-art predictors. And the dimension of the input vector used in this predictor is much fewer than that used in other existing researches. Compared with the existing AKI predictors, the predictor in this work greatly improves the precision of early prediction of AKI by using the Convolutional Neural Network architecture and a more concise input vector. Early and precise prediction of AKI will bring much benefit to the decision of treatment, so it is believed that our work is a very helpful clinical application.

Yu Wang, JunPeng Bao, JianQiang Du, YongFeng Li


General General

GreenSea: Visual Soccer Analysis Using Broad Learning System.

In IEEE transactions on cybernetics

Modern soccer increasingly places trust in visual analysis and statistics rather than only relying on the human experience. However, soccer is an extraordinarily complex game that no widely accepted quantitative analysis methods exist. The statistics collection and visualization are time consuming which result in numerous adjustments. To tackle this issue, we developed GreenSea, a visual-based assessment system designed for soccer game analysis, tactics, and training. The system uses a broad learning system (BLS) to train the model in order to avoid the time-consuming issue that traditional deep learning may suffer. Users are able to apply multiple views of a soccer game, and visual summarization of essential statistics using advanced visualization and animation that are available. A marking system trained by BLS is designed to perform quantitative analysis. A novel recurrent discriminative BLS (RDBLS) is proposed to carry out long-term tracking. In our RDBLS, the structure is adjusted to have better performance on the binary classification problem of the discriminative model. Several experiments are carried out to verify that our proposed RDBLS model can outperform the standard BLS and other methods. Two studies were conducted to verify the effectiveness of our GreenSea. The first study was on how GreenSea assists a youth training coach to assess each trainee's performance for selecting most potential players. The second study was on how GreenSea was used to help the U20 Shanghai soccer team coaching staff analyze games and make tactics during the 13th National Games. Our studies have shown the usability of GreenSea and the values of our system to both amateur and expert users.

Sheng Bin, Li Ping, Zhang Yuhan, Mao Lijuan, Chen C L Philip


General General

Resonant Machine Learning Based on Complex Growth Transform Dynamical Systems.

In IEEE transactions on neural networks and learning systems

Traditional energy-based learning models associate a single energy metric to each configuration of variables involved in the underlying optimization process. Such models associate the lowest energy state with the optimal configuration of variables under consideration and are thus inherently dissipative. In this article, we propose an energy-efficient learning framework that exploits structural and functional similarities between a machine-learning network and a general electrical network satisfying Tellegen's theorem. In contrast to the standard energy-based models, the proposed formulation associates two energy components, namely, active and reactive energy with the network. The formulation ensures that the network's active power is dissipated only during the process of learning, whereas the reactive power is maintained to be zero at all times. As a result, in steady state, the learned parameters are stored and self-sustained by electrical resonance determined by the network's nodal inductances and capacitances. Based on this approach, this article introduces three novel concepts: 1) a learning framework where the network's active-power dissipation is used as a regularization for a learning objective function that is subjected to zero total reactive-power constraint; 2) a dynamical system based on complex-domain, continuous-time growth transforms that optimizes the learning objective function and drives the network toward electrical resonance under steady-state operation; and 3) an annealing procedure that controls the tradeoff between active-power dissipation and the speed of convergence. As a representative example, we show how the proposed framework can be used for designing resonant support vector machines (SVMs), where the support vectors correspond to an LC network with self-sustained oscillations. We also show that this resonant network dissipates less active power compared with its non-resonant counterpart.

Chatterjee Oindrila, Chakrabartty Shantanu