Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

Integration of the Extreme Gradient Boosting model with electronic health records to enable the early diagnosis of multiple sclerosis.

In Multiple sclerosis and related disorders

BACKGROUND : Delayed multiple sclerosis (MS) diagnoses are not uncommon, an early diagnostic tool is urgently warranted. We aimed to develop an effective tool through electronic health records and machine learning techniques to early recognize MS patients from hospital visitors in China.

METHODS : Two case sets were collected from January 2016 to December 2018. The training set had 239 MS and 1142 controls, and the test set had 23 MS and 92 controls. The utility of Extreme Gradient Boosting (XGBoost), Random Forest (RF), Naive Bayes, K-nearest-neighbor (KNN) and Support Vector Machine (SVM) in early diagnosis of MS was evaluated by the area under curve of receiver operating characteristic, precision, recall, specificity, accuracy and F1 score.

RESULTS : The XGBoost performed the best and was used to generate the results. Thirty-four variables which were highly relevant to MS diagnosis were set for the XGBoost model, and their relative importance with MS were ranked. The training set recall was 0.632, with a precision of 0.576, and the test set recall was 0.609, with a precision of 0.609. Our study found that 61%, 51%, and 49% of the patients could be diagnosed with MS, 1, 2, and 3 years earlier than their real diagnostic time point, respectively.

CONCLUSIONS : A diagnostic tool for early MS recognition based on the XGBoost model and electronic health records were developed to help reduce diagnostic delays in MS.

Wang Ruoning, Luo Wenjing, Liu Zifeng, Liu Weilong, Liu Chunxin, Liu Xun, Zhu He, Li Rui, Song Jiafang, Hu Xueqiang, Han Sheng, Qiu Wei


Baysian optimization, MS, XGBoost, early diagnostics, machine learning algorithms

General General

A program to automate the discovery of drugs for West Nile and Dengue virus-programmatic screening of over a billion compounds on PubChem, generation of drug leads and automated in silico modelling.

In Journal of biomolecular structure & dynamics

Our work is composed of a python program for programmatic data mining of PubChem to collect data to implement a machine learning-based AutoQSAR algorithm to generate drug leads for the flaviviruses-Dengue and West Nile. The drug leads generated by the program are fed as programmatic inputs to AutoDock Vina package for automated in silico modelling of interaction between the compounds generated as drug leads by the program and the chosen Dengue and West Nile drug target methyltransferase, whose inhibition leads to the control of viral replication. The machine learning-based AutoQSAR algorithm involves feature selection, QSAR modelling, validation and prediction. The drug leads generated, each time the program is run, are reflective of the constantly growing PubChem database which is an important dynamic feature of the program which facilitates fast and dynamic drug lead generation against the West Nile and Dengue viruses. The program prints out the top drug leads after screening PubChem library which is over a billion compounds. The interaction of top drug lead compounds generated by the program and drug targets of West Nile and Dengue virus was modelled in an automated way through the tool. The results are stored in the working folder of the user. Thus, our program ushers in a new age of automatic ease in the virtual drug screening and drug identification through programmatic data mining of chemical data libraries and drug lead generation through machine learning-based AutoQSAR algorithm and an automated in silico modelling run through the program to study the interaction between the drug lead compounds and the drug target protein of West Nile and Dengue virus. The program is hosted, maintained and supported at the GitHub repository link given below Communicated by Ramaswamy H. Sarma.

Geoffrey Ben, Sanker Akhil, Madaj Rafal, Tresanco Mario Sergio Valdés, Upadhyay Manish, Gracia Judith


** in silico modelling, AutoQSAR, drug discovery**

General General

COVID-19 CT Image Synthesis with a Conditional Generative Adversarial Network.

In IEEE journal of biomedical and health informatics

Coronavirus disease 2019 (COVID-19) is an ongoing global pandemic that has spread rapidly since December 2019. Real-time reverse transcription polymerase chain reaction (rRT-PCR) and chest computed tomography (CT) imaging both play an important role in COVID-19 diagnosis. Chest CT imaging offers the benefits of quick reporting, a low cost, and high sensitivity for the detection of pulmonary infection. Recently, deep-learning-based computer vision methods have demonstrated great promise for use in medical imaging applications, including X-rays, magnetic resonance imaging, and CT imaging. However, training a deep-learning model requires large volumes of data, and medical staff faces a high risk when collecting COVID-19 CT data due to the high infectivity of the disease. Another issue is the lack of experts available for data labeling. In order to meet the data requirements for COVID-19 CT imaging, we propose a CT image synthesis approach based on a conditional generative adversarial network that can effectively generate high-quality and realistic COVID-19 CT images for use in deep-learning-based medical imaging tasks. Experimental results show that the proposed method outperforms other state-of-the-art image synthesis methods with the generated COVID-19 CT images and indicates promising for various machine learning applications including semantic segmentation and classification.

Jiang Yifan, Chen Han, Loew M H, Ko Hanseok


General General

Episodic memory governs choices: An RNN-based reinforcement learning model for decision-making task.

In Neural networks : the official journal of the International Neural Network Society

Typical methods to study cognitive function are to record the electrical activities of animal neurons during the training of animals performing behavioral tasks. A key problem is that they fail to record all the relevant neurons in the animal brain. To alleviate this problem, we develop an RNN-based Actor-Critic framework, which is trained through reinforcement learning (RL) to solve two tasks analogous to the monkeys' decision-making tasks. The trained model is capable of reproducing some features of neural activities recorded from animal brain, or some behavior properties exhibited in animal experiments, suggesting that it can serve as a computational platform to explore other cognitive functions. Furthermore, we conduct behavioral experiments on our framework, trying to explore an open question in neuroscience: which episodic memory in the hippocampus should be selected to ultimately govern future decisions. We find that the retrieval of salient events sampled from episodic memories can effectively shorten deliberation time than common events in the decision-making process. The results indicate that salient events stored in the hippocampus could be prioritized to propagate reward information, and thus allow decision-makers to learn a strategy faster.

Zhang Xiaohan, Liu Lu, Long Guodong, Jiang Jing, Liu Shenquan


Actor–Critic, Episodic memory, Prefrontal cortex-basal ganglia circuit, Reinforcement learning

General General

An hybrid ECG-based deep network for the early identification of high-risk to major cardiovascular events for hypertension patients.

In Journal of biomedical informatics ; h5-index 55.0

BACKGROUND AND OBJECTIVE : As the population becomes older and more overweight, the number of potential high-risk subjects with hypertension continues to increase. ICT technologies can provide valuable support for the early assessment of such cases since the practice of conducting medical examinations for the early recognition of high-risk subjects affected by hypertension is quite difficult, time-consuming, and expensive.

METHODS : This paper presents a novel time series-based approach for the early identification of increases in hypertension to discriminate between cardiovascular high-risk and low-risk hypertensive patients through the analyses of electrocardiographic holter signals.

RESULTS : The experimental results show that the proposed model achieves excellent results in terms of classification accuracy compared with the state-of-the-art. In terms of performances, our model reaches an average accuracy at 98%, Sensitivity and Specificity achieve both an average value at 97%.

CONCLUSION : The analysis of the whole time series shows promising results in terms of highlighting the tiny differences between subjects affected by hypertension.

Paragliola Giovanni, Coronato Antonio


Deep learning, Early hypertension identification, Signal processing, Time series classification, eHealth

General General

Smart conversational agents for the detection of neuropsychiatric disorders: A systematic review.

In Journal of biomedical informatics ; h5-index 55.0

OBJECTIVE : To determine whether smart conversational agents can be used for detection of neuropsychiatric disorders. Therefore, we reviewed the technologies used, targeted mental disorders and validation procedures of relevant proposals in this field.

METHODS : We searched Scopus, PubMed, Pro-Quest, IEEE Xplore, Web of Science, CINAHL and the Cochrane Library using a predefined search strategy. Studies were included if they focused on neuropsychiatric disorders and involved conversational data for detection and diagnosis. They were assessed for eligibility by independent reviewers and ultimately included if a consensus was reached about their relevance.

RESULTS : 2356 references were initially retrieved. Eventually, articles -referring smart conversational agents- met the inclusion criteria. Out of the selected studies, are targeted at neurocognitive disorders, at depression and 3 at other conditions. They apply diverse technological solutions and analysis techniques ( use Artificial Intelligence), and they usually rely on gold standard tests for criterion validity assessment. Acceptability, reliability and other aspects of validity were rarely addressed.

CONCLUSION : The use of smart conversational agents for the detection of neuropsychiatric disorders is an emerging and promising field of research, with a broad coverage of mental disorders and extended use of AI. However, the few published studies did not undergo robust psychometric validation processes. Future research in this field would benefit from more rigorous validation mechanisms and standardized software and hardware platforms.

Pacheco-Lorenzo Moisés R, Valladares-Rodríguez Sonia M, Anido-Rifón Luis E, Fernández-Iglesias Manuel J


Conversational agent, Dementia, Depression, Detection, Diagnosis, Mental disorder, Virtual assistant