Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

Category articles

General General

Machine learning approach for discrimination of genotypes based on bright-field cellular images.

In NPJ systems biology and applications

Morphological profiling is a combination of established optical microscopes and cutting-edge machine vision technologies, which stacks up successful applications in high-throughput phenotyping. One major question is how much information can be extracted from an image to identify genetic differences between cells. While fluorescent microscopy images of specific organelles have been broadly used for single-cell profiling, the potential ability of bright-field (BF) microscopy images of label-free cells remains to be tested. Here, we examine whether single-gene perturbation can be discriminated based on BF images of label-free cells using a machine learning approach. We acquired hundreds of BF images of single-gene mutant cells, quantified single-cell profiles consisting of texture features of cellular regions, and constructed a machine learning model to discriminate mutant cells from wild-type cells. Interestingly, the mutants were successfully discriminated from the wild type (area under the receiver operating characteristic curve = 0.773). The features that contributed to the discrimination were identified, and they included those related to the morphology of structures that appeared within cellular regions. Furthermore, functionally close gene pairs showed similar feature profiles of the mutant cells. Our study reveals that single-gene mutant cells can be discriminated from wild-type cells based on BF images, suggesting the potential as a useful tool for mutant cell profiling.

Suzuki Godai, Saito Yutaka, Seki Motoaki, Evans-Yamamoto Daniel, Negishi Mikiko, Kakoi Kentaro, Kawai Hiroki, Landry Christian R, Yachie Nozomu, Mitsuyama Toutai

2021-Jul-21

General General

Defining the undefinable: the black box problem in healthcare artificial intelligence.

In Journal of medical ethics ; h5-index 34.0

The 'black box problem' is a long-standing talking point in debates about artificial intelligence (AI). This is a significant point of tension between ethicists, programmers, clinicians and anyone else working on developing AI for healthcare applications. However, the precise definition of these systems are often left undefined, vague, unclear or are assumed to be standardised within AI circles. This leads to situations where individuals working on AI talk over each other and has been invoked in numerous debates between opaque and explainable systems. This paper proposes a coherent and clear definition for the black box problem to assist in future discussions about AI in healthcare. This is accomplished by synthesising various definitions in the literature and examining several criteria that can be extrapolated from these definitions.

Wadden Jordan Joseph

2021-Jul-21

clinical ethics, ethics, philosophy of medicine

General General

Contemporary issues in the implementation of lung cancer screening.

In European respiratory review : an official journal of the European Respiratory Society

Lung cancer screening with low-dose computed tomography can reduce death from lung cancer by 20-24% in high-risk smokers. National lung cancer screening programmes have been implemented in the USA and Korea and are being implemented in Europe, Canada and other countries. Lung cancer screening is a process, not a test. It requires an organised programmatic approach to replicate the lung cancer mortality reduction and safety of pivotal clinical trials. Cost-effectiveness of a screening programme is strongly influenced by screening sensitivity and specificity, age to stop screening, integration of smoking cessation intervention for current smokers, screening uptake, nodule management and treatment costs. Appropriate management of screen-detected lung nodules has significant implications for healthcare resource utilisation and minimising harm from radiation exposure related to imaging studies, invasive procedures and clinically significant distress. This review focuses on selected contemporary issues in the path to implement a cost-effective lung cancer screening at the population level. The future impact of emerging technologies such as deep learning and biomarkers are also discussed.

Lam Stephen, Tammemagi Martin

2021-Sep-30

Public Health Public Health

Metabolomic analyses reveals new stage-specific features of the COVID-19.

In The European respiratory journal

The current pandemic of coronavirus disease 19 (COVID-19) has affected more than 160 million of individuals and caused millions of deaths worldwide at least in part due to the unclarified pathophysiology of this disease. Therefore, identifying the underlying molecular mechanisms of COVID-19 is critical to overcome this pandemic. Metabolites mirror the disease progression of an individual by acquiring extensive insights into the pathophysiological significance during disease progression. We provide a comprehensive view of metabolic characterization of sera from COVID-19 patients at all stages using untargeted and targeted metabolomic analysis. As compared with the healthy controls, we observed different alteration patterns of circulating metabolites from the mild, severe and recovery stages, in both discovery cohort and validation cohort, which suggest that metabolic reprogramming of glucose metabolism and urea cycle are potential pathological mechanisms for COVID-19 progression. Our findings suggest that targeting glucose metabolism and urea cycle may be a viable approach to fight against COVID-19 at various stages along the disease course.

Jia Hongling, Liu Chaowu, Li Dantong, Huang Qingsheng, Liu Dong, Zhang Ying, Ye Chang, Zhou Di, Wang Yang, Tan Yanlian, Li Kuibiao, Lin Fangqin, Zhang Haiqing, Lin Jingchao, Xu Yang, Liu Jingwen, Zeng Qing, Hong Jian, Chen Guobing, Zhang Hao, Zheng Lingling, Deng Xilong, Ke Changwen, Gao Yunfei, Fan Jun, Di Biao, Liang Huiying

2021-Jul-21

General General

Diagnostic accuracy of a novel artificial intelligence system for adenoma detection in daily practice: a prospective non-randomized comparative study.

In Endoscopy ; h5-index 58.0

BACKGROUND AND AIMS : Adenoma detection rate (ADR) varies significantly between endoscopists with up to 26% adenoma miss rate (AMR). Artificial intelligence (AI) systems may improve endoscopic quality and reduce the rate of interval cancer. We evaluated the efficacy of an AI system in real time colonoscopy and its influence on the AMR and the ADR.

PATIENTS AND METHODS : In this prospective non-randomized comparative study we analyzed 150 patients (age 65±14, 69 women, 81 men) undergoing diagnostic colonoscopy at a single endoscopy center in Germany from June to October 2020. Every patient was examined concurrently by an endoscopist and AI using two opposing screens. The AI system GI Genius (Medtronic), overseen by a second observer, was not visible to the endoscopist. AMR was the primary outcome. Both methods were compared by the McNemar Test.

RESULTS : There was no significant and no clinically relevant difference (p=0.754) in AMR between the AI system (6/197, 3.0%, 95%CI [1.1-6.5]) and routine colonoscopy (4/197, 2.0%, 95%CI [0.6-5.1]). The polyp miss rate of the AI system (14/311, 4.5%, 95%CI [2.5-7.4]) was not significantly different (p=0.720) from routine colonoscopy (17/311, 5.5%, 95%CI [3.2-8.6]). There was no significant difference (p=0.500) between the ADR with routine colonoscopy (78/150, 52.0%, 95%CI [43.7-60.2]) and the AI system (76/150, 50.7%, 95%CI [42.4-58.9]). Routine colonoscopy detected adenomas in two patients that were missed by the AI system.

CONCLUSION : The AI system had a comparable performance to experienced endoscopists during real-time colonoscopy with similar high ADR (>50%).

Zippelius Carolin, Alqahtani Saleh A, Schedel Jörg, Brookman-Amissah Dominic, Muehlenberg Klaus, Federle Christoph, Salzberger Andrea, Schorr Wolfgang, Pech Oliver

2021-Jul-22

General General

Highly accurate protein structure prediction for the human proteome.

In Nature ; h5-index 368.0

Protein structures can provide invaluable information, both for reasoning about biological processes and for enabling interventions such as structure-based drug development or targeted mutagenesis. After decades of effort, 17% of the total residues in human protein sequences are covered by an experimentally-determined structure1. Here we dramatically expand structural coverage by applying the state-of-the-art machine learning method, AlphaFold2, at scale to almost the entire human proteome (98.5% of human proteins). The resulting dataset covers 58% of residues with a confident prediction, of which a subset (36% of all residues) have very high confidence. We introduce several metrics developed by building on the AlphaFold model, and use them to interpret the dataset, identifying strong multi-domain predictions as well as regions likely to be disordered. Finally, we provide some case studies illustrating how high-quality predictions may be used to generate biological hypotheses. Importantly, we are making our predictions freely available to the community via a public database (hosted by the European Bioinformatics Institute at https://alphafold.ebi.ac.uk/ ). We anticipate that routine large-scale and high-accuracy structure prediction will become an important tool, allowing new questions to be addressed from a structural perspective.

Tunyasuvunakool Kathryn, Adler Jonas, Wu Zachary, Green Tim, Zielinski Michal, Žídek Augustin, Bridgland Alex, Cowie Andrew, Meyer Clemens, Laydon Agata, Velankar Sameer, Kleywegt Gerard J, Bateman Alex, Evans Richard, Pritzel Alexander, Figurnov Michael, Ronneberger Olaf, Bates Russ, Kohl Simon A A, Potapenko Anna, Ballard Andrew J, Romera-Paredes Bernardino, Nikolov Stanislav, Jain Rishub, Clancy Ellen, Reiman David, Petersen Stig, Senior Andrew W, Kavukcuoglu Koray, Birney Ewan, Kohli Pushmeet, Jumper John, Hassabis Demis

2021-Jul-22

oncology Oncology

Prediction of chemotherapy response in breast cancer patients at pre-treatment using second derivative texture of CT images and machine learning.

In Translational oncology

Although neoadjuvant chemotherapy (NAC) is a crucial component of treatment for locally advanced breast cancer (LABC), only about 70% of patients respond to it. Effective adjustment of NAC for individual patients can significantly improve survival rates of those resistant to standard regimens. Thus, the early prediction of NAC outcome is of great importance in facilitating a personalized paradigm for breast cancer therapeutics. In this study, quantitative computed tomography (qCT) parametric imaging in conjunction with machine learning techniques were investigated to predict LABC tumor response to NAC. Textural and second derivative textural (SDT) features of CT images of 72 patients diagnosed with LABC were analysed before the initiation of NAC to quantify intra-tumor heterogeneity. These quantitative features were processed through a correlation-based feature reduction followed by a sequential feature selection with a bootstrap 0.632+ area under the receiver operating characteristic (ROC) curve (AUC0.632+) criterion. The best feature subset consisted of a combination of one textural and three SDT features. Using these features, an AdaBoost decision tree could predict the patient response with a cross-validated AUC0.632+ accuracy, sensitivity and specificity of 0.88, 85%, 88% and 75%, respectively. This study demonstrates, for the first time, that a combination of textural and SDT features of CT images can be used to predict breast cancer response NAC prior to the start of treatment which can potentially facilitate early therapy adjustments.

Moghadas-Dastjerdi Hadi, Rahman Shan-E-Tallat Hira, Sannachi Lakshmanan, Wright Frances C, Gandhi Sonal, Trudeau Maureen E, Sadeghi-Naini Ali, Czarnota Gregory J

2021-Jul-19

Derivative textures, Locally advanced breast cancer (LABC), Machine learning, Neoadjuvant chemotherapy (NAC), Personalized medicine, Quantitative computed tomography (qCT)

Public Health Public Health

Neonatal mortality prediction with routinely collected data: a machine learning approach.

In BMC pediatrics ; h5-index 44.0

BACKGROUND : Recent decreases in neonatal mortality have been slower than expected for most countries. This study aims to predict the risk of neonatal mortality using only data routinely available from birth records in the largest city of the Americas.

METHODS : A probabilistic linkage of every birth record occurring in the municipality of São Paulo, Brazil, between 2012 e 2017 was performed with the death records from 2012 to 2018 (1,202,843 births and 447,687 deaths), and a total of 7282 neonatal deaths were identified (a neonatal mortality rate of 6.46 per 1000 live births). Births from 2012 and 2016 (N = 941,308; or 83.44% of the total) were used to train five different machine learning algorithms, while births occurring in 2017 (N = 186,854; or 16.56% of the total) were used to test their predictive performance on new unseen data.

RESULTS : The best performance was obtained by the extreme gradient boosting trees (XGBoost) algorithm, with a very high AUC of 0.97 and F1-score of 0.55. The 5% births with the highest predicted risk of neonatal death included more than 90% of the actual neonatal deaths. On the other hand, there were no deaths among the 5% births with the lowest predicted risk. There were no significant differences in predictive performance for vulnerable subgroups. The use of a smaller number of variables (WHO's five minimum perinatal indicators) decreased overall performance but the results still remained high (AUC of 0.91). With the addition of only three more variables, we achieved the same predictive performance (AUC of 0.97) as using all the 23 variables originally available from the Brazilian birth records.

CONCLUSION : Machine learning algorithms were able to identify with very high predictive performance the neonatal mortality risk of newborns using only routinely collected data.

Batista André F M, Diniz Carmen S G, Bonilha Eliana A, Kawachi Ichiro, Chiavegatto Filho Alexandre D P

2021-Jul-21

Artificial intelligence, Birth records, Brazil, Machine learning, Neonatal mortality, Prediction

General General

Ant colony optimization with Cauchy and greedy Levy mutations for multilevel COVID 19 X-ray image segmentation.

In Computers in biology and medicine

This paper focuses on the study of multilevel COVID-19 X-ray image segmentation based on swarm intelligence optimization to improve the diagnostic level of COVID-19. We present a new ant colony optimization with the Cauchy mutation and the greedy Levy mutation, termed CLACO, for continuous domains. Specifically, the Cauchy mutation is applied to the end phase of ant foraging in CLACO to enhance its searchability and to boost its convergence rate. The greedy Levy mutation is applied to the optimal ant individuals to confer an improved ability to jump out of the local optimum. Furthermore, this paper develops a novel CLACO-based multilevel image segmentation method, termed CLACO-MIS. Using 2D Kapur's entropy as the CLACO fitness function based on 2D histograms consisting of non-local mean filtered images and grayscale images, CLACO-MIS was successfully applied to the segmentation of COVID-19 X-ray images. A comparison of CLACO with some relevant variants and other excellent peers on 30 benchmark functions from IEEE CEC2014 demonstrates the superior performance of CLACO in terms of search capability, and convergence speed as well as ability to jump out of the local optimum. Moreover, CLACO-MIS was shown to have a better segmentation effect and a stronger adaptability at different threshold levels than other methods in performing segmentation experiments of COVID-19 X-ray images. Therefore, CLACO-MIS has great potential to be used for improving the diagnostic level of COVID-19. This research will host a webservice for any question at https://aliasgharheidari.com.

Liu Lei, Zhao Dong, Yu Fanhua, Heidari Ali Asghar, Li Chengye, Ouyang Jinsheng, Chen Huiling, Mafarja Majdi, Turabieh Hamza, Pan Jingye

2021-Jul-03

Ant colony optimization, COVID-19, Diagnosis, Image, Meta-heuristic, Swarm-intelligence

General General

Role of deep learning in brain tumor detection and classification (2015 to 2020): A review.

In Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society

During the last decade, computer vision and machine learning have revolutionized the world in every way possible. Deep Learning is a sub field of machine learning that has shown remarkable results in every field especially biomedical field due to its ability of handling huge amount of data. Its potential and ability have also been applied and tested in the detection of brain tumor using MRI images for effective prognosis and has shown remarkable performance. The main objective of this research work is to present a detailed critical analysis of the research and findings already done to detect and classify brain tumor through MRI images in the recent past. This analysis is specifically beneficial for the researchers who are experts of deep learning and are interested to apply their expertise for brain tumor detection and classification. As a first step, a brief review of the past research papers using Deep Learning for brain tumor classification and detection is carried out. Afterwards, a critical analysis of Deep Learning techniques proposed in these research papers (2015-2020) is being carried out in the form of a Table. Finally, the conclusion highlights the merits and demerits of deep neural networks. The results formulated in this paper will provide a thorough comparison of recent studies to the future researchers, along with the idea of the effectiveness of various deep learning approaches. We are confident that this study would greatly assist in advancement of brain tumor research.

Nazir Maria, Shakil Sadia, Khurshid Khurram

2021-May-15

Brain tumor, Deep learning, Machine learning, Neural networks

General General

Ant colony optimization with Cauchy and greedy Levy mutations for multilevel COVID 19 X-ray image segmentation.

In Computers in biology and medicine

This paper focuses on the study of multilevel COVID-19 X-ray image segmentation based on swarm intelligence optimization to improve the diagnostic level of COVID-19. We present a new ant colony optimization with the Cauchy mutation and the greedy Levy mutation, termed CLACO, for continuous domains. Specifically, the Cauchy mutation is applied to the end phase of ant foraging in CLACO to enhance its searchability and to boost its convergence rate. The greedy Levy mutation is applied to the optimal ant individuals to confer an improved ability to jump out of the local optimum. Furthermore, this paper develops a novel CLACO-based multilevel image segmentation method, termed CLACO-MIS. Using 2D Kapur's entropy as the CLACO fitness function based on 2D histograms consisting of non-local mean filtered images and grayscale images, CLACO-MIS was successfully applied to the segmentation of COVID-19 X-ray images. A comparison of CLACO with some relevant variants and other excellent peers on 30 benchmark functions from IEEE CEC2014 demonstrates the superior performance of CLACO in terms of search capability, and convergence speed as well as ability to jump out of the local optimum. Moreover, CLACO-MIS was shown to have a better segmentation effect and a stronger adaptability at different threshold levels than other methods in performing segmentation experiments of COVID-19 X-ray images. Therefore, CLACO-MIS has great potential to be used for improving the diagnostic level of COVID-19. This research will host a webservice for any question at https://aliasgharheidari.com.

Liu Lei, Zhao Dong, Yu Fanhua, Heidari Ali Asghar, Li Chengye, Ouyang Jinsheng, Chen Huiling, Mafarja Majdi, Turabieh Hamza, Pan Jingye

2021-Jul-03

Ant colony optimization, COVID-19, Diagnosis, Image, Meta-heuristic, Swarm-intelligence

General General

Leveraging unsupervised image registration for discovery of landmark shape descriptor.

In Medical image analysis

In current biological and medical research, statistical shape modeling (SSM) provides an essential framework for the characterization of anatomy/morphology. Such analysis is often driven by the identification of a relatively small number of geometrically consistent features found across the samples of a population. These features can subsequently provide information about the population shape variation. Dense correspondence models can provide ease of computation and yield an interpretable low-dimensional shape descriptor when followed by dimensionality reduction. However, automatic methods for obtaining such correspondences usually require image segmentation followed by significant preprocessing, which is taxing in terms of both computation as well as human resources. In many cases, the segmentation and subsequent processing require manual guidance and anatomy specific domain expertise. This paper proposes a self-supervised deep learning approach for discovering landmarks from images that can directly be used as a shape descriptor for subsequent analysis. We use landmark-driven image registration as the primary task to force the neural network to discover landmarks that register the images well. We also propose a regularization term that allows for robust optimization of the neural network and ensures that the landmarks uniformly span the image domain. The proposed method circumvents segmentation and preprocessing and directly produces a usable shape descriptor using just 2D or 3D images. In addition, we also propose two variants on the training loss function that allows for prior shape information to be integrated into the model. We apply this framework on several 2D and 3D datasets to obtain their shape descriptors. We analyze these shape descriptors in their efficacy of capturing shape information by performing different shape-driven applications depending on the data ranging from shape clustering to severity prediction to outcome diagnosis.

Bhalodia Riddhish, Elhabian Shireen, Kavan Ladislav, Whitaker Ross

2021-Jul-09

Image registration, Machine learning, Self-supervised learning, Statistical shape modeling

General General

Online sensorimotor learning and adaptation for inverse dynamics control.

In Neural networks : the official journal of the International Neural Network Society

We propose a micro-data (< 10 trials) sensorimotor learning and adaptation (SEED) model for human-like arm inverse dynamics control. The SEED model consists of a feedforward Gaussian motor primitive (GATE) neural network and an adaptive feedback impedance (AIM) mechanism. Sensorimotor weights over trials are learned in the GATE network, while the AIM mechanism is used to online tune impedance gains in a trial. The model was validated by periodic and non-periodic tracking tasks on a two-joint robot arm. As a result, the proposed model enables the arm to stably learn the tasks within 10 trials, compared to thousands of trials required by state-of-art deep learning. This model facilitates the exploration of unknown arm dynamics, in which the elbow joint requires much less active control compared to the shoulder. This control goes below 3% of the overall effort. This finding complies with a proximal-distal control gradient in human arm control. Taken together, the proposed SEED model paves a way for implementing data-efficient sensorimotor learning and adaptation of human-like arm movement.

Xiong Xiaofeng, Manoonpong Poramate

2021-Jul-09

Gaussian model, Neural network, Robot control, Variable compliant control

General General

Deep learning for sex classification in resting-state and task functional brain networks from the UK Biobank.

In NeuroImage ; h5-index 117.0

Classification of whole-brain functional connectivity MRI data with convolutional neural networks (CNNs) has shown promise, but the complexity of these models impedes understanding of which aspects of brain activity contribute to classification. While visualization techniques have been developed to interpret CNNs, bias inherent in the method of encoding abstract input data, as well as the natural variance of deep learning models, detract from the accuracy of these techniques. We introduce a stochastic encoding method in an ensemble of CNNs to classify functional connectomes by sex. We applied our method to resting-state and task data from the UK BioBank, using two visualization techniques to measure the salience of three brain networks involved in task- and resting-states, and their interaction. To regress confounding factors such as head motion, age, and intracranial volume, we introduced a multivariate balancing algorithm to ensure equal distributions of such covariates between classes in our data. We achieved a final AUROC of 0.8459. We found that resting-state data classifies more accurately than task data, with the inner salience network playing the most important role of the three networks overall in classification of resting-state data and connections to the central executive network in task data.

Leming Matthew, Suckling John

2021-Jul-19

General General

Innovations in ex vivo Light Sheet Fluorescence Microscopy.

In Progress in biophysics and molecular biology

Light Sheet Fluorescence Microscopy (LSFM) has revolutionized how optical imaging of biological specimens can be performed as this technique allows to produce 3D fluorescence images of entire samples with a high spatiotemporal resolution. In this manuscript, we aim to provide readers with an overview of the field of LSFM on ex vivo samples. Recent advances in LSFM architectures have made the technique widely accessible and have improved its acquisition speed and resolution, among other features. These developments are strongly supported by quantitative analysis of the huge image volumes produced thanks to the boost in computational capacities, the advent of Deep Learning techniques, and by the combination of LSFM with other imaging modalities. Namely, LSFM allows for the characterization of biological structures, disease manifestations and drug effectivity studies. This information can ultimately serve to develop novel diagnostic procedures, treatments and even to model the organs physiology on health and disease.

Delgado-Rodriguez Pablo, Brooks Claire Jordan, Vaquero Juan José, Munoz-Barrutia Arrate

2021-Jul-19

Architecture, Fluorescence, Image analysis, Microscopy, Multimodal, Quantification

General General

Proteome plasticity in response to persistent environmental change.

In Molecular cell ; h5-index 132.0

Temperature is a variable component of the environment, and all organisms must deal with or adapt to temperature change. Acute temperature change activates cellular stress responses, resulting in refolding or removal of damaged proteins. However, how organisms adapt to long-term temperature change remains largely unexplored. Here we report that budding yeast responds to long-term high temperature challenge by switching from chaperone induction to reduction of temperature-sensitive proteins and re-localizing a portion of its proteome. Surprisingly, we also find that many proteins adopt an alternative conformation. Using Fet3p as an example, we find that the temperature-dependent conformational difference is accompanied by distinct thermostability, subcellular localization, and, importantly, cellular functions. We postulate that, in addition to the known mechanisms of adaptation, conformational plasticity allows some polypeptides to acquire new biophysical properties and functions when environmental change endures.

Domnauer Matthew, Zheng Fan, Li Liying, Zhang Yanxiao, Chang Catherine E, Unruh Jay R, Conkright-Fincham Juliana, McCroskey Scott, Florens Laurence, Zhang Ying, Seidel Christopher, Fong Benjamin, Schilling Birgit, Sharma Rishi, Ramanathan Arvind, Si Kausik, Zhou Chuankai

2021-Jul-13

Fet3, environmental stress, machine learning, moonlighting functions, protein conformation changes, thermal acclimation

General General

Bridging neuronal correlations and dimensionality reduction.

In Neuron ; h5-index 148.0

Two commonly used approaches to study interactions among neurons are spike count correlation, which describes pairs of neurons, and dimensionality reduction, applied to a population of neurons. Although both approaches have been used to study trial-to-trial neuronal variability correlated among neurons, they are often used in isolation and have not been directly related. We first established concrete mathematical and empirical relationships between pairwise correlation and metrics of population-wide covariability based on dimensionality reduction. Applying these insights to macaque V4 population recordings, we found that the previously reported decrease in mean pairwise correlation associated with attention stemmed from three distinct changes in population-wide covariability. Overall, our work builds the intuition and formalism to bridge between pairwise correlation and population-wide covariability and presents a cautionary tale about the inferences one can make about population activity by using a single statistic, whether it be mean pairwise correlation or dimensionality.

Umakantha Akash, Morina Rudina, Cowley Benjamin R, Snyder Adam C, Smith Matthew A, Yu Byron M

2021-Jul-16

dimensionality reduction, neuronal population, spatial attention, spike count correlation, visual area V4

General General

Finding a new balance between a genetics-first or phenotype-first approach to the study of disease.

In Neuron ; h5-index 148.0

Successes in neuroscience using a genetics-first approach to characterizing disorders such as autism have eclipsed the scientific and clinical value of a comprehensive phenotype-first-clinical or molecular-approach. Recent high-throughput phenotyping techniques using machine learning, electronic medical records, and even administrative databases show the value of a synthesis between the two approaches.

Kohane Isaac S

2021-Jul-21

autism, electronic health record, endophenotypes, genetics-first, phenome-wide study, phenotyping, real-world data

General General

Predicting mutant outcome by combining deep mutational scanning and machine learning.

In Proteins

MOTIVATION : Deep mutational scanning provides unprecedented wealth of quantitative data regarding the functional outcome of mutations in proteins. A single experiment may measure properties (e.g., structural stability) of numerous protein variants. Leveraging the experimental data to gain insights about unexplored regions of the mutational landscape is a major computational challenge. Such insights may facilitate further experimental work and accelerate the development of novel protein variants with beneficial therapeutic or industrially relevant properties. Here we present a novel, machine learning approach for the prediction of functional mutation outcome in the context of deep mutational screens.

RESULTS : Using sequence (one-hot) features of variants with known properties, as well as structural features derived from models thereof, we train predictive statistical models to estimate the unknown properties of other variants. The utility of the new computational scheme is demonstrated using five sets of mutational scanning data, denoted "targets": (a) protease specificity of APPI (amyloid precursor protein inhibitor) variants; (b - d) three stability related properties of IGBPG (immunoglobulin G-binding β1 domain of streptococcal protein G) variants; and (e) fluorescence of GFP (green fluorescent protein) variants. Performance is measured by the overall correlation of the predicted and observed properties, and enrichment - the ability to predict the most potent variants and presumably guide further experiments. Despite the diversity of the targets the statistical models can generalize variant examples thereof and predict the properties of test variants with both single and multiple mutations. This article is protected by copyright. All rights reserved.

Sarfati Hagit, Naftaly Si, Papo Niv, Keasar Chen

2021-Jul-22

Deep mutational scanning, Machine learning, Mutant outcome, Prediction, Protein library, Protein-protein interactions, Random forest, Specificity, Structural features, Structural stability

General General

Disease ontologies for knowledge graphs.

In BMC bioinformatics

BACKGROUND : Data integration to build a biomedical knowledge graph is a challenging task. There are multiple disease ontologies used in data sources and publications, each having its hierarchy. A common task is to map between ontologies, find disease clusters and finally build a representation of the chosen disease area. There is a shortage of published resources and tools to facilitate interactive, efficient and flexible cross-referencing and analysis of multiple disease ontologies commonly found in data sources and research.

RESULTS : Our results are represented as a knowledge graph solution that uses disease ontology cross-references and facilitates switching between ontology hierarchies for data integration and other tasks.

CONCLUSIONS : Grakn core with pre-installed "Disease ontologies for knowledge graphs" facilitates the biomedical knowledge graph build and provides an elegant solution for the multiple disease ontologies problem.

Kurbatova Natalja, Swiers Rowan

2021-Jul-21

Data integration, Knowledge graph, Ontologies

General General

Computational predictions for protein sequences of COVID-19 virus via machine learning algorithms.

In Medical & biological engineering & computing ; h5-index 32.0

The rapid spread of coronavirus disease (COVID-19) has become a worldwide pandemic and affected more than 15 million patients reported in 27 countries. Therefore, the computational biology carrying this virus that correlates with the human population urgently needs to be understood. In this paper, the classification of the human protein sequences of COVID-19, according to the country, is presented based on machine learning algorithms. The proposed model is based on distinguishing 9238 sequences using three stages, including data preprocessing, data labeling, and classification. In the first stage, data preprocessing's function converts the amino acids of COVID-19 protein sequences into eight groups of numbers based on the amino acids' volume and dipole. It is based on the conjoint triad (CT) method. In the second stage, there are two methods for labeling data from 27 countries from 0 to 26. The first method is based on selecting one number for each country according to the code numbers of countries, while the second method is based on binary elements for each country. According to their countries, machine learning algorithms are used to discover different COVID-19 protein sequences in the last stage. The obtained results demonstrate 100% accuracy, 100% sensitivity, and 90% specificity via the country-based binary labeling method with a linear support vector machine (SVM) classifier. Furthermore, with significant infection data, the USA is more prone to correct classification compared to other countries with fewer data. The unbalanced data for COVID-19 protein sequences is considered a major issue, especially as the US's available data represents 76% of a total of 9238 sequences. The proposed model will act as a prediction tool for the COVID-19 protein sequences in different countries.

Afify Heba M, Zanaty Muhammad S

2021-Jul-22

COVID-19 protein sequences, Conjoint triad (CT), Machine learning algorithms, Support vector machine (SVM)

Ophthalmology Ophthalmology

Individualized Glaucoma Change Detection Using Deep Learning Auto Encoder-Based Regions of Interest.

In Translational vision science & technology

Purpose : To compare change over time in eye-specific optical coherence tomography (OCT) retinal nerve fiber layer (RNFL)-based region-of-interest (ROI) maps developed using unsupervised deep-learning auto-encoders (DL-AE) to circumpapillary RNFL (cpRNFL) thickness for the detection of glaucomatous progression.

Methods : Forty-four progressing glaucoma eyes (by stereophotograph assessment), 189 nonprogressing glaucoma eyes (by stereophotograph assessment), and 109 healthy eyes were followed for ≥3 years with ≥4 visits using OCT. The San Diego Automated Layer Segmentation Algorithm was used to automatically segment the RNFL layer from raw three-dimensional OCT images. For each longitudinal series, DL-AEs were used to generate individualized eye-based ROI maps by identifying RNFL regions of likely progression and no change. Sensitivities and specificities for detecting change over time and rates of change over time were compared for the DL-AE ROI and global cpRNFL thickness measurements derived from a 2.22-mm to 3.45-mm annulus centered on the optic disc.

Results : The sensitivity for detecting change in progressing eyes was greater for DL-AE ROIs than for global cpRNFL annulus thicknesses (0.90 and 0.63, respectively). The specificity for detecting not likely progression in nonprogressing eyes was similar (0.92 and 0.93, respectively). The mean rates of change in DL-AE ROI were significantly faster than for cpRNFL annulus thickness in progressing eyes (-1.28 µm/y vs. -0.83 µm/y) and nonprogressing eyes (-1.03 µm/y vs. -0.78 µm/y).

Conclusions : Eye-specific ROIs identified using DL-AE analysis of OCT images show promise for improving assessment of glaucomatous progression.

Translational Relevance : The detection and monitoring of structural glaucomatous progression can be improved by considering eye-specific regions of likely progression identified using deep learning.

Bowd Christopher, Belghith Akram, Christopher Mark, Goldbaum Michael H, Fazio Massimo A, Girkin Christopher A, Liebmann Jeffrey M, de Moraes Carlos Gustavo, Weinreb Robert N, Zangwill Linda M

2021-Jul-01

General General

Magnetic-resonance-based measurement of electromagnetic fields and conductivity in vivo using single current administration-A machine learning approach.

In PloS one ; h5-index 176.0

Diffusion tensor magnetic resonance electrical impedance tomography (DT-MREIT) is a newly developed technique that combines MR-based measurements of magnetic flux density with diffusion tensor MRI (DT-MRI) data to reconstruct electrical conductivity tensor distributions. DT-MREIT techniques normally require injection of two independent current patterns for unique reconstruction of conductivity characteristics. In this paper, we demonstrate an algorithm that can be used to reconstruct the position dependent scale factor relating conductivity and diffusion tensors, using flux density data measured from only one current injection. We demonstrate how these images can also be used to reconstruct electric field and current density distributions. Reconstructions were performed using a mimetic algorithm and simulations of magnetic flux density from complementary electrode montages, combined with a small-scale machine learning approach. In a biological tissue phantom, we found that the method reduced relative errors between single-current and two-current DT-MREIT results to around 10%. For in vivo human experimental data the error was about 15%. These results suggest that incorporation of machine learning may make it easier to recover electrical conductivity tensors and electric field images during neuromodulation therapy without the need for multiple current administrations.

Sajib Saurav Z K, Chauhan Munish, Kwon Oh In, Sadleir Rosalind J

2021

General General

The School Attachment Monitor-A novel computational tool for assessment of attachment in middle childhood.

In PloS one ; h5-index 176.0

BACKGROUND : Attachment research has been limited by the lack of quick and easy measures. We report development and validation of the School Attachment Monitor (SAM), a novel measure for largescale assessment of attachment in children aged 5-9, in the general population. SAM offers automatic presentation, on computer, of story-stems based on the Manchester Child Attachment Story Task (MCAST), without the need for trained administrators. SAM is delivered by novel software which interacts with child participants, starting with warm-up activities to familiarise them with the task. Children's story completion is video recorded and augmented by 'smart dolls' that the child can hold and manipulate, with movement sensors for data collection. The design of SAM was informed by children of users' age range to establish their task understanding and incorporate their innovative ideas for improving SAM software.

METHODS : 130 5-9 year old children were recruited from mainstream primary schools. In Phase 1, sixty-one children completed both SAM and MCAST. Inter-rater reliability and rating concordance was compared between SAM and MCAST. In Phase 2, a further 44 children completed SAM complete and, including those children completing SAM in Phase 1 (total n = 105), a machine learning algorithm was developed using a "majority vote" procedure where, for each child, 500 non-overlapping video frames contribute to the decision.

RESULTS : Using manual rating, SAM-MCAST concordance was excellent (89% secure versus insecure; 97% organised versus disorganised; 86% four-way). Comparison of human ratings of SAM versus the machine learning algorithm showed over 80% concordance.

CONCLUSIONS : We have developed a new tool for measuring attachment at the population level, which has good reliability compared to a validated attachment measure and has the potential for automatic rating-opening the door to measurement of attachment in large populations.

Rooksby Maki, Di Folco Simona, Tayarani Mohammad, Vo Dong-Bach, Huan Rui, Vinciarelli Alessandro, Brewster Stephen A, Minnis Helen

2021

General General

A simple parametric representation of the Hodgkin-Huxley model.

In PloS one ; h5-index 176.0

The Hodgkin-Huxley model, decades after its first presentation, is still a reference model in neuroscience as it has successfully reproduced the electrophysiological activity of many organisms. The primary signal in the model represents the membrane potential of a neuron. A simple representation of this signal is presented in this paper. The new proposal is an adapted Frequency Modulated Möbius multicomponent model defined as a signal plus error model in which the signal is decomposed as a sum of waves. The main strengths of the method are the simple parametric formulation, the interpretability and flexibility of the parameters that describe and discriminate the waveforms, the estimators' identifiability and accuracy, and the robustness against noise. The approach is validated with a broad simulation experiment of Hodgkin-Huxley signals and real data from squid giant axons. Interesting differences between simulated and real data emerge from the comparison of the parameter configurations. Furthermore, the potential of the FMM parameters to predict Hodgkin-Huxley model parameters is shown using different Machine Learning methods. Finally, promising contributions of the approach in Spike Sorting and cell-type classification are detailed.

Rodríguez-Collado Alejandro, Rueda Cristina

2021

General General

A machine learning analysis of risk and protective factors of suicidal thoughts and behaviors in college students.

In Journal of American college health : J of ACH

OBJECTIVE : To identify robust and reproducible factors associated with suicidal thoughts and behaviors (STBs) in college students.

METHODS : 356 first-year university students completed a large battery of demographic and clinically-relevant self-report measures during the first semester of college and end-of-year (n = 228). Suicide Behaviors Questionnaire-Revised (SBQ-R) assessed STBs. A machine learning (ML) pipeline using stacking and nested cross-validation examined correlates of SBQ-R scores.

RESULTS : 9.6% of students were identified at significant STBs risk by the SBQ-R. The ML algorithm explained 28.3% of variance (95%CI: 28-28.5%) in baseline SBQ-R scores, with depression severity, social isolation, meaning and purpose in life, and positive affect among the most important factors. There was a significant reduction in STBs at end-of-year with only 1.8% of students identified at significant risk.

CONCLUSION : Analyses replicated known factors associated with STBs during the first semester of college and identified novel, potentially modifiable factors including positive affect and social connectedness.

Kirlic Namik, Akeman Elisabeth, DeVille Danielle C, Yeh Hung-Wen, Cosgrove Kelly T, McDermott Timothy J, Touthang James, Clausen Ashley, Paulus Martin P, Aupperle Robin L

2021-Jul-22

Surgery Surgery

Emerging Technologies for In Vitro Inhalation Toxicology.

In Advanced healthcare materials

Respiratory toxicology remains a major research area in the 21st century since current scenario of airborne viral infection transmission and pollutant inhalation is expected to raise the annual morbidity beyond 2 million. Clinical and epidemiological research connecting human exposure to air contaminants to understand adverse pulmonary health outcomes is, therefore, an immediate subject of human health assessment. Important observations in defining systemic effects of environmental contaminants on inhalation metabolic dysfunction, liver health, and gastrointestinal tract have been well explored with in vivo models. In this review, a framework is provided, a paradigm is established about inhalation toxicity testing in vitro, and a brief overview of breathing Lungs-on-Chip (LoC) as design concepts is given. The optimized bioengineering approaches and microfluidics with their fundamental pros, and cons are presented. There are different strategies that researchers apply to inhalation toxicity studies to assess a variety of inhalable substances and relevant LoC approaches. A case study from published literature and frame arguments about reproducibility as well as in vitro/in vivo correlations are discussed. Finally, the opportunities and challenges in soft robotics, systems inhalation toxicology approach integrating bioengineering, machine learning, and artificial intelligence to address a multitude model for future toxicology are discussed.

Singh Ajay Vikram, Romeo Anthony, Scott Kassandra, Wagener Sandra, Leibrock Lars, Laux Peter, Luch Andreas, Kerkar Pranali, Balakrishnan Shidin, Dakua Sarada Prasad, Park Byung-Wook

2021-Jul-22

air-liquid-interfaces, inhalation, lungs-on-chip, machine learning, toxicology

Pathology Pathology

Quantitative particle agglutination assay for point-of-care testing using mobile holographic imaging and deep learning.

In Lab on a chip

Particle agglutination assays are widely adopted immunological tests that are based on antigen-antibody interactions. Antibody-coated microscopic particles are mixed with a test sample that potentially contains the target antigen, as a result of which the particles form clusters, with a size that is a function of the antigen concentration and the reaction time. Here, we present a quantitative particle agglutination assay that combines mobile lens-free microscopy and deep learning for rapidly measuring the concentration of a target analyte; as its proof-of-concept, we demonstrate high-sensitivity C-reactive protein (hs-CRP) testing using human serum samples. A dual-channel capillary lateral flow device is designed to host the agglutination reaction using 4 μL of serum sample with a material cost of 1.79 cents per test. A mobile lens-free microscope records time-lapsed inline holograms of the lateral flow device, monitoring the agglutination process over 3 min. These captured holograms are processed, and at each frame the number and area of the particle clusters are automatically extracted and fed into shallow neural networks to predict the CRP concentration. 189 measurements using 88 unique patient serum samples were utilized to train, validate and blindly test our platform, which matched the corresponding ground truth concentrations in the hs-CRP range (0-10 μg mL-1) with an R2 value of 0.912. This computational sensing platform was also able to successfully differentiate very high CRP concentrations (e.g., >10-500 μg mL-1) from the hs-CRP range. This mobile, cost-effective and quantitative particle agglutination assay can be useful for various point-of-care sensing needs and global health related applications.

Luo Yi, Joung Hyou-Arm, Esparza Sarah, Rao Jingyou, Garner Omai, Ozcan Aydogan

2021-Jul-22

Radiology Radiology

Using Deep Learning Segmentation for Endotracheal Tube Position Assessment.

In Journal of thoracic imaging

PURPOSE : The purpose of this study was to determine the efficacy of using deep learning segmentation for endotracheal tube (ETT) position on frontal chest x-rays (CXRs).

MATERIALS AND METHODS : This was a retrospective trial involving 936 deidentified frontal CXRs divided into sets for training (676), validation (50), and 2 for testing (210). This included an "internal test" set of 100 CXRs from the same institution, and an "external test" set of 110 CXRs from a different institution. Each image was labeled by 2 radiologists with the ETT-carina distance. On the training images, 1 radiologist manually segmented the ETT tip and inferior wall of the carina. A U-NET architecture was constructed to label each pixel of the CXR as belonging to either the ETT, carina, or neither. This labeling allowed the distance between the ETT and carina to be compared with the average of 2 radiologists. The interclass correlation coefficients, mean, and SDs of the absolute differences between the U-NET and radiologists were calculated.

RESULTS : The mean absolute differences between the U-NET and average of radiologist measurements were 0.60±0.61 and 0.48±0.47 cm on the internal and external datasets, respectively. The interclass correlation coefficients were 0.87 (0.82, 0.91) and 0.92 (0.88, 0.94) on the internal and external datasets, respectively.

CONCLUSION : The U-NET model had excellent reliability and performance similar to radiologists in assessing ETT-carina distance.

Schultheis William G, Lakhani Paras

2021-Jul-21

Dermatology Dermatology

Anaphylaxis and digital medicine.

In Current opinion in allergy and clinical immunology

PURPOSE OF THE REVIEW : Digital medicine (mHealth) aims to help patients and healthcare providers (HCPs) improve and facilitate the provision of patient care. It encompasses equipment/connected medical devices, mHealth services and mHealth apps (apps). An updated review on digital health in anaphylaxis is proposed.

RECENT FINDINGS : In anaphylaxis, mHealth is used in electronic health records and registries.It will greatly benefit from the new International Classification of Diseases-11 rules and artificial intelligence. Telehealth was revolutionised by the coronavirus disease 2019 pandemic, and lessons learnt should be extended to shared decision making in anaphylaxis. Very few nonvalidated apps exist and there is an urgent need to develop and validate such tools.

SUMMARY : Although digital health appears to be of great importance in anaphylaxis, it is still insufficiently used.

Anto Aram, Sousa-Pinto Bernardo, Bousquet Jean

2021-Jul-20

General General

Predicting Writing Styles of Web-Based Materials for Children's Health Education Using the Selection of Semantic Features: Machine Learning Approach.

In JMIR medical informatics ; h5-index 23.0

BACKGROUND : Medical writing styles can have an impact on the understandability of health educational resources. Amid current web-based health information research, there is a dearth of research-based evidence that demonstrates what constitutes the best practice of the development of web-based health resources on children's health promotion and education.

OBJECTIVE : Using authoritative and highly influential web-based children's health educational resources from the Nemours Foundation, the largest not-for-profit organization promoting children's health and well-being, we aimed to develop machine learning algorithms to discriminate and predict the writing styles of health educational resources on children versus adult health promotion using a variety of health educational resources aimed at the general public.

METHODS : The selection of natural language features as predicator variables of algorithms went through initial automatic feature selection using ridge classifier, support vector machine, extreme gradient boost tree, and recursive feature elimination followed by revision by education experts. We compared algorithms using the automatically selected (n=19) and linguistically enhanced (n=20) feature sets, using the initial feature set (n=115) as the baseline.

RESULTS : Using five-fold cross-validation, compared with the baseline (115 features), the Gaussian Naive Bayes model (20 features) achieved statistically higher mean sensitivity (P=.02; 95% CI -0.016 to 0.1929), mean specificity (P=.02; 95% CI -0.016 to 0.199), mean area under the receiver operating characteristic curve (P=.02; 95% CI -0.007 to 0.140), and mean macro F1 (P=.006; 95% CI 0.016-0.167). The statistically improved performance of the final model (20 features) is in contrast to the statistically insignificant changes between the original feature set (n=115) and the automatically selected features (n=19): mean sensitivity (P=.13; 95% CI -0.1699 to 0.0681), mean specificity (P=.10; 95% CI -0.1389 to 0.4017), mean area under the receiver operating characteristic curve (P=.008; 95% CI 0.0059-0.1126), and mean macro F1 (P=.98; 95% CI -0.0555 to 0.0548). This demonstrates the importance and effectiveness of combining automatic feature selection and expert-based linguistic revision to develop the most effective machine learning algorithms from high-dimensional data sets.

CONCLUSIONS : We developed new evaluation tools for the discrimination and prediction of writing styles of web-based health resources for children's health education and promotion among parents and caregivers of children. User-adaptive automatic assessment of web-based health content holds great promise for distant and remote health education among young readers. Our study leveraged the precision and adaptability of machine learning algorithms and insights from health linguistics to help advance this significant yet understudied area of research.

Xie Wenxiu, Ji Meng, Liu Yanmeng, Hao Tianyong, Chow Chi-Yin

2021-Jul-22

health educational resource development, health linguistics, machine learning, online health education

General General

Automated detection of muscle fatigue conditions from cyclostationary based geometric features of surface electromyography signals.

In Computer methods in biomechanics and biomedical engineering

In this study, an attempt has been made to develop an automated muscle fatigue detection system using cyclostationary based geometric features of surface electromyography (sEMG) signals. For this purpose, signals are acquired from fifty-eight healthy volunteers under dynamic muscle fatiguing contractions. The sEMG signals are preprocessed and the epochs of signals under nonfatigue and fatigue conditions are considered for the analysis. A computationally effective Fast Fourier transform based accumulation algorithm is adapted to compute the spectral correlation density coefficients. The boundary of spectral density coefficients in the complex plane is obtained using alpha shape method. The geometric features, namely, perimeter, area, circularity, bending energy, eccentricity and inertia are extracted from the shape and the machine learning models based on multilayer perceptron (MLP) and extreme learning machine (ELM) are developed using these biomarkers. The results show that the cyclostationarity increases in fatigue condition. All the extracted features are found to have significant difference in the two conditions. It is found that the ELM model based on prominent features classifies the sEMG signals with a maximum accuracy of 94.09% and F-score of 93.75%. Therefore, the proposed approach appears to be useful for analysing the fatiguing contractions in neuromuscular conditions.

K Divya Bharathi, P A Karthick, S Ramakrishnan

2021-Jul-22

Fatigue analysis, artificial neural networks, cyclostationarity, geometric features, surface electromyography

Public Health Public Health

Exploring Feasibility of Multivariate Deep Learning Models in Predicting COVID-19 Epidemic.

In Frontiers in public health

Background: Mathematical models are powerful tools to study COVID-19. However, one fundamental challenge in current modeling approaches is the lack of accurate and comprehensive data. Complex epidemiological systems such as COVID-19 are especially challenging to the commonly used mechanistic model when our understanding of this pandemic rapidly refreshes. Objective: We aim to develop a data-driven workflow to extract, process, and develop deep learning (DL) methods to model the COVID-19 epidemic. We provide an alternative modeling approach to complement the current mechanistic modeling paradigm. Method: We extensively searched, extracted, and annotated relevant datasets from over 60 official press releases in Hubei, China, in 2020. Multivariate long short-term memory (LSTM) models were developed with different architectures to track and predict multivariate COVID-19 time series for 1, 2, and 3 days ahead. As a comparison, univariate LSTMs were also developed to track new cases, total cases, and new deaths. Results: A comprehensive dataset with 10 variables was retrieved and processed for 125 days in Hubei. Multivariate LSTM had reasonably good predictability on new deaths, hospitalization of both severe and critical patients, total discharges, and total monitored in hospital. Multivariate LSTM showed better results for new and total cases, and new deaths for 1-day-ahead prediction than univariate counterparts, but not for 2-day and 3-day-ahead predictions. Besides, more complex LSTM architecture seemed not to increase overall predictability in this study. Conclusion: This study demonstrates the feasibility of DL models to complement current mechanistic approaches when the exact epidemiological mechanisms are still under investigation.

Chen Shi, Paul Rajib, Janies Daniel, Murphy Keith, Feng Tinghao, Thill Jean-Claude

2021

COVID-19, deep learning, epidemic, modeling, multivariate

General General

Machine Learning Derived Blueprint for Rational Design of the Effective Single-Atom Cathode Catalyst of the Lithium-Sulfur Battery.

In The journal of physical chemistry letters ; h5-index 129.0

The "shuttle effect" and sluggish kinetics at cathode significantly hinder the further improvements of the lithium-sulfur (Li-S) battery, a candidate of next generation energy storage technology. Herein, machine learning based on high-throughput density functional theory calculations is employed to establish the pattern of polysulfides adsorption and screen the supported single-atom catalyst (SAC). The adsorptions are classified as two categories which successfully distinguish S-S bond breaking from the others. Moreover, a general trend of polysulfides adsorption was established regarding of both kind of metal and the nitrogen configurations on support. The regression model has a mean absolute error of 0.14 eV which exhibited a faithful predictive ability. Based on adsorption energy of soluble polysulfides and overpotential, the most promising SAC was proposed, and a volcano curve was found. In the end, a reactivity map is supplied to guide SAC design of the Li-S battery.

Lian Zan, Yang Min, Jan Faheem, Li Bo

2021-Jul-22

General General

Applied Machine Learning for Prediction of CO2 Adsorption on Biomass Waste-Derived Porous Carbons.

In Environmental science & technology ; h5-index 132.0

Biomass waste-derived porous carbons (BWDPCs) are a class of complex materials that are widely used in sustainable waste management and carbon capture. However, their diverse textural properties, the presence of various functional groups, and the varied temperatures and pressures to which they are subjected during CO2 adsorption make it challenging to understand the underlying mechanism of CO2 adsorption. Here, we compiled a data set including 527 data points collected from peer-reviewed publications and applied machine learning to systematically map CO2 adsorption as a function of the textural and compositional properties of BWDPCs and adsorption parameters. Various tree-based models were devised, where the gradient boosting decision trees (GBDTs) had the best predictive performance with R2 of 0.98 and 0.84 on the training and test data, respectively. Further, the BWDPCs in the compiled data set were classified into regular porous carbons (RPCs) and heteroatom-doped porous carbons (HDPCs), where again the GBDT model had R2 of 0.99 and 0.98 on the training and 0.86 and 0.79 on the test data for the RPCs and HDPCs, respectively. Feature importance revealed the significance of adsorption parameters, textural properties, and compositional properties in the order of precedence for BWDPC-based CO2 adsorption, effectively guiding the synthesis of porous carbons for CO2 adsorption applications.

Yuan Xiangzhou, Suvarna Manu, Low Sean, Dissanayake Pavani Dulanja, Lee Ki Bong, Li Jie, Wang Xiaonan, Ok Yong Sik

2021-Jul-22

carbon materials, gas adsorption and separation, gradient boosting decision trees, low carbon technology, machine learning, sustainable waste management

General General

Exploiting the power of information in medical education.

In Medical teacher

The explosion of medical information demands a thorough reconsideration of medical education, including what we teach and assess, how we educate, and whom we educate. Physicians of the future will need to be self-aware, self-directed, resource-effective team players who can synthesize and apply summarized information and communicate clearly. Training in metacognition, data science, informatics, and artificial intelligence is needed. Education programs must shift focus from content delivery to providing students explicit scaffolding for future learning, such as the Master Adaptive Learner model. Additionally, educators should leverage informatics to improve the process of education and foster individualized, precision education. Finally, attributes of the successful physician of the future should inform adjustments in recruitment and admissions processes. This paper explores how member schools of the American Medical Association Accelerating Change in Medical Education Consortium adjusted all aspects of educational programming in acknowledgment of the rapid expansion of information.

Cutrer William B, Spickard W Anderson, Triola Marc M, Allen Bradley L, Spell Nathan, Herrine Steven K, Dalrymple John L, Gorman Paul N, Lomis Kimberly D

2021-Jul

Metacognition, active learning, artificial intelligence, clinical informatics, electronic health record, medical education

General General

iDNA6mA-Rice-DL: A local web server for identifying DNA N6-methyladenine sites in rice genome by deep learning method.

In Journal of bioinformatics and computational biology

Accurate detection of N6-methyladenine (6mA) sites by biochemical experiments will help to reveal their biological functions, still, these wet experiments are laborious and expensive. Therefore, it is necessary to introduce a powerful computational model to identify the 6mA sites on a genomic scale, especially for plant genomes. In view of this, we proposed a model called iDNA6mA-Rice-DL for the effective identification of 6mA sites in rice genome, which is an intelligent computing model based on deep learning method. Traditional machine learning methods assume the preparation of the features for analysis. However, our proposed model automatically encodes and extracts key DNA features through an embedded layer and several groups of dense layers. We use an independent dataset to evaluate the generalization ability of our model. An area under the receiver operating characteristic curve (auROC) of 0.98 with an accuracy of 95.96% was obtained. The experiment results demonstrate that our model had good performance in predicting 6mA sites in the rice genome. A user-friendly local web server has been established. The Docker image of the local web server can be freely downloaded at https://hub.docker.com/r/his1server/idna6ma-rice-dl.

He Shiqian, Kong Liang, Chen Jing

2021-Jul-21

6mA, DNA, Docker, deep learning, web server

Pathology Pathology

Fragmentation patterns and personalized sequencing of cell-free DNA in urine and plasma of glioma patients.

In EMBO molecular medicine

Glioma-derived cell-free DNA (cfDNA) is challenging to detect using liquid biopsy because quantities in body fluids are low. We determined the glioma-derived DNA fraction in cerebrospinal fluid (CSF), plasma, and urine samples from patients using sequencing of personalized capture panels guided by analysis of matched tumor biopsies. By sequencing cfDNA across thousands of mutations, identified individually in each patient's tumor, we detected tumor-derived DNA in the majority of CSF (7/8), plasma (10/12), and urine samples (10/16), with a median tumor fraction of 6.4 × 10-3 , 3.1 × 10-5 , and 4.7 × 10-5 , respectively. We identified a shift in the size distribution of tumor-derived cfDNA fragments in these body fluids. We further analyzed cfDNA fragment sizes using whole-genome sequencing, in urine samples from 35 glioma patients, 27 individuals with non-malignant brain disorders, and 26 healthy individuals. cfDNA in urine of glioma patients was significantly more fragmented compared to urine from patients with non-malignant brain disorders (P = 1.7 × 10-2 ) and healthy individuals (P = 5.2 × 10-9 ). Machine learning models integrating fragment length could differentiate urine samples from glioma patients (AUC = 0.80-0.91) suggesting possibilities for truly non-invasive cancer detection.

Mouliere Florent, Smith Christopher G, Heider Katrin, Su Jing, van der Pol Ymke, Thompson Mareike, Morris James, Wan Jonathan C M, Chandrananda Dineika, Hadfield James, Grzelak Marta, Hudecova Irena, Couturier Dominique-Laurent, Cooper Wendy, Zhao Hui, Gale Davina, Eldridge Matthew, Watts Colin, Brindle Kevin, Rosenfeld Nitzan, Mair Richard

2021-Jul-22

cell-free DNA, circulating tumor DNA, fragmentomics, gliomas, liquid biopsy

General General

Improving protein tertiary structure prediction by deep learning and distance prediction in CASP14.

In Proteins

Substantial progresses in protein structure prediction have been made by utilizing deep-learning and residue-residue distance prediction since CASP13. Inspired by the advances, we improve our CASP14 MULTICOM protein structure prediction system by incorporating three new components: (1) a new deep learning-based protein inter-residue distance predictor to improve template-free (ab initio) tertiary structure prediction, (2) an enhanced template-based tertiary structure prediction method, and (3) distance-based model quality assessment methods empowered by deep learning. In the 2020 CASP14 experiment, MULTICOM predictor was ranked 7th out of 146 predictors in tertiary structure prediction and ranked 3rd out of 136 predictors in inter-domain structure prediction. The results demonstrate that the template-free modeling based on deep learning and residue-residue distance prediction can predict the correct topology for almost all template-based modeling targets and a majority of hard targets (template-free targets or targets whose templates cannot be recognized), which is a significant improvement over the CASP13 MULTICOM predictor. Moreover, the template-free modeling performs better than the template-based modeling on not only hard targets but also the targets that have homologous templates. The performance of the template-free modeling largely depends on the accuracy of distance prediction closely related to the quality of multiple sequence alignments. The structural model quality assessment works well on targets for which enough good models can be predicted, but it may perform poorly when only a few good models are predicted for a hard target and the distribution of model quality scores is highly skewed. MULTICOM is available at https://github.com/jianlin-cheng/MULTICOM_Human_CASP14/tree/CASP14_DeepRank3 and https://github.com/multicom-toolbox/multicom/tree/multicom_v2.0. This article is protected by copyright. All rights reserved.

Liu Jian, Wu Tianqi, Guo Zhiye, Hou Jie, Cheng Jianlin

2021-Jul-21

inter-residue distance prediction, protein quality assessment, protein structure prediction

Public Health Public Health

Pathways to performance in undergraduate medical students: role of conscientiousness and the perceived educational environment.

In Advances in health sciences education : theory and practice

This study examined conscientiousness and the perceived educational environment as independent and interactive predictors of medical students' performance within Biggs' theoretical model of learning. Conscientiousness, the perceived educational environment, and learning approaches were assessed at the beginning of the third year in 268 medical students at the University of Geneva, Switzerland. Performance was examined at the end of the third year via a computer-based assessment (CBA) and the Objective Structured Clinical Examination (OSCE). Path analysis was used to test the proposed model, whereby conscientiousness and the perceived educational environment predicted performance directly and indirectly via students' learning approaches. A second model included interaction effects. The proposed model provided the best fit and explained 45% of the variance in CBA performance, and 23% of the variance in OSCE performance. Conscientiousness positively predicted CBA performance directly (β = 0.19, p < 0.001) and indirectly via a deep learning approach (β = 0.05, p = 0.012). The perceived educational environment positively predicted CBA performance indirectly only (β = 0.02, p = 0.011). Neither conscientiousness nor the perceived educational environment predicted OSCE performance. Model 2 had acceptable, but less optimal fit. In this model, there was a significant cross-over interaction effect (β = 0.16, p < 0.01): conscientiousness positively predicted OSCE performance when perceptions of the educational environment were the most positive, but negatively predicted performance when perceptions were the least positive. The findings suggest that both conscientiousness and perceptions of the educational environment predict CBA performance. Research should further examine interactions between personality traits and the medical school environment to inform strategies aimed at improving OSCE performance.

Schrempft S, Piumatti G, Gerbase M W, Baroffio A

2021-Jul-22

Academic performance, Conscientiousness, Learning approaches, Medical school, Objective structured clinical exam, Perceived educational environment, Undergraduate medical students

General General

Computational predictions for protein sequences of COVID-19 virus via machine learning algorithms.

In Medical & biological engineering & computing ; h5-index 32.0

The rapid spread of coronavirus disease (COVID-19) has become a worldwide pandemic and affected more than 15 million patients reported in 27 countries. Therefore, the computational biology carrying this virus that correlates with the human population urgently needs to be understood. In this paper, the classification of the human protein sequences of COVID-19, according to the country, is presented based on machine learning algorithms. The proposed model is based on distinguishing 9238 sequences using three stages, including data preprocessing, data labeling, and classification. In the first stage, data preprocessing's function converts the amino acids of COVID-19 protein sequences into eight groups of numbers based on the amino acids' volume and dipole. It is based on the conjoint triad (CT) method. In the second stage, there are two methods for labeling data from 27 countries from 0 to 26. The first method is based on selecting one number for each country according to the code numbers of countries, while the second method is based on binary elements for each country. According to their countries, machine learning algorithms are used to discover different COVID-19 protein sequences in the last stage. The obtained results demonstrate 100% accuracy, 100% sensitivity, and 90% specificity via the country-based binary labeling method with a linear support vector machine (SVM) classifier. Furthermore, with significant infection data, the USA is more prone to correct classification compared to other countries with fewer data. The unbalanced data for COVID-19 protein sequences is considered a major issue, especially as the US's available data represents 76% of a total of 9238 sequences. The proposed model will act as a prediction tool for the COVID-19 protein sequences in different countries.

Afify Heba M, Zanaty Muhammad S

2021-Jul-22

COVID-19 protein sequences, Conjoint triad (CT), Machine learning algorithms, Support vector machine (SVM)

Radiology Radiology

Automated detection and segmentation of sclerotic spinal lesions on body CTs using a deep convolutional neural network.

In Skeletal radiology

PURPOSE : To develop a deep convolutional neural network capable of detecting spinal sclerotic metastases on body CTs.

MATERIALS AND METHODS : Our study was IRB-approved and HIPAA-compliant. Cases of confirmed sclerotic bone metastases in chest, abdomen, and pelvis CTs were identified. Images were manually segmented for 3 classes: background, normal bone, and sclerotic lesion(s). If multiple lesions were present on a slice, all lesions were segmented. A total of 600 images were obtained, with a 90/10 training/testing split. Images were stored as 128 × 128 pixel grayscale and the training dataset underwent a processing pipeline of histogram equalization and data augmentation. We trained our model from scratch on Keras/TensorFlow using an 80/20 training/validation split and a U-Net architecture (64 batch size, 100 epochs, dropout 0.25, initial learning rate 0.0001, sigmoid activation). We also tested our model's true negative and false positive rate with 1104 non-pathologic images. Global sensitivity measured model detection of any lesion on a single image, local sensitivity and positive predictive value (PPV) measured model detection of each lesion on a given image, and local specificity measured the false positive rate in non-pathologic bone.

RESULTS : Dice scores were 0.83 for lesion, 0.96 for non-pathologic bone, and 0.99 for background. Global sensitivity was 95% (57/60), local sensitivity was 92% (89/97), local PPV was 97% (89/92), and local specificity was 87% (958/1104).

CONCLUSION : A deep convolutional neural network has the potential to assist in detecting sclerotic spinal metastases.

Chang Connie Y, Buckless Colleen, Yeh Kaitlyn J, Torriani Martin

2021-Jul-22

Artificial intelligence, Bone lesions, Deep convolutional neural network, Sclerotic

General General

Functional label-free assessment of fibroblast differentiation in 3D collagen-I-matrices using particle image velocimetry.

In Biomaterials science

Fibroblasts are a diverse population of connective tissue cells that are a key component in physiological wound healing. Myofibroblasts are differentiated fibroblasts occurring in various physiological and pathological conditions, like in the healing of wounds or in the tumour microenvironment. They exhibit important functions compared to fibroblasts in terms of proliferation, protein secretion, and contractility. The gold standard to distinguish myofibroblasts is alpha-smooth muscle actin (αSMA) expression and its incorporation in stress fibres, which is only revealed by gene expression analysis and immunostaining. Here, we introduce an approach to functionally determine the myofibroblast status of live fibroblasts directly in in vitro cell culture by analysing their ability to contract the extracellular matrix around them without the need for labelling. It is based on particle image velocimetry algorithms applied to dynamic deformations of the extracellular matrix network structure imaged by phase contrast microscopy. Advanced image analysis allows us to distinguish between various differentiation stages of fibroblasts including the dynamic change over several days. We further apply machine learning classification to automatically evaluate different cell culture conditions. With this new method, we provide a versatile tool to functionally evaluate the dynamic process of fibroblast differentiation. It can be applied for in vitro screening studies in biomimetic 3D cell cultures with options to extend it to other cell systems with contractile phenotypes.

Riedl Philipp, Pompe Tilo

2021-Jul-22

General General

Improving common bacterial blight phenotyping by using rub-inoculation and machine learning: cheaper, better, faster, stronger.

In Phytopathology

Accurate assessment of plant symptoms plays a key role for measuring the impact of pathogens during plant-pathogen interaction. Common bacterial blight caused by Xanthomonas phaseoli pv. phaseoli and Xanthomonas citri pv. fuscans (Xpp-Xcf) is a major threat to common bean. The pathogenicity of these bacteria is variable among strains, and depends mainly on a type III secretion system and associated type III effectors such as transcription activator-like effectors (TALEs). Because the impact of a single gene is often small and difficult to detect, a discriminating methodology is required to distinguish the slight phenotype changes induced during the progression of the disease. Here, we compared two different inoculation and symptom assessment methods for their ability to distinguish two tal mutants from their corresponding wild-type strains. Interestingly, rub-inoculation of the first leaves combined with symptom assessment by machine learning-based imaging allowed significant distinction between wild-type and mutant strains. By contrast, dip-inoculation of first trifoliate leaves combined with chlorophyll fluorescence imaging did not differentiate the strains. Furthermore, the new method developed here led to the miniaturization of pathogenicity tests and significant time savings.

Foucher Justine, Ruh Mylène, Briand Martial, Préveaux Anne, Barbazange Florian, Boureau Tristan, Jacques Marie-Agnès, Chen Nicolas

2021-Jul-21

Bacterial Pathogens, Techniques

Public Health Public Health

DEELIG: A Deep Learning Approach to Predict Protein-Ligand Binding Affinity.

In Bioinformatics and biology insights

Protein-ligand binding prediction has extensive biological significance. Binding affinity helps in understanding the degree of protein-ligand interactions and is a useful measure in drug design. Protein-ligand docking using virtual screening and molecular dynamic simulations are required to predict the binding affinity of a ligand to its cognate receptor. Performing such analyses to cover the entire chemical space of small molecules requires intense computational power. Recent developments using deep learning have enabled us to make sense of massive amounts of complex data sets where the ability of the model to "learn" intrinsic patterns in a complex plane of data is the strength of the approach. Here, we have incorporated convolutional neural networks to find spatial relationships among data to help us predict affinity of binding of proteins in whole superfamilies toward a diverse set of ligands without the need of a docked pose or complex as user input. The models were trained and validated using a stringent methodology for feature extraction. Our model performs better in comparison to some existing methods used widely and is suitable for predictions on high-resolution protein crystal (⩽2.5 Å) and nonpeptide ligand as individual inputs. Our approach to network construction and training on protein-ligand data set prepared in-house has yielded significant insights. We have also tested DEELIG on few COVID-19 main protease-inhibitor complexes relevant to the current public health scenario. DEELIG-based predictions can be incorporated in existing databases including RSCB PDB, PDBMoad, and PDBbind in filling missing binding affinity data for protein-ligand complexes.

Ahmed Asad, Mam Bhavika, Sowdhamini Ramanathan

2021

Binding affinity, PDB, convolutional neural networks, deep learning, drug discovery, protein-ligand binding, supervised learning

General General

A Review on Meat Quality Evaluation Methods Based on Non-Destructive Computer Vision and Artificial Intelligence Technologies.

In Food science of animal resources

Increasing meat demand in terms of both quality and quantity in conjunction with feeding a growing population has resulted in regulatory agencies imposing stringent guidelines on meat quality and safety. Objective and accurate rapid non-destructive detection methods and evaluation techniques based on artificial intelligence have become the research hotspot in recent years and have been widely applied in the meat industry. Therefore, this review surveyed the key technologies of non-destructive detection for meat quality, mainly including ultrasonic technology, machine (computer) vision technology, near-infrared spectroscopy technology, hyperspectral technology, Raman spectra technology, and electronic nose/tongue. The technical characteristics and evaluation methods were compared and analyzed; the practical applications of non-destructive detection technologies in meat quality assessment were explored; and the current challenges and future research directions were discussed. The literature presented in this review clearly demonstrate that previous research on non-destructive technologies are of great significance to ensure consumers' urgent demand for high-quality meat by promoting automatic, real-time inspection and quality control in meat production. In the near future, with ever-growing application requirements and research developments, it is a trend to integrate such systems to provide effective solutions for various grain quality evaluation applications.

Shi Yinyan, Wang Xiaochan, Borhan Md Saidul, Young Jennifer, Newman David, Berg Eric, Sun Xin

2021-Jul

grading assessment, industrial application, key technology, meat quality, non-destructive detection

General General

Label-free screening of brain tissue myelin content using phase imaging with computational specificity (PICS).

In APL photonics

Inadequate myelination in the central nervous system is associated with neurodevelopmental complications. Thus, quantitative, high spatial resolution measurements of myelin levels are highly desirable. We used spatial light interference microcopy (SLIM), a highly sensitive quantitative phase imaging (QPI) technique, to correlate the dry mass content of myelin in piglet brain tissue with dietary changes and gestational size. We combined SLIM micrographs with an artificial intelligence (AI) classifying model that allows us to discern subtle disparities in myelin distributions with high accuracy. This concept of combining QPI label-free data with AI for the purpose of extracting molecular specificity has recently been introduced by our laboratory as phase imaging with computational specificity. Training on 8000 SLIM images of piglet brain tissue with the 71-layer transfer learning model Xception, we created a two-parameter classification to differentiate gestational size and diet type with an accuracy of 82% and 80%, respectively. To our knowledge, this type of evaluation is impossible to perform by an expert pathologist or other techniques.

Fanous Michael, Shi Chuqiao, Caputo Megan P, Rund Laurie A, Johnson Rodney W, Das Tapas, Kuchan Matthew J, Sobh Nahil, Popescu Gabriel

2021-Jul-01

Public Health Public Health

A Differential Threshold of Breakfast, Caffeine and Food Groups May Be Impacting Mental Well-Being in Young Adults: The Mediation Effect of Exercise.

In Frontiers in nutrition

Diet and exercise are known to influence mental health. However, the interaction between diet, dietary practices, and exercise and its impact on the mood of young adults (YA) is poorly understood. YA are inherently at risk for mental distress. They tend to consume a low-quality diet and are generally active. The purpose of the study was to assess these relationships through validating causal loop diagrams (CLD) that describe these connections by using a system dynamic (SD) modeling methodology. Adults 18-29 years were invited to complete the Food-Mood questionnaire. The anonymous questionnaire link was distributed to several institutional listservs and via several social media platforms targeting young adults. A multi-level analysis, including machine learning techniques, was used to assess these relationships. The key findings were then built into gender based CLD, which suggest that a differential repertoire may be needed to optimize diet quality, exercise, and mental well-being. Additionally, a potential net threshold for dietary factors and exercise may be needed to achieve mental well-being in young adults. Moreover, our findings suggest that exercise may boost the enhancing effect of food groups on mental well-being and may lessen the negative impact of dietary impediments of mental well-being.

Begdache Lina, Kianmehr Hamed, Najjar Helen, Witt Dylan, Sabounchi Nasim S

2021

caffeine, dietary patterns, exercise, food groups, gender, mediation, mental health, young adults

General General

Early Prediction of Mortality, Severity, and Length of Stay in the Intensive Care Unit of Sepsis Patients Based on Sepsis 3.0 by Machine Learning Models.

In Frontiers in medicine

Background: Early prediction of the clinical outcome of patients with sepsis is of great significance and can guide treatment and reduce the mortality of patients. However, it is clinically difficult for clinicians. Methods: A total of 2,224 patients with sepsis were involved over a 3-year period (2016-2018) in the intensive care unit (ICU) of Peking Union Medical College Hospital. With all the key medical data from the first 6 h in the ICU, three machine learning models, logistic regression, random forest, and XGBoost, were used to predict mortality, severity (sepsis/septic shock), and length of ICU stay (LOS) (>6 days, ≤ 6 days). Missing data imputation and oversampling were completed on the dataset before introduction into the models. Results: Compared to the mortality and LOS predictions, the severity prediction achieved the best classification results, based on the area under the operating receiver characteristics (AUC), with the random forest classifier (sensitivity = 0.65, specificity = 0.73, F1 score = 0.72, AUC = 0.79). The random forest model also showed the best overall performance (mortality prediction: sensitivity = 0.50, specificity = 0.84, F1 score = 0.66, AUC = 0.74; LOS prediction: sensitivity = 0.79, specificity = 0.66, F1 score = 0.69, AUC = 0.76) among the three models. The predictive ability of the SOFA score itself was inferior to that of the above three models. Conclusions: Using the random forest classifier in the first 6 h of ICU admission can provide a comprehensive early warning of sepsis, which will contribute to the formulation and management of clinical decisions and the allocation and management of resources.

Su Longxiang, Xu Zheng, Chang Fengxiang, Ma Yingying, Liu Shengjun, Jiang Huizhen, Wang Hao, Li Dongkai, Chen Huan, Zhou Xiang, Hong Na, Zhu Weiguo, Long Yun

2021

machine learning, outcome, prediction, sepsis, sequential (sepsis-related) organ failure assessment

General General

An Optimization Algorithm for Computer-Aided Diagnosis of Breast Cancer Based on Support Vector Machine.

In Frontiers in bioengineering and biotechnology

As one of the most vulnerable cancers of women, the incidence rate of breast cancer in China is increasing at an annual rate of 3%, and the incidence is younger. Therefore, it is necessary to conduct research on the risk of breast cancer, including the cause of disease and the prediction of breast cancer risk based on historical data. Data based statistical learning is an important branch of modern computational intelligence technology. Using machine learning method to predict and judge unknown data provides a new idea for breast cancer diagnosis. In this paper, an improved optimization algorithm (GSP_SVM) is proposed by combining genetic algorithm, particle swarm optimization and simulated annealing with support vector machine algorithm. The results show that the classification accuracy, MCC, AUC and other indicators have reached a very high level. By comparing with other optimization algorithms, it can be seen that this method can provide effective support for decision-making of breast cancer auxiliary diagnosis, thus significantly improving the diagnosis efficiency of medical institutions. Finally, this paper also preliminarily explores the effect of applying this algorithm in detecting and classifying breast cancer in different periods, and discusses the application of this algorithm to multiple classifications by comparing it with other algorithms.

Dou Yifeng, Meng Wentao

2021

breast cancer, classification, computer-aided diagnosis, machine learning, optimization, support vector machine

Public Health Public Health

The Optimal Machine Learning-Based Missing Data Imputation for the Cox Proportional Hazard Model.

In Frontiers in public health

An adequate imputation of missing data would significantly preserve the statistical power and avoid erroneous conclusions. In the era of big data, machine learning is a great tool to infer the missing values. The root means square error (RMSE) and the proportion of falsely classified entries (PFC) are two standard statistics to evaluate imputation accuracy. However, the Cox proportional hazards model using various types requires deliberate study, and the validity under different missing mechanisms is unknown. In this research, we propose supervised and unsupervised imputations and examine four machine learning-based imputation strategies. We conducted a simulation study under various scenarios with several parameters, such as sample size, missing rate, and different missing mechanisms. The results revealed the type-I errors according to different imputation techniques in the survival data. The simulation results show that the non-parametric "missForest" based on the unsupervised imputation is the only robust method without inflated type-I errors under all missing mechanisms. In contrast, other methods are not valid to test when the missing pattern is informative. Statistical analysis, which is improperly conducted, with missing data may lead to erroneous conclusions. This research provides a clear guideline for a valid survival analysis using the Cox proportional hazard model with machine learning-based imputations.

Guo Chao-Yu, Yang Ying-Chen, Chen Yi-Hau

2021

cox proportional hazard model, k-nearest neighbors imputation, machine learning, random forest imputation, survival data simulation

Public Health Public Health

Exploring Feasibility of Multivariate Deep Learning Models in Predicting COVID-19 Epidemic.

In Frontiers in public health

Background: Mathematical models are powerful tools to study COVID-19. However, one fundamental challenge in current modeling approaches is the lack of accurate and comprehensive data. Complex epidemiological systems such as COVID-19 are especially challenging to the commonly used mechanistic model when our understanding of this pandemic rapidly refreshes. Objective: We aim to develop a data-driven workflow to extract, process, and develop deep learning (DL) methods to model the COVID-19 epidemic. We provide an alternative modeling approach to complement the current mechanistic modeling paradigm. Method: We extensively searched, extracted, and annotated relevant datasets from over 60 official press releases in Hubei, China, in 2020. Multivariate long short-term memory (LSTM) models were developed with different architectures to track and predict multivariate COVID-19 time series for 1, 2, and 3 days ahead. As a comparison, univariate LSTMs were also developed to track new cases, total cases, and new deaths. Results: A comprehensive dataset with 10 variables was retrieved and processed for 125 days in Hubei. Multivariate LSTM had reasonably good predictability on new deaths, hospitalization of both severe and critical patients, total discharges, and total monitored in hospital. Multivariate LSTM showed better results for new and total cases, and new deaths for 1-day-ahead prediction than univariate counterparts, but not for 2-day and 3-day-ahead predictions. Besides, more complex LSTM architecture seemed not to increase overall predictability in this study. Conclusion: This study demonstrates the feasibility of DL models to complement current mechanistic approaches when the exact epidemiological mechanisms are still under investigation.

Chen Shi, Paul Rajib, Janies Daniel, Murphy Keith, Feng Tinghao, Thill Jean-Claude

2021

COVID-19, deep learning, epidemic, modeling, multivariate

Radiology Radiology

DeepCUBIT: Predicting Lymphovascular Invasion or Pathological Lymph Node Involvement of Clinical T1 Stage Non-Small Cell Lung Cancer on Chest CT Scan Using Deep Cubical Nodule Transfer Learning Algorithm.

In Frontiers in oncology

The prediction of lymphovascular invasion (LVI) or pathological nodal involvement of tumor cells is critical for successful treatment in early stage non-small cell lung cancer (NSCLC). We developed and validated a Deep Cubical Nodule Transfer Learning Algorithm (DeepCUBIT) using transfer learning and 3D Convolutional Neural Network (CNN) to predict LVI or pathological nodal involvement on chest CT images. A total of 695 preoperative CT images of resected NSCLC with tumor size of less than or equal to 3 cm from 2008 to 2015 were used to train and validate the DeepCUBIT model using five-fold cross-validation method. We also used tumor size and consolidation to tumor ratio (C/T ratio) to build a support vector machine (SVM) classifier. Two-hundred and fifty-four out of 695 samples (36.5%) had LVI or nodal involvement. An integrated model (3D CNN + Tumor size + C/T ratio) showed sensitivity of 31.8%, specificity of 89.8%, accuracy of 76.4%, and AUC of 0.759 on external validation cohort. Three single SVM models, using 3D CNN (DeepCUBIT), tumor size or C/T ratio, showed AUCs of 0.717, 0.630 and 0.683, respectively on external validation cohort. DeepCUBIT showed the best single model compared to the models using only C/T ratio or tumor size. In addition, the DeepCUBIT model could significantly identify the prognosis of resected NSCLC patients even in stage I. DeepCUBIT using transfer learning and 3D CNN can accurately predict LVI or nodal involvement in cT1 size NSCLC on CT images. Thus, it can provide a more accurate selection of candidates who will benefit from limited surgery without increasing the risk of recurrence.

Beck Kyongmin Sarah, Gil Bomi, Na Sae Jung, Hong Ji Hyung, Chun Sang Hoon, An Ho Jung, Kim Jae Jun, Hong Soon Auck, Lee Bora, Shim Won Sang, Park Sungsoo, Ko Yoon Ho

2021

computed tomography, deep learning, lobectomy, non-small cell lung cancer, prognosis

oncology Oncology

An immune-based risk-stratification system for predicting prognosis in pulmonary sarcomatoid carcinoma (PSC).

In Oncoimmunology

Pulmonary sarcomatoid carcinoma (PSC) is an uncommon subtype of lung cancer, and immune checkpoint blockade promises in clinical benefit. However, virtually nothing is known about the expression of common immune checkpoints in PSC. Here, we performed immunohistochemistry (IHC) to detect nine immune-related proteins in 97 PSC patients. Based on the univariable Cox regression, random forests were used to establish risk models for OS and DFS. Moreover, we used the GSEA, CIBERSORT, and ImmuCellAI to analyze the enriched pathways and microenvironment. Univariable analysis revealed that CD4 (P = 0.008), programmed cell death protein 1 (PD-1; P = 0.003), galectin-9 (Gal-9) on tumor cells (TCs; P = 0.021) were independent for DFS, while CD4 (P = 0.020), PD-1 (P = 0.004), Gal-9 (P = 0.033), and HLA on TILs (P = 0.031) were significant for OS. Meanwhile, the expression level of CD8 played a marginable role in DFS (P = 0.061), limited by the number of patients. The combination of Gal-9 on TC with CD4 and PD-1 on TILs demonstrated the most accurate prediction for DFS (AUC: 0.636-0.791, F1-score: 0.635-0.799), and a dramatic improvement to TNM-stage (P < 0.001 for F1-score of 1-y, 3-y, and 5-yDFS). A similar finding was also observed in the predictive ability of CD4 for OS (AUC: 0.602-0.678, F1-score: 0.635-0.679). CD4 was negatively associated with the infiltration of neutrophils (P = 0.015). PDCD1 (coding gene of PD-1) was positively correlated to the number of exhausted T cells (Texs; P = 0.020) and induced regulatory T cells (iTregs; P = 0.021), and LGALS9 (coding gene of Gal-9) was positively related to the level of dendritic cells (DCs; P = 0.021). Further, a higher combinational level of CD4, PDCD1 on TILs, and LAGLS9 on TCs were proved to be infiltrated with more M1-type macrophages (P < 0.05). We confirmed the expression status of nine immune-related proteins and established a TNM-Immune system for OS and DFS in PSC to assist clinical risk-stratification.

Guo Haoyue, Li Binglei, Diao Li, Wang Hao, Chen Peixin, Jiang Minlin, Zhao Lishu, He Yayi, Zhou Caicun

2021

Pulmonary sarcomatoid carcinoma, immune checkpoint, immunohistochemistry, machine learning, prognosis

General General

Opportunistic diagnosis of osteoporosis, fragile bone strength and vertebral fractures from routine CT scans; a review of approved technology systems and pathways to implementation.

In Therapeutic advances in musculoskeletal disease

Osteoporosis causes bones to become weak, porous and fracture more easily. While a vertebral fracture is the archetypal fracture of osteoporosis, it is also the most difficult to diagnose clinically. Patients often suffer further spine or other fractures, deformity, height loss and pain before diagnosis. There were an estimated 520,000 fragility fractures in the United Kingdom (UK) in 2017 (costing £4.5 billion), a figure set to increase 30% by 2030. One way to improve both vertebral fracture identification and the diagnosis of osteoporosis is to assess a patient's spine or hips during routine computed tomography (CT) scans. Patients attend routine CT for diagnosis and monitoring of various medical conditions, but the skeleton can be overlooked as radiologists concentrate on the primary reason for scanning. More than half a million CT scans done each year in the National Health Service (NHS) could potentially be screened for osteoporosis (increasing 5% annually). If CT-based screening became embedded in practice, then the technique could have a positive clinical impact in the identification of fragility fracture and/or low bone density. Several companies have developed software methods to diagnose osteoporosis/fragile bone strength and/or identify vertebral fractures in CT datasets, using various methods that include image processing, computational modelling, artificial intelligence and biomechanical engineering concepts. Technology to evaluate Hounsfield units is used to calculate bone density, but not necessarily bone strength. In this rapid evidence review, we summarise the current literature underpinning approved technologies for opportunistic screening of routine CT images to identify fractures, bone density or strength information. We highlight how other new software technologies have become embedded in NHS clinical practice (having overcome barriers to implementation) and highlight how the novel osteoporosis technologies could follow suit. We define the key unanswered questions where further research is needed to enable the adoption of these technologies for maximal patient benefit.

Aggarwal Veena, Maslen Christina, Abel Richard L, Bhattacharya Pinaki, Bromiley Paul A, Clark Emma M, Compston Juliet E, Crabtree Nicola, Gregory Jennifer S, Kariki Eleni P, Harvey Nicholas C, Ward Kate A, Poole Kenneth E S

2021

Osteoporosis, QCT, artificial intelligence, computed tomography, epidemiology, fragility fracture, innovation, screening, technology, vertebral fracture

General General

A Multitask Approach to Learn Molecular Properties.

In Journal of chemical information and modeling

The endeavors to pursue a robust multitask model to resolve intertask correlations have lasted for many years. A multitask deep neural network, as the most widely used multitask framework, however, experiences several issues such as inconsistent performance improvement over the independent model benchmark. The research aims to introduce an alternative framework by using the problem transformation methods. We build our multitask models essentially based on the stacking of a base regressor and classifier, where the multitarget predictions are realized from an additional training stage on the expanded molecular feature space. The model architecture is implemented on the QM9, Alchemy, and Tox21 datasets, by using a variety of baseline machine learning techniques. The resultant multitask performance shows 1 to 10% enhancement of forecasting precision, with the task prediction accuracy being consistently improved over the independent single-target models. The proposed method demonstrates a notable superiority in tackling the intertarget dependence and, moreover, a great potential to simulate a wide range of molecular properties under the transformation framework.

Tan Zheng, Li Yan, Shi Weimei, Yang Shiqing

2021-Jul-21

Public Health Public Health

Metabolomic analyses reveals new stage-specific features of the COVID-19.

In The European respiratory journal

The current pandemic of coronavirus disease 19 (COVID-19) has affected more than 160 million of individuals and caused millions of deaths worldwide at least in part due to the unclarified pathophysiology of this disease. Therefore, identifying the underlying molecular mechanisms of COVID-19 is critical to overcome this pandemic. Metabolites mirror the disease progression of an individual by acquiring extensive insights into the pathophysiological significance during disease progression. We provide a comprehensive view of metabolic characterization of sera from COVID-19 patients at all stages using untargeted and targeted metabolomic analysis. As compared with the healthy controls, we observed different alteration patterns of circulating metabolites from the mild, severe and recovery stages, in both discovery cohort and validation cohort, which suggest that metabolic reprogramming of glucose metabolism and urea cycle are potential pathological mechanisms for COVID-19 progression. Our findings suggest that targeting glucose metabolism and urea cycle may be a viable approach to fight against COVID-19 at various stages along the disease course.

Jia Hongling, Liu Chaowu, Li Dantong, Huang Qingsheng, Liu Dong, Zhang Ying, Ye Chang, Zhou Di, Wang Yang, Tan Yanlian, Li Kuibiao, Lin Fangqin, Zhang Haiqing, Lin Jingchao, Xu Yang, Liu Jingwen, Zeng Qing, Hong Jian, Chen Guobing, Zhang Hao, Zheng Lingling, Deng Xilong, Ke Changwen, Gao Yunfei, Fan Jun, Di Biao, Liang Huiying

2021-Jul-21

General General

Using Machine Learning to Characterize Atrial Fibrotic Substrate From Intracardiac Signals With a Hybrid in silico and in vivo Dataset.

In Frontiers in physiology

In patients with atrial fibrillation, intracardiac electrogram signal amplitude is known to decrease with increased structural tissue remodeling, referred to as fibrosis. In addition to the isolation of the pulmonary veins, fibrotic sites are considered a suitable target for catheter ablation. However, it remains an open challenge to find fibrotic areas and to differentiate their density and transmurality. This study aims to identify the volume fraction and transmurality of fibrosis in the atrial substrate. Simulated cardiac electrograms, combined with a generalized model of clinical noise, reproduce clinically measured signals. Our hybrid dataset approach combines in silico and clinical electrograms to train a decision tree classifier to characterize the fibrotic atrial substrate. This approach captures different in vivo dynamics of the electrical propagation reflected on healthy electrogram morphology and synergistically combines it with synthetic fibrotic electrograms from in silico experiments. The machine learning algorithm was tested on five patients and compared against clinical voltage maps as a proof of concept, distinguishing non-fibrotic from fibrotic tissue and characterizing the patient's fibrotic tissue in terms of density and transmurality. The proposed approach can be used to overcome a single voltage cut-off value to identify fibrotic tissue and guide ablation targeting fibrotic areas.

Sánchez Jorge, Luongo Giorgio, Nothstein Mark, Unger Laura A, Saiz Javier, Trenor Beatriz, Luik Armin, Dössel Olaf, Loewe Axel

2021

atrial fibrillation, bidomain, cardiac modeling, density, fibrosis, machine learning, transmurality

General General

HADLN: Hybrid Attention-Based Deep Learning Network for Automated Arrhythmia Classification.

In Frontiers in physiology

In recent years, with the development of artificial intelligence, deep learning model has achieved initial success in ECG data analysis, especially the detection of atrial fibrillation. In order to solve the problems of ignoring the correlation between contexts and gradient dispersion in traditional deep convolution neural network model, the hybrid attention-based deep learning network (HADLN) method is proposed to implement arrhythmia classification. The HADLN can make full use of the advantages of residual network (ResNet) and bidirectional long-short-term memory (Bi-LSTM) architecture to obtain fusion features containing local and global information and improve the interpretability of the model through the attention mechanism. The method is trained and verified by using the PhysioNet 2017 challenge dataset. Without loss of generality, the ECG signal is classified into four categories, including atrial fibrillation, noise, other, and normal signals. By combining the fusion features and the attention mechanism, the learned model has a great improvement in classification performance and certain interpretability. The experimental results show that the proposed HADLN method can achieve precision of 0.866, recall of 0.859, accuracy of 0.867, and F1-score of 0.880 on 10-fold cross-validation.

Jiang Mingfeng, Gu Jiayan, Li Yang, Wei Bo, Zhang Jucheng, Wang Zhikang, Xia Ling

2021

ResNet, arrhythmia classification, attention mechanism, bidirectional LSTM, deep learning

General General

Toward Software-Equivalent Accuracy on Transformer-Based Deep Neural Networks With Analog Memory Devices.

In Frontiers in computational neuroscience

Recent advances in deep learning have been driven by ever-increasing model sizes, with networks growing to millions or even billions of parameters. Such enormous models call for fast and energy-efficient hardware accelerators. We study the potential of Analog AI accelerators based on Non-Volatile Memory, in particular Phase Change Memory (PCM), for software-equivalent accurate inference of natural language processing applications. We demonstrate a path to software-equivalent accuracy for the GLUE benchmark on BERT (Bidirectional Encoder Representations from Transformers), by combining noise-aware training to combat inherent PCM drift and noise sources, together with reduced-precision digital attention-block computation down to INT6.

Spoon Katie, Tsai Hsinyu, Chen An, Rasch Malte J, Ambrogio Stefano, Mackin Charles, Fasoli Andrea, Friz Alexander M, Narayanan Pritish, Stanisavljevic Milos, Burr Geoffrey W

2021

BERT, DNN, PCM, RRAM, Transformer, analog accelerators, in-memory computing

Radiology Radiology

Thalamus Radiomics-Based Disease Identification and Prediction of Early Treatment Response for Schizophrenia.

In Frontiers in neuroscience ; h5-index 72.0

Background : Emerging evidence suggests structural and functional disruptions of the thalamus in schizophrenia, but whether thalamus abnormalities are able to be used for disease identification and prediction of early treatment response in schizophrenia remains to be determined. This study aims at developing and validating a method of disease identification and prediction of treatment response by multi-dimensional thalamic features derived from magnetic resonance imaging in schizophrenia patients using radiomics approaches.

Methods : A total of 390 subjects, including patients with schizophrenia and healthy controls, participated in this study, among which 109 out of 191 patients had clinical characteristics of early outcome (61 responders and 48 non-responders). Thalamus-based radiomics features were extracted and selected. The diagnostic and predictive capacity of multi-dimensional thalamic features was evaluated using radiomics approach.

Results : Using radiomics features, the classifier accurately discriminated patients from healthy controls, with an accuracy of 68%. The features were further confirmed in prediction and random forest of treatment response, with an accuracy of 75%.

Conclusion : Our study demonstrates a radiomics approach by multiple thalamic features to identify schizophrenia and predict early treatment response. Thalamus-based classification could be promising to apply in schizophrenia definition and treatment selection.

Cui Long-Biao, Zhang Ya-Juan, Lu Hong-Liang, Liu Lin, Zhang Hai-Jun, Fu Yu-Fei, Wu Xu-Sha, Xu Yong-Qiang, Li Xiao-Sa, Qiao Yu-Ting, Qin Wei, Yin Hong, Cao Feng

2021

diagnosis, machine learning, radiomics, schizophrenia, thalamus, treatment

Public Health Public Health

DEELIG: A Deep Learning Approach to Predict Protein-Ligand Binding Affinity.

In Bioinformatics and biology insights

Protein-ligand binding prediction has extensive biological significance. Binding affinity helps in understanding the degree of protein-ligand interactions and is a useful measure in drug design. Protein-ligand docking using virtual screening and molecular dynamic simulations are required to predict the binding affinity of a ligand to its cognate receptor. Performing such analyses to cover the entire chemical space of small molecules requires intense computational power. Recent developments using deep learning have enabled us to make sense of massive amounts of complex data sets where the ability of the model to "learn" intrinsic patterns in a complex plane of data is the strength of the approach. Here, we have incorporated convolutional neural networks to find spatial relationships among data to help us predict affinity of binding of proteins in whole superfamilies toward a diverse set of ligands without the need of a docked pose or complex as user input. The models were trained and validated using a stringent methodology for feature extraction. Our model performs better in comparison to some existing methods used widely and is suitable for predictions on high-resolution protein crystal (⩽2.5 Å) and nonpeptide ligand as individual inputs. Our approach to network construction and training on protein-ligand data set prepared in-house has yielded significant insights. We have also tested DEELIG on few COVID-19 main protease-inhibitor complexes relevant to the current public health scenario. DEELIG-based predictions can be incorporated in existing databases including RSCB PDB, PDBMoad, and PDBbind in filling missing binding affinity data for protein-ligand complexes.

Ahmed Asad, Mam Bhavika, Sowdhamini Ramanathan

2021

Binding affinity, PDB, convolutional neural networks, deep learning, drug discovery, protein-ligand binding, supervised learning

Radiology Radiology

Automated cortical thickness measurement of the mandibular condyle head on CBCT images using a deep learning method.

In Scientific reports ; h5-index 158.0

This study proposes a deep learning model for cortical bone segmentation in the mandibular condyle head using cone-beam computed tomography (CBCT) and an automated method for measuring cortical thickness with a color display based on the segmentation results. In total, 12,800 CBCT images from 25 normal subjects, manually labeled by an oral radiologist, served as the gold-standard. The segmentation model combined a modified U-Net and a convolutional neural network for target region classification. Model performance was evaluated using intersection over union (IoU) and the Hausdorff distance in comparison with the gold standard. The second automated model measured the cortical thickness based on a three-dimensional (3D) model rendered from the segmentation results and presented a color visualization of the measurements. The IoU and Hausdorff distance showed high accuracy (0.870 and 0.928 for marrow bone and 0.734 and 1.247 for cortical bone, respectively). A visual comparison of the 3D color maps showed a similar trend to the gold standard. This algorithm for automatic segmentation of the mandibular condyle head and visualization of the measured cortical thickness as a 3D-rendered model with a color map may contribute to the automated quantification of bone thickness changes of the temporomandibular joint complex on CBCT.

Kim Young Hyun, Shin Jin Young, Lee Ari, Park Seungtae, Han Sang-Sun, Hwang Hyung Ju

2021-Jul-21

Surgery Surgery

Machine learning-based preoperative datamining can predict the therapeutic outcome of sleep surgery in OSA subjects.

In Scientific reports ; h5-index 158.0

Increasing recognition of anatomical obstruction has resulted in a large variety of sleep surgeries to improve anatomic collapse of obstructive sleep apnea (OSA) and the prediction of whether sleep surgery will have successful outcome is very important. The aim of this study is to assess a machine learning-based clinical model that predict the success rate of sleep surgery in OSA subjects. The predicted success rate from machine learning and the predicted subjective surgical outcome from the physician were compared with the actual success rate in 163 male dominated-OSA subjects. Predicted success rate of sleep surgery from machine learning models based on sleep parameters and endoscopic findings of upper airway demonstrated higher accuracy than subjective predicted value of sleep surgeon. The gradient boosting model showed the best performance to predict the surgical success that is evaluated by pre- and post-operative polysomnography or home sleep apnea testing among the logistic regression and three machine learning models, and the accuracy of gradient boosting model (0.708) was significantly higher than logistic regression model (0.542). Our data demonstrate that the data mining-driven prediction such as gradient boosting exhibited higher accuracy for prediction of surgical outcome and we can provide accurate information on surgical outcomes before surgery to OSA subjects using machine learning models.

Kim Jin Youp, Kong Hyoun-Joong, Kim Su Hwan, Lee Sangjun, Kang Seung Heon, Han Seung Cheol, Kim Do Won, Ji Jeong-Yeon, Kim Hyun Jik

2021-Jul-21

General General

Retrospective study of glycemic variability, BMI, and blood pressure in diabetes patients in the Digital Twin Precision Treatment Program.

In Scientific reports ; h5-index 158.0

The objective of this retrospective observational cohort study was to measure glycemic variability and reductions in body mass index (BMI), blood pressure (BP), and use of antihypertensive medications in type 2 diabetes (T2D) patients participating in the digital twin-enabled Twin Precision Treatment (TPT) Program. Study participants included 19 females and 45 males with T2D who chose to participate in the TPT Program and adhered to program protocols. Nine additional enrollees were excluded due to major program non-adherence. Enrollees were required to have adequate hepatic and renal function, no myocardial infarction, stroke, or angina ≤ 90 days before enrollment, and no history of ketoacidosis or major psychiatric disorders. The TPT program uses Digital Twin technology, machine learning algorithms, and precision nutrition to aid treatment of patients with T2D. Each study participant had ≥ 3 months of follow-up. Outcome measures included glucose percentage coefficient of variation (%CV), low blood glucose index (LBGI), high blood glucose index (HBGI), systolic and diastolic BP, number of antihypertensive medications, and BMI. Sixty-four patients participated in the program. Mean (± standard deviation) %CV, LBGI, and HBGI values were low (17.34 ± 4.35, 1.37 ± 1.37, and 2.13 ± 2.79, respectively) throughout the 90-day program. BMI decreased from 29.23 ± 5.83 at baseline to 27.43 ± 5.25 kg/m2. Systolic BP fell from 134.72 ± 17.73 to 124.58 ± 11.62 mm Hg. Diastolic BP decreased from 83.95 ± 10.20 to 80.33 ± 7.04 mm Hg. The percent of patients taking antihypertensive medications decreased from 35.9% at baseline to 4.7% at 90 days. During 90 days of the TPT Program, patients achieved low glycemic variability and significant reductions in BMI and BP. Antihypertensive medication use was eliminated in nearly all patients. Future research will focus on randomized case-control comparisons.

Shamanna Paramesh, Dharmalingam Mala, Sahay Rakesh, Mohammed Jahangir, Mohamed Maluk, Poon Terrence, Kleinman Nathan, Thajudeen Mohamed

2021-Jul-21

General General

A multi-hazard map-based flooding, gully erosion, forest fires, and earthquakes in Iran.

In Scientific reports ; h5-index 158.0

We used three state-of-the-art machine learning techniques (boosted regression tree, random forest, and support vector machine) to produce a multi-hazard (MHR) map illustrating areas susceptible to flooding, gully erosion, forest fires, and earthquakes in Kohgiluyeh and Boyer-Ahmad Province, Iran. The earthquake hazard map was derived from a probabilistic seismic hazard analysis. The mean decrease Gini (MDG) method was implemented to determine the relative importance of effective factors on the spatial occurrence of each of the four hazards. Area under the curve (AUC) plots, based on a validation dataset, were created for the maps generated using the three algorithms to compare the results. The random forest model had the highest predictive accuracy, with AUC values of 0.994, 0.982, and 0.885 for gully erosion, flooding, and forest fires, respectively. Approximately 41%, 40%, 28%, and 3% of the study area are at risk of forest fires, earthquakes, floods, and gully erosion, respectively.

Pouyan Soheila, Pourghasemi Hamid Reza, Bordbar Mojgan, Rahmanian Soroor, Clague John J

2021-Jul-21

General General

Integrating ensemble systems biology feature selection and bimodal deep neural network for breast cancer prognosis prediction.

In Scientific reports ; h5-index 158.0

Breast cancer is a heterogeneous disease. To guide proper treatment decisions for each patient, robust prognostic biomarkers, which allow reliable prognosis prediction, are necessary. Gene feature selection based on microarray data is an approach to discover potential biomarkers systematically. However, standard pure-statistical feature selection approaches often fail to incorporate prior biological knowledge and select genes that lack biological insights. Besides, due to the high dimensionality and low sample size properties of microarray data, selecting robust gene features is an intrinsically challenging problem. We hence combined systems biology feature selection with ensemble learning in this study, aiming to select genes with biological insights and robust prognostic predictive power. Moreover, to capture breast cancer's complex molecular processes, we adopted a multi-gene approach to predict the prognosis status using deep learning classifiers. We found that all ensemble approaches could improve feature selection robustness, wherein the hybrid ensemble approach led to the most robust result. Among all prognosis prediction models, the bimodal deep neural network (DNN) achieved the highest test performance, further verified by survival analysis. In summary, this study demonstrated the potential of combining ensemble learning and bimodal DNN in guiding precision medicine.

Cheng Li-Hsin, Hsu Te-Cheng, Lin Che

2021-Jul-21

General General

Pendulum test in chronic hemiplegic stroke population: additional ambulatory information beyond spasticity.

In Scientific reports ; h5-index 158.0

Spasticity measured by manual tests, such as modified Ashworth scale (MAS), may not sufficiently reflect mobility function in stroke survivors. This study aims to identify additional ambulatory information provided by the pendulum test. Clinical assessments including Brünnstrom recovery stage, manual muscle test, MAS, Tinetti test (TT), Timed up and go test, 10-m walk test (10-MWT), and Barthel index were applied to 40 ambulant chronic stroke patients. The pendular parameters, first swing excursion (FSE) and relaxation index (RI), were extracted by an electrogoniometer. The correlations among these variables were analyzed by the Spearman and Pearson partial correlation tests. After controlling the factor of motor recovery (Brünnstrom recovery stage), the MAS of paretic knee extensor was negatively correlated with the gait score of TT (r =  - 0.355, p = 0.027), while the FSE revealed positive correlations to the balance score of TT (r = 0.378, p = 0.018). RI were associated with the comfortable speed of 10-MWT (r = 0.367, p = 0.022). These results suggest a decrease of knee extensor spasticity links to a better gait and balance in chronic stroke patients. The pendular parameters can provide additional ambulatory information, as complementary to the MAS. The pendulum test can be a potential tool for patient selection and outcome assessment after spasticity treatments in chronic stroke population.

Huang Yin-Kai Dean, Li Wei, Chou Yi-Lin, Hung Erica Shih-Wei, Kang Jiunn-Horng

2021-Jul-20

oncology Oncology

The impact of site-specific digital histology signatures on deep learning model accuracy and bias.

In Nature communications ; h5-index 260.0

The Cancer Genome Atlas (TCGA) is one of the largest biorepositories of digital histology. Deep learning (DL) models have been trained on TCGA to predict numerous features directly from histology, including survival, gene expression patterns, and driver mutations. However, we demonstrate that these features vary substantially across tissue submitting sites in TCGA for over 3,000 patients with six cancer subtypes. Additionally, we show that histologic image differences between submitting sites can easily be identified with DL. Site detection remains possible despite commonly used color normalization and augmentation methods, and we quantify the image characteristics constituting this site-specific digital histology signature. We demonstrate that these site-specific signatures lead to biased accuracy for prediction of features including survival, genomic mutations, and tumor stage. Furthermore, ethnicity can also be inferred from site-specific signatures, which must be accounted for to ensure equitable application of DL. These site-specific signatures can lead to overoptimistic estimates of model performance, and we propose a quadratic programming method that abrogates this bias by ensuring models are not trained and validated on samples from the same site.

Howard Frederick M, Dolezal James, Kochanny Sara, Schulte Jefree, Chen Heather, Heij Lara, Huo Dezheng, Nanda Rita, Olopade Olufunmilayo I, Kather Jakob N, Cipriani Nicole, Grossman Robert L, Pearson Alexander T

2021-07-20

General General

A generalizable and accessible approach to machine learning with global satellite imagery.

In Nature communications ; h5-index 260.0

Combining satellite imagery with machine learning (SIML) has the potential to address global challenges by remotely estimating socioeconomic and environmental conditions in data-poor regions, yet the resource requirements of SIML limit its accessibility and use. We show that a single encoding of satellite imagery can generalize across diverse prediction tasks (e.g., forest cover, house price, road length). Our method achieves accuracy competitive with deep neural networks at orders of magnitude lower computational cost, scales globally, delivers label super-resolution predictions, and facilitates characterizations of uncertainty. Since image encodings are shared across tasks, they can be centrally computed and distributed to unlimited researchers, who need only fit a linear regression to their own ground truth data in order to achieve state-of-the-art SIML performance.

Rolf Esther, Proctor Jonathan, Carleton Tamma, Bolliger Ian, Shankar Vaishaal, Ishihara Miyabi, Recht Benjamin, Hsiang Solomon

2021-07-20

General General

Effects of eight neuropsychiatric copy number variants on human brain structure.

In Translational psychiatry ; h5-index 60.0

Many copy number variants (CNVs) confer risk for the same range of neurodevelopmental symptoms and psychiatric conditions including autism and schizophrenia. Yet, to date neuroimaging studies have typically been carried out one mutation at a time, showing that CNVs have large effects on brain anatomy. Here, we aimed to characterize and quantify the distinct brain morphometry effects and latent dimensions across 8 neuropsychiatric CNVs. We analyzed T1-weighted MRI data from clinically and non-clinically ascertained CNV carriers (deletion/duplication) at the 1q21.1 (n = 39/28), 16p11.2 (n = 87/78), 22q11.2 (n = 75/30), and 15q11.2 (n = 72/76) loci as well as 1296 non-carriers (controls). Case-control contrasts of all examined genomic loci demonstrated effects on brain anatomy, with deletions and duplications showing mirror effects at the global and regional levels. Although CNVs mainly showed distinct brain patterns, principal component analysis (PCA) loaded subsets of CNVs on two latent brain dimensions, which explained 32 and 29% of the variance of the 8 Cohen's d maps. The cingulate gyrus, insula, supplementary motor cortex, and cerebellum were identified by PCA and multi-view pattern learning as top regions contributing to latent dimension shared across subsets of CNVs. The large proportion of distinct CNV effects on brain morphology may explain the small neuroimaging effect sizes reported in polygenic psychiatric conditions. Nevertheless, latent gene brain morphology dimensions will help subgroup the rapidly expanding landscape of neuropsychiatric variants and dissect the heterogeneity of idiopathic conditions.

Modenato Claudia, Kumar Kuldeep, Moreau Clara, Martin-Brevet Sandra, Huguet Guillaume, Schramm Catherine, Jean-Louis Martineau, Martin Charles-Olivier, Younis Nadine, Tamer Petra, Douard Elise, Thébault-Dagher Fanny, Côté Valérie, Charlebois Audrey-Rose, Deguire Florence, Maillard Anne M, Rodriguez-Herreros Borja, Pain Aurèlie, Richetin Sonia, Melie-Garcia Lester, Kushan Leila, Silva Ana I, van den Bree Marianne B M, Linden David E J, Owen Michael J, Hall Jeremy, Lippé Sarah, Chakravarty Mallar, Bzdok Danilo, Bearden Carrie E, Draganski Bogdan, Jacquemont Sébastien

2021-Jul-20

General General

Binocular vision supports the development of scene segmentation capabilities: Evidence from a deep learning model.

In Journal of vision ; h5-index 47.0

The application of deep learning techniques has led to substantial progress in solving a number of critical problems in machine vision, including fundamental problems of scene segmentation and depth estimation. Here, we report a novel deep neural network model, capable of simultaneous scene segmentation and depth estimation from a pair of binocular images. By manipulating the arrangement of binocular image pairs, presenting the model with standard left-right image pairs, identical image pairs or swapped left-right images, we show that performance levels depend on the presence of appropriate binocular image arrangements. Segmentation and depth estimation performance are both impaired when images are swapped. Segmentation performance levels are maintained, however, for identical image pairs, despite the absence of binocular disparity information. Critically, these performance levels exceed those found for an equivalent, monocularly trained, segmentation model. These results provide evidence that binocular image differences support both the direct recovery of depth and segmentation information, and the enhanced learning of monocular segmentation signals. This finding suggests that binocular vision may play an important role in visual development. Better understanding of this role may hold implications for the study and treatment of developmentally acquired perceptual impairments.

Goutcher Ross, Barrington Christian, Hibbard Paul B, Graham Bruce

2021-Jul-06

General General

Dual-Sampling Attention Pooling for Graph Neural Networks on 3D Mesh.

In Computer methods and programs in biomedicine

Mesh is an essential and effective data representation of a 3D shape. The 3D mesh segmentation is a fundamental task in computer vision and graphics. It has recently been realized through a multi-scale deep learning framework, whose sampling methods are of key significance. Rarely do the previous sampling methods consider the receptive field contour of vertex, leading to loss in scale consistency of the vertex feature. Meanwhile, uniform sampling can ensure the utmost uniformity of the vertex distribution of the sampled mesh. Consequently, to efficiently improve the scale consistency of vertex features, uniform sampling was first used in this study to construct a multi-scale mesh hierarchy. In order to address the issue on uniform sampling, namely, the smoothing effect, vertex clustering sampling was used because it can preserve the geometric structure, especially the edge information. With the merits of these two sampling methods combined, more and complete information on the 3D shape can be acquired. Moreover, we adopted the attention mechanism to better realize the cross-scale shape feature transfer. According to the attention mechanism, shape feature transfer between different scales can be realized by the construction of a novel graph structure. On this basis, we propose dual-sampling attention pooling for graph neural networks on 3D mesh. According to experiments on three datasets, the proposed methods are highly competitive.

Wen Tingxi, Zhuang Jiafu, Du Yu, Yang Linjie, Xu Jianfei

2021-Jun-30

3D shape semantic segmentation, Graph convolutional networks, attention mechanism, deep learning

General General

Automated Data Quality Control in FDOPA brain PET Imaging using Deep Learning.

In Computer methods and programs in biomedicine

INTRODUCTION : With biomedical imaging research increasingly using large datasets, it becomes critical to find operator-free methods to quality control the data collected and the associated analysis. Attempts to use artificial intelligence (AI) to perform automated quality control (QC) for both single-site and multi-site datasets have been explored in some neuroimaging techniques (e.g. EEG or MRI), although these methods struggle to find replication in other domains. The aim of this study is to test the feasibility of an automated QC pipeline for brain [18F]-FDOPA PET imaging as a biomarker for the dopamine system.

METHODS : Two different Convolutional Neural Networks (CNNs) were used and combined to assess spatial misalignment to a standard template and the signal-to-noise ratio (SNR) relative to 200 static [18F]-FDOPA PET images that had been manually quality controlled from three different PET/CT scanners. The scans were combined with an additional 400 scans, in which misalignment (200 scans) and low SNR (200 scans) were simulated. A cross-validation was performed, where 80% of the data were used for training and 20% for validation. Two additional datasets of [18F]-FDOPA PET images (50 and 100 scans respectively with at least 80% of good quality images) were used for out-of-sample validation.

RESULTS : The CNN performance was excellent in the training dataset (accuracy for motion: 0.86 ± 0.01, accuracy for SNR: 0.69 ± 0.01), leading to 100% accurate QC classification when applied to the two out-of-sample datasets. Data dimensionality reduction affected the generalizability of the CNNs, especially when the classifiers were applied to the out-of-sample data from 3D to 1D datasets.

CONCLUSIONS : This feasibility study shows that it is possible to perform automatic QC of [18F]-FDOPA PET imaging with CNNs. The approach has the potential to be extended to other PET tracers in both brain and non-brain applications, but it is dependent on the availability of large datasets necessary for the algorithm training.

Pontoriero Antonella D, Nordio Giovanna, Easmin Rubaida, Giacomel Alessio, Santangelo Barbara, Jahuar Sameer, Bonoldi Ilaria, Rogdaki Maria, Turkheimer Federico, Howes Oliver, Veronese Mattia

2021-Jun-22

FDOPA, PET, QC, convolutional neural networks, quality control

General General

Interpretable prioritization of splice variants in diagnostic next-generation sequencing.

In American journal of human genetics

A critical challenge in genetic diagnostics is the computational assessment of candidate splice variants, specifically the interpretation of nucleotide changes located outside of the highly conserved dinucleotide sequences at the 5' and 3' ends of introns. To address this gap, we developed the Super Quick Information-content Random-forest Learning of Splice variants (SQUIRLS) algorithm. SQUIRLS generates a small set of interpretable features for machine learning by calculating the information-content of wild-type and variant sequences of canonical and cryptic splice sites, assessing changes in candidate splicing regulatory sequences, and incorporating characteristics of the sequence such as exon length, disruptions of the AG exclusion zone, and conservation. We curated a comprehensive collection of disease-associated splice-altering variants at positions outside of the highly conserved AG/GT dinucleotides at the termini of introns. SQUIRLS trains two random-forest classifiers for the donor and for the acceptor and combines their outputs by logistic regression to yield a final score. We show that SQUIRLS transcends previous state-of-the-art accuracy in classifying splice variants as assessed by rank analysis in simulated exomes, and is significantly faster than competing methods. SQUIRLS provides tabular output files for incorporation into diagnostic pipelines for exome and genome analysis, as well as visualizations that contextualize predicted effects of variants on splicing to make it easier to interpret splice variants in diagnostic settings.

Danis Daniel, Jacobsen Julius O B, Carmody Leigh C, Gargano Michael A, McMurry Julie A, Hegde Ayushi, Haendel Melissa A, Valentini Giorgio, Smedley Damian, Robinson Peter N

2021-Jul-14

Mendelian genetics, bioinformatics, cryptic splicing, exome sequencing, genome sequencing, machine learning, random forest, sequence logo, splice mutation, splice variant, splicing

General General

Transcriptome analysis reveals key genes modulated by ALK5 inhibition in a bleomycin model of systemic sclerosis.

In Rheumatology (Oxford, England)

OBJECTIVE : Systemic sclerosis (SSc) is a rheumatic autoimmune disease affecting roughly 20 000 people worldwide and characterized by excessive collagen accumulation in the skin and internal organs. Despite the high morbidity and mortality associated with SSc, there are no approved disease-modifying agents. Our objective in this study was to explore transcriptomic and model-based drug discovery approaches for systemic sclerosis.

METHODS : In this study, we explored the molecular basis for SSc pathogenesis in a well-studied mouse model of scleroderma. We profiled the skin and lung transcriptomes of mice at multiple timepoints, analyzing the differential gene expression that underscores the development and resolution of bleomycin-induced fibrosis.

RESULTS : We observed shared expression signatures of upregulation and downregulation in fibrotic skin and lung tissue, and observed significant upregulation of key pro-fibrotic genes including GDF15, Saa3, Cxcl10, Spp1, and Timp1. To identify changes in gene expression in responses to anti-fibrotic therapy, we assessed the effect of TGF-β pathway inhibition via oral ALK5 (TGF-β receptor I) inhibitor SB525334 and observed a time-lagged response in the lung relative to skin. We also implemented a machine learning algorithm that showed promise at predicting lung function using transcriptome data from both skin and lung biopsies.

CONCLUSION : This study provides the most comprehensive look at the gene expression dynamics of an animal model of systemic sclerosis to date, provides a rich dataset for future comparative fibrotic disease research, and helps refine our understanding of pathways at work during SSc pathogenesis and intervention.

Decato Benjamin E, Ammar Ron, Reinke-Breen Lauren, Thompson John R, Azzara Anthony V

2021-Jul-21

ALK5 inhibitor, Bleomycin, Fibrosis, RNA-seq, Systemic sclerosis, scleroderma

General General

CCIP: Predicting CTCF-mediated chromatin loops with transitivity.

In Bioinformatics (Oxford, England)

MOTIVATION : CTCF-mediated chromatin loops underlie the formation of topological associating domains (TADs) and serve as the structural basis for transcriptional regulation. However, the formation mechanism of these loops remains unclear, and the genome-wide mapping of these loops is costly and difficult. Motivated by the recent studies on the formation mechanism of CTCF-mediated loops, we studied the possibility of making use of transitivity-related information of interacting CTCF anchors to predict CTCF loops computationally. In this context, transitivity arises when two CTCF anchors interact with the same third anchor by the loop extrusion mechanism and bring themselves close to each other spatially to form an indirect loop.

RESULTS : To determine whether transitivity is informative for predicting CTCF loops and to obtain an accurate and low-cost predicting method, we proposed a two-stage random-forest-based machine learning method, CCIP (CTCF-mediated Chromatin Interaction Prediction), to predict CTCF-mediated chromatin loops. Our two-stage learning approach makes it possible for us to train a prediction model by taking advantage of transitivity-related information as well as functional genome data and genomic data. Experimental studies showed that our method predicts CTCF-mediated loops more accurately than other methods and that transitivity, when used as a properly defined attribute, is informative for predicting CTCF loops. Furthermore, we found that transitivity explains the formation of tandem CTCF loops and facilitates enhancer-promoter interactions. Our work contributes to the understanding of the formation mechanism and function of CTCF-mediated chromatin loops.

AVAILABILITY AND IMPLEMENTATION : The source code of CCIP can be accessed at: https://github.com/GaoLabXDU/CCIP.

SUPPLEMENTARY INFORMATION : Supplementary data are available at Bioinformatics online.

Wang Weibing, Gao Lin, Ye Yusen, Gao Yong

2021-Jul-20

Surgery Surgery

Automatic cell counting from stimulated Raman imaging using deep learning.

In PloS one ; h5-index 176.0

In this paper, we propose an automatic cell counting framework for stimulated Raman scattering (SRS) images, which can assist tumor tissue characteristic analysis, cancer diagnosis, and surgery planning processes. SRS microscopy has promoted tumor diagnosis and surgery by mapping lipids and proteins from fresh specimens and conducting a fast disclose of fundamental diagnostic hallmarks of tumors with a high resolution. However, cell counting from label-free SRS images has been challenging due to the limited contrast of cells and tissue, along with the heterogeneity of tissue morphology and biochemical compositions. To this end, a deep learning-based cell counting scheme is proposed by modifying and applying U-Net, an effective medical image semantic segmentation model that uses a small number of training samples. The distance transform and watershed segmentation algorithms are also implemented to yield the cell instance segmentation and cell counting results. By performing cell counting on SRS images of real human brain tumor specimens, promising cell counting results are obtained with > 98% of area under the curve (AUC) and R = 0.97 in terms of cell counting correlation between SRS and histological images with hematoxylin and eosin (H&E) staining. The proposed cell counting scheme illustrates the possibility and potential of performing cell counting automatically in near real time and encourages the study of applying deep learning techniques in biomedical and pathological image analyses.

Zhang Qianqian, Yun Kyung Keun, Wang Hao, Yoon Sang Won, Lu Fake, Won Daehan

2021

General General

Survival prognostic factors in patients with acute myeloid leukemia using machine learning techniques.

In PloS one ; h5-index 176.0

This paper identifies prognosis factors for survival in patients with acute myeloid leukemia (AML) using machine learning techniques. We have integrated machine learning with feature selection methods and have compared their performances to identify the most suitable factors in assessing the survival of AML patients. Here, six data mining algorithms including Decision Tree, Random Forrest, Logistic Regression, Naive Bayes, W-Bayes Net, and Gradient Boosted Tree (GBT) are employed for the detection model and implemented using the common data mining tool RapidMiner and open-source R package. To improve the predictive ability of our model, a set of features were selected by employing multiple feature selection methods. The accuracy of classification was obtained using 10-fold cross-validation for the various combinations of the feature selection methods and machine learning algorithms. The performance of the models was assessed by various measurement indexes including accuracy, kappa, sensitivity, specificity, positive predictive value, negative predictive value, and area under the ROC curve (AUC). Our results showed that GBT with an accuracy of 85.17%, AUC of 0.930, and the feature selection via the Relief algorithm has the best performance in predicting the survival rate of AML patients.

Karami Keyvan, Akbari Mahboubeh, Moradi Mohammad-Taher, Soleymani Bijan, Fallahi Hossein

2021

Public Health Public Health

Cost-effectiveness of artificial intelligence monitoring for active tuberculosis treatment: A modeling study.

In PloS one ; h5-index 176.0

BACKGROUND : Tuberculosis (TB) incidence in Los Angeles County, California, USA (5.7 per 100,000) is significantly higher than the U.S. national average (2.9 per 100,000). Directly observed therapy (DOT) is the preferred strategy for active TB treatment but requires substantial resources. We partnered with the Los Angeles County Department of Public Health (LACDPH) to evaluate the cost-effectiveness of AiCure, an artificial intelligence (AI) platform that allows for automated treatment monitoring.

METHODS : We used a Markov model to compare DOT versus AiCure for active TB treatment in LA County. Each cohort transitioned between health states at rates estimated using data from a pilot study for AiCure (N = 43) and comparable historical controls for DOT (N = 71). We estimated total costs (2017, USD) and quality-adjusted life years (QALYs) over a 16-month horizon to calculate the incremental cost-effectiveness ratio (ICER) and net monetary benefits (NMB) of AiCure. To assess robustness, we conducted deterministic (DSA) and probabilistic sensitivity analyses (PSA).

RESULTS : For the average patient, AiCure was dominant over DOT. DOT treatment cost $4,894 and generated 1.03 QALYs over 16-months. AiCure treatment cost $2,668 for 1.05 QALYs. At willingness-to-pay threshold of $150K/QALY, incremental NMB per-patient under AiCure was $4,973. In univariate DSA, NMB were most sensitive to monthly doses and vocational nurse wage; however, AiCure remained dominant. In PSA, AiCure was dominant in 93.5% of 10,000 simulations (cost-effective in 96.4%).

CONCLUSIONS : AiCure for treatment of active TB is cost-effective for patients in LA County, California. Increased use of AI platforms in other jurisdictions could facilitate the CDC's vision of TB elimination.

Salcedo Jonathan, Rosales Monica, Kim Jeniffer S, Nuno Daisy, Suen Sze-Chuan, Chang Alicia H

2021

General General

[Artificial intelligence, radiomics and pathomics to predict response and survival of patients treated with radiations].

In Cancer radiotherapie : journal de la Societe francaise de radiotherapie oncologique

Artificial intelligence approaches in medicine are more and more used and are extremely promising due to the growing number of data produced and the variety of data they allow to exploit. Thus, the computational analysis of medical images in particular, radiological (radiomics), or anatomopathological (pathomics), has shown many very interesting results for the prediction of the prognosis and the response of cancer patients. Radiotherapy is a discipline that particularly benefits from these new approaches based on computer science and imaging. This review will present the main principles of an artificial intelligence approach and in particular machine learning, the principles of a radiomic and pathomic approach and the potential of their use for the prediction of the prognosis of patients treated with radiotherapy.

Sun R, Lerousseau M, Henry T, Carré A, Leroy A, Estienne T, Niyoteka S, Bockel S, Rouyar A, Alvarez Andres É, Benzazon N, Battistella E, Classe M, Robert C, Scoazec J Y, Deutsch É

2021-Jul-17

Artificial intelligence, Dosiomics, Dosiomique, Intelligence artificielle, Pathomics, Pathomique, Radiomics, Radiomique, Radiotherapy, Radiothérapie

General General

PHOTONAI-A Python API for rapid machine learning model development.

In PloS one ; h5-index 176.0

PHOTONAI is a high-level Python API designed to simplify and accelerate machine learning model development. It functions as a unifying framework allowing the user to easily access and combine algorithms from different toolboxes into custom algorithm sequences. It is especially designed to support the iterative model development process and automates the repetitive training, hyperparameter optimization and evaluation tasks. Importantly, the workflow ensures unbiased performance estimates while still allowing the user to fully customize the machine learning analysis. PHOTONAI extends existing solutions with a novel pipeline implementation supporting more complex data streams, feature combinations, and algorithm selection. Metrics and results can be conveniently visualized using the PHOTONAI Explorer and predictive models are shareable in a standardized format for further external validation or application. A growing add-on ecosystem allows researchers to offer data modality specific algorithms to the community and enhance machine learning in the areas of the life sciences. Its practical utility is demonstrated on an exemplary medical machine learning problem, achieving a state-of-the-art solution in few lines of code. Source code is publicly available on Github, while examples and documentation can be found at www.photon-ai.com.

Leenings Ramona, Winter Nils Ralf, Plagwitz Lucas, Holstein Vincent, Ernsting Jan, Sarink Kelvin, Fisch Lukas, Steenweg Jakob, Kleine-Vennekate Leon, Gebker Julian, Emden Daniel, Grotegerd Dominik, Opel Nils, Risse Benjamin, Jiang Xiaoyi, Dannlowski Udo, Hahn Tim

2021

General General

PEDF, a pleiotropic WTC-LI biomarker: Machine learning biomarker identification and validation.

In PLoS computational biology

Biomarkers predict World Trade Center-Lung Injury (WTC-LI); however, there remains unaddressed multicollinearity in our serum cytokines, chemokines, and high-throughput platform datasets used to phenotype WTC-disease. To address this concern, we used automated, machine-learning, high-dimensional data pruning, and validated identified biomarkers. The parent cohort consisted of male, never-smoking firefighters with WTC-LI (FEV1, %Pred< lower limit of normal (LLN); n = 100) and controls (n = 127) and had their biomarkers assessed. Cases and controls (n = 15/group) underwent untargeted metabolomics, then feature selection performed on metabolites, cytokines, chemokines, and clinical data. Cytokines, chemokines, and clinical biomarkers were validated in the non-overlapping parent-cohort via binary logistic regression with 5-fold cross validation. Random forests of metabolites (n = 580), clinical biomarkers (n = 5), and previously assayed cytokines, chemokines (n = 106) identified that the top 5% of biomarkers important to class separation included pigment epithelium-derived factor (PEDF), macrophage derived chemokine (MDC), systolic blood pressure, macrophage inflammatory protein-4 (MIP-4), growth-regulated oncogene protein (GRO), monocyte chemoattractant protein-1 (MCP-1), apolipoprotein-AII (Apo-AII), cell membrane metabolites (sphingolipids, phospholipids), and branched-chain amino acids. Validated models via confounder-adjusted (age on 9/11, BMI, exposure, and pre-9/11 FEV1, %Pred) binary logistic regression had AUCROC [0.90(0.84-0.96)]. Decreased PEDF and MIP-4, and increased Apo-AII were associated with increased odds of WTC-LI. Increased GRO, MCP-1, and simultaneously decreased MDC were associated with decreased odds of WTC-LI. In conclusion, automated data pruning identified novel WTC-LI biomarkers; performance was validated in an independent cohort. One biomarker-PEDF, an antiangiogenic agent-is a novel, predictive biomarker of particulate-matter-related lung disease. Other biomarkers-GRO, MCP-1, MDC, MIP-4-reveal immune cell involvement in WTC-LI pathogenesis. Findings of our automated biomarker identification warrant further investigation into these potential pharmacotherapy targets.

Crowley George, Kim James, Kwon Sophia, Lam Rachel, Prezant David J, Liu Mengling, Nolan Anna

2021-Jul-21

General General

Sequence to Sequence ECG Cardiac Rhythm Classification using Convolutional Recurrent Neural Networks.

In IEEE journal of biomedical and health informatics

This paper proposes a novel deep learning architecture involving combinations of Convolutional Neural Networks (CNN) layers and Recurrent neural networks (RNN) layers that can be used to perform segmentation and classification of 5 cardiac rhythms based on ECG recordings. The algorithm is developed in a sequence to sequence setting where the input is a sequence of five second ECG signal sliding windows and the output is a sequence of cardiac rhythm labels. The novel architecture processes as input both the spectrograms of the ECG signal as well as the heartbeats' signal waveform. Additionally, we are able to train the model in the presence of label noise. The model's performance and generalizability is verified on an external database different from the one we used to train. Experimental result shows this approach can achieve an average F1 scores of 0.89 (averaged across 5 classes). The proposed model also achieves comparable classification performance to existing state-of-the-art approach with considerably less number of training parameters.

Pokaprakarn Teeranan, Kitzmiller Rebecca R, Moorman Randall, Lake Douglas E, Krishnamurthy Ashok K, Kosorok Michael

2021-Jul-21

General General

Proximity of Cellular and Physiological Response Failures in Sepsis.

In IEEE journal of biomedical and health informatics

Sepsis is a devastating multi-stage health condition with a high mortality rate. Its complexity, prevalence, and dependency of its outcomes on early detection have attracted substantial attention from data science and machine learning communities. Previous studies rely on individual cellular and physiological responses representing organ system failures to predict health outcomes or the onset of different sepsis stages. However, it is known that organ systems' failures and dynamics are not independent events. In this study, we identify the dependency patterns of significant proximate sepsis-related failures of cellular and physiological responses using data from 12,223 adult patients hospitalized between July 2013 and December 2015. The results show that proximate failures of cellular and physiological responses create better feature sets for outcome prediction than individual responses. Our findings reveal the few significant proximate failures that play the major roles in predicting patients' outcomes. This study's results can be simply translated into clinical practices and inform the prediction and improvement of patients' conditions and outcomes.

Jazayeri Ali, Capan Muge, Ivy Julie, Arnold Ryan, Yang Christopher C

2021-Jul-21

General General

Respiratory Event Detection during Sleep Using Electrocardiogram and Respiratory Related Signals: Using Polysomnogram and Patch-Type Wearable Device Data.

In IEEE journal of biomedical and health informatics

This paper presents an automatic algorithm for the detection of respiratory events in patients using electrocardiogram (ECG) and respiratory signals. The proposed method was developed using data of polysomnogram (PSG) and those recorded from a patch-type device. In total, data of 1,285 subjects were used for algorithm development and evaluation. The proposed method involved respiratory event detection and apnea-hypopnea index (AHI) estimation. Handcrafted features from the ECG and respiratory signals were applied to machine learning algorithms including linear discriminant analysis, quadratic discriminant analysis, random forest, multi-layer perceptron, and the support vector machine (SVM). High performance was demonstrated when using SVM, where the overall accuracy achieved was 83% and the Cohens kappa was 0.53 for the minute-by-minute respiratory event detection. The correlation coefficient between the reference AHI obtained using the PSG and estimated AHI as per the proposed method was 0.87. Furthermore, patient classification based on an AHI cutoff of 15 showed an accuracy of 87% and a Cohens kappa of 0.72. The proposed method increases performance result, as it records the ECG and respiratory signals simultaneously. Overall, it can be used to lower the development cost of commercial software owing to the use of open datasets.

Yeo Minsoo, Byun Hoonsuk, Lee Jiyeon, Byun Jungick, Rhee Hak Young, Shin Wonchul, Yoon Heenam

2021-Jul-21

General General

RobustSleepNet: Transfer learning for automated sleep staging at scale.

In IEEE transactions on neural systems and rehabilitation engineering : a publication of the IEEE Engineering in Medicine and Biology Society

Sleep disorder diagnosis relies on the analysis of polysomnography (PSG) records. As a preliminary step of this examination, sleep stages are systematically determined. In practice, sleep stage classification relies on the visual inspection of 30- second epochs of polysomnography signals. Numerous automatic approaches have been developed to replace this tedious and expensive task. Although these methods demonstrated better performance than human sleep experts on specific datasets, they remain largely unused in sleep clinics. The main reason is that each sleep clinic uses a specific PSG montage that most automatic approaches cannot handle out-of-the-box. Moreover, even when the PSG montage is compatible, publications have shown that automatic approaches perform poorly on unseen data with different demographics. To address these issues, we introduce RobustSleepNet, a deep learning model for automatic sleep stage classification able to handle arbitrary PSG montages. We trained and evaluated this model in a leave-one-out-dataset fashion on a large corpus of 8 heterogeneous sleep staging datasets to make it robust to demographic changes. When evaluated on an unseen dataset, RobustSleepNet reaches 97% of the F1 of a model explicitly trained on this dataset. Hence, RobustSleepNet unlocks the possibility to perform high-quality out-of-the-box automatic sleep staging with any clinical setup. We further show that finetuning RobustSleepNet, using a part of the unseen dataset, increases the F1 by 2% when compared to a model trained specifically for this dataset. Therefore, finetuning might be used to reach a state-of-the-art level of performance on a specific population.

Guillot Antoine, Thorey Valentin

2021-Jul-21

General General

The computer says no: AI, health law, ethics and patient safety.

In British journal of nursing (Mark Allen Publishing)

John Tingle, Lecturer in Law, Birmingham Law School, University of Birmingham, discusses some recent reports on artificial intelligence (AI) and machine learning in the context of law, ethics and patient safety.

Tingle John

2021-Jul-22

General General

Lab-on-Eyeglasses to Monitor Kidneys and Strengthen Vulnerable Populations in Pandemics: Machine Learning in Predicting Serum Creatinine Using Tear Creatinine.

In Analytical chemistry

The serum creatinine level is commonly recognized as a measure of glomerular filtration rate (GFR) and is defined as an indicator of overall renal health. A typical procedure in determining kidney performance is venipuncture to obtain serum creatinine in the blood, which requires a skilled technician to perform on a laboratory basis and multiple clinical steps to acquire a meaningful result. Recently, wearable sensors have undergone immense development, especially for noninvasive health monitoring without a need for a blood sample. This article addresses a fiber-based sensing device selective for tear creatinine, which was fabricated using a copper-containing benzenedicarboxylate (BDC) metal-organic framework (MOF) bound with graphene oxide-Cu(II) and hybridized with Cu2O nanoparticles (NPs). Density functional theory (DFT) was employed to study the binding energies of creatinine toward the ternary hybrid materials that irreversibly occurred at pendant copper ions attached with the BDC segments. Electrochemical impedance spectroscopy (EIS) was utilized to probe the unique charge-transfer resistances of the derived sensing materials. The single-use modified sensor achieved 95.1% selectivity efficiency toward the determination of tear creatinine contents from 1.6 to 2400 μM of 10 repeated measurements in the presence of interfering species of dopamine, urea, and uric acid. The machine learning with the supervised training estimated 83.3% algorithm accuracy to distinguish among low, moderate, and high normal serum creatinine by evaluating tear creatinine. With only one step of collecting tears, this lab-on-eyeglasses with disposable hybrid textile electrodes selective for tear creatinine may be greatly beneficial for point-of-care (POC) kidney monitoring for vulnerable populations remotely, especially during pandemics.

Kalasin Surachate, Sangnuang Pantawan, Surareungchai Werasak

2021-Jul-21

oncology Oncology

Quick Annotator: an open-source digital pathology based rapid image annotation tool.

In The journal of pathology. Clinical research

Image-based biomarker discovery typically requires accurate segmentation of histologic structures (e.g. cell nuclei, tubules, and epithelial regions) in digital pathology whole slide images (WSIs). Unfortunately, annotating each structure of interest is laborious and often intractable even in moderately sized cohorts. Here, we present an open-source tool, Quick Annotator (QA), designed to improve annotation efficiency of histologic structures by orders of magnitude. While the user annotates regions of interest (ROIs) via an intuitive web interface, a deep learning (DL) model is concurrently optimized using these annotations and applied to the ROI. The user iteratively reviews DL results to either (1) accept accurately annotated regions or (2) correct erroneously segmented structures to improve subsequent model suggestions, before transitioning to other ROIs. We demonstrate the effectiveness of QA over comparable manual efforts via three use cases. These include annotating (1) 337,386 nuclei in 5 pancreatic WSIs, (2) 5,692 tubules in 10 colorectal WSIs, and (3) 14,187 regions of epithelium in 10 breast WSIs. Efficiency gains in terms of annotations per second of 102×, 9×, and 39× were, respectively, witnessed while retaining f-scores >0.95, suggesting that QA may be a valuable tool for efficiently fully annotating WSIs employed in downstream biomarker studies.

Miao Runtian, Toth Robert, Zhou Yu, Madabhushi Anant, Janowczyk Andrew

2021-Jul-19

active learning, annotations, computational pathology, deep learning, digital pathology, efficiency, epithelium, nuclei, open-source tool, tubules

General General

The application of pangenomics and machine learning in genomic selection in plants.

In The plant genome

Genomic selection approaches have increased the speed of plant breeding, leading to growing crop yields over the last decade. However, climate change is impacting current and future yields, resulting in the need to further accelerate breeding efforts to cope with these changing conditions. Here we present approaches to accelerate plant breeding and incorporate nonadditive effects in genomic selection by applying state-of-the-art machine learning approaches. These approaches are made more powerful by the inclusion of pangenomes, which represent the entire genome content of a species. Understanding the strengths and limitations of machine learning methods, compared with more traditional genomic selection efforts, is paramount to the successful application of these methods in crop breeding. We describe examples of genomic selection and pangenome-based approaches in crop breeding, discuss machine learning-specific challenges, and highlight the potential for the application of machine learning in genomic selection. We believe that careful implementation of machine learning approaches will support crop improvement to help counter the adverse outcomes of climate change on crop production.

Bayer Philipp E, Petereit Jakob, Danilevicz Monica Furaste, Anderson Robyn, Batley Jacqueline, Edwards David

2021-Jul-20

oncology Oncology

[Evaluation of quality of life: Clinical relevance for patient].

In Cancer radiotherapie : journal de la Societe francaise de radiotherapie oncologique

The quality of life of patients and its evaluation remains one of the primordial objectives in oncology. Different methods and tools of evaluation of quality of life have been developed with the objective of having a global evaluation, throughout different aspects, be it physical, emotional, psychological or social. The quality of life questionnaires improve and simplify the reevaluation and follow-up of patients during clinical trials. Patient reported outcome measures (PROMs) are an evaluation of the quality of life as experienced by the patients (patient-reported-outcomes [PROs]) and allow for physicians a personalized treatment approach. In radiotherapy, PROMs are a useful tool for the follow-up of patients during or after treatment. The technological advances, notably in data collecting, but also in their integration and treatment with regard to artificial intelligence will allow integrating these evaluation tools in the management of patients in oncology.

Dossun C, Popescu B V, Antoni D

2021-Jul-17

EPROS, Evaluation, PROS, Patient, Quality of life, Qualité de vie, ePROS, Évaluation

General General

Deep-learning-based motion correction in OCT angiography.

In Journal of biophotonics

Optical coherence tomography angiography (OCTA) is a widely applied tool to image microvascular networks with high spatial resolution and sensitivity. Due to limited imaging speed, the artifacts caused by tissue motion can severely compromise visualization of the microvascular networks and quantification of OCTA images. In this paper, we propose a deep-learning-based framework to effectively correct motion artifacts and retrieve microvascular architectures. This method comprised two deep neural networks in which the first subnet was applied to distinguish motion corrupted B-scan images from a volumetric dataset. Based on the classification results, the artifacts could be removed from the en face maximum-intensity-projection (MIP) OCTA image. To restore the disturbed vasculature induced by artifact removal, the second subnet, an inpainting neural network, was utilized to reconnect the broken vascular networks. We applied the method to postprocess OCTA images of the microvascular networks in mouse cortex in vivo. Both image comparison and quantitative analysis show that the proposed method can significantly improve OCTA image by efficiently recovering microvasculature from the overwhelming motion artifacts. This article is protected by copyright. All rights reserved.

Li Ang, Du Congwu, Pan Yingtian

2021-Jul-20

OCTA, deep neural networks, microvascular network, motion correction

oncology Oncology

Unsupervised flow cytometry analysis in hematological malignancies: A new paradigm.

In International journal of laboratory hematology ; h5-index 29.0

Ever since hematopoietic cells became "events" enumerated and characterized in suspension by cell counters or flow cytometers, researchers and engineers have strived to refine the acquisition and display of the electronic signals generated. A large array of solutions was then developed to identify at best the numerous cell subsets that can be delineated, notably among hematopoietic cells. As instruments became more and more stable and robust, the focus moved to analytic software. Almost concomitantly, the capacity increased to use large panels (both with mass and classical cytometry) and to apply artificial intelligence/machine learning for their analysis. The combination of these concepts raised new analytical possibilities, opening an unprecedented field of subtle exploration for many conditions, including hematopoiesis and hematological disorders. In this review, the general concepts and progress achieved in the development of new analytical approaches for exploring high-dimensional data sets at the single-cell level will be described as they appeared over the past few years. A larger and more practical part will detail the various steps that need to be mastered, both in data acquisition and in the preanalytical check of data files. Finally, a step-by-step explanation of the solution in development to combine the Bioconductor clustering algorithm FlowSOM and the popular and widely used software Kaluza® (Beckman Coulter) will be presented. The aim of this review was to point out that the day when these progresses will reach routine hematology laboratories does not seem so far away.

Béné Marie C, Lacombe Francis, Porwit Anna

2021-Jul

artificial intelligence, flow cytometry, machine learning, unsupervised analysis

Pathology Pathology

Machine learning in health care and laboratory medicine: General overview of supervised learning and Auto-ML.

In International journal of laboratory hematology ; h5-index 29.0

Artificial Intelligence (AI) and machine learning (ML) have now spawned a new field within health care and health science research. These new predictive analytics tools are starting to change various facets of our clinical care domains including the practice of laboratory medicine. Many of these ML tools and studies are also starting to populate our literature landscape as we know it but unfamiliarity of the average reader to the basic knowledge and critical concepts within AI/ML is now demanding a need to better prepare our audience to such relatively unfamiliar concepts. A fundamental knowledge of such platforms will inevitably enhance cross-disciplinary literacy and ultimately lead to enhanced integration and understanding of such tools within our discipline. In this review, we provide a general outline of AI/ML along with an overview of the fundamental concepts of ML categories, specifically supervised, unsupervised, and reinforcement learning. Additionally, since the vast majority of our current approaches within ML in laboratory medicine and health care involve supervised algorithms, we will predominantly concentrate on such platforms. Finally, the need for making such tools more accessible to the average investigator is becoming a major driving force for the need of automation within these ML platforms. This has now given rise to the automated ML (Auto-ML) world which will undoubtedly help shape the future of ML within health care. Hence, an overview of Auto-ML is also covered within this manuscript which will hopefully enrich the reader's understanding, appreciation, and the need for embracing such tools.

Rashidi Hooman H, Tran Nam, Albahra Samer, Dang Luke T

2021-Jul

Algorithm, artificial intelligence, auto-ML, feature selection, principal component analysis

General General

Facial recognition accuracy in photographs of Thai neonates with Down syndrome among physicians and the Face2Gene application.

In American journal of medical genetics. Part A

Down syndrome (DS) is typically recognizable in those who present with multiple dysmorphism, especially in regard to facial phenotypes. However, as the presentation of DS in neonates is less obvious, a phenotype-based presumptive diagnosis is more challenging. Recently, an artificial intelligence (AI) application, Face2Gene, was developed to help physicians recognize specific genetic syndromes by using two-dimensional facial photos. As of yet, there has not been any study comparing accuracy among physicians or applications. Our objective was to compare the facial recognition accuracy of DS in Thai neonates, using facial photographs, among physicians and the Face2Gene. Sixty-four Thai neonates at Thammasat University Hospital, with genetic testing and signed parental consent, were divided into a DS group (25) and non-DS group (39). Non-DS was further divided into unaffected (19) and those affected with other syndromes (20). Our results revealed physician accuracy (89%) was higher than the Face2Gene (81%); however, the application was higher in sensitivity (100%) than physicians (86%). While this application can serve as a helpful assistant in facilitating any genetic syndrome such as DS, to aid clinicians in recognizing DS facial features in neonates, it is not a replacement for well-trained doctors.

Srisraluang Wewika, Rojnueangnit Kitiwan

2021-Jul-21

Face2Gene, Thai neonates with Down syndrome, facial recognition

Radiology Radiology

Classification of focal liver lesions in CT images using convolutional neural networks with lesion information augmented patches and synthetic data augmentation.

In Medical physics ; h5-index 59.0

PURPOSE : We propose a deep learning method that classifies FLLs into cysts, he-mangiomas, and metastases from portal phase abdominal CT images. We propose a synthetic data augmentation process to alleviate the class imbalance and the Lesion INformation Augmented (LINA) patch to improve the learning efficiency.

METHODS : A dataset of 502 portal phase CT scans of 1,290 focal liver lesions (FLLs) was used. First, to alleviate the class imbalance and to diversify the training data patterns, we suggest synthetic training data augmentation using DCGAN-based lesion mask synthesis and pix2pix-based mask-to-image translation. Second, to improve the learning efficiency of convolutional neural networks (CNNs) for the small lesions, we propose a novel type of input patch termed the LINA patch to emphasize the lesion texture information while also maintaining the lesion boundary information in the patches. Third, we construct a multi-scale CNN through a model ensemble of ResNet-18 CNNs trained on LINA patches of various mini-patch sizes.

RESULTS : The experiments demonstrate that (1) synthetic data augmentation method shows characteristics different but complementary to those in conventional real data augmentation in augmenting data distributions, (2) the proposed LINA patches improve classification performance compared to those by existing types of CNN input patches due to the enhanced texture and boundary information in the small lesions, and (3) through an ensemble of LINA patch-trained CNNs with different mini-patch sizes, the multi-scale CNN further improves overall classification performance. As a result, the proposed method achieved an accuracy of 87.30%, showing improvements of 10.81%p and 15.0%p compared to the conventional image patch-trained CNN and texture feature-trained SVM, respectively.

CONCLUSIONS : The proposed synthetic data augmentation method shows promising results in improving the data diversity and class imbalance, and the proposed LINA patches enhance the learning efficiency compared to the existing input image patches.

Lee Hansang, Lee Haeil, Hong Helen, Bae Heejin, Lim Joon Seok, Kim Junmo

2021-Jul-21

Liver metastasis, classification, computed tomography, deep learning, generative adversarial network

oncology Oncology

Noise2Context: Context-assisted Learning 3D Thin-layer for Low Dose CT.

In Medical physics ; h5-index 59.0

PURPOSE : Computed tomography (CT) has played a vital role in medical diagnosis, assessment, and therapy planning, etc. In clinical practice, concerns about the increase of X-ray radiation exposure attract more and more attention. To lower the X-ray radiation, low-dose CT (LDCT) has been widely adopted in certain scenarios, while it will induce the degradation of CT image quality. In this paper, we proposed a deep learning-based method that can train denoising neural networks without any clean data.

METHODS : In this work, for 3D thin-slice LDCT scanning, we first drive an unsupervised loss function which was equivalent to a supervised loss function with paired noisy and clean samples when the noise in the different slices from a single scan was uncorrelated and zero-mean. Then, we trained the denoising neural network to map one noise LDCT image to its two adjacent LDCT images in a single 3D thin-layer LDCT scanning, simultaneously. In essence, with some latent assumptions, we proposed an unsupervised loss function to train the denoising neural network in an unsupervised manner, which integrated the similarity between adjacent CT slices in 3D thin-layer LDCT.

RESULTS : Further experiments on Mayo LDCT dataset and a realistic pig head were carried out. In the experiments using Mayo LDCT dataset, our unsupervised method can obtain performance comparable to that of the supervised baseline. With the realistic pig head, our method can achieve optimal performance at different noise levels as compared to all the other methods that demonstrated the superiority and robustness of the proposed Noise2Context.

CONCLUSIONS : In this work, we present a generalizable LDCT image denoising method without any clean data. As a result, our method not only gets rid of the complex artificial image priors but also amounts of paired high-quality training datasets.

Zhang Zhicheng, Liang Xiaokun, Zhao Wei, Xing Lei

2021-Jul-21

Deep learning, Image denoising, Low dose CT, Unsupervised learning

Radiology Radiology

Technical Note: Comparison of Convolutional Neural Networks for Detecting Large Vessel Occlusion on Computed Tomography Angiography.

In Medical physics ; h5-index 59.0

PURPOSE : Artificial intelligence diagnosis and triage of large vessel occlusion may quicken clinical response for a subset of time-sensitive acute ischemic stroke patients, improving outcomes. Differences in architectural elements within data-driven convolutional neural network (CNN) models impact performance. Foreknowledge of effective model architectural elements for domain-specific problems can narrow the search for candidate models and inform strategic model design and adaptation to optimize performance on available data. Here, we study CNN architectures with a range of learnable parameters and which span inclusion of architectural elements, such as parallel processing branches and residual connections with varying methods of recombining residual information.

METHODS : We compare five CNNs: ResNet-50, DenseNet-121, EfficientNet-B0, PhiNet, and an Inception module-based network, on a computed tomography angiography large vessel occlusion detection task. The models were trained and preliminarily evaluated with 10-fold cross-validation on preprocessed scans (n=240). An ablation study was performed on PhiNet due to superior cross-validated test performance across accuracy, precision, recall, specificity, and F1 score. The final evaluation of all models was performed on a withheld external validation set (n=60) and these predictions were subsequently calibrated with sigmoid curves.

RESULTS : Uncalibrated results on the withheld external validation set show that DenseNet-121 had the best average performance on accuracy, precision, recall, specificity, and F1 score. After calibration DenseNet-121 maintained superior performance on all metrics except recall.

CONCLUSIONS : The number of learnable parameters in our five models and best-ablated PhiNet directly related to cross-validated test performance-the smaller the model the better. However, this pattern did not hold when looking at generalization on the withheld external validation set. DenseNet-121 generalized the best; we posit this was due to its heavy use of residual connections utilizing concatenation, which causes feature maps from earlier layers to be used deeper in the network, while aiding in gradient flow and regularization.

Remedios Lucas W, Lingam Sneha, Remedios Samuel W, Gao Riqiang, Clark Stephen W, Davis Larry T, Landman Bennett A

2021-Jul-21

Computed Tomography Angiography (CTA), Convolutional Neural Network, Deep Learning, Image Classification, Large Vessel Occlusion

General General

A deep learning approach to automate whole-genome prediction of diverse epigenomic modifications in plants.

In The New phytologist

Epigenetic modifications function in gene transcription, RNA metabolism, and other biological processes. However, multiple factors currently limit the scientific utility of epigenomic datasets generated for plants. Here, using deep-learning approaches, we developed a Smart Model for Epigenetics in Plants (SMEP) to predict six types of epigenomic modifications: DNA 5-methylcytosine (5mC) and N6-methyladenosine (6mA) methylation, RNA N6-methyladenosine (m6 A) methylation, and three types of histone modification. Using the datasets from the japonica rice Nipponbare, SMEP achieved 95% prediction accuracy for 6mA, and also achieved around 80% for 5mC, m6 A, and the three types of histone modification based on the 10-fold cross-validation. Additionally, >95% of the 6mA peaks detected after a heat-shock treatment were predicted. We also successfully applied the SMEP for examining epigenomic modifications in indica rice 93-11 and even the B73 maize line. Taken together, we show that the deep-learning-enabled SMEP can reliably mine epigenomic datasets from diverse plants to yield actionable insights about epigenomic sites. Thus, our work opens new avenues for the application of predictive tools to facilitate functional research, and will almost certainly increase the efficiency of genome engineering efforts.

Wang Yifan, Zhang Pingxian, Guo Weijun, Liu Hanqing, Li Xiulan, Zhang Qian, Du Zhuoying, Hu Guihua, Han Xiao, Pu Li, Tian Jian, Gu Xiaofeng

2021-Jul-20

DNA methylation, RNA methylation, artificial intelligence, convolutional neural networks, deep learning, histone modification

General General

Signs of the times: universalism and localism.

In BJPsych international

At a time when nationalism has reappeared in Europe, when COVID-19 is not yet quarantined and when compassion coexists with grief, there is a need to consider the impact of these societal changes on international collaboration, frequency and management of perinatal mental disorder - and new roles for psychiatrists and other health professionals in the digital and AI (artificial intelligence) post-COVID era.

Cox John

2020-Aug

COVID-19, Perinatal mental health, cross-cultural psychiatry, international psychiatry, person-centred care

Surgery Surgery

Automatic Extraction of Lung Cancer Staging Information From Computed Tomography Reports: Deep Learning Approach.

In JMIR medical informatics ; h5-index 23.0

BACKGROUND : Lung cancer is the leading cause of cancer deaths worldwide. Clinical staging of lung cancer plays a crucial role in making treatment decisions and evaluating prognosis. However, in clinical practice, approximately one-half of the clinical stages of lung cancer patients are inconsistent with their pathological stages. As one of the most important diagnostic modalities for staging, chest computed tomography (CT) provides a wealth of information about cancer staging, but the free-text nature of the CT reports obstructs their computerization.

OBJECTIVE : We aimed to automatically extract the staging-related information from CT reports to support accurate clinical staging of lung cancer.

METHODS : In this study, we developed an information extraction (IE) system to extract the staging-related information from CT reports. The system consisted of the following three parts: named entity recognition (NER), relation classification (RC), and postprocessing (PP). We first summarized 22 questions about lung cancer staging based on the TNM staging guideline. Next, three state-of-the-art NER algorithms were implemented to recognize the entities of interest. Next, we designed a novel RC method using the relation sign constraint (RSC) to classify the relations between entities. Finally, a rule-based PP module was established to obtain the formatted answers using the results of NER and RC.

RESULTS : We evaluated the developed IE system on a clinical data set containing 392 chest CT reports collected from the Department of Thoracic Surgery II in the Peking University Cancer Hospital. The experimental results showed that the bidirectional encoder representation from transformers (BERT) model outperformed the iterated dilated convolutional neural networks-conditional random field (ID-CNN-CRF) and bidirectional long short-term memory networks-conditional random field (Bi-LSTM-CRF) for NER tasks with macro-F1 scores of 80.97% and 90.06% under the exact and inexact matching schemes, respectively. For the RC task, the proposed RSC showed better performance than the baseline methods. Further, the BERT-RSC model achieved the best performance with a macro-F1 score of 97.13% and a micro-F1 score of 98.37%. Moreover, the rule-based PP module could correctly obtain the formatted results using the extractions of NER and RC, achieving a macro-F1 score of 94.57% and a micro-F1 score of 96.74% for all the 22 questions.

CONCLUSIONS : We conclude that the developed IE system can effectively and accurately extract information about lung cancer staging from CT reports. Experimental results show that the extracted results have significant potential for further use in stage verification and prediction to facilitate accurate clinical staging.

Hu Danqing, Zhang Huanyao, Li Shaolei, Wang Yuhong, Wu Nan, Lu Xudong

2021-Jul-21

clinical staging, information extraction, lung cancer, named entity recognition, relation classification

Surgery Surgery

Application of machine learning to predict the outcome of pediatric traumatic brain injury.

In Chinese journal of traumatology = Zhonghua chuang shang za zhi

PURPOSE : Traumatic brain injury (TBI) generally causes mortality and disability, particularly in children. Machine learning (ML) is a computer algorithm, applied as a clinical prediction tool. The present study aims to assess the predictability of ML for the functional outcomes of pediatric TBI.

METHODS : A retrospective cohort study was performed targeting children with TBI who were admitted to the trauma center of southern Thailand between January 2009 and July 2020. The patient was excluded if he (1) did not undergo a CT scan of the brain, (2) died within the first 24 h, (3) had unavailable complete medical records during admission, or (4) was unable to provide updated outcomes. Clinical and radiologic characteristics were collected such as vital signs, Glasgow coma scale score, and characteristics of intracranial injuries. The functional outcome was assessed using the King's Outcome Scale for Childhood Head Injury, which was thus dichotomized into favourable outcomes and unfavourable outcomes: good recovery and moderate disability were categorized as the former, whereas death, vegetative state, and severe disability were categorized as the latter. The prognostic factors were estimated using traditional binary logistic regression. By data splitting, 70% of data were used for training the ML models and the remaining 30% were used for testing the ML models. The supervised algorithms including support vector machines, neural networks, random forest, logistic regression, naive Bayes and k-nearest neighbor were performed for training of the ML models. Therefore, the ML models were tested for the predictive performances by the testing datasets.

RESULTS : There were 828 patients in the cohort. The median age was 72 months (interquartile range 104.7 months, range 2-179 months). Road traffic accident was the most common mechanism of injury, accounting for 68.7%. At hospital discharge, favourable outcomes were achieved in 97.0% of patients, while the mortality rate was 2.2%. Glasgow coma scale score, hypotension, pupillary light reflex, and subarachnoid haemorrhage were associated with TBI outcomes following traditional binary logistic regression; hence, the 4 prognostic factors were used for building ML models and testing performance. The support vector machine model had the best performance for predicting pediatric TBI outcomes: sensitivity 0.95, specificity 0.60, positive predicted value 0.99, negative predictive value 1.0; accuracy 0.94, and area under the receiver operating characteristic curve 0.78.

CONCLUSION : The ML algorithms of the present study have a high sensitivity; therefore they have the potential to be screening tools for predicting functional outcomes and counselling prognosis in general practice of pediatric TBIs.

Tunthanathip Thara, Oearsakul Thakul

2021-Jun-08

Logistic regression, Machine learning, Pediatrics, Random forest, Support vector machine, Traumatic brain injury

General General

Signs of the times: universalism and localism.

In BJPsych international

At a time when nationalism has reappeared in Europe, when COVID-19 is not yet quarantined and when compassion coexists with grief, there is a need to consider the impact of these societal changes on international collaboration, frequency and management of perinatal mental disorder - and new roles for psychiatrists and other health professionals in the digital and AI (artificial intelligence) post-COVID era.

Cox John

2020-Aug

COVID-19, Perinatal mental health, cross-cultural psychiatry, international psychiatry, person-centred care

Public Health Public Health

Chest X-ray analysis with deep learning-based software as a triage test for pulmonary tuberculosis: an individual patient data meta-analysis of diagnostic accuracy.

In Clinical infectious diseases : an official publication of the Infectious Diseases Society of America

BACKGROUND : Automated radiologic analysis using computer-aided detection software (CAD) could facilitate chest X-ray (CXR) use in tuberculosis diagnosis. There is little to no evidence on the accuracy of commercially-available deep learning-based CAD in different populations, including patients with smear-negative tuberculosis and people living with HIV (PLWH).

METHODS : We collected CXRs and individual patient data (IPD) from studies evaluating CAD in patients self-referring for tuberculosis symptoms with culture or nucleic acid amplification testing as the reference. We re-analyzed CXRs with three CAD (CAD4TB version (v) 6, Lunit v3.1.0.0, and qXR v2). We estimated sensitivity and specificity within each study and pooled using IPD meta-analysis. We used multivariable meta-regression to identify characteristics modifying accuracy.

RESULTS : We included CXRs and IPD of 3727/3967 participants from 4/7 eligible studies. 17% (621/3727) were PLWH. 17% (645/3727) had microbiologically-confirmed tuberculosis. Despite using the same threshold score for classifying CXR in every study, sensitivity and specificity varied from study to study. The software had similar unadjusted accuracy (at 90% pooled sensitivity, pooled specificities were: CAD4TBv6, 56.9% [95%CI:51.7-61.9]; Lunit, 54.1% [44.6-63.3]; qXRv2, 60.5% [51.7-68.6]). Adjusted absolute differences in pooled sensitivity between PLWH and HIV-uninfected participants was: CAD4TBv6, -13.4% [-21.1, -6.9]; Lunit, +2.2% [-3.6, +6.3]; qXRv2: -13.4% [-21.5, -6.6]); between smear-negative and smear-positive tuberculosis was: CAD4TBv6, -12.3% [-19.5, -6.1]; Lunit, -17.2% [-24.6, -10.5]; qXRv2, -16.6% [-24.4, -9.9]. Accuracy was similar to human readers.

CONCLUSIONS : For CAD CXR analysis to be implemented as a high-sensitivity tuberculosis rule-out test, users will need threshold scores identified from their own patient populations, and stratified by HIV- and smear-status.

Tavaziva Gamuchirai, Harris Miriam, Abidi Syed K, Geric Coralie, Breuninger Marianne, Dheda Keertan, Esmail Aliasgar, Muyoyeta Monde, Reither Klaus, Majidulla Arman, Khan Aamir J, Campbell Jonathon R, David Pierre-Marie, Denkinger Claudia, Miller Cecily, Nathavitharana Ruvandhi, Pai Madhukar, Benedetti Andrea, Khan Faiz Ahmad

2021-Jul-21

Tuberculosis, accuracy, chest X-ray, deep learning, individual patient data meta-analysis

General General

Trends in computational molecular catalyst design.

In Dalton transactions (Cambridge, England : 2003)

Computational methods have emerged as a powerful tool to augment traditional experimental molecular catalyst design by providing useful predictions of catalyst performance and decreasing the time needed for catalyst screening. In this perspective, we discuss three approaches for computational molecular catalyst design: (i) the reaction mechanism-based approach that calculates all relevant elementary steps, finds the rate and selectivity determining steps, and ultimately makes predictions on catalyst performance based on kinetic analysis, (ii) the descriptor-based approach where physical/chemical considerations are used to find molecular properties as predictors of catalyst performance, and (iii) the data-driven approach where statistical analysis as well as machine learning (ML) methods are used to obtain relationships between available data/features and catalyst performance. Following an introduction to these approaches, we cover their strengths and weaknesses and highlight some recent key applications. Furthermore, we present an outlook on how the currently applied approaches may evolve in the near future by addressing how recent developments in building automated computational workflows and implementing advanced ML models hold promise for reducing human workload, eliminating human bias, and speeding up computational catalyst design at the same time. Finally, we provide our viewpoint on how some of the challenges associated with the up-and-coming approaches driven by automation and ML may be resolved.

Soyemi Ademola, Szilvási Tibor

2021-Jul-21

General General

Functional binding dynamics relevant to the evolution of zoonotic spillovers in endemic and emergent Betacoronavirus strains.

In Journal of biomolecular structure & dynamics

Comparative functional analysis of the dynamic interactions between various Betacoronavirus mutant strains and broadly utilized target proteins such as ACE2 and CD26, is crucial for a more complete understanding of zoonotic spillovers of viruses that cause diseases such as COVID-19. Here, we employ machine learning to replicated sets of nanosecond scale GPU accelerated molecular dynamics simulations to statistically compare and classify atom motions of these target proteins in both the presence and absence of different endemic and emergent strains of the viral receptor binding domain (RBD) of the S spike glycoprotein. A multi-agent classifier successfully identified functional binding dynamics that are evolutionarily conserved from bat CoV-HKU4 to human endemic/emergent strains. Conserved dynamics regions of ACE2 involve both the N-terminal helices, as well as a region of more transient dynamics encompassing residues K353, Q325 and a novel motif AAQPFLL 386-92 that appears to coordinate their dynamic interactions with the viral RBD at N501. We also demonstrate that the functional evolution of Betacoronavirus zoonotic spillovers involving ACE2 interaction dynamics are likely pre-adapted from two precise and stable binding sites involving the viral bat progenitor strain's interaction with CD26 at SAMLI 291-5 and SS 333-334. Our analyses further indicate that the human endemic strains hCoV-HKU1 and hCoV-OC43 have evolved more stable N-terminal helix interactions through enhancement of an interfacing loop region on the viral RBD, whereas the highly transmissible SARS-CoV-2 variants (B.1.1.7, B.1.351 and P.1) have evolved more stable viral binding via more focused interactions between the viral N501 and ACE2 K353 alone.Communicated by Ramaswamy H. Sarma.

Rynkiewicz Patrick, Lynch Miranda L, Cui Feng, Hudson André O, Babbitt Gregory A

2021-Jul-21

COVID 19, Molecular dynamics, molecular evolution, viral binding

Radiology Radiology

Radiologists' Increasing Role in Population Health Management: AJR Expert Panel Narrative Review.

In AJR. American journal of roentgenology

Population health management (PHM) is the holistic process of improving health outcomes of groups of individuals through the support of appropriate financial and care models. Radiologists' presence at the intersection of many aspects of healthcare, including screening, diagnostic imaging, and image-guided therapies, provides significant opportunity for increased radiologist engagement in PHM. Further, innovations in artificial intelligence and imaging informatics will serve as critical tools to improve value in healthcare through evidence-based and equitable approaches. Given radiologists' limited engagement in PHM to date, it is imperative to define the specialty's PHM priorities so that the radiologists' full value in improving population health is realized. In this expert review, we explore programs and future directions for radiology in PHM.

Porembka Jessica H, Lee Ryan K, Spalluto Lucy B, Yee Judy, Krishnaraj Arun, Zaidi Syed, Brewington Cecelia

2021-Jul-21

AUC, Abdominal aortic aneurysm screening, Accountable care organization, Artificial intelligence, BPCI, Breast cancer screening, Care coordination, Clinical decision support, Colorectal cancer screening, HEDIS, Lung cancer screening, MACRA, MIPS, Opportunistic imaging, PAMA, Population health, Population health management, Radiology, Screening, Value-based care

Radiology Radiology

Deep learning to automate the labelling of head MRI datasets for computer vision applications.

In European radiology ; h5-index 62.0

OBJECTIVES : The purpose of this study was to build a deep learning model to derive labels from neuroradiology reports and assign these to the corresponding examinations, overcoming a bottleneck to computer vision model development.

METHODS : Reference-standard labels were generated by a team of neuroradiologists for model training and evaluation. Three thousand examinations were labelled for the presence or absence of any abnormality by manually scrutinising the corresponding radiology reports ('reference-standard report labels'); a subset of these examinations (n = 250) were assigned 'reference-standard image labels' by interrogating the actual images. Separately, 2000 reports were labelled for the presence or absence of 7 specialised categories of abnormality (acute stroke, mass, atrophy, vascular abnormality, small vessel disease, white matter inflammation, encephalomalacia), with a subset of these examinations (n = 700) also assigned reference-standard image labels. A deep learning model was trained using labelled reports and validated in two ways: comparing predicted labels to (i) reference-standard report labels and (ii) reference-standard image labels. The area under the receiver operating characteristic curve (AUC-ROC) was used to quantify model performance. Accuracy, sensitivity, specificity, and F1 score were also calculated.

RESULTS : Accurate classification (AUC-ROC > 0.95) was achieved for all categories when tested against reference-standard report labels. A drop in performance (ΔAUC-ROC > 0.02) was seen for three categories (atrophy, encephalomalacia, vascular) when tested against reference-standard image labels, highlighting discrepancies in the original reports. Once trained, the model assigned labels to 121,556 examinations in under 30 min.

CONCLUSIONS : Our model accurately classifies head MRI examinations, enabling automated dataset labelling for downstream computer vision applications.

KEY POINTS : • Deep learning is poised to revolutionise image recognition tasks in radiology; however, a barrier to clinical adoption is the difficulty of obtaining large labelled datasets for model training. • We demonstrate a deep learning model which can derive labels from neuroradiology reports and assign these to the corresponding examinations at scale, facilitating the development of downstream computer vision models. • We rigorously tested our model by comparing labels predicted on the basis of neuroradiology reports with two sets of reference-standard labels: (1) labels derived by manually scrutinising each radiology report and (2) labels derived by interrogating the actual images.

Wood David A, Kafiabadi Sina, Al Busaidi Aisha, Guilhem Emily L, Lynch Jeremy, Townend Matthew K, Montvila Antanas, Kiik Martin, Siddiqui Juveria, Gadapa Naveen, Benger Matthew D, Mazumder Asif, Barker Gareth, Ourselin Sebastian, Cole James H, Booth Thomas C

2021-Jul-20

Data curation, Deep learning, Magnetic resonance imaging, Natural language processing, Radiology

Radiology Radiology

Performance of automatic machine learning versus radiologists in the evaluation of endometrium on computed tomography.

In Abdominal radiology (New York)

PURPOSE : In this study, we developed radiomic models that utilize a combination of imaging features and clinical variables to distinguish endometrial cancer (EC) from normal endometrium on routine computed tomography (CT).

METHODS : A total of 926 patients consisting of 416 endometrial cancer (EC) and 510 normal endometrium were included. The CT images of these patients were segmented manually, and divided into training, validation, testing and external testing sets. Non-texture and texture features of these images with endometrium or uterus as region of interest were extracted. The clinical feature "age" was also included in the feature set. Feature selection and machine learning classifier were applied to normalized feature set. This manual optimized combination was then compared with the best pipeline exported by Tree-Based Pipeline Optimization Tool (TPOT) on testing and external testing set. The performances of these machine learning pipelines were compared to that of radiologists.

RESULTS : The manual expert optimized pipeline using the "reliefF" feature selection method and "Bagging" classifier on the external testing set achieved a test ROC AUC of 0.73, accuracy of 0.73 (95% CI 0.62-0.82), sensitivity of 0.64 (95% CI 0.45-0.79), and specificity of 0.78 (95% CI 0.65-0.87), while TPOT achieved a test ROC AUC of 0.79, accuracy of 0.80 (95% CI 0.70-0.87), sensitivity of 0.61 (95% CI 0.43-0.77), and specificity of 0.90 (95% CI 0.78-0.96). When compared to average radiologist performance, the TPOT achieved higher test accuracy (0.80 vs. 0.49, p < 0.001) and specificity (0.90 vs. 0.51, p < 0.001), with comparable sensitivity (0.61 vs. 0.46, p = 0.130).

CONCLUSION : Our results demonstrate that automatic machine learning can distinguish EC from normal endometrium on routine CT imaging with higher accuracy and specificity than radiologists.

Li Dan, Hu Rong, Li Huizhou, Cai Yeyu, Zhang Paul J, Wu Jing, Zhu Chengzhang, Bai Harrison X

2021-Jul-21

Automatic machine learning, Computed tomography, Endometrial cancer, Radiomics

General General

Hydrogen storage in MOFs: Machine learning for finding a needle in a haystack.

In Patterns (New York, N.Y.)

In recent years, machine learning (ML) has grown exponentially within the field of structure property predictions in materials science. In this issue of Patterns, Ahmed and Siegel scrutinize several redeveloped ML techniques for systematic investigations of over 900,000 metal-organic framework (MOF) structures, taken from 19 databases, to discover new, potentially record-breaking, hydrogen-storage materials.

Glasby Lawson T, Moghadam Peyman Z

2021-Jul-09

General General

Advancing sensory neuroprosthetics using artificial brain networks.

In Patterns (New York, N.Y.)

Implementation of effective brain or neural stimulation protocols for restoration of complex sensory perception, e.g., in the visual domain, is an unresolved challenge. By leveraging the capacity of deep learning to model the brain's visual system, optic nerve stimulation patterns could be derived that are predictive of neural responses of higher-level cortical visual areas in silico. This novel approach could be generalized to optimize different types of neuroprosthetics or bidirectional brain-computer interfaces (BCIs).

Haslacher David, Nasr Khaled, Soekadar Surjo R

2021-Jul-09

General General

Looking through glass: Knowledge discovery from materials science literature using natural language processing.

In Patterns (New York, N.Y.)

Most of the knowledge in materials science literature is in the form of unstructured data such as text and images. Here, we present a framework employing natural language processing, which automates text and image comprehension and precision knowledge extraction from inorganic glasses' literature. The abstracts are automatically categorized using latent Dirichlet allocation (LDA) to classify and search semantically linked publications. Similarly, a comprehensive summary of images and plots is presented using the caption cluster plot (CCP), providing direct access to images buried in the papers. Finally, we combine the LDA and CCP with chemical elements to present an elemental map, a topical and image-wise distribution of elements occurring in the literature. Overall, the framework presented here can be a generic and powerful tool to extract and disseminate material-specific information on composition-structure-processing-property dataspaces, allowing insights into fundamental problems relevant to the materials science community and accelerated materials discovery.

Venugopal Vineeth, Sahoo Sourav, Zaki Mohd, Agarwal Manish, Gosvami Nitya Nand, Krishnan N M Anoop

2021-Jul-09

artificial intelligence, glass science, knowledge discovery, materials science, natural language processing

General General

Structuring clinical text with AI: Old versus new natural language processing techniques evaluated on eight common cardiovascular diseases.

In Patterns (New York, N.Y.)

Free-text clinical notes in electronic health records are more difficult for data mining while the structured diagnostic codes can be missing or erroneous. To improve the quality of diagnostic codes, this work extracts diagnostic codes from free-text notes: five old and new word vectorization methods were used to vectorize Stanford progress notes and predict eight ICD-10 codes of common cardiovascular diseases with logistic regression. The models showed good performance, with TF-IDF as the best vectorization model showing the highest AUROC (0.9499-0.9915) and AUPRC (0.2956-0.8072). The models also showed transferability when tested on MIMIC-III data with AUROC from 0.7952 to 0.9790 and AUPRC from 0.2353 to 0.8084. Model interpretability was shown by the important words with clinical meanings matching each disease. This study shows the feasibility of accurately extracting structured diagnostic codes, imputing missing codes, and correcting erroneous codes from free-text clinical notes for information retrieval and downstream machine-learning applications.

Zhan Xianghao, Humbert-Droz Marie, Mukherjee Pritam, Gevaert Olivier

2021-Jul-09

ICD-10 codes, cardiovascular disease, clinical notes, interpretability, natural language processing

oncology Oncology

Exploring perceptions of healthcare technologies enabled by artificial intelligence: an online, scenario-based survey.

In BMC medical informatics and decision making ; h5-index 38.0

BACKGROUND : Healthcare is expected to increasingly integrate technologies enabled by artificial intelligence (AI) into patient care. Understanding perceptions of these tools is essential to successful development and adoption. This exploratory study gauged participants' level of openness, concern, and perceived benefit associated with AI-driven healthcare technologies. We also explored socio-demographic, health-related, and psychosocial correlates of these perceptions.

METHODS : We developed a measure depicting six AI-driven technologies that either diagnose, predict, or suggest treatment. We administered the measure via an online survey to adults (N = 936) in the United States using MTurk, a crowdsourcing platform. Participants indicated their level of openness to using the AI technology in the healthcare scenario. Items reflecting potential concerns and benefits associated with each technology accompanied the scenarios. Participants rated the extent that the statements of concerns and benefits influenced their perception of favorability toward the technology. Participants completed measures of socio-demographics, health variables, and psychosocial variables such as trust in the healthcare system and trust in technology. Exploratory and confirmatory factor analyses of the concern and benefit items identified two factors representing overall level of concern and perceived benefit. Descriptive analyses examined levels of openness, concern, and perceived benefit. Correlational analyses explored associations of socio-demographic, health, and psychosocial variables with openness, concern, and benefit scores while multivariable regression models examined these relationships concurrently.

RESULTS : Participants were moderately open to AI-driven healthcare technologies (M = 3.1/5.0 ± 0.9), but there was variation depending on the type of application, and the statements of concerns and benefits swayed views. Trust in the healthcare system and trust in technology were the strongest, most consistent correlates of openness, concern, and perceived benefit. Most other socio-demographic, health-related, and psychosocial variables were less strongly, or not, associated, but multivariable models indicated some personality characteristics (e.g., conscientiousness and agreeableness) and socio-demographics (e.g., full-time employment, age, sex, and race) were modestly related to perceptions.

CONCLUSIONS : Participants' openness appears tenuous, suggesting early promotion strategies and experiences with novel AI technologies may strongly influence views, especially if implementation of AI technologies increases or undermines trust. The exploratory nature of these findings warrants additional research.

Antes Alison L, Burrous Sara, Sisk Bryan A, Schuelke Matthew J, Keune Jason D, DuBois James M

2021-Jul-20

Acceptance of healthcare, Artificial intelligence, Benefits, Bioethics, Concerns, Machine learning, Openness, Perceptions

General General

Functional binding dynamics relevant to the evolution of zoonotic spillovers in endemic and emergent Betacoronavirus strains.

In Journal of biomolecular structure & dynamics

Comparative functional analysis of the dynamic interactions between various Betacoronavirus mutant strains and broadly utilized target proteins such as ACE2 and CD26, is crucial for a more complete understanding of zoonotic spillovers of viruses that cause diseases such as COVID-19. Here, we employ machine learning to replicated sets of nanosecond scale GPU accelerated molecular dynamics simulations to statistically compare and classify atom motions of these target proteins in both the presence and absence of different endemic and emergent strains of the viral receptor binding domain (RBD) of the S spike glycoprotein. A multi-agent classifier successfully identified functional binding dynamics that are evolutionarily conserved from bat CoV-HKU4 to human endemic/emergent strains. Conserved dynamics regions of ACE2 involve both the N-terminal helices, as well as a region of more transient dynamics encompassing residues K353, Q325 and a novel motif AAQPFLL 386-92 that appears to coordinate their dynamic interactions with the viral RBD at N501. We also demonstrate that the functional evolution of Betacoronavirus zoonotic spillovers involving ACE2 interaction dynamics are likely pre-adapted from two precise and stable binding sites involving the viral bat progenitor strain's interaction with CD26 at SAMLI 291-5 and SS 333-334. Our analyses further indicate that the human endemic strains hCoV-HKU1 and hCoV-OC43 have evolved more stable N-terminal helix interactions through enhancement of an interfacing loop region on the viral RBD, whereas the highly transmissible SARS-CoV-2 variants (B.1.1.7, B.1.351 and P.1) have evolved more stable viral binding via more focused interactions between the viral N501 and ACE2 K353 alone.Communicated by Ramaswamy H. Sarma.

Rynkiewicz Patrick, Lynch Miranda L, Cui Feng, Hudson André O, Babbitt Gregory A

2021-Jul-21

COVID 19, Molecular dynamics, molecular evolution, viral binding

General General

Segmentation of Cardiac Structures via Successive Subspace Learning with Saab Transform from Cine MRI

ArXiv Preprint

Assessment of cardiovascular disease (CVD) with cine magnetic resonance imaging (MRI) has been used to non-invasively evaluate detailed cardiac structure and function. Accurate segmentation of cardiac structures from cine MRI is a crucial step for early diagnosis and prognosis of CVD, and has been greatly improved with convolutional neural networks (CNN). There, however, are a number of limitations identified in CNN models, such as limited interpretability and high complexity, thus limiting their use in clinical practice. In this work, to address the limitations, we propose a lightweight and interpretable machine learning model, successive subspace learning with the subspace approximation with adjusted bias (Saab) transform, for accurate and efficient segmentation from cine MRI. Specifically, our segmentation framework is comprised of the following steps: (1) sequential expansion of near-to-far neighborhood at different resolutions; (2) channel-wise subspace approximation using the Saab transform for unsupervised dimension reduction; (3) class-wise entropy guided feature selection for supervised dimension reduction; (4) concatenation of features and pixel-wise classification with gradient boost; and (5) conditional random field for post-processing. Experimental results on the ACDC 2017 segmentation database, showed that our framework performed better than state-of-the-art U-Net models with 200$\times$ fewer parameters in delineating the left ventricle, right ventricle, and myocardium, thus showing its potential to be used in clinical practice.

Xiaofeng Liu, Fangxu Xing, Hanna K. Gaggin, Weichung Wang, C. -C. Jay Kuo, Georges El Fakhri, Jonghye Woo

2021-07-22

General General

A machine learning framework to optimize optic nerve electrical stimulation for vision restoration.

In Patterns (New York, N.Y.)

Optic nerve electrical stimulation is a promising technique to restore vision in blind subjects. Machine learning methods can be used to select effective stimulation protocols, but they require a model of the stimulated system to generate enough training data. Here, we use a convolutional neural network (CNN) as a model of the ventral visual stream. A genetic algorithm drives the activation of the units in a layer of the CNN representing a cortical region toward a desired pattern, by refining the activation imposed at a layer representing the optic nerve. To simulate the pattern of activation elicited by the sites of an electrode array, a simple point-source model was introduced and its optimization process was investigated for static and dynamic scenes. Psychophysical data confirm that our stimulation evolution framework produces results compatible with natural vision. Machine learning approaches could become a very powerful tool to optimize and personalize neuroprosthetic systems.

Romeni Simone, Zoccolan Davide, Micera Silvestro

2021-Jul-09

convolutional neural networks, genetic algorithms, neuroprosthetics, optic nerve stimulation, optimization, sensory restoration, vision restoration

General General

Predicting material microstructure evolution via data-driven machine learning.

In Patterns (New York, N.Y.)

Predicting microstructure evolution can be a formidable challenge, yet it is essential to building microstructure-processing-property relationships. Yang et al. offer a new solution to traditional partial differential equation-based simulations: a data-driven machine learning approach motivated by the practical needs to accelerate the materials design process and deal with incomplete information in the real world of microstructure simulation.

Kautz Elizabeth J

2021-Jul-09

Public Health Public Health

Privacy-preserving data sharing via probabilistic modeling.

In Patterns (New York, N.Y.)

Differential privacy allows quantifying privacy loss resulting from accession of sensitive personal data. Repeated accesses to underlying data incur increasing loss. Releasing data as privacy-preserving synthetic data would avoid this limitation but would leave open the problem of designing what kind of synthetic data. We propose formulating the problem of private data release through probabilistic modeling. This approach transforms the problem of designing the synthetic data into choosing a model for the data, allowing also the inclusion of prior knowledge, which improves the quality of the synthetic data. We demonstrate empirically, in an epidemiological study, that statistical discoveries can be reliably reproduced from the synthetic data. We expect the method to have broad use in creating high-quality anonymized data twins of key datasets for research.

Jälkö Joonas, Lagerspetz Eemil, Haukka Jari, Tarkoma Sasu, Honkela Antti, Kaski Samuel

2021-Jul-09

differential privacy, machine learning, open data, probabilistic modeling, synthetic data

General General

An in silico drug repositioning workflow for host-based antivirals.

In STAR protocols

Drug repositioning represents a cost- and time-efficient strategy for drug development. Artificial intelligence-based algorithms have been applied in drug repositioning by predicting drug-target interactions in an efficient and high throughput manner. Here, we present a workflow of in silico drug repositioning for host-based antivirals using specially defined targets, a refined list of drug candidates, and an easily implemented computational framework. The workflow described here can also apply to more general purposes, especially when given a user-defined druggable target gene set. For complete details on the use and execution of this protocol, please refer to Li et al. (2021).

Li Zexu, Yao Yingjia, Cheng Xiaolong, Li Wei, Fei Teng

2021-Sep-17

Bioinformatics, High Throughput Screening, Immunology, Microbiology, Molecular Biology, Structural Biology

General General

Qualification of Soybean Responses to Flooding Stress Using UAV-Based Imagery and Deep Learning.

In Plant phenomics (Washington, D.C.)

Soybean is sensitive to flooding stress that may result in poor seed quality and significant yield reduction. Soybean production under flooding could be sustained by developing flood-tolerant cultivars through breeding programs. Conventionally, soybean tolerance to flooding in field conditions is evaluated by visually rating the shoot injury/damage due to flooding stress, which is labor-intensive and subjective to human error. Recent developments of field high-throughput phenotyping technology have shown great potential in measuring crop traits and detecting crop responses to abiotic and biotic stresses. The goal of this study was to investigate the potential in estimating flood-induced soybean injuries using UAV-based image features collected at different flight heights. The flooding injury score (FIS) of 724 soybean breeding plots was taken visually by breeders when soybean showed obvious injury symptoms. Aerial images were taken on the same day using a five-band multispectral and an infrared (IR) thermal camera at 20, 50, and 80 m above ground. Five image features, i.e., canopy temperature, normalized difference vegetation index, canopy area, width, and length, were extracted from the images at three flight heights. A deep learning model was used to classify the soybean breeding plots to five FIS ratings based on the extracted image features. Results show that the image features were significantly different at three flight heights. The best classification performance was obtained by the model developed using image features at 20 m with 0.9 for the five-level FIS. The results indicate that the proposed method is very promising in estimating FIS for soybean breeding.

Zhou Jing, Mou Huawei, Zhou Jianfeng, Ali Md Liakat, Ye Heng, Chen Pengyin, Nguyen Henry T

2021

General General

Machine learning identifies novel markers predicting functional decline in older adults.

In Brain communications

The ability to carry out instrumental activities of daily living, such as paying bills, remembering appointments and shopping alone decreases with age, yet there are remarkable individual differences in the rate of decline among older adults. Understanding variables associated with a decline in instrumental activities of daily living is critical to providing appropriate intervention to prolong independence. Prior research suggests that cognitive measures, neuroimaging and fluid-based biomarkers predict functional decline. However, a priori selection of variables can lead to the over-valuation of certain variables and exclusion of others that may be predictive. In this study, we used machine learning techniques to select a wide range of baseline variables that best predicted functional decline in two years in individuals from the Alzheimer's Disease Neuroimaging Initiative dataset. The sample included 398 individuals characterized as cognitively normal or mild cognitive impairment. Support vector machine classification algorithms were used to identify the most predictive modality from five different data modality types (demographics, structural MRI, fluorodeoxyglucose-PET, neurocognitive and genetic/fluid-based biomarkers). In addition, variable selection identified individual variables across all modalities that best predicted functional decline in a testing sample. Of the five modalities examined, neurocognitive measures demonstrated the best accuracy in predicting functional decline (accuracy = 74.2%; area under the curve = 0.77), followed by fluorodeoxyglucose-PET (accuracy = 70.8%; area under the curve = 0.66). The individual variables with the greatest discriminatory ability for predicting functional decline included partner report of language in the Everyday Cognition questionnaire, the ADAS13, and activity of the left angular gyrus using fluorodeoxyglucose-PET. These three variables collectively explained 32% of the total variance in functional decline. Taken together, the machine learning model identified novel biomarkers that may be involved in the processing, retrieval, and conceptual integration of semantic information and which predict functional decline two years after assessment. These findings may be used to explore the clinical utility of the Everyday Cognition as a non-invasive, cost and time effective tool to predict future functional decline.

Valerio Kate E, Prieto Sarah, Hasselbach Alexander N, Moody Jena N, Hayes Scott M, Hayes Jasmeet P

2021-Jul

ADNI, IADL, angular gyrus, everyday cognition, machine learning

General General

The automation of doctors and machines: A classification for AI in medicine (ADAM framework).

In Future healthcare journal

The advances in artificial intelligence (AI) provide an opportunity to expand the frontier of medicine to improve diagnosis, efficiency and management. By extension of being able to perform any task that a human could, a machine that meets the requirements of artificial general intelligence ('strong' AI; AGI) possesses the basic necessities to perform as, or at least qualify to become, a doctor. In this emerging field, this article explores the distinctions between doctors and AGI, and the prerequisites for AGI performing as clinicians. In doing so, it necessitates the requirement for a classification of medical AI and prepares for the development of AGI. With its imminent arrival, it is beneficial to create a framework from which leading institutions can define specific criteria for AGI.

Kazzazi Fawz

2021-Jul

AI, artificial intelligence, clinical governance, medical technology

General General

The future of acute and emergency care.

In Future healthcare journal

Improved outcomes for acutely unwell patients are predicated on early identification of deterioration, accelerating the time to accurate diagnosis of the underlying condition, selection and titration of treatments that target biological phenotypes, and personalised endpoints to achieve optimal benefit yet minimise iatrogenic harm. Technological developments entering routine clinical practice over the next decade will deliver a sea change in patient management. Enhanced point of care diagnostics, more sophisticated physiological and biochemical monitoring with superior analytics and computer-aided support tools will all add considerable artificial intelligence to complement clinical skills. Experts in different fields of emergency and critical care medicine offer their perspectives as to which research developments could make a big difference within the next decade.

Newcombe Virginia, Coats Timothy, Dark Paul, Gordon Anthony, Harris Steve, McAuley Danny F, Menon David K, Price Susanna, Puthucheary Zudin, Singer Mervyn

2021-Jul

acute care, emergency care, precision medicine, stratified medicine

General General

Artificial intelligence in healthcare: transforming the practice of medicine.

In Future healthcare journal

Artificial intelligence (AI) is a powerful and disruptive area of computer science, with the potential to fundamentally transform the practice of medicine and the delivery of healthcare. In this review article, we outline recent breakthroughs in the application of AI in healthcare, describe a roadmap to building effective, reliable and safe AI systems, and discuss the possible future direction of AI augmented healthcare systems.

Bajwa Junaid, Munir Usman, Nori Aditya, Williams Bryan

2021-Jul

AI, digital health

General General

Sustaining COVID-19 pandemic lockdown era air pollution impact through utilization of more renewable energy resources.

In Heliyon

The lock down engendered by COVID-19 pandemic has impacted positively on the environment through reduction of the emissions of green house gases, CO2, CO and other pollutants into the atmosphere below the pre-COVID-19 levels. There are fears that the gains made in the environment during COVID-19 may be frittered away as nations around the world make serious efforts to boost the COVID-19 recessed economy through massive investments in the sectors of the economy that are not environmentally friendly. This paper emphasizes on the essence of maintaining the COVID-19 pandemic era environmental impact levels in post COVID-19 era without retarding the efforts towards economic recovery. World health organization (WHO) data from six regions between April and August 2020 was evaluated. Emission levels during the COVID-19 lockdown were reviewed. The global renewable energy potentials were ascertained. The paper suggests that investment in renewable energy resources for various countries' energy needs will help sustain the green and clean environment created by the COVID-19 lockdown even after COVID-19 era lockdown. Also, building large scale and distributed energy storage infrastructure and application of artificial intelligence would ensure security of energy supply and handle unstable nature of solar and wind energy. The COVID-19 lockdown significantly reduced air pollution. The application of biofuels to generate energy and power was found to significantly reduce air pollutant emissions similar to COVID-19 lockdown.

Rita Eyisi, Chizoo Esonye, Cyril Ume Sunday

2021-Jul

COVID-19 pandemic, Energy policy, Environmental pollution, Lockdown, Renewable energy

General General

Cortical thickness distinguishes between major depression and schizophrenia in adolescents.

In BMC psychiatry

BACKGROUND : Early diagnosis of adolescent psychiatric disorder is crucial for early intervention. However, there is extensive comorbidity between affective and psychotic disorders, which increases the difficulty of precise diagnoses among adolescents.

METHODS : We obtained structural magnetic resonance imaging scans from 150 adolescents, including 67 and 47 patients with major depressive disorder (MDD) and schizophrenia (SCZ), as well as 34 healthy controls (HC) to explore whether psychiatric disorders could be identified using a machine learning technique. Specifically, we used the support vector machine and the leave-one-out cross-validation method to distinguish among adolescents with MDD and SCZ and healthy controls.

RESULTS : We found that cortical thickness was a classification feature of a) MDD and HC with 79.21% accuracy where the temporal pole had the highest weight; b) SCZ and HC with 69.88% accuracy where the left superior temporal sulcus had the highest weight. Notably, adolescents with MDD and SCZ could be classified with 62.93% accuracy where the right pars triangularis had the highest weight.

CONCLUSIONS : Our findings suggest that cortical thickness may be a critical biological feature in the diagnosis of adolescent psychiatric disorders. These findings might be helpful to establish an early prediction model for adolescents to better diagnose psychiatric disorders.

Zhou Zheyi, Wang Kangcheng, Tang Jinxiang, Wei Dongtao, Song Li, Peng Yadong, Fu Yixiao, Qiu Jiang

2021-Jul-20

Adolescence, Cortical thickness, Depression, Machine learning, Schizophrenia

General General

Sustaining COVID-19 pandemic lockdown era air pollution impact through utilization of more renewable energy resources.

In Heliyon

The lock down engendered by COVID-19 pandemic has impacted positively on the environment through reduction of the emissions of green house gases, CO2, CO and other pollutants into the atmosphere below the pre-COVID-19 levels. There are fears that the gains made in the environment during COVID-19 may be frittered away as nations around the world make serious efforts to boost the COVID-19 recessed economy through massive investments in the sectors of the economy that are not environmentally friendly. This paper emphasizes on the essence of maintaining the COVID-19 pandemic era environmental impact levels in post COVID-19 era without retarding the efforts towards economic recovery. World health organization (WHO) data from six regions between April and August 2020 was evaluated. Emission levels during the COVID-19 lockdown were reviewed. The global renewable energy potentials were ascertained. The paper suggests that investment in renewable energy resources for various countries' energy needs will help sustain the green and clean environment created by the COVID-19 lockdown even after COVID-19 era lockdown. Also, building large scale and distributed energy storage infrastructure and application of artificial intelligence would ensure security of energy supply and handle unstable nature of solar and wind energy. The COVID-19 lockdown significantly reduced air pollution. The application of biofuels to generate energy and power was found to significantly reduce air pollutant emissions similar to COVID-19 lockdown.

Rita Eyisi, Chizoo Esonye, Cyril Ume Sunday

2021-Jul

COVID-19 pandemic, Energy policy, Environmental pollution, Lockdown, Renewable energy

General General

Deep 3D-CNN for Depression Diagnosis with Facial Video Recording of Self-Rating Depression Scale Questionnaire

ArXiv Preprint

The Self-Rating Depression Scale (SDS) questionnaire is commonly utilized for effective depression preliminary screening. The uncontrolled self-administered measure, on the other hand, maybe readily influenced by insouciant or dishonest responses, yielding different findings from the clinician-administered diagnostic. Facial expression (FE) and behaviors are important in clinician-administered assessments, but they are underappreciated in self-administered evaluations. We use a new dataset of 200 participants to demonstrate the validity of self-rating questionnaires and their accompanying question-by-question video recordings in this study. We offer an end-to-end system to handle the face video recording that is conditioned on the questionnaire answers and the responding time to automatically interpret sadness from the SDS assessment and the associated video. We modified a 3D-CNN for temporal feature extraction and compared various state-of-the-art temporal modeling techniques. The superior performance of our system shows the validity of combining facial video recording with the SDS score for more accurate self-diagnose.

Wanqing Xie, Lizhong Liang, Yao Lu, Hui Luo, Xiaofeng Liu

2021-07-22

General General

Data about fall events and ordinary daily activities from a sensorized smart floor.

In Data in brief

A smart floor with 16 embedded pressure sensors was used to record 420 simulated fall events performed by 60 volunteers. Each participant performed seven fall events selected from the guidelines defined in a previous study. Raw data were grouped and well organized in CSV format. The data was collected for the development of a non-intrusive fall detection solution based on the smart floor. Indeed, the collected data can be used to further improve the current solution by proposing new fall detection techniques for the correct identification of accidental fall events on the smart floor. The gathered fall simulation data is associated with participants' demographic characteristics, useful for future expansions of the smart floor solution beyond the fall detection problem.

Tošić Aleksandar, Hrovatin Niki, Vičič Jernej

2021-Aug

Elderly, Fall detection, Machine learning, Sensor networks, Smart floor

General General

ERpred: a web server for the prediction of subtype-specific estrogen receptor antagonists.

In PeerJ

Estrogen receptors alpha and beta (ERα and ERβ) are responsible for breast cancer metastasis through their involvement of clinical outcomes. Estradiol and hormone replacement therapy targets both ERs, but this often leads to an increased risk of breast and endometrial cancers as well as thromboembolism. A major challenge is posed for the development of compounds possessing ER subtype specificity. Herein, we present a large-scale classification structure-activity relationship (CSAR) study of inhibitors from the ChEMBL database which consisted of an initial set of 11,618 compounds for ERα and 7,810 compounds for ERβ. The IC50 was selected as the bioactivity unit for further investigation and after the data curation process, this led to a final data set of 1,593 and 1,281 compounds for ERα and ERβ, respectively. We employed the random forest (RF) algorithm for model building and of the 12 fingerprint types, models built using the PubChem fingerprint was the most robust (Ac of 94.65% and 92.25% and Matthews correlation coefficient (MCC) of 89% and 76% for ERα and ERβ, respectively) and therefore selected for feature interpretation. Results indicated the importance of features pertaining to aromatic rings, nitrogen-containing functional groups and aliphatic hydrocarbons. Finally, the model was deployed as the publicly available web server called ERpred at http://codes.bio/erpred where users can submit SMILES notation as the input query for prediction of the bioactivity against ERα and ERβ.

Schaduangrat Nalini, Malik Aijaz Ahmad, Nantasenamat Chanin

2021

Breast cancer, Data science, Estrogen, Estrogen receptor, Machine learning, QSAR, Quantitative structure-activity relationship

General General

Statistical and machine learning methods for spatially resolved transcriptomics with histology.

In Computational and structural biotechnology journal

Recent developments in spatially resolved transcriptomics (SRT) technologies have enabled scientists to get an integrated understanding of cells in their morphological context. Applications of these technologies in diverse tissues and diseases have transformed our views of transcriptional complexity. Most published studies utilized tools developed for single-cell RNA sequencing (scRNA-seq) for data analysis. However, SRT data exhibit different properties from scRNA-seq. To take full advantage of the added dimension on spatial location information in such data, new methods that are tailored for SRT are needed. Additionally, SRT data often have companion high-resolution histology information available. Incorporating histological features in gene expression analysis is an underexplored area. In this review, we will focus on the statistical and machine learning aspects for SRT data analysis and discuss how spatial location and histology information can be integrated with gene expression to advance our understanding of the transcriptional complexity. We also point out open problems and future research directions in this field.

Hu Jian, Schroeder Amelia, Coleman Kyle, Chen Chixiang, Auerbach Benjamin J, Li Mingyao

2021

Cell-cell communications, Celltype deconvolution, Spatial clustering, Spatially resolved transcriptomics, Spatially variable genes

General General

Integration strategies of multi-omics data for machine learning analysis.

In Computational and structural biotechnology journal

Increased availability of high-throughput technologies has generated an ever-growing number of omics data that seek to portray many different but complementary biological layers including genomics, epigenomics, transcriptomics, proteomics, and metabolomics. New insight from these data have been obtained by machine learning algorithms that have produced diagnostic and classification biomarkers. Most biomarkers obtained to date however only include one omic measurement at a time and thus do not take full advantage of recent multi-omics experiments that now capture the entire complexity of biological systems. Multi-omics data integration strategies are needed to combine the complementary knowledge brought by each omics layer. We have summarized the most recent data integration methods/ frameworks into five different integration strategies: early, mixed, intermediate, late and hierarchical. In this mini-review, we focus on challenges and existing multi-omics integration strategies by paying special attention to machine learning applications.

Picard Milan, Scott-Boyer Marie-Pier, Bodein Antoine, Périn Olivier, Droit Arnaud

2021

Deep learning, Integration strategy, Machine learning, Multi-omics, Multi-view, Network

General General

Structure-based in silico approaches for drug discovery against Mycobacterium tuberculosis.

In Computational and structural biotechnology journal

Mycobacterium tuberculosis is the causative agent of TB and was estimated to cause 1.4 million death in 2019, alongside 10 million new infections. Drug resistance is a growing issue, with multi-drug resistant infections representing 3.3% of all new infections, hence novel antimycobacterial drugs are urgently required to combat this growing health emergency. Alongside this, increased knowledge of gene essentiality in the pathogenic organism and larger compound databases can aid in the discovery of new drug compounds. The number of protein structures, X-ray based and modelled, is increasing and now accounts for greater than > 80% of all predicted M. tuberculosis proteins; allowing novel targets to be investigated. This review will focus on structure-based in silico approaches for drug discovery, covering a range of complexities and computational demands, with associated antimycobacterial examples. This includes molecular docking, molecular dynamic simulations, ensemble docking and free energy calculations. Applications of machine learning onto each of these approaches will be discussed. The need for experimental validation of computational hits is an essential component, which is unfortunately missing from many current studies. The future outlooks of these approaches will also be discussed.

Kingdon Alexander D H, Alderwick Luke J

2021

CV, collective variable, Docking, Drug discovery, In silico, LIE, Linear Interaction Energy, MD, Molecular Dynamic, MDR, multi-drug resistant, MMPB(GB)SA, Molecular Mechanics with Poisson Boltzmann (or generalised Born) and Surface Area solvation, Machine learning, Mt, Mycobacterium tuberculosis, Mycobacterium tuberculosis, PTC, peptidyl transferase centre, RMSD, root-mean square-deviation, Tuberculosis, TB, cMD, Classical Molecular Dynamic, cryo-EM, cryogenic electron microscopy, ns, nanosecond

Cardiology Cardiology

Clinical Feature-Based Machine Learning Model for 1-Year Mortality Risk Prediction of ST-Segment Elevation Myocardial Infarction in Patients with Hyperuricemia: A Retrospective Study.

In Computational and mathematical methods in medicine

Accurate risk assessment of high-risk patients is essential in clinical practice. However, there is no practical method to predict or monitor the prognosis of patients with ST-segment elevation myocardial infarction (STEMI) complicated by hyperuricemia. We aimed to evaluate the performance of different machine learning models for the prediction of 1-year mortality in STEMI patients with hyperuricemia. We compared five machine learning models (logistic regression, k-nearest neighbor, CatBoost, random forest, and XGBoost) with the traditional global (GRACE) risk score for acute coronary event registrations. We registered patients aged >18 years diagnosed with STEMI and hyperuricemia at the Affiliated Hospital of Zunyi Medical University between January 2016 and January 2020. Overall, 656 patients were enrolled (average age, 62.5 ± 13.6 years; 83.6%, male). All patients underwent emergency percutaneous coronary intervention. We evaluated the performance of five machine learning classifiers and the GRACE risk model in predicting 1-year mortality. The area under the curve (AUC) of the six models, including the GRACE risk model, ranged from 0.75 to 0.88. Among all the models, CatBoost had the highest predictive accuracy (0.89), AUC (0.87), precision (0.84), and F1 value (0.44). After hybrid sampling technique optimization, CatBoost had the highest accuracy (0.96), AUC (0.99), precision (0.95), and F1 value (0.97). Machine learning algorithms, especially the CatBoost model, can accurately predict the mortality associated with STEMI complicated by hyperuricemia after a 1-year follow-up.

Bai Zhixun, Lu Jing, Li Ting, Ma Yi, Liu Zhijiang, Zhao Ranzun, Wang Zhenglong, Shi Bei

2021

General General

Multiscale Convolutional Neural Networks with Attention for Plant Species Recognition.

In Computational intelligence and neuroscience

Plant species recognition is a critical step in protecting plant diversity. Leaf-based plant species recognition research is important and challenging due to the large within-class difference and between-class similarity of leaves and the rich inconsistent leaves with different sizes, colors, shapes, textures, and venations. Most existing plant leaf recognition methods typically normalize all leaf images to the same size and then recognize them at one scale, which results in unsatisfactory performances. A novel multiscale convolutional neural network with attention (AMSCNN) model is constructed for plant species recognition. In AMSCNN, multiscale convolution is used to learn the low-frequency and high-frequency features of the input images, and an attention mechanism is utilized to capture rich contextual relationships for better feature extraction and improving network training. Extensive experiments on the plant leaf dataset demonstrate the remarkable performance of AMSCNN compared with the hand-crafted feature-based methods and deep-neural network-based methods. The maximum accuracy attained along with AMSCNN is 95.28%.

Wang Xianfeng, Zhang Chuanlei, Zhang Shanwen

2021

Public Health Public Health

Health informatics publication trends in Saudi Arabia: a bibliometric analysis over the last twenty-four years.

In Journal of the Medical Library Association : JMLA

Objective : Understanding health informatics (HI) publication trends in Saudi Arabia may serve as a framework for future research efforts and contribute toward meeting national "e-Health" goals. The authors' intention was to understand the state of the HI field in Saudi Arabia by exploring publication trends and their alignment with national goals.

Methods : A scoping review was performed to identify HI publications from Saudi Arabia in PubMed, Embase, and Web of Science. We analyzed publication trends based on topics, keywords, and how they align with the Ministry of Health's (MOH's) "digital health journey" framework.

Results : The total number of publications included was 242. We found 1 (0.4%) publication in 1995-1999, 11 (4.5%) publications in 2000-2009, and 230 (95.0%) publications in 2010-2019. We categorized publications into 3 main HI fields and 4 subfields: 73.1% (n=177) of publications were in clinical informatics (85.1%, n=151 medical informatics; 5.6%, n=10 pharmacy informatics; 6.8%, n=12 nursing informatics; 2.3%, n=4 dental informatics); 22.3% (n=54) were in consumer health informatics; and 4.5% (n=11) were in public health informatics. The most common keyword was "medical informatics" (21.5%, n=52). MOH framework-based analysis showed that most publications were categorized as "digitally enabled care" and "digital health foundations."

Conclusions : The years of 2000-2009 may be seen as an infancy stage of the HI field in Saudi Arabia. Exploring how the Saudi Arabian MOH's e-Health initiatives may influence research is valuable for advancing the field. Data exchange and interoperability, artificial intelligence, and intelligent health enterprises might be future research directions in Saudi Arabia.

Binkheder Samar, Aldekhyyel Raniah, Almulhem Jwaher

2021-Apr-01

bibliometric analysis, biomedical informatics, clinical informatics, consumer health informatics, health informatics, public health informatics

Surgery Surgery

Development of a system to support warfarin dose decisions using deep neural networks.

In Scientific reports ; h5-index 158.0

The first aim of this study was to develop a prothrombin time international normalized ratio (PT INR) prediction model. The second aim was to develop a warfarin maintenance dose decision support system as a precise warfarin dosing platform. Data of 19,719 inpatients from three institutions was analyzed. The PT INR prediction algorithm included dense and recurrent neural networks, and was designed to predict the 5th-day PT INR from data of days 1-4. Data from patients in one hospital (n = 22,314) was used to train the algorithm which was tested with the datasets from the other two hospitals (n = 12,673). The performance of 5th-day PT INR prediction was compared with 2000 predictions made by 10 expert physicians. A generator of individualized warfarin dose-PT INR tables which simulated the repeated administration of varying doses of warfarin was developed based on the prediction model. The algorithm outperformed humans with accuracy terms of within ± 0.3 of the actual value (machine learning algorithm: 10,650/12,673 cases (84.0%), expert physicians: 1647/2000 cases (81.9%), P = 0.014). In the individualized warfarin dose-PT INR tables generated by the algorithm, the 8th-day PT INR predictions were within 0.3 of actual value in 450/842 cases (53.4%). An artificial intelligence-based warfarin dosing algorithm using a recurrent neural network outperformed expert physicians in predicting future PT INRs. An individualized warfarin dose-PT INR table generator which was constructed based on this algorithm was acceptable.

Lee Heemoon, Kim Hyun Joo, Chang Hyoung Woo, Kim Dong Jung, Mo Jonghoon, Kim Ji-Eon

2021-Jul-20

General General

SpheroidPicker for automated 3D cell culture manipulation using deep learning.

In Scientific reports ; h5-index 158.0

Recent statistics report that more than 3.7 million new cases of cancer occur in Europe yearly, and the disease accounts for approximately 20% of all deaths. High-throughput screening of cancer cell cultures has dominated the search for novel, effective anticancer therapies in the past decades. Recently, functional assays with patient-derived ex vivo 3D cell culture have gained importance for drug discovery and precision medicine. We recently evaluated the major advancements and needs for the 3D cell culture screening, and concluded that strictly standardized and robust sample preparation is the most desired development. Here we propose an artificial intelligence-guided low-cost 3D cell culture delivery system. It consists of a light microscope, a micromanipulator, a syringe pump, and a controller computer. The system performs morphology-based feature analysis on spheroids and can select uniform sized or shaped spheroids to transfer them between various sample holders. It can select the samples from standard sample holders, including Petri dishes and microwell plates, and then transfer them to a variety of holders up to 384 well plates. The device performs reliable semi- and fully automated spheroid transfer. This results in highly controlled experimental conditions and eliminates non-trivial side effects of sample variability that is a key aspect towards next-generation precision medicine.

Grexa Istvan, Diosdi Akos, Harmati Maria, Kriston Andras, Moshkov Nikita, Buzas Krisztina, Pietiäinen Vilja, Koos Krisztian, Horvath Peter

2021-Jul-20

oncology Oncology

Radiogenomic and Deep Learning Network Approaches to Predict KRAS Mutation from Radiotherapy Plan CT.

In Anticancer research

BACKGROUND/AIM : We aimed to investigate the role of radiogenomic and deep learning approaches in predicting the KRAS mutation status of a tumor using radiotherapy planning computed tomography (CT) images in patients with locally advanced rectal cancer.

PATIENTS AND METHODS : After surgical resection, 30 (27.3%) of 110 patients were found to carry a KRAS mutation. For the radiogenomic model, a total of 378 texture features were extracted from the boost clinical target volume (CTV) in the radiotherapy planning CT images. For the deep learning model, we constructed a simple deep learning network that received a three-dimensional input from the CTV.

RESULTS : The predictive ability of the radiogenomic score model revealed an AUC of 0.73 for KRAS mutation, whereas the deep learning model demonstrated worse performance, with an AUC of 0.63.

CONCLUSION : The radiogenomic score model was a more feasible approach to predict KRAS status than the deep learning model.

Jang Bum-Sup, Song Changhoon, Kang Sung-Bum, Kim Jae-Sung

2021-Aug

KRAS, Radiogenomics, chemoradiation, clinical target volume, deep learning, rectal cancer

General General

A machine learning-based workflow for automatic detection of anomalies in machine tools.

In ISA transactions

Despite the increased sensor-based data collection in Industry 4.0, the practical use of this data is still in its infancy. In contrast, academic literature provides several approaches to detect machine failures but, in most cases, relies on simulations and vast amounts of training data. Since it is often not practical to collect such amounts of data in an industrial context, we propose an approach to detect the current production mode and machine degradation states on a comparably small data set. Our approach integrates domain knowledge about manufacturing systems into a highly generalizable end-to-end workflow ranging from raw data processing, phase segmentation, data resampling, and feature extraction to machine tool anomaly detection. The workflow applies unsupervised clustering techniques to identify the current production mode and supervised classification models for detecting the present degradation. A resampling strategy and classical machine learning models enable the workflow to handle small data sets and distinguish between normal and abnormal machine tool behavior. To the best of our knowledge, there exists no such end-to-end workflow in the literature that uses the entire machine signal as input to identify anomalies for individual tools. Our evaluation with data from a real multi-purpose machine shows that the proposed workflow detects anomalies with an average F1-score of almost 93%.

Züfle Marwin, Moog Felix, Lesch Veronika, Krupitzer Christian, Kounev Samuel

2021-Jul-08

Anomaly detection, Clustering, Industrial Internet-of-Things, Industry 4.0, Predictive maintenance

General General

Artificial intelligence models for tooth-supported fixed and removable prosthodontics: A systematic review.

In The Journal of prosthetic dentistry ; h5-index 51.0

STATEMENT OF PROBLEM : Artificial intelligence applications are increasing in prosthodontics. Still, the current development and performance of artificial intelligence in prosthodontic applications has not yet been systematically documented and analyzed.

PURPOSE : The purpose of this systematic review was to assess the performance of the artificial intelligence models in prosthodontics for tooth shade selection, automation of restoration design, mapping the tooth preparation finishing line, optimizing the manufacturing casting, predicting facial changes in patients with removable prostheses, and designing removable partial dentures.

MATERIAL AND METHODS : An electronic systematic review was performed in MEDLINE/PubMed, EMBASE, Web of Science, Cochrane, and Scopus. A manual search was also conducted. Studies with artificial intelligence models were selected based on 6 criteria: tooth shade selection, automated fabrication of dental restorations, mapping the finishing line of tooth preparations, optimizing the manufacturing casting process, predicting facial changes in patients with removable prostheses, and designing removable partial dentures. Two investigators independently evaluated the quality assessment of the studies by applying the Joanna Briggs Institute Critical Appraisal Checklist for Quasi-Experimental Studies (nonrandomized experimental studies). A third investigator was consulted to resolve lack of consensus.

RESULTS : A total of 36 articles were reviewed and classified into 6 groups based on the application of the artificial intelligence model. One article reported on the development of an artificial intelligence model for tooth shade selection, reporting better shade matching than with conventional visual selection; 14 articles reported on the feasibility of automated design of dental restorations using different artificial intelligence models; 1 artificial intelligence model was able to mark the margin line without manual interaction with an average accuracy ranging from 90.6% to 97.4%; 2 investigations developed artificial intelligence algorithms for optimizing the manufacturing casting process, reporting an improvement of the design process, minimizing the porosity on the cast metal, and reducing the overall manufacturing time; 1 study proposed an artificial intelligence model that was able to predict facial changes in patients using removable prostheses; and 17 investigations that developed clinical decision support, expert systems for designing removable partial dentures for clinicians and educational purposes, computer-aided learning with video interactive programs for student learning, and automated removable partial denture design.

CONCLUSIONS : Artificial intelligence models have shown the potential for providing a reliable diagnostic tool for tooth shade selection, automated restoration design, mapping the preparation finishing line, optimizing the manufacturing casting, predicting facial changes in patients with removable prostheses, and designing removable partial dentures, but they are still in development. Additional studies are needed to further develop and assess their clinical performance.

Revilla-León Marta, Gómez-Polo Miguel, Vyas Shantanu, Barmak Basir A, Gallucci German O, Att Wael, Özcan Mutlu, Krishnamurthy Vinayak R

2021-Jul-16

General General

Data-driven approach for tailoring facilitation strategies to overcome implementation barriers in community pharmacy.

In Implementation science : IS

BACKGROUND : Implementation research has delved into barriers to implementing change and interventions for the implementation of innovation in practice. There remains a gap, however, that fails to connect implementation barriers to the most effective implementation strategies and provide a more tailored approach during implementation. This study aimed to explore barriers for the implementation of professional services in community pharmacies and to predict the effectiveness of facilitation strategies to overcome implementation barriers using machine learning techniques.

METHODS : Six change facilitators facilitated a 2-year change programme aimed at implementing professional services across community pharmacies in Australia. A mixed methods approach was used where barriers were identified by change facilitators during the implementation study. Change facilitators trialled and recorded tailored facilitation strategies delivered to overcome identified barriers. Barriers were coded according to implementation factors derived from the Consolidated Framework for Implementation Research and the Theoretical Domains Framework. Tailored facilitation strategies were coded into 16 facilitation categories. To predict the effectiveness of these strategies, data mining with random forest was used to provide the highest level of accuracy. A predictive resolution percentage was established for each implementation strategy in relation to the barriers that were resolved by that particular strategy.

RESULTS : During the 2-year programme, 1131 barriers and facilitation strategies were recorded by change facilitators. The most frequently identified barriers were a 'lack of ability to plan for change', 'lack of internal supporters for the change', 'lack of knowledge and experience', 'lack of monitoring and feedback', 'lack of individual alignment with the change', 'undefined change objectives', 'lack of objective feedback' and 'lack of time'. The random forest algorithm used was able to provide 96.9% prediction accuracy. The strategy category with the highest predicted resolution rate across the most number of implementation barriers was 'to empower stakeholders to develop objectives and solve problems'.

CONCLUSIONS : Results from this study have provided a better understanding of implementation barriers in community pharmacy and how data-driven approaches can be used to predict the effectiveness of facilitation strategies to overcome implementation barriers. Tailored facilitation strategies such as these can increase the rate of real-time implementation of innovations in healthcare, leading to an industry that can confidently and efficiently adapt to continuous change.

Moussa Lydia, Benrimoj Shalom, Musial Katarzyna, Kocbek Simon, Garcia-Cardenas Victoria

2021-Jul-19

Change facilitation, Change management, Determinants, Facilitation strategies, Implementation factors, Machine learning, Organisational change, Pharmacy practice, Random forest, Tailored interventions

General General

Growth rate-dependent flexural rigidity of microtubules influences pattern formation in collective motion.

In Journal of nanobiotechnology

BACKGROUND : Microtubules (MTs) are highly dynamic tubular cytoskeleton filaments that are essential for cellular morphology and intracellular transport. In vivo, the flexural rigidity of MTs can be dynamically regulated depending on their intracellular function. In the in vitro reconstructed MT-motor system, flexural rigidity affects MT gliding behaviors and trajectories. Despite the importance of flexural rigidity for both biological functions and in vitro applications, there is no clear interpretation of the regulation of MT flexural rigidity, and the results of many studies are contradictory. These discrepancies impede our understanding of the regulation of MT flexural rigidity, thereby challenging its precise manipulation.

RESULTS : Here, plausible explanations for these discrepancies are provided and a new method to evaluate the MT rigidity is developed. Moreover, a new relationship of the dynamic and mechanic of MTs is revealed that MT flexural rigidity decreases through three phases with the growth rate increases, which offers a method of designing MT flexural rigidity by regulating its growth rate. To test the validity of this method, the gliding performances of MTs with different flexural rigidities polymerized at different growth rates are examined. The growth rate-dependent flexural rigidity of MTs is experimentally found to influence the pattern formation in collective motion using gliding motility assay, which is further validated using machine learning.

CONCLUSION : Our study establishes a robust quantitative method for measurement and design of MT flexural rigidity to study its influences on MT gliding assays, collective motion, and other biological activities in vitro. The new relationship about the growth rate and rigidity of MTs updates current concepts on the dynamics and mechanics of MTs and provides comparable data for investigating the regulation mechanism of MT rigidity in vivo in the future.

Zhou Hang, Isozaki Naoto, Fujimoto Kazuya, Yokokawa Ryuji

2021-Jul-19

Collective motion, Flexural rigidity, Growth rate, Localization precision, Microtubule

Pathology Pathology

HistoCartography: A Toolkit for Graph Analytics in Digital Pathology

ArXiv Preprint

Advances in entity-graph based analysis of histopathology images have brought in a new paradigm to describe tissue composition, and learn the tissue structure-to-function relationship. Entity-graphs offer flexible and scalable representations to characterize tissue organization, while allowing the incorporation of prior pathological knowledge to further support model interpretability and explainability. However, entity-graph analysis requires prerequisites for image-to-graph translation and knowledge of state-of-the-art machine learning algorithms applied to graph-structured data, which can potentially hinder their adoption. In this work, we aim to alleviate these issues by developing HistoCartography, a standardized python API with necessary preprocessing, machine learning and explainability tools to facilitate graph-analytics in computational pathology. Further, we have benchmarked the computational time and performance on multiple datasets across different imaging types and histopathology tasks to highlight the applicability of the API for building computational pathology workflows.

Guillaume Jaume, Pushpak Pati, Valentin Anklin, Antonio Foncubierta, Maria Gabrani

2021-07-21

General General

Recent advancement in nano-optical strategies for detection of pathogenic bacteria and their metabolites in food safety.

In Critical reviews in food science and nutrition ; h5-index 70.0

Pathogenic bacteria and their metabolites are the leading risk factor in food safety and are one of the major threats to human health because of the capability of triggering diseases with high morbidity and mortality. Nano-optical sensors for bacteria sensing have been greatly explored with the emergence of nanotechnology and artificial intelligence. In addition, with the rapid development of cross fusion technology, other technologies integrated nano-optical sensors show great potential in bacterial and their metabolites sensing. This review focus on nano-optical strategies for bacteria and their metabolites sensing in the field of food safety; based on surface-enhanced Raman scattering (SERS), fluorescence, and colorimetric biosensors, and their integration with the microfluidic platform, electrochemical platform, and nucleic acid amplification platform in the recent three years. Compared with the traditional techniques, nano optical-based sensors have greatly improved the sensitivity with reduced detection time and cost. However, challenges remain for the simple fabrication of biosensors and their practical application in complex matrices. Thus, bringing out improvements or novelty in the pretreatment methods will be a trend in the upcoming future.

Xu Yi, Hassan Md Mehedi, Sharma Arumugam Selva, Li Huanhuan, Chen Quansheng

2021-Jul-20

Nano optical sensor, bacterial metabolites, food safety, integrated platform, pathogenic bacteria

Surgery Surgery

A Point Cloud Generative Model via Tree-Structured Graph Convolutions for 3D Brain Shape Reconstruction

ArXiv Preprint

Fusing medical images and the corresponding 3D shape representation can provide complementary information and microstructure details to improve the operational performance and accuracy in brain surgery. However, compared to the substantial image data, it is almost impossible to obtain the intraoperative 3D shape information by using physical methods such as sensor scanning, especially in minimally invasive surgery and robot-guided surgery. In this paper, a general generative adversarial network (GAN) architecture based on graph convolutional networks is proposed to reconstruct the 3D point clouds (PCs) of brains by using one single 2D image, thus relieving the limitation of acquiring 3D shape data during surgery. Specifically, a tree-structured generative mechanism is constructed to use the latent vector effectively and transfer features between hidden layers accurately. With the proposed generative model, a spontaneous image-to-PC conversion is finished in real-time. Competitive qualitative and quantitative experimental results have been achieved on our model. In multiple evaluation methods, the proposed model outperforms another common point cloud generative model PointOutNet.

Bowen Hu, Baiying Lei, Yanyan Shen, Yong Liu, Shuqiang Wang

2021-07-21

General General

A deep learning method for single-trial EEG classification in RSVP task based on spatiotemporal features of ERPs.

In Journal of neural engineering ; h5-index 52.0

OBJECTIVE : Single-trial electroencephalography (EEG) classification is of great importance in the rapid serial visual presentation (RSVP) task. Convolutional neural networks (CNNs), as one of the mainstream deep learning methods, have been proven to be effective in extracting RSVP EEG features. However, most existing CNN models for EEG classification do not consider the phase-locked characteristic of ERP components very well in the architecture design. Here, we propose a novel CNN model to make better use of the phase-locked characteristic to extract spatiotemporal features for single-trial RSVP EEG classification. Based on the phase-locked characteristic, the spatial distributions of the main ERP component in different periods can be learned separately.

APPROACH : In this work, we propose a novel CNN model to achieve superior performance on single-trial RSVP EEG classification. We introduce the combination of the standard convolutional layer, the permute layer and the depthwise convolutional layer to separately operate the spatial convolution in different periods, which more fully utilizes the phase-locked characteristic of ERPs for classification. We compare our model with several traditional and deep-learning methods in the classification performance. Moreover, we use spatial topography and saliency map to visually analyze the ERP features extracted by our model.

MAIN RESULTS : The results show that our model obtains better classification performance than those of reference methods. The spatial topographies of each subject exhibit the typical ERP spatial distribution in different time periods. And the saliency map of each subject illustrates the discriminant electrodes and the meaningful temporal features.

SIGNIFICANCE : Our model is designed with better consideration of the phase-locked ERP characteristic and reaches excellent performance on single-trial RSVP EEG classification.

Zang Boyu, Lin Yanfei, Liu Zhiwen, Gao Xiaorong

2021-Jul-20

EEG, RSVP, convolutional neural network, deep learning, event-related potential

General General

Maneuverable gait selection for a novel fish-inspired robot using a CMAES-assisted workflow.

In Bioinspiration & biomimetics

Among underwater vehicles, fish-inspired designs are often selected for their efficient gaits; these designs, however, remain limited in their maneuverability, especially in confined spaces. This paper presents a new design for a fish-inspired robot with two degree-of-freedom pectoral fins and a single degree-of-freedom caudal fin. This robot has been designed to operate in open-channel canals in the presence of external disturbances. With the complex interactions of water in mind, the composition of goal-specific swimming gaits is trained via a machine learning workflow in which automated trials in the lab are used to select a subset of potential gaits for outdoor trials. The goal of this process is to minimize the time cost of outdoor experimentation through the identification and transfer of high-performing gaits with the understanding that, in the absence of complete replication of the intended target environment, some or many of these gaits must be eliminated in the real world. This process is motivated by the challenge of balancing the optimization of complex, high degree-of-freedom robots for disturbance-heavy, random, niche environments against the limitations of current machine learning techniques in real-world experiments, and has been used in the design process as well as across a number of locomotion goals. The key contribution of this paper involves finding strategies that leverage online learning methods to train a bio-inspired fish robot by identifying high-performing gaits that have a consistent performance both in the laboratory experiments and the intended operating environment. Using the workflow described herein, the resulting robot can reach a forward swimming speed of 0.385 m/s (0.71 body lengths per second) and can achieve a near-zero turning radius.

Sharifzadeh Mohammad, Jiang Yuhao, Salimi Lafmejani Amir, Nichols Kevin, Aukes Daniel McConnell

2021-Jul-20

Evolution strategy, Experimental training, Fish-inspired robot, Gait selection, Maneuverability, Pectoral fins, Training workflow

Radiology Radiology

Advances in micro-CT imaging of small animals.

In Physica medica : PM : an international journal devoted to the applications of physics to medicine and biology : official journal of the Italian Association of Biomedical Physics (AIFB)

PURPOSE : Micron-scale computed tomography (micro-CT) imaging is a ubiquitous, cost-effective, and non-invasive three-dimensional imaging modality. We review recent developments and applications of micro-CT for preclinical research.

METHODS : Based on a comprehensive review of recent micro-CT literature, we summarize features of state-of-the-art hardware and ongoing challenges and promising research directions in the field.

RESULTS : Representative features of commercially available micro-CT scanners and some new applications for both in vivo and ex vivo imaging are described. New advancements include spectral scanning using dual-energy micro-CT based on energy-integrating detectors or a new generation of photon-counting x-ray detectors (PCDs). Beyond two-material discrimination, PCDs enable quantitative differentiation of intrinsic tissues from one or more extrinsic contrast agents. When these extrinsic contrast agents are incorporated into a nanoparticle platform (e.g. liposomes), novel micro-CT imaging applications are possible such as combined therapy and diagnostic imaging in the field of cancer theranostics. Another major area of research in micro-CT is in x-ray phase contrast (XPC) imaging. XPC imaging opens CT to many new imaging applications because phase changes are more sensitive to density variations in soft tissues than standard absorption imaging. We further review the impact of deep learning on micro-CT. We feature several recent works which have successfully applied deep learning to micro-CT data, and we outline several challenges specific to micro-CT.

CONCLUSIONS : All of these advancements establish micro-CT imaging at the forefront of preclinical research, able to provide anatomical, functional, and even molecular information while serving as a testbench for translational research.

Clark D P, Badea C T

2021-Jul-17

Contrast agents, Deep learning, Micro-CT, Nanoparticles, Phase contrast, Photon counting detector, Preclinical, Spectral CT, Theranostics

General General

Changes in alcohol use during the COVID-19 pandemic among American veterans.

In Addictive behaviors ; h5-index 60.0

BACKGROUND : The COVID-19 pandemic has had considerable behavioral health implications globally. One subgroup that may be of particular concern is U.S. veterans, who are susceptible to mental health and substance use concerns. The current study aimed to investigate changes in alcohol use and binge drinking before and during the first year of the pandemic among U.S. veterans, and how pre-pandemic mental health disorders, namely posttraumatic stress disorder (PTSD), and COVID-19-related factors like loneliness, negative reactions to COVID-19, and economic hardship influenced alcohol use trends.

METHODS : 1230 veterans were recruited in February 2020 as part of a larger survey study on veteran health behaviors. Veterans were asked to complete follow-up assessments throughout the pandemic at 6, 9, and 12- months.

RESULTS : Overall, veterans reported a significant decrease in alcohol use (IRR = 0.98) and binge drinking (IRR = 0.11) However, women, racial/ethnic minority veterans, and those with pre-existing PTSD exhibited smaller decreases in alcohol use and binge drinking and overall higher rates of use compared to men, White veterans, and those without PTSD. Both economic hardship and negative reactions to COVID-19 were associated with greater alcohol and binge drinking whereas loneliness showed a negative association with alcohol use and binge drinking.

CONCLUSIONS : Veterans reported decreases in alcohol use and binge drinking throughout the pandemic, with heterogeneity in these outcomes noted for higher risk groups. Special research and clinical attention should be given to the behavioral health care needs of veterans in the post-pandemic period.

Davis Jordan P, Prindle John, Castro Carl C, Saba Shaddy, Fitzke Reagan E, Pedersen Eric R

2021-Jul-15

Active duty, COVID-19, Drug use, Longitudinal, Trauma, Veterans administration

Ophthalmology Ophthalmology

Can artificial intelligence predict glaucomatous visual field progression?: A spatial-ordinal convolutional neural network model.

In American journal of ophthalmology ; h5-index 67.0

PURPOSE : To develop an artificial neural network model incorporating both spatial and ordinal approaches to predict glaucomatous visual field (VF) progression.

DESIGN : Cohort study.

PARTICIPANTS : From a cohort of primary open-angle glaucoma patients, 9,212 eyes of 6,047 patients who underwent regular reliable VF examinations for >4 years were included.

METHODS : We constructed all possible spatial-ordinal tensors by stacking three consecutive VF tests (VF-blocks) with at least 3 years of follow-up. Trend-based, event-based and combined criteria were defined to determine the progression. VF-blocks were considered "progressed" if progression occurred within 3 years; the progression was further confirmed after 3 years. We constructed six convolutional neural network (NN) models and two linear models: regression on global indices and pointwise linear regression (PLR). We compared area under the receiver operating characteristic curve (AUROC) of each models for the prediction of glaucomatous VF progression.

RESULTS : Among 43,260 VF-blocks, 4,406 (10.2%), 4,376 (10.1%), and 2,394 (5.5%) VF blocks were classified as progression based on trend-based, event-based and combined criteria. For all three criteria, the progression group was significantly older and had worse initial MD and VFI than the non-progression group (p < 0.001 for all). The best-performing NN model had an AUROC of 0.864 with sensitivity of 0.42 at specificity of 0.95. In contrast, an AUROC of 0.611 was estimated from sensitivity of 0.28 at specificity of 0.84 for the PLR.

CONCLUSIONS : The NN models incorporating spatial-ordinal characteristics demonstrated significantly better performance than the linear models in the prediction of glaucomatous VF progression.

Shon Kilhwan, Sung Kyung Rim, Shin Joong Won

2021-Jul-17

artificial intelligence, glaucoma, machine learning, visual field

General General

Improve automatic detection of animal call sequences with temporal context.

In Journal of the Royal Society, Interface

Many animals rely on long-form communication, in the form of songs, for vital functions such as mate attraction and territorial defence. We explored the prospect of improving automatic recognition performance by using the temporal context inherent in song. The ability to accurately detect sequences of calls has implications for conservation and biological studies. We show that the performance of a convolutional neural network (CNN), designed to detect song notes (calls) in short-duration audio segments, can be improved by combining it with a recurrent network designed to process sequences of learned representations from the CNN on a longer time scale. The combined system of independently trained CNN and long short-term memory (LSTM) network models exploits the temporal patterns between song notes. We demonstrate the technique using recordings of fin whale (Balaenoptera physalus) songs, which comprise patterned sequences of characteristic notes. We evaluated several variants of the CNN + LSTM network. Relative to the baseline CNN model, the CNN + LSTM models reduced performance variance, offering a 9-17% increase in area under the precision-recall curve and a 9-18% increase in peak F1-scores. These results show that the inclusion of temporal information may offer a valuable pathway for improving the automatic recognition and transcription of wildlife recordings.

Madhusudhana Shyam, Shiu Yu, Klinck Holger, Fleishman Erica, Liu Xiaobai, Nosal Eva-Marie, Helble Tyler, Cholewiak Danielle, Gillespie Douglas, Širović Ana, Roch Marie A

2021-Jul

bioacoustics, improved performance, machine learning, passive acoustic monitoring, robust automatic recognition, temporal context

Radiology Radiology

Deep learning neural networks to differentiate Stafne's bone cavity from pathological radiolucent lesions of the mandible in heterogeneous panoramic radiography.

In PloS one ; h5-index 176.0

This study aimed to develop a high-performance deep learning algorithm to differentiate Stafne's bone cavity (SBC) from cysts and tumors of the jaw based on images acquired from various panoramic radiographic systems. Data sets included 176 Stafne's bone cavities and 282 odontogenic cysts and tumors of the mandible (98 dentigerous cysts, 91 odontogenic keratocysts, and 93 ameloblastomas) that required surgical removal. Panoramic radiographs were obtained using three different imaging systems. The trained model showed 99.25% accuracy, 98.08% sensitivity, and 100% specificity for SBC classification and resulted in one misclassified SBC case. The algorithm was approved to recognize the typical imaging features of SBC in panoramic radiography regardless of the imaging system when traced back with Grad-Cam and Guided Grad-Cam methods. The deep learning model for SBC differentiating from odontogenic cysts and tumors showed high performance with images obtained from multiple panoramic systems. The present algorithm is expected to be a useful tool for clinicians, as it diagnoses SBCs in panoramic radiography to prevent unnecessary examinations for patients. Additionally, it would provide support for clinicians to determine further examinations or referrals to surgeons for cases where even experts are unsure of diagnosis using panoramic radiography alone.

Lee Ari, Kim Min Su, Han Sang-Sun, Park PooGyeon, Lee Chena, Yun Jong Pil

2021

General General

Performance and scaling behavior of bioinformatic applications in virtualization environments to create awareness for the efficient use of compute resources.

In PLoS computational biology

The large amount of biological data available in the current times, makes it necessary to use tools and applications based on sophisticated and efficient algorithms, developed in the area of bioinformatics. Further, access to high performance computing resources is necessary, to achieve results in reasonable time. To speed up applications and utilize available compute resources as efficient as possible, software developers make use of parallelization mechanisms, like multithreading. Many of the available tools in bioinformatics offer multithreading capabilities, but more compute power is not always helpful. In this study we investigated the behavior of well-known applications in bioinformatics, regarding their performance in the terms of scaling, different virtual environments and different datasets with our benchmarking tool suite BOOTABLE. The tool suite includes the tools BBMap, Bowtie2, BWA, Velvet, IDBA, SPAdes, Clustal Omega, MAFFT, SINA and GROMACS. In addition we added an application using the machine learning framework TensorFlow. Machine learning is not directly part of bioinformatics but applied to many biological problems, especially in the context of medical images (X-ray photographs). The mentioned tools have been analyzed in two different virtual environments, a virtual machine environment based on the OpenStack cloud software and in a Docker environment. The gained performance values were compared to a bare-metal setup and among each other. The study reveals, that the used virtual environments produce an overhead in the range of seven to twenty-five percent compared to the bare-metal environment. The scaling measurements showed, that some of the analyzed tools do not benefit from using larger amounts of computing resources, whereas others showed an almost linear scaling behavior. The findings of this study have been generalized as far as possible and should help users to find the best amount of resources for their analysis. Further, the results provide valuable information for resource providers to handle their resources as efficiently as possible and raise the user community's awareness of the efficient usage of computing resources.

Hanussek Maximilian, Bartusch Felix, Krüger Jens

2021-Jul-20

Radiology Radiology

Next-generation sequencing in a large pedigree segregating visceral artery aneurysms suggests potential role of COL4A1/COL4A2 in disease etiology.

In Vascular

BACKGROUND : Visceral artery aneurysms (VAAs) can be fatal if ruptured. Although a relatively rare incident, it holds a contemporary mortality rate of approximately 12%. VAAs have multiple possible causes, one of which is genetic predisposition. Here, we present a striking family with seven individuals affected by VAAs, and one individual affected by a visceral artery pseudoaneurysm.

METHODS : We exome sequenced the affected family members and the parents of the proband to find a possible underlying genetic defect. As exome sequencing did not reveal any feasible protein-coding variants, we combined whole-genome sequencing of two individuals with linkage analysis to find a plausible non-coding culprit variant. Variants were ranked by the deep learning framework DeepSEA.

RESULTS : Two of seven top-ranking variants, NC_000013.11:g.108154659C>T and NC_000013.11:g.110409638C>T, were found in all VAA-affected individuals, but not in the individual affected by the pseudoaneurysm. The second variant is in a candidate cis-regulatory element in the fourth intron of COL4A2, proximal to COL4A1.

CONCLUSIONS : As type IV collagens are essential for the stability and integrity of the vascular basement membrane and involved in vascular disease, we conclude that COL4A1 and COL4A2 are strong candidates for VAA susceptibility genes.

Donner Iikki, Sipilä Lauri J, Plaketti Roosa-Maria, Kuosmanen Anna, Forsström Linda, Katainen Riku, Kuismin Outi, Aavikko Mervi, Romsi Pekka, Kariniemi Juho, Aaltonen Lauri A

2021-Jul-19

Aneurysm, COL4A1, COL4A2, genetic susceptibility, next-generation sequencing, non-coding variants

Radiology Radiology

An artificial intelligence natural language processing pipeline for information extraction in neuroradiology

ArXiv Preprint

The use of electronic health records in medical research is difficult because of the unstructured format. Extracting information within reports and summarising patient presentations in a way amenable to downstream analysis would be enormously beneficial for operational and clinical research. In this work we present a natural language processing pipeline for information extraction of radiological reports in neurology. Our pipeline uses a hybrid sequence of rule-based and artificial intelligence models to accurately extract and summarise neurological reports. We train and evaluate a custom language model on a corpus of 150000 radiological reports from National Hospital for Neurology and Neurosurgery, London MRI imaging. We also present results for standard NLP tasks on domain-specific neuroradiology datasets. We show our pipeline, called `neuroNLP', can reliably extract clinically relevant information from these reports, enabling downstream modelling of reports and associated imaging on a heretofore unprecedented scale.

Henry Watkins, Robert Gray, Ashwani Jha, Parashkev Nachev

2021-07-21

General General

Determining soil particle-size distribution from infrared spectra using machine learning predictions: Methodology and modeling.

In PloS one ; h5-index 176.0

Accuracy of infrared (IR) models to measure soil particle-size distribution (PSD) depends on soil preparation, methodology (sedimentation, laser), settling times and relevant soil features. Compositional soil data may require log ratio (ilr) transformation to avoid numerical biases. Machine learning can relate numerous independent variables that may impact on NIR spectra to assess particle-size distribution. Our objective was to reach high IRS prediction accuracy across a large range of PSD methods and soil properties. A total of 1298 soil samples from eastern Canada were IR-scanned. Spectra were processed by Stochastic Gradient Boosting (SGB) to predict sand, silt, clay and carbon. Slope and intercept of the log-log relationships between settling time and suspension density function (SDF) (R2 = 0.84-0.92) performed similarly to NIR spectra using either ilr-transformed (R2 = 0.81-0.93) or raw percentages (R2 = 0.76-0.94). Settling times of 0.67-min and 2-h were the most accurate for NIR predictions (R2 = 0.49-0.79). The NIR prediction of sand sieving method (R2 = 0.66) was more accurate than sedimentation method(R2 = 0.53). The NIR 2X gain was less accurate (R2 = 0.69-0.92) than 4X (R2 = 0.87-0.95). The MIR (R2 = 0.45-0.80) performed better than NIR (R2 = 0.40-0.71) spectra. Adding soil carbon, reconstituted bulk density, pH, red-green-blue color, oxalate and Mehlich3 extracts returned R2 value of 0.86-0.91 for texture prediction. In addition to slope and intercept of the SDF, 4X gain, method and pre-treatment classes, soil carbon and color appeared to be promising features for routine SGB-processed NIR particle-size analysis. Machine learning methods support cost-effective soil texture NIR analysis.

Parent Elizabeth Jeanne, Parent Serge-Étienne, Parent Léon Etienne

2021

General General

A Theoretical Insight Into the Effect of Loss Function for Deep Semantic-Preserving Learning.

In IEEE transactions on neural networks and learning systems

Good generalization performance is the fundamental goal of any machine learning algorithm. Using the uniform stability concept, this article theoretically proves that the choice of loss function impacts the generalization performance of a trained deep neural network (DNN). The adopted stability-based framework provides an effective tool for comparing the generalization error bound with respect to the utilized loss function. The main result of our analysis is that using an effective loss function makes stochastic gradient descent more stable which consequently leads to the tighter generalization error bound, and so better generalization performance. To validate our analysis, we study learning problems in which the classes are semantically correlated. To capture this semantic similarity of neighboring classes, we adopt the well-known semantics-preserving learning framework, namely label distribution learning (LDL). We propose two novel loss functions for the LDL framework and theoretically show that they provide stronger stability than the other widely used loss functions adopted for training DNNs. The experimental results on three applications with semantically correlated classes, including facial age estimation, head pose estimation, and image esthetic assessment, validate the theoretical insights gained by our analysis and demonstrate the usefulness of the proposed loss functions in practical applications.

Akbari Ali, Awais Muhammad, Bashar Manijeh, Kittler Josef

2021-Jul-20

General General

MCA-Net: Multi-feature coding and attention convolutional neural network for predicting lncRNA-disease association.

In IEEE/ACM transactions on computational biology and bioinformatics

With the advent of the era of big data, it is troublesome to accurately predict the associations between lncRNAs and diseases based on traditional biological experiments due to its time-consuming and subjective. In this paper, we propose a novel deep learning method for predicting lncRNA-disease associations using multi-feature coding and attention convolutional neural network (MCA-Net). We first calculate six similarity features to extract different types of lncRNA and disease feature information. Second, a multi-feature coding method is proposed to construct the feature vectors of lncRNA-disease association samples by integrating the six similarity features. Furthermore, an attention convolutional neural network is developed to identify lncRNA-disease associations under 10-fold cross-validation. Finally, we evaluate the performance of MCA-Net from different perspectives including the effects of the model parameters, distinct deep learning models, and the necessity of attention mechanism. We also compare MCA-Net with several state-of-the-art methods on three publicly available datasets, i.e., LncRNADisease, Lnc2Cancer, and LncRNADisease2.0. The results show that our MCA-Net outperforms the state-of-the-art methods on all three of datasets. Besides, case studies on breast cancer and lung cancer further verify that MCA-Net is effective and accurate for the lncRNA-disease association prediction.

Zhang Yuan, Ye Fei, Gao Xieping

2021-Jul-20

General General

Continuous Gait Phase Estimation using LSTM for Robotic Transfemoral Prosthesis Across Walking Speeds.

In IEEE transactions on neural systems and rehabilitation engineering : a publication of the IEEE Engineering in Medicine and Biology Society

User gait phase estimation plays a key role for the seamless control of the lower-limb robotic assistive devices (e.g., exoskeletons or prostheses) during ambulation. To achieve this, several studies have attempted to estimate the gait phase using a thigh or shank angle. However, their estimation resulted in some deviation from the actual walking and varied across the walking speeds. In this study, we investigated the different setups using for the machine learning approach to obtain more accurate and consistent gait phase estimation for the robotic transfemoral prosthesis over different walking speeds. Considering the transfemoral prosthetic application, we proposed two different sensor setups: i) the angular positions and velocities of both thigh and torso (S1) and ii) the angular positions and velocities of both thigh and torso, and heel force data (S2). The proposed setups and method are experimentally evaluated with three healthy young subjects at four different walking speeds: 0.5, 1.0, 1.5, and 2.0 m/s. Both results showed robust and accurate gait phase estimation with respect to the ground truth (loss value of S1: 4.54e-03 Vs. S2: 4.70e-03). S1 had the advantage of a simple equipment setup using only two IMUs, while S2 had the advantage of estimating more accurate heel-strikes than S1 by using additional heel force data. The choice between the two sensor setups can depend on the researchers' preference in consideration of the device setup or the focus of the interest.

Lee Jinwon, Hong Woolim, Hur Pilwon

2021-Jul-20

General General

SANTIA: a Matlab-based open-source toolbox for artifact detection and removal from extracellular neuronal signals.

In Brain informatics

Neuronal signals generally represent activation of the neuronal networks and give insights into brain functionalities. They are considered as fingerprints of actions and their processing across different structures of the brain. These recordings generate a large volume of data that are susceptible to noise and artifacts. Therefore, the review of these data to ensure high quality by automatically detecting and removing the artifacts is imperative. Toward this aim, this work proposes a custom-developed automatic artifact removal toolbox named, SANTIA (SigMate Advanced: a Novel Tool for Identification of Artifacts in Neuronal Signals). Developed in Matlab, SANTIA is an open-source toolbox that applies neural network-based machine learning techniques to label and train models to detect artifacts from the invasive neuronal signals known as local field potentials.

Fabietti Marcos, Mahmud Mufti, Lotfi Ahmad, Kaiser M Shamim, Averna Alberto, Guggenmos David J, Nudo Randolph J, Chiappalone Michela, Chen Jianhui

2021-Jul-20

Artifacts, Local field potential, Machine learning, Neural networks, Neuronal signals

General General

Predicting Antituberculosis Drug-Induced Liver Injury Using an Interpretable Machine Learning Method: Model Development and Validation Study.

In JMIR medical informatics ; h5-index 23.0

BACKGROUND : Tuberculosis (TB) is a pandemic, being one of the top 10 causes of death and the main cause of death from a single source of infection. Drug-induced liver injury (DILI) is the most common and serious side effect during the treatment of TB.

OBJECTIVE : We aim to predict the status of liver injury in patients with TB at the clinical treatment stage.

METHODS : We designed an interpretable prediction model based on the XGBoost algorithm and identified the most robust and meaningful predictors of the risk of TB-DILI on the basis of clinical data extracted from the Hospital Information System of Shenzhen Nanshan Center for Chronic Disease Control from 2014 to 2019.

RESULTS : In total, 757 patients were included, and 287 (38%) had developed TB-DILI. Based on values of relative importance and area under the receiver operating characteristic curve, machine learning tools selected patients' most recent alanine transaminase levels, average rate of change of patients' last 2 measures of alanine transaminase levels, cumulative dose of pyrazinamide, and cumulative dose of ethambutol as the best predictors for assessing the risk of TB-DILI. In the validation data set, the model had a precision of 90%, recall of 74%, classification accuracy of 76%, and balanced error rate of 77% in predicting cases of TB-DILI. The area under the receiver operating characteristic curve score upon 10-fold cross-validation was 0.912 (95% CI 0.890-0.935). In addition, the model provided warnings of high risk for patients in advance of DILI onset for a median of 15 (IQR 7.3-27.5) days.

CONCLUSIONS : Our model shows high accuracy and interpretability in predicting cases of TB-DILI, which can provide useful information to clinicians to adjust the medication regimen and avoid more serious liver injury in patients.

Zhong Tao, Zhuang Zian, Dong Xiaoli, Wong Ka Hing, Wong Wing Tak, Wang Jian, He Daihai, Liu Shengyuan

2021-Jul-20

XGBoost algorithm, accuracy, drug, drug-induced liver injury, high accuracy, injury, interpretability, interpretation, liver, machine learning, model, prediction, treatment, tuberculosis

Radiology Radiology

Emerging role of artificial intelligence in stroke imaging.

In Expert review of neurotherapeutics

Introduction: The recognition and therapy of patients with stroke is becoming progressively intricate as additional treatment choices become accessible and new associations between disease characteristics and treatment response are incessantly uncovered. Therefore, clinicians must regularly learn new skill, stay up to date with the literature and integrate advances into daily practice. The application of artificial intelligence (AI) to assist clinical decision making could diminish inter-rater variation in routine clinical practice and accelerate the mining of vital data that could expand recognition of patients with stroke, forecast of treatment responses and patient outcomes.Areas covered: In this review, the authors provide an up-to-date review of AI in stroke, analyzing the latest papers on this subject. These have been divided in two main groups: stroke diagnosis and outcome prediction.Expert opinion: The highest value of AI is its capability to merge, select and condense a large amount of clinical and imaging features of a single patient and to associate these with fitted models that have gone through robust assessment and optimization with large cohorts of data to support clinical decision making.

Corrias Giuseppe, Mazzotta Andrea, Melis Marta, Cademartiri Filippo, Yang Qi, Suri Jasjit S, Saba Luca

2021-Jul-20

Supervised artificial intelligence, image optimization and analysis, perfusion imaging, stroke

General General

Tracking the Differentiation Status of Human Neural Stem Cells through Label-Free Raman Spectroscopy and Machine Learning-Based Analysis.

In Analytical chemistry

The ability to noninvasively monitor stem cells' differentiation is important to stem cell studies. Raman spectroscopy is a non-harmful imaging approach that acquires the cellular biochemical signatures. Herein, we report the first use of label-free Raman spectroscopy to characterize the gradual change during the differentiation process of live human neural stem cells (NSCs) in the in vitro cultures. Raman spectra of 600-1800 cm-1 were measured with human NSC cultures from the undifferentiated stage (NSC-predominant) to the highly differentiated one (neuron-predominant) and subsequently analyzed using various mathematical methods. Hierarchical cluster analysis distinguished two cell types (NSCs and neurons) through the spectra. The subsequently derived differentiation rate matched that measured by immunocytochemistry. The key spectral biomarkers were identified by time-dependent trend analysis and principal component analysis. Furthermore, through machine learning-based analysis, a set of eight spectral data points were found to be highly accurate in classifying cell types and predicting the differentiation rate. The predictive accuracy was the highest using the artificial neural network (ANN) and slightly lowered using the logistic regression model and linear discriminant analysis. In conclusion, label-free Raman spectroscopy with the aid of machine learning analysis can provide the noninvasive classification of cell types at the single-cell level and thus accurately track the human NSC differentiation. A set of eight spectral data points combined with the ANN method were found to be the most efficient and accurate. Establishing this non-harmful and efficient strategy will shed light on the in vivo and clinical studies of NSCs.

Geng Junnan, Zhang Wei, Chen Cheng, Zhang Han, Zhou Anhong, Huang Yu

2021-Jul-20

General General

Multivariate Machine Learning Analyses in Identification of Major Depressive Disorder Using Resting-State Functional Connectivity: A Multicentral Study.

In ACS chemical neuroscience

Diagnosis of major depressive disorder (MDD) using resting-state functional connectivity (rs-FC) data faces many challenges, such as the high dimensionality, small samples, and individual difference. To assess the clinical value of rs-FC in MDD and identify the potential rs-FC machine learning (ML) model for the individualized diagnosis of MDD, based on the rs-FC data, a progressive three-step ML analysis was performed, including six different ML algorithms and two dimension reduction methods, to investigate the classification performance of ML model in a multicentral, large sample dataset [1021 MDD patients and 1100 normal controls (NCs)]. Furthermore, the linear least-squares fitted regression model was used to assess the relationships between rs-FC features and the severity of clinical symptoms in MDD patients. Among used ML methods, the rs-FC model constructed by the eXtreme Gradient Boosting (XGBoost) method showed the optimal classification performance for distinguishing MDD patients from NCs at the individual level (accuracy = 0.728, sensitivity = 0.720, specificity = 0.739, area under the curve = 0.831). Meanwhile, identified rs-FCs by the XGBoost model were primarily distributed within and between the default mode network, limbic network, and visual network. More importantly, the 17 item individual Hamilton Depression Scale scores of MDD patients can be accurately predicted using rs-FC features identified by the XGBoost model (adjusted R2 = 0.180, root mean squared error = 0.946). The XGBoost model using rs-FCs showed the optimal classification performance between MDD patients and HCs, with the good generalization and neuroscientifical interpretability.

Shi Yachen, Zhang Linhai, Wang Zan, Lu Xiang, Wang Tao, Zhou Deyu, Zhang Zhijun

2021-Jul-20

Major depressive disorder, classification, eXtreme Gradient Boosting, machine learning, multiple-center, resting-state functional connectivity

General General

Appetitive olfactory learning suffers in ants when octopamine or dopamine receptors are blocked.

In The Journal of experimental biology

Associative learning relies on the detection of coincidence between a stimulus and a reward or punishment. In the insect brain, this process is carried out in the mushroom bodies under control of octopaminergic and dopaminergic neurons. It was assumed that appetitive learning is governed by octopaminergic neurons, while dopamine is required for aversive learning. This view has been recently challenged: Both neurotransmitters are involved in both types of learning in bees and flies. Here, we test which neurotransmitters are required for appetitive learning in ants. We trained Lasius niger workers to discriminate two mixtures of linear hydrocarbons and to associate one of them with a sucrose reward. We analysed the walking paths of the ants using machine learning and found that the ants spent more time near the rewarded odour than the other, a preference that was stable for at least 24 hours. We then treated the ants before learning with either epinastine, an octopamine receptor blocker, or with flupentixol, a dopamine receptor blocker. Ants with blocked octopamine receptors did not prefer the rewarded odour. Octopamine signalling is thus necessary for appetitive learning of olfactory cues, likely because it signals information about odours or reward to the mushroom body. In contrast, ants with blocked dopamine receptors initially learned the rewarded odour but failed to retrieve this memory 24 hours later. Dopamine is thus likely required for long-term memory consolidation, independent of short-term memory formation. Our results show that appetitive olfactory learning depends on both octopamine and dopamine signalling in ants.

Wissink Maarten, Nehring Volker

2021-Jul-20

Associative learning, Lasius niger, Long term memory, Neurotransmitters, Short term memory

Public Health Public Health

A Machine Learning Approach for Investigating Delirium as a Multifactorial Syndrome.

In International journal of environmental research and public health ; h5-index 73.0

Delirium is a psycho-organic syndrome common in hospitalized patients, especially the elderly, and is associated with poor clinical outcomes. This study aims to identify the predictors that are mostly associated with the risk of delirium episodes using a machine learning technique (MLT). A random forest (RF) algorithm was used to evaluate the association between the subject's characteristics and the 4AT (the 4 A's test) score screening tool for delirium. RF algorithm was implemented using information based on demographic characteristics, comorbidities, drugs and procedures. Of the 78 patients enrolled in the study, 49 (63%) were at risk for delirium, 32 (41%) had at least one episode of delirium during the hospitalization (38% in orthopedics and 31% both in internal medicine and in the geriatric ward). The model explained 75.8% of the variability of the 4AT score with a root mean squared error of 3.29. Higher age, the presence of dementia, physical restraint, diabetes and a lower degree are the variables associated with an increase of the 4AT score. Random forest is a valid method for investigating the patients' characteristics associated with delirium onset also in small case-series. The use of this model may allow for early detection of delirium onset to plan the proper adjustment in healthcare assistance.

Ocagli Honoria, Bottigliengo Daniele, Lorenzoni Giulia, Azzolina Danila, Acar Aslihan S, Sorgato Silvia, Stivanello Lucia, Degan Mario, Gregori Dario

2021-Jul-02

aging, delirium, machine learning technique, nursing, random forest

General General

Changes in alcohol use during the COVID-19 pandemic among American veterans.

In Addictive behaviors ; h5-index 60.0

BACKGROUND : The COVID-19 pandemic has had considerable behavioral health implications globally. One subgroup that may be of particular concern is U.S. veterans, who are susceptible to mental health and substance use concerns. The current study aimed to investigate changes in alcohol use and binge drinking before and during the first year of the pandemic among U.S. veterans, and how pre-pandemic mental health disorders, namely posttraumatic stress disorder (PTSD), and COVID-19-related factors like loneliness, negative reactions to COVID-19, and economic hardship influenced alcohol use trends.

METHODS : 1230 veterans were recruited in February 2020 as part of a larger survey study on veteran health behaviors. Veterans were asked to complete follow-up assessments throughout the pandemic at 6, 9, and 12- months.

RESULTS : Overall, veterans reported a significant decrease in alcohol use (IRR = 0.98) and binge drinking (IRR = 0.11) However, women, racial/ethnic minority veterans, and those with pre-existing PTSD exhibited smaller decreases in alcohol use and binge drinking and overall higher rates of use compared to men, White veterans, and those without PTSD. Both economic hardship and negative reactions to COVID-19 were associated with greater alcohol and binge drinking whereas loneliness showed a negative association with alcohol use and binge drinking.

CONCLUSIONS : Veterans reported decreases in alcohol use and binge drinking throughout the pandemic, with heterogeneity in these outcomes noted for higher risk groups. Special research and clinical attention should be given to the behavioral health care needs of veterans in the post-pandemic period.

Davis Jordan P, Prindle John, Castro Carl C, Saba Shaddy, Fitzke Reagan E, Pedersen Eric R

2021-Jul-15

Active duty, COVID-19, Drug use, Longitudinal, Trauma, Veterans administration

General General

Deep learning for temporal data representation in electronic health records: A systematic review of challenges and methodologies

ArXiv Preprint

Objective: Temporal electronic health records (EHRs) can be a wealth of information for secondary uses, such as clinical events prediction or chronic disease management. However, challenges exist for temporal data representation. We therefore sought to identify these challenges and evaluate novel methodologies for addressing them through a systematic examination of deep learning solutions. Methods: We searched five databases (PubMed, EMBASE, the Institute of Electrical and Electronics Engineers [IEEE] Xplore Digital Library, the Association for Computing Machinery [ACM] digital library, and Web of Science) complemented with hand-searching in several prestigious computer science conference proceedings. We sought articles that reported deep learning methodologies on temporal data representation in structured EHR data from January 1, 2010, to August 30, 2020. We summarized and analyzed the selected articles from three perspectives: nature of time series, methodology, and model implementation. Results: We included 98 articles related to temporal data representation using deep learning. Four major challenges were identified, including data irregularity, data heterogeneity, data sparsity, and model opacity. We then studied how deep learning techniques were applied to address these challenges. Finally, we discuss some open challenges arising from deep learning. Conclusion: Temporal EHR data present several major challenges for clinical prediction modeling and data utilization. To some extent, current deep learning solutions can address these challenges. Future studies can consider designing comprehensive and integrated solutions. Moreover, researchers should incorporate additional clinical domain knowledge into study designs and enhance the interpretability of the model to facilitate its implementation in clinical practice.

Feng Xie, Han Yuan, Yilin Ning, Marcus Eng Hock Ong, Mengling Feng, Wynne Hsu, Bibhas Chakraborty, Nan Liu

2021-07-21

General General

Quantitative neurogenetics: applications in understanding disease.

In Biochemical Society transactions

Neurodevelopmental and neurodegenerative disorders (NNDs) are a group of conditions with a broad range of core and co-morbidities, associated with dysfunction of the central nervous system. Improvements in high throughput sequencing have led to the detection of putative risk genetic loci for NNDs, however, quantitative neurogenetic approaches need to be further developed in order to establish causality and underlying molecular genetic mechanisms of pathogenesis. Here, we discuss an approach for prioritizing the contribution of genetic risk loci to complex-NND pathogenesis by estimating the possible impacts of these loci on gene regulation. Furthermore, we highlight the use of a tissue-specificity gene expression index and the application of artificial intelligence (AI) to improve the interpretation of the role of genetic risk elements in NND pathogenesis. Given that NND symptoms are associated with brain dysfunction, risk loci with direct, causative actions would comprise genes with essential functions in neural cells that are highly expressed in the brain. Indeed, NND risk genes implicated in brain dysfunction are disproportionately enriched in the brain compared with other tissues, which we refer to as brain-specific expressed genes. In addition, the tissue-specificity gene expression index can be used as a handle to identify non-brain contexts that are involved in NND pathogenesis. Lastly, we discuss how using an AI approach provides the opportunity to integrate the biological impacts of risk loci to identify those putative combinations of causative relationships through which genetic factors contribute to NND pathogenesis.

Afrasiabi Ali, Keane Jeremy T, Heng Julian Ik-Tsen, Palmer Elizabeth E, Lovell Nigel H, Alinejad-Rokny Hamid

2021-Jul-20

artificial intelligence, functional analysis, genomic variants, health data analytics, neurogenetics disorders, tissue-specific gene expression

Radiology Radiology

DeepRePath: Identifying the Prognostic Features of Early-Stage Lung Adenocarcinoma Using Multi-Scale Pathology Images and Deep Convolutional Neural Networks.

In Cancers

The prognosis of patients with lung adenocarcinoma (LUAD), especially early-stage LUAD, is dependent on clinicopathological features. However, its predictive utility is limited. In this study, we developed and trained a DeepRePath model based on a deep convolutional neural network (CNN) using multi-scale pathology images to predict the prognosis of patients with early-stage LUAD. DeepRePath was pre-trained with 1067 hematoxylin and eosin-stained whole-slide images of LUAD from the Cancer Genome Atlas. DeepRePath was further trained and validated using two separate CNNs and multi-scale pathology images of 393 resected lung cancer specimens from patients with stage I and II LUAD. Of the 393 patients, 95 patients developed recurrence after surgical resection. The DeepRePath model showed average area under the curve (AUC) scores of 0.77 and 0.76 in cohort I and cohort II (external validation set), respectively. Owing to low performance, DeepRePath cannot be used as an automated tool in a clinical setting. When gradient-weighted class activation mapping was used, DeepRePath indicated the association between atypical nuclei, discohesive tumor cells, and tumor necrosis in pathology images showing recurrence. Despite the limitations associated with a relatively small number of patients, the DeepRePath model based on CNNs with transfer learning could predict recurrence after the curative resection of early-stage LUAD using multi-scale pathology images.

Shim Won Sang, Yim Kwangil, Kim Tae-Jung, Sung Yeoun Eun, Lee Gyeongyun, Hong Ji Hyung, Chun Sang Hoon, Kim Seoree, An Ho Jung, Na Sae Jung, Kim Jae Jun, Moon Mi Hyoung, Moon Seok Whan, Park Sungsoo, Hong Soon Auck, Ko Yoon Ho

2021-Jul-01

deep learning, lung adenocarcinoma, pathology image, prognosis

Surgery Surgery

Development of a Deep-Learning Pipeline to Recognize and Characterize Macrophages in Colo-Rectal Liver Metastasis.

In Cancers

Quantitative analysis of Tumor Microenvironment (TME) provides prognostic and predictive information in several human cancers but, with few exceptions, it is not performed in daily clinical practice since it is extremely time-consuming. We recently showed that the morphology of Tumor Associated Macrophages (TAMs) correlates with outcome in patients with Colo-Rectal Liver Metastases (CLM). However, as for other TME components, recognizing and characterizing hundreds of TAMs in a single histopathological slide is unfeasible. To fasten this process, we explored a deep-learning based solution. We tested three Convolutional Neural Networks (CNNs), namely UNet, SegNet and DeepLab-v3, with three different segmentation strategies, semantic segmentation, pixel penalties and instance segmentation. The different experiments are compared according to the Intersection over Union (IoU), a metric describing the similarity between what CNN predicts as TAM and the ground truth, and the Symmetric Best Dice (SBD), which indicates the ability of CNN to separate different TAMs. UNet and SegNet showed intrinsic limitations in discriminating single TAMs (highest SBD 61.34±2.21), whereas DeepLab-v3 accurately recognized TAMs from the background (IoU 89.13±3.85) and separated different TAMs (SBD 79.00±3.72). This deep-learning pipeline to recognize TAMs in digital slides will allow the characterization of TAM-related metrics in the daily clinical practice, allowing the implementation of prognostic tools.

Cancian Pierandrea, Cortese Nina, Donadon Matteo, Di Maio Marco, Soldani Cristiana, Marchesi Federica, Savevski Victor, Santambrogio Marco Domenico, Cerina Luca, Laino Maria Elena, Torzilli Guido, Mantovani Alberto, Terracciano Luigi, Roncalli Massimo, Di Tommaso Luca

2021-Jul-01

artificial intelligence, colo-rectal liver metastases, deep learning, digital pathology, macrophages

General General

Analysis of Periodontal Conditions in the Provinces of Vietnam: Results From the National Dental Survey.

In Asia-Pacific journal of public health

Nationwide dental health surveys are crucial for providing essential information on dental health and dental condition-related problems in the community. However, the relationship between periodontal conditions and sociodemographic data has not been well investigated in Vietnam. With data from the National Oral Health Survey in 2019, we performed several machine learning methods on this dataset to investigate the impacts of sociodemographic features on gingival bleeding, periodontal pockets, and Community Periodontal Index. From the experiments, LightGBM produced a maximum AUC (area under the curve) value of 0.744. The other models in descending order were logistic regression (0.705), logiboost (0.704), and random forest (0.684). All methods resulted in significantly high overall accuracies, all exceeding 90%. The results show that the gradient boosting model can predict well the relationship between periodontal conditions and sociodemographic data. The investigated model also reveals that the geographic region has the most significant influence on dental health, while the consumption of sweet foods/drinks is the second most crucial. These findings advocate for a region-specific approach for the dental care program and the implementation of a sugar-risk food reduction program.

Minh Nguyen Thi Hong, Cao Binh Tran, Dinh Hai Trinh, Thuy Duong Nguyen, Bui Quang-Thanh

2021-Jul-20

National Oral Health Survey, Vietnam, machine learning, periodontal conditions, sociodemographics

General General

Prediction of Neurological Outcomes in Out-of-hospital Cardiac Arrest Survivors Immediately after Return of Spontaneous Circulation: Ensemble Technique with Four Machine Learning Models.

In Journal of Korean medical science

BACKGROUND : We performed this study to establish a prediction model for 1-year neurological outcomes in out-of-hospital cardiac arrest (OHCA) patients who achieved return of spontaneous circulation (ROSC) immediately after ROSC using machine learning methods.

METHODS : We performed a retrospective analysis of an OHCA survivor registry. Patients aged ≥ 18 years were included. Study participants who had registered between March 31, 2013 and December 31, 2018 were divided into a develop dataset (80% of total) and an internal validation dataset (20% of total), and those who had registered between January 1, 2019 and December 31, 2019 were assigned to an external validation dataset. Four machine learning methods, including random forest, support vector machine, ElasticNet and extreme gradient boost, were implemented to establish prediction models with the develop dataset, and the ensemble technique was used to build the final prediction model. The prediction performance of the model in the internal validation and the external validation dataset was described with accuracy, area under the receiver-operating characteristic curve, area under the precision-recall curve, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Futhermore, we established multivariable logistic regression models with the develop set and compared prediction performance with the ensemble models. The primary outcome was an unfavorable 1-year neurological outcome.

RESULTS : A total of 1,207 patients were included in the study. Among them, 631, 139, and 153 were assigned to the develop, the internal validation and the external validation datasets, respectively. Prediction performance metrics for the ensemble prediction model in the internal validation dataset were as follows: accuracy, 0.9620 (95% confidence interval [CI], 0.9352-0.9889); area under receiver-operator characteristics curve, 0.9800 (95% CI, 0.9612-0.9988); area under precision-recall curve, 0.9950 (95% CI, 0.9860-1.0000); sensitivity, 0.9594 (95% CI, 0.9245-0.9943); specificity, 0.9714 (95% CI, 0.9162-1.0000); PPV, 0.9916 (95% CI, 0.9752-1.0000); NPV, 0.8718 (95% CI, 0.7669-0.9767). Prediction performance metrics for the model in the external validation dataset were as follows: accuracy, 0.8509 (95% CI, 0.7825-0.9192); area under receiver-operator characteristics curve, 0.9301 (95% CI, 0.8845-0.9756); area under precision-recall curve, 0.9476 (95% CI, 0.9087-0.9867); sensitivity, 0.9595 (95% CI, 0.9145-1.0000); specificity, 0.6500 (95% CI, 0.5022-0.7978); PPV, 0.8353 (95% CI, 0.7564-0.9142); NPV, 0.8966 (95% CI, 0.7857-1.0000). All the prediction metrics were higher in the ensemble models, except NPVs in both the internal and the external validation datasets.

CONCLUSION : We established an ensemble prediction model for prediction of unfavorable 1-year neurological outcomes in OHCA survivors using four machine learning methods. The prediction performance of the ensemble model was higher than the multivariable logistic regression model, while its performance was slightly decreased in the external validation dataset.

Heo Ji Han, Kim Taegyun, Shin Jonghwan, Suh Gil Joon, Kim Joonghee, Jung Yoon Sun, Park Seung Min, Kim Sungwan

2021-Jul-19

Cardiopulmonary Resuscitation, Heart Arrest, Machine Learning

Radiology Radiology

Improving detection accuracy of perfusion defect in standard dose SPECT-myocardial perfusion imaging by deep-learning denoising.

In Journal of nuclear cardiology : official publication of the American Society of Nuclear Cardiology

BACKGROUND : We previously developed a deep-learning (DL) network for image denoising in SPECT-myocardial perfusion imaging (MPI). Here we investigate whether this DL network can be utilized for improving detection of perfusion defects in standard-dose clinical acquisitions.

METHODS : To quantify perfusion-defect detection accuracy, we conducted a receiver-operating characteristic (ROC) analysis on reconstructed images with and without processing by the DL network using a set of clinical SPECT-MPI data from 190 subjects. For perfusion-defect detection hybrid studies were used as ground truth, which were created from clinically normal studies with simulated realistic lesions inserted. We considered ordered-subset expectation-maximization (OSEM) reconstruction with corrections for attenuation, resolution, and scatter and with 3D Gaussian post-filtering. Total perfusion deficit (TPD) scores, computed by Quantitative Perfusion SPECT (QPS) software, were used to evaluate the reconstructed images.

RESULTS : Compared to reconstruction with optimal Gaussian post-filtering (sigma = 1.2 voxels), further DL denoising increased the area under the ROC curve (AUC) from 0.80 to 0.88 (P-value < 10-4). For reconstruction with less Gaussian post-filtering (sigma = 0.8 voxels), thus better spatial resolution, DL denoising increased the AUC value from 0.78 to 0.86 (P-value < 10-4) and achieved better spatial resolution in reconstruction.

CONCLUSIONS : DL denoising can effectively improve the detection of abnormal defects in standard-dose SPECT-MPI images over conventional reconstruction.

Liu Junchi, Yang Yongyi, Wernick Miles N, Pretorius P Hendrik, Slomka Piotr J, King Michael A

2021-Jul-19

SPECT-MPI, deep learning, noise-to-noise training, post-reconstruction filtering

General General

Recent trends in artificial intelligence-driven identification and development of anti-neurodegenerative therapeutic agents.

In Molecular diversity

Neurological disorders affect various aspects of life. Finding drugs for the central nervous system is a very challenging and complex task due to the involvement of the blood-brain barrier, P-glycoprotein, and the drug's high attrition rates. The availability of big data present in online databases and resources has enabled the emergence of artificial intelligence techniques including machine learning to analyze, process the data, and predict the unknown data with high efficiency. The use of these modern techniques has revolutionized the whole drug development paradigm, with an unprecedented acceleration in the central nervous system drug discovery programs. Also, the new deep learning architectures proposed in many recent works have given a better understanding of how artificial intelligence can tackle big complex problems that arose due to central nervous system disorders. Therefore, the present review provides comprehensive and up-to-date information on machine learning/artificial intelligence-triggered effort in the brain care domain. In addition, a brief overview is presented on machine learning algorithms and their uses in structure-based drug design, ligand-based drug design, ADMET prediction, de novo drug design, and drug repurposing. Lastly, we conclude by discussing the major challenges and limitations posed and how they can be tackled in the future by using these modern machine learning/artificial intelligence approaches.

Kashyap Kushagra, Siddiqi Mohammad Imran

2021-Jul-19

Artificial intelligence, Blood–brain barrier, CNS drug discovery, Deep learning, Machine learning, Neurological disorders

Pathology Pathology

Identification of diatom taxonomy by a combination of region-based full convolutional network, online hard example mining, and shape priors of diatoms.

In International journal of legal medicine

Diatom test is one of the commonly used diagnostic methods for drowning in forensic pathology, which provides supportive evidence for drowning. However, in forensic practice, it is time-consuming and laborious for forensic experts to classify and count diatoms, whereas artificial intelligence (AI) is superior to human experts in processing data and carrying out classification tasks. Some AI techniques have focused on searching diatoms and classifying diatoms. But, they either could not classify diatoms correctly or were time-consuming. Conventional detection deep network has been used to overcome these problems but failed to detect the occluded diatoms and the diatoms similar to the background heavily, which could lead to false positives or false negatives. In order to figure out the problems above, an improved region-based full convolutional network (R-FCN) with online hard example mining and the shape prior of diatoms was proposed. The online hard example mining (OHEM) was coupled with the R-FCN to boost the capacity of detecting the occluded diatoms and the diatoms similar to the background heavily and the priors of the shape of the common diatoms were explored and introduced to the anchor generation strategy of the region proposal network in the R-FCN to locate the diatoms precisely. The results showed that the proposed approach significantly outperforms several state-of-the-art methods and could detect the diatom precisely without missing the occluded diatoms and the diatoms similar to the background heavily. From the study, we could conclude that (1) the proposed model can locate the position and identify the genera of common diatoms more accurately; (2) this method can reduce the false positives or false negatives in forensic practice; and (3) it is a time-saving method and can be introduced.

Deng Jiehang, Guo Wenquan, Zhao Youwei, Liu Jingjian, Lai Runhao, Gu Guosheng, Zhang Yalong, Li Qi, Liu Chao, Zhao Jian

2021-Jul-20

Diatom test, Drowning, Forensic science, Taxonomy identification

General General

ChronoRoot: High-throughput phenotyping by deep segmentation networks reveals novel temporal parameters of plant root system architecture.

In GigaScience

BACKGROUND : Deep learning methods have outperformed previous techniques in most computer vision tasks, including image-based plant phenotyping. However, massive data collection of root traits and the development of associated artificial intelligence approaches have been hampered by the inaccessibility of the rhizosphere. Here we present ChronoRoot, a system that combines 3D-printed open-hardware with deep segmentation networks for high temporal resolution phenotyping of plant roots in agarized medium.

RESULTS : We developed a novel deep learning-based root extraction method that leverages the latest advances in convolutional neural networks for image segmentation and incorporates temporal consistency into the root system architecture reconstruction process. Automatic extraction of phenotypic parameters from sequences of images allowed a comprehensive characterization of the root system growth dynamics. Furthermore, novel time-associated parameters emerged from the analysis of spectral features derived from temporal signals.

CONCLUSIONS : Our work shows that the combination of machine intelligence methods and a 3D-printed device expands the possibilities of root high-throughput phenotyping for genetics and natural variation studies, as well as the screening of clock-related mutants, revealing novel root traits.

Gaggion Nicolás, Ariel Federico, Daric Vladimir, Lambert Éric, Legendre Simon, Roulé Thomas, Camoirano Alejandra, Milone Diego H, Crespi Martin, Blein Thomas, Ferrante Enzo

2021-Jul-20

3D-printed hardware, convolutional neural networks, image segmentation, root system architecture, temporal phenotyping

General General

M2aia-Interactive, fast, and memory-efficient analysis of 2D and 3D multi-modal mass spectrometry imaging data.

In GigaScience

BACKGROUND : Mass spectrometry imaging (MSI) is a label-free analysis method for resolving bio-molecules or pharmaceuticals in the spatial domain. It offers unique perspectives for the examination of entire organs or other tissue specimens. Owing to increasing capabilities of modern MSI devices, the use of 3D and multi-modal MSI becomes feasible in routine applications-resulting in hundreds of gigabytes of data. To fully leverage such MSI acquisitions, interactive tools for 3D image reconstruction, visualization, and analysis are required, which preferably should be open-source to allow scientists to develop custom extensions.

FINDINGS : We introduce M2aia (MSI applications for interactive analysis in MITK), a software tool providing interactive and memory-efficient data access and signal processing of multiple large MSI datasets stored in imzML format. M2aia extends MITK, a popular open-source tool in medical image processing. Besides the steps of a typical signal processing workflow, M2aia offers fast visual interaction, image segmentation, deformable 3D image reconstruction, and multi-modal registration. A unique feature is that fused data with individual mass axes can be visualized in a shared coordinate system. We demonstrate features of M2aia by reanalyzing an N-glycan mouse kidney dataset and 3D reconstruction and multi-modal image registration of a lipid and peptide dataset of a mouse brain, which we make publicly available.

CONCLUSIONS : To our knowledge, M2aia is the first extensible open-source application that enables a fast, user-friendly, and interactive exploration of large datasets. M2aia is applicable to a wide range of MSI analysis tasks.

Cordes Jonas, Enzlein Thomas, Marsching Christian, Hinze Marven, Engelhardt Sandy, Hopf Carsten, Wolf Ivo

2021-Jul-20

image reconstruction, image registration, interactive visualization, mass spectrometry imaging, multi-modal, three-dimensional

General General

Design of a Spark Big Data Framework for PM2.5 Air Pollution Forecasting.

In International journal of environmental research and public health ; h5-index 73.0

In recent years, with rapid economic development, air pollution has become extremely serious, causing many negative effects on health, environment and medical costs. PM2.5 is one of the main components of air pollution. Therefore, it is necessary to know the PM2.5 air quality in advance for health. Many studies on air quality are based on the government's official air quality monitoring stations, which cannot be widely deployed due to high cost constraints. Furthermore, the update frequency of government monitoring stations is once an hour, and it is hard to capture short-term PM2.5 concentration peaks with little warning. Nevertheless, dealing with short-term data with many stations, the volume of data is huge and is calculated, analyzed and predicted in a complex way. This alleviates the high computational requirements of the original predictor, thus making Spark suitable for the considered problem. This study proposes a PM2.5 instant prediction architecture based on the Spark big data framework to handle the huge data from the LASS community. The Spark big data framework proposed in this study is divided into three modules. It collects real time PM2.5 data and performs ensemble learning through three machine learning algorithms (Linear Regression, Random Forest, Gradient Boosting Decision Tree) to predict the PM2.5 concentration value in the next 30 to 180 min with accompanying visualization graph. The experimental results show that our proposed Spark big data ensemble prediction model in next 30-min prediction has the best performance (R2 up to 0.96), and the ensemble model has better performance than any single machine learning model. Taiwan has been suffering from a situation of relatively poor air pollution quality for a long time. Air pollutant monitoring data from LASS community can provide a wide broader monitoring, however the data is large and difficult to integrate or analyze. The proposed Spark big data framework system can provide short-term PM2.5 forecasts and help the decision-maker to take proper action immediately.

Shih Dong-Her, To Thi Hien, Nguyen Ly Sy Phu, Wu Ting-Wei, You Wen-Ting

2021-Jul-02

PM2.5 predictions, Spark, air pollution, big data, ensemble model, machine learning

General General

Water-soluble tocopherol derivatives inhibit SARS-CoV-2 RNA-dependent RNA polymerase.

In bioRxiv : the preprint server for biology

The recent emergence of a novel coronavirus, SARS-CoV-2, has led to the global pandemic of the severe disease COVID-19 in humans. While efforts to quickly identify effective antiviral therapies have focused largely on repurposing existing drugs 1-4 , the current standard of care, remdesivir, remains the only authorized antiviral intervention of COVID-19 and provides only modest clinical benefits 5 . Here we show that water-soluble derivatives of α-tocopherol have potent antiviral activity and synergize with remdesivir as inhibitors of the SARS-CoV-2 RNA-dependent RNA polymerase (RdRp). Through an artificial-intelligence-driven in silico screen and in vitro viral inhibition assay, we identified D-α-tocopherol polyethylene glycol succinate (TPGS) as an effective antiviral against SARS-CoV-2 and β-coronaviruses more broadly that also displays strong synergy with remdesivir. We subsequently determined that TPGS and other water-soluble derivatives of α-tocopherol inhibit the transcriptional activity of purified SARS-CoV-2 RdRp and identified affinity binding sites for these compounds within a conserved, hydrophobic interface between SARS-CoV-2 nonstructural protein 7 and nonstructural protein 8 that is functionally implicated in the assembly of the SARS-CoV-2 RdRp 6 . In summary, we conclude that solubilizing modifications to α-tocopherol allow it to interact with the SARS-CoV-2 RdRp, making it an effective antiviral molecule alone and even more so in combination with remdesivir. These findings are significant given that many tocopherol derivatives, including TPGS, are considered safe for humans, orally bioavailable, and dramatically enhance the activity of the only approved antiviral for SARS-CoV-2 infection 7-9 .

Pacl Hayden T, Tipper Jennifer L, Sevalkar Ritesh R, Crouse Andrew, Crowder Camerron, Holder Gillian D, Kuhlman Charles J, Chinta Krishna C, Nadeem Sajid, Green Todd J, Petit Chad M, Steyn Adrie J C, Might Matthew, Harrod Kevin S

2021-Jul-14

General General

Online structural kernel selection for mobile health

ArXiv Preprint

Motivated by the need for efficient and personalized learning in mobile health, we investigate the problem of online kernel selection for Gaussian Process regression in the multi-task setting. We propose a novel generative process on the kernel composition for this purpose. Our method demonstrates that trajectories of kernel evolutions can be transferred between users to improve learning and that the kernels themselves are meaningful for an mHealth prediction goal.

Eura Shin, Pedja Klasnja, Susan Murphy, Finale Doshi-Velez

2021-07-21

General General

Water-soluble tocopherol derivatives inhibit SARS-CoV-2 RNA-dependent RNA polymerase.

In bioRxiv : the preprint server for biology

The recent emergence of a novel coronavirus, SARS-CoV-2, has led to the global pandemic of the severe disease COVID-19 in humans. While efforts to quickly identify effective antiviral therapies have focused largely on repurposing existing drugs 1-4 , the current standard of care, remdesivir, remains the only authorized antiviral intervention of COVID-19 and provides only modest clinical benefits 5 . Here we show that water-soluble derivatives of α-tocopherol have potent antiviral activity and synergize with remdesivir as inhibitors of the SARS-CoV-2 RNA-dependent RNA polymerase (RdRp). Through an artificial-intelligence-driven in silico screen and in vitro viral inhibition assay, we identified D-α-tocopherol polyethylene glycol succinate (TPGS) as an effective antiviral against SARS-CoV-2 and β-coronaviruses more broadly that also displays strong synergy with remdesivir. We subsequently determined that TPGS and other water-soluble derivatives of α-tocopherol inhibit the transcriptional activity of purified SARS-CoV-2 RdRp and identified affinity binding sites for these compounds within a conserved, hydrophobic interface between SARS-CoV-2 nonstructural protein 7 and nonstructural protein 8 that is functionally implicated in the assembly of the SARS-CoV-2 RdRp 6 . In summary, we conclude that solubilizing modifications to α-tocopherol allow it to interact with the SARS-CoV-2 RdRp, making it an effective antiviral molecule alone and even more so in combination with remdesivir. These findings are significant given that many tocopherol derivatives, including TPGS, are considered safe for humans, orally bioavailable, and dramatically enhance the activity of the only approved antiviral for SARS-CoV-2 infection 7-9 .

Pacl Hayden T, Tipper Jennifer L, Sevalkar Ritesh R, Crouse Andrew, Crowder Camerron, Holder Gillian D, Kuhlman Charles J, Chinta Krishna C, Nadeem Sajid, Green Todd J, Petit Chad M, Steyn Adrie J C, Might Matthew, Harrod Kevin S

2021-Jul-14

General General

O-Net: An Overall Convolutional Network for Segmentation Tasks.

In Machine learning in medical imaging. MLMI (Workshop)

Convolutional neural networks (CNNs) have recently been popular for classification and segmentation through numerous network architectures offering a substantial performance improvement. Their value has been particularly appreciated in the domain of biomedical applications, where even a small improvement in the predicted segmented region (e.g., a malignancy) compared to the ground truth can potentially lead to better diagnosis or treatment planning. Here, we introduce a novel architecture, namely the Overall Convolutional Network (O-Net), which takes advantage of different pooling levels and convolutional layers to extract more deeper local and containing global context. Our quantitative results on 2D images from two distinct datasets show that O-Net can achieve a higher dice coefficient when compared to either a U-Net or a Pyramid Scene Parsing Net. We also look into the stability of results for training and validation sets which can show the robustness of model compared with new datasets. In addition to comparison to the decoder, we use different encoders including simple, VGG Net, and ResNet. The ResNet encoder could help to improve the results in most of the cases.

Maghsoudi Omid Haji, Gastounioti Aimilia, Pantalone Lauren, Davatzikos Christos, Bakas Spyridon, Kontos Despina

2020-Oct

Biomedical imaging, Deep learning, Segmentation

Pathology Pathology

A pan-cancer landscape of somatic mutations in non-unique regions of the human genome.

In Nature biotechnology ; h5-index 151.0

A substantial fraction of the human genome displays high sequence similarity with at least one other genomic sequence, posing a challenge for the identification of somatic mutations from short-read sequencing data. Here we annotate genomic variants in 2,658 cancers from the Pan-Cancer Analysis of Whole Genomes (PCAWG) cohort with links to similar sites across the human genome. We train a machine learning model to use signals distributed over multiple genomic sites to call somatic events in non-unique regions and validate the data against linked-read sequencing in an independent dataset. Using this approach, we uncover previously hidden mutations in ~1,700 coding sequences and in thousands of regulatory elements, including in known cancer genes, immunoglobulins and highly mutated gene families. Mutations in non-unique regions are consistent with mutations in unique regions in terms of mutation burden and substitution profiles. The analysis provides a systematic summary of the mutation events in non-unique regions at a genome-wide scale across multiple human cancers.

Tarabichi Maxime, Demeulemeester Jonas, Verfaillie Annelien, Flanagan Adrienne M, Van Loo Peter, Konopka Tomasz

2021-Jul-19

General General

Comparison of two simulators for individual based models in HIV epidemiology in a population with HSV 2 in Yaoundé (Cameroon).

In Scientific reports ; h5-index 158.0

Model comparisons have been widely used to guide intervention strategies to control infectious diseases. Agreement between different models is crucial for providing robust evidence for policy-makers because differences in model properties can influence their predictions. In this study, we compared models implemented by two individual-based model simulators for HIV epidemiology in a heterosexual population with Herpes simplex virus type-2 (HSV-2). For each model simulator, we constructed four models, starting from a simplified basic model and stepwise including more model complexity. For the resulting eight models, the predictions of the impact of behavioural interventions on the HIV epidemic in Yaoundé-Cameroon were compared. The results show that differences in model assumptions and model complexity can influence the size of the predicted impact of the intervention, as well as the predicted qualitative behaviour of the HIV epidemic after the intervention. These differences in predictions of an intervention were also observed for two models that agreed in their predictions of the HIV epidemic in the absence of that intervention. Without additional data, it is impossible to determine which of these two models is the most reliable. These findings highlight the importance of making more data available for the calibration and validation of epidemiological models.

Hendrickx Diana M, Sousa João Dinis, Libin Pieter J K, Delva Wim, Liesenborgs Jori, Hens Niel, Müller Viktor, Vandamme Anne-Mieke

2021-Jul-19

General General

Accurate prediction of breast cancer survival through coherent voting networks with gene expression profiling.

In Scientific reports ; h5-index 158.0

For a patient affected by breast cancer, after tumor removal, it is necessary to decide which adjuvant therapy is able to prevent tumor relapse and formation of metastases. A prediction of the outcome of adjuvant therapy tailored for the patient is hard, due to the heterogeneous nature of the disease. We devised a methodology for predicting 5-years survival based on the new machine learning paradigm of coherent voting networks, with improved accuracy over state-of-the-art prediction methods. The 'coherent voting communities' metaphor provides a certificate justifying the survival prediction for an individual patient, thus facilitating its acceptability in practice, in the vein of explainable Artificial Intelligence. The method we propose is quite flexible and applicable to other types of cancer.

Pellegrini Marco

2021-Jul-19

Radiology Radiology

An integrated machine learning framework for a discriminative analysis of schizophrenia using multi-biological data.

In Scientific reports ; h5-index 158.0

Finding effective and objective biomarkers to inform the diagnosis of schizophrenia is of great importance yet remains challenging. Relatively little work has been conducted on multi-biological data for the diagnosis of schizophrenia. In this cross-sectional study, we extracted multiple features from three types of biological data, including gut microbiota data, blood data, and electroencephalogram data. Then, an integrated framework of machine learning consisting of five classifiers, three feature selection algorithms, and four cross validation methods was used to discriminate patients with schizophrenia from healthy controls. Our results show that the support vector machine classifier without feature selection using the input features of multi-biological data achieved the best performance, with an accuracy of 91.7% and an AUC of 96.5% (p < 0.05). These results indicate that multi-biological data showed better discriminative capacity for patients with schizophrenia than single biological data. The top 5% discriminative features selected from the optimal model include the gut microbiota features (Lactobacillus, Haemophilus, and Prevotella), the blood features (superoxide dismutase level, monocyte-lymphocyte ratio, and neutrophil count), and the electroencephalogram features (nodal local efficiency, nodal efficiency, and nodal shortest path length in the temporal and frontal-parietal brain areas). The proposed integrated framework may be helpful for understanding the pathophysiology of schizophrenia and developing biomarkers for schizophrenia using multi-biological data.

Ke Peng-Fei, Xiong Dong-Sheng, Li Jia-Hui, Pan Zhi-Lin, Zhou Jing, Li Shi-Jia, Song Jie, Chen Xiao-Yi, Li Gui-Xiang, Chen Jun, Li Xiao-Bo, Ning Yu-Ping, Wu Feng-Chun, Wu Kai

2021-Jul-19

General General

A deep learning model for predicting next-generation sequencing depth from DNA sequence.

In Nature communications ; h5-index 260.0

Targeted high-throughput DNA sequencing is a primary approach for genomics and molecular diagnostics, and more recently as a readout for DNA information storage. Oligonucleotide probes used to enrich gene loci of interest have different hybridization kinetics, resulting in non-uniform coverage that increases sequencing costs and decreases sequencing sensitivities. Here, we present a deep learning model (DLM) for predicting Next-Generation Sequencing (NGS) depth from DNA probe sequences. Our DLM includes a bidirectional recurrent neural network that takes as input both DNA nucleotide identities as well as the calculated probability of the nucleotide being unpaired. We apply our DLM to three different NGS panels: a 39,145-plex panel for human single nucleotide polymorphisms (SNP), a 2000-plex panel for human long non-coding RNA (lncRNA), and a 7373-plex panel targeting non-human sequences for DNA information storage. In cross-validation, our DLM predicts sequencing depth to within a factor of 3 with 93% accuracy for the SNP panel, and 99% accuracy for the non-human panel. In independent testing, the DLM predicts the lncRNA panel with 89% accuracy when trained on the SNP panel. The same model is also effective at predicting the measured single-plex kinetic rate constants of DNA hybridization and strand displacement.

Zhang Jinny X, Yordanov Boyan, Gaunt Alexander, Wang Michael X, Dai Peng, Chen Yuan-Jyue, Zhang Kerou, Fang John Z, Dalchau Neil, Li Jiaming, Phillips Andrew, Zhang David Yu

2021-07-19

Internal Medicine Internal Medicine

Deep learning for abdominal ultrasound: A computer-aided diagnostic system for the severity of fatty liver.

In Journal of the Chinese Medical Association : JCMA

BACKGROUND : The prevalence of nonalcoholic fatty liver disease is increasing over time worldwide, with similar trends to those of diabetes and obesity. A liver biopsy, the gold standard of diagnosis, is not favored due to its invasiveness. Meanwhile, noninvasive evaluation methods of fatty liver are still either very expensive or demonstrate poor diagnostic performances, thus, limiting their applications. We developed neural network-based models to assess fatty liver and classify the severity using B-mode ultrasound (US) images.

METHODS : We followed STARD guidelines to report this study. In this retrospective study, we utilized B-mode US images from a consecutive series of patients to develop four-class, two-class, and three-class diagnostic prediction models. The images were eligible if confirmed by at least two gastroenterologists. We compared pretrained convolutional neural network models, consisting of VGG19, ResNet-50 v2, MobileNet v2, Xception, and Inception v2. For validation, we utilized 20% of the dataset resulting in >100 images for each severity category.

RESULTS : There were 21,855 images from 2,070 patients classified as normal (N = 11,307), mild (N = 4,467), moderate (N = 3,155), or severe steatosis (N = 2,926). We used ResNet-50 v2 for the final model as the best ones. The areas under the receiver operating characteristic curves were 0.974 (mild steatosis vs. others), 0.971 (moderate steatosis vs. others), 0.981 (severe steatosis vs. others), 0.985 (any severity vs. normal), and 0.996 (moderate-to-severe steatosis/clinically abnormal vs. normal-to-mild steatosis/clinically normal).

CONCLUSION : Our deep learning models achieved comparable predictive performances to the most accurate, yet expensive, noninvasive diagnostic methods for fatty liver. Because of the discriminative ability, including for mild steatosis, significant impacts on clinical applications for fatty liver are expected. However, we need to overcome machine-dependent variation, motion artifacts, lacking of second confirmation from any other tools, and hospital-dependent regional bias.

Chou Tsung-Hsien, Yeh Hsing-Jung, Chang Chun-Chao, Tang Jui-Hsiang, Kao Wei-Yu, Su I-Chia, Li Chien-Hung, Chang Wei-Hao, Huang Chun-Kai, Sufriyana Herdiantri, Su Emily Chia-Yu

2021-Jul-16

General General

A method for measuring investigative journalism in local newspapers.

In Proceedings of the National Academy of Sciences of the United States of America

Major changes to the operation of local newsrooms-ownership restructuring, layoffs, and a reorientation away from print advertising-have become commonplace in the last few decades. However, there have been few systematic attempts to characterize the impact of these changes on the types of reporting that local newsrooms produce. In this paper, we propose a method to measure the investigative content of news articles based on article text and influence on subsequent articles. We use our method to examine over-time and cross-sectional patterns in news production by local newspapers in the United States over the past decade. We find surprising stability in the quantity of investigative articles produced over most of the time period examined, but a notable decline in the last 2 y of the decade, corresponding to a recent wave of newsroom layoffs.

Turkel Eray, Saha Anish, Owen Rhett Carson, Martin Gregory J, Vasserman Shoshana

2021-Jul-27

journalistic impact, local news, machine learning

Dermatology Dermatology

DNA methylation-based prediction of response to immune checkpoint inhibition in metastatic melanoma.

In Journal for immunotherapy of cancer

BACKGROUND : Therapies based on targeting immune checkpoints have revolutionized the treatment of metastatic melanoma in recent years. Still, biomarkers predicting long-term therapy responses are lacking.

METHODS : A novel approach of reference-free deconvolution of large-scale DNA methylation data enabled us to develop a machine learning classifier based on CpG sites, specific for latent methylation components (LMC), that allowed for patient allocation to prognostic clusters. DNA methylation data were processed using reference-free analyses (MeDeCom) and reference-based computational tumor deconvolution (MethylCIBERSORT, LUMP).

RESULTS : We provide evidence that DNA methylation signatures of tumor tissue from cutaneous metastases are predictive for therapy response to immune checkpoint inhibition in patients with stage IV metastatic melanoma.

CONCLUSIONS : These results demonstrate that LMC-based segregation of large-scale DNA methylation data is a promising tool for classifier development and treatment response estimation in cancer patients under targeted immunotherapy.

Filipski Katharina, Scherer Michael, Zeiner Kim N, Bucher Andreas, Kleemann Johannes, Jurmeister Philipp, Hartung Tabea I, Meissner Markus, Plate Karl H, Fenton Tim R, Walter Jörn, Tierling Sascha, Schilling Bastian, Zeiner Pia S, Harter Patrick N

2021-Jul

biomarkers, biostatistics, immunotherapy, melanoma, tumor, tumor biomarkers

General General

[Use of machine learning for the prediction of stress using the example of logistics].

In Zeitschrift fur Arbeitswissenschaft

Stress and its complex effects have been researched since the beginning of the 20th century. The manifold psychological and physical stressors in the world of work can, in sum, lead to disorders of the organism and to illness. Since the physical and subjective consequences of stress vary individually, no absolute threshold values can be determined. Machine learning (ML) methods are used in this article to research the systematic recognition of patterns of physiological and subjective stress parameters and to predict stress. The logistics sector serves as a practical application case in which stress factors are often rooted in the activity and work organisation. One design element of the prevention of stress is the work break. ML methods are used to investigate the extent to which stress can be predicted on the basis of physiological and subjective parameters in order to recommend breaks individually. The article presents the interim status of a software solution for dynamic break management for logistics.Practical Relevance: The aim of the software solution "Dynamic Break" is to preventively prevent stress resulting from mental and physical stress factors in logistics and to keep employees healthy, satisfied, fit for work and productive in the long term. Individualized rest breaks as a design element can support companies in deploying human resources more flexibly in line with the dynamic requirements of logistics.

Foot Hermann, Mättig Benedikt, Fiolka Michael, Grylewicz Tim, Ten Hompel Michael, Kretschmer Veronika

2021-Jul-13

Break management, Machine learning, Psychophysiology, Sensor technology, Stress

General General

A concise review: the synergy between artificial intelligence and biomedical nanomaterials that empowers nanomedicine.

In Biomedical materials (Bristol, England)

Nanomedicine has recently experienced unprecedented growth and development. However, the complexity of operations at the nanoscale introduces a layer of difficulty in the clinical translation of nanodrugs and biomedical nanotechnology. This problem is further exacerbated when engineering and optimizing nanomaterials for biomedical purposes. To navigate this issue, artificial intelligence algorithms have been applied for data analysis and inference, allowing for a more applicable understanding of the complex interaction amongst the abundant variables in a system involving the synthesis or use of nanomedicine. Here, we report on the current relationship and implications of nanomedicine and artificial intelligence. Particularly, we explore artificial intelligence as a tool for enabling nanomedicine in the context of nanodrug screening and development, brain machine interfaces and nanotoxicology. We also report on the current state and future direction of nanomedicine and artificial intelligence in cancer, diabetes, and neurological disorder therapy.

Hayat Hasaan, Nukala Arijit, Nyamira Anthony, Fan Jinda, Wang Ping

2021-Jul-19

Artificial Intelligence, Deep Learning, Machine Learning, Nanomedicine, Nanotechnology, Theranostic, Type 1 Diabetes

General General

Abnormal causal connectivity of left superior temporal gyrus in drug-naïve first- episode adolescent-onset schizophrenia: A resting-state fMRI study.

In Psychiatry research. Neuroimaging

This study aimed to investigate the alterations of causal connectivity between the brain regions in Adolescent-onset schizophrenia (AOS) patients. Thirty-two first-episode drug-naïve AOS patients and 27 healthy controls (HC) were recruited for resting-state functional MRI scanning. The brain region with the between-group difference in regional homogeneity (ReHo) values was chosen as a seed to perform the Granger causality analysis (GCA) and further detect the alterations of causal connectivity in AOS. AOS patients exhibited increased ReHo values in left superior temporal gyrus (STG) compared with HCs. Significantly decreased values of outgoing Granger causality from left STG to right superior frontal gyrus and right angular gyrus were observed in GC mapping for AOS. Significantly stronger causal outflow from left STG to right insula and stronger causal inflow from right middle occipital gyrus (MOG) to left STG were also observed in AOS patients. Based on assessments of the two strengthened causal connectivity of the left STG with insula and MOG, a discriminant model could identify all patients from controls with 94.9% accuracy. This study indicated that alterations of directional connections in left STG may play an important role in the pathogenesis of AOS and serve as potential biomarkers for the disease.

Lyu Hailong, Jiao Jianping, Feng Guoxun, Wang Xinxin, Sun Bin, Zhao Zhiyong, Shang Desheng, Pan Fen, Xu Weijuan, Duan Jinfeng, Zhou Qingshuang, Hu Shaohua, Xu Yi, Xu Dongrong, Huang Manli

2021-Jul-03

Adolescent-onset schizophrenia, Granger causality analysis, Machine learning, Regional homogeneity, fMRI

General General

Fusion-based framework for meteorological drought modeling using remotely sensed datasets under climate change scenarios: Resilience, vulnerability, and frequency analysis.

In Journal of environmental management

Severe drought events in recent decades and their catastrophic effects have called for drought prediction and monitoring needed for developing drought readiness plans and mitigation measures. This study used a fusion-based framework for meteorological drought modeling for the historical (1983-2016) and future (2020-2050) periods using remotely sensed datasets versus ground-based observations and climate change scenarios. To this aim, high-resolution remotely sensed precipitation datasets, including PERSIANN-CDR and CHIRPS (multi-source products), ERA5 (reanalysis datasets), and GPCC (gauge-interpolated datasets), were employed to estimate non-parametric SPI (nSPI) as a meteorological drought index against local observations. For more accurate drought evaluation, all stations were classified into different clusters using the K-means clustering algorithm based on ground-based nSPI. Then, four Individual Artificial Intelligence (IAI) models, including Adaptive Neuro-Fuzzy Inference System (ANFIS), Group Method of Data Handling (GMDH), Multi-Layer Perceptron (MLP), and General Regression Neural Network (GRNN), were developed for drought modeling within each cluster. Finally, two advanced fusion-based methods, including Multi-Model Super Ensemble (MMSE) as a linear weighted model and a nonlinear model called machine learning Random Forest (RF), combined results by IAI models using different remotely sensed datasets. The proposed framework was implemented to simulate each remotely sensed precipitation data for the future based on CORDEX regional climate models (RCMs) under RCP4.5 and RCP8.5 scenarios for drought projection. The efficiency of IAI and fusion models was evaluated using statistical error metrics, including the coefficient of determination (R2), Mean Absolute Error (MAE), Mean Square Error (MSE), and Root Mean Square Error (RMSE). The proposed methodology was employed in the Gavkhooni basin of Iran, and results showed that the RF model with the lowest estimation error (RMSE of 0.391 and R2 of 0.810) had performed well compared to all other models. Finally, the resilience, vulnerability, and frequency of probability metrics indicated that the 12-month time scale of drought affected the basin more severely than other time scales.

Fooladi Mahmood, Golmohammadi Mohammad H, Safavi Hamid R, Singh Vijay P

2021-Jul-16

CORDEX RCMs, Frequency, Ground-based observations, Individual and fusion models, Meteorological drought, Remotely sensed datasets, Resilience, Vulnerability

General General

MIDCAN: A multiple input deep convolutional attention network for Covid-19 diagnosis based on chest CT and chest X-ray.

In Pattern recognition letters

Background : COVID-19 has caused 3.34m deaths till 13/May/2021. It is now still causing confirmed cases and ongoing deaths every day.

Method : This study investigated whether fusing chest CT with chest X-ray can help improve the AI's diagnosis performance. Data harmonization is employed to make a homogeneous dataset. We create an end-to-end multiple-input deep convolutional attention network (MIDCAN) by using the convolutional block attention module (CBAM). One input of our model receives 3D chest CT image, and other input receives 2D X-ray image. Besides, multiple-way data augmentation is used to generate fake data on training set. Grad-CAM is used to give explainable heatmap.

Results : The proposed MIDCAN achieves a sensitivity of 98.10±1.88%, a specificity of 97.95±2.26%, and an accuracy of 98.02±1.35%.

Conclusion : Our MIDCAN method provides better results than 8 state-of-the-art approaches. We demonstrate the using multiple modalities can achieve better results than individual modality. Also, we demonstrate that CBAM can help improve the diagnosis performance.

Zhang Yu-Dong, Zhang Zheng, Zhang Xin, Wang Shui-Hua

2021-Jul-14

Automatic differentiation, COVID-19, Chest CT, Chest X-ray, Convolutional neural network, Data harmonization, Deep learning, Multimodality, Multiple input

General General

Identifying Communities at Risk for COVID-19-Related Burden Across 500 U.S. Cities and within New York City: Unsupervised Learning of Co-Prevalence of Health Indicators.

In JMIR public health and surveillance

BACKGROUND : While it is well-known that older individuals with certain comorbidities are at highest risk for complications related to COVID-19 including hospitalization and death, we lack tools to identify communities at highest risk with fine-grained spatial resolution. Information collected at a county level obscures local risk and complex interactions between clinical comorbidities, the built environment, population factors, and other social determinants of health.

OBJECTIVE : Development of a COVID-19 Community Risk Score that summarizes complex disease prevalence together with age and sex, and compare the score to different social determinants of health indicators and built environment measures derived from satellite images using deep-learning.

METHODS : We develop a robust COVID-19 Community Risk Score (COVID-19 Risk Score) that summarizes the complex disease co-occurrences (using data for 2019) for individual census tracts with unsupervised learning, selected on the basis of their association with risk for COVID-19 complications, such as death. We mapped the COVID-19 Risk Score to corresponding zip codes in New York City and associated the score with COVID-19 related death. We further model the variance of the COVID-19 Risk Score using satellite imagery and social determinants of health.

RESULTS : Using 2019 chronic disease data, the COVID-19 Risk Score describes 85% of variation in co-occurrence of 15 diseases and health behaviors that are risk factors for COVID-19 complications among ~28K census tract neighborhoods (median population size of tracts: 4,091). The COVID-19 Risk Score is associated with a 40% greater risk for COVID-19 related death across New York City (April and September 2020) for a 1 standard deviation (SD) change in the score (risk ratio for 1 SD change in COVID-19 Risk Score: 1.4, P < .001) at the zip code level. Satellite imagery coupled with social determinants of health explain nearly 90% of the variance in the COVID-19 Risk Score in the United States in census tracts (r2 = 0.87).

CONCLUSIONS : The COVID-19 Risk Score localizes risk at the census tract level and was able to predict COVID-19 related mortality in New York City. The built environment explained significant variations in the score, suggesting risk models could be enhanced with satellite imagery.

CLINICALTRIAL :

Deonarine Andrew, Lyons Genevieve, Lakhani Chirag

2021-Jul-15

oncology Oncology

A Computational Tumor-Infiltrating Lymphocyte Assessment Method Comparable with Visual Reporting Guidelines for Triple-Negative Breast Cancer.

In EBioMedicine

BACKGROUND : Tumor-infiltrating lymphocytes (TILs) are clinically significant in triple-negative breast cancer (TNBC). Although a standardized methodology for visual TILs assessment (VTA) exists, it has several inherent limitations. We established a deep learning-based computational TIL assessment (CTA) method broadly following VTA guideline and compared it with VTA for TNBC to determine the prognostic value of the CTA and a reasonable CTA workflow for clinical practice.

METHODS : We trained three deep neural networks for nuclei segmentation, nuclei classification and necrosis classification to establish a CTA workflow. The automatic TIL (aTIL) score generated was compared with manual TIL (mTIL) scores provided by three pathologists in an Asian (n = 184) and a Caucasian (n = 117) TNBC cohort to evaluate scoring concordance and prognostic value.

FINDINGS : The intraclass correlations (ICCs) between aTILs and mTILs varied from 0.40 to 0.70 in two cohorts. Multivariate Cox proportional hazards analysis revealed that the aTIL score was associated with disease free survival (DFS) in both cohorts, as either a continuous [hazard ratio (HR)=0.96, 95% CI 0.94-0.99] or dichotomous variable (HR=0.29, 95% CI 0.12-0.72). A higher C-index was observed in a composite mTIL/aTIL three-tier stratification model than in the dichotomous model, using either mTILs or aTILs alone.

INTERPRETATION : The current study provides a useful tool for stromal TIL assessment and prognosis evaluation for patients with TNBC. A workflow integrating both VTA and CTA may aid pathologists in performing risk management and decision-making tasks.

Sun Peng, He Jiehua, Chao Xue, Chen Keming, Xu Yuanyuan, Huang Qitao, Yun Jingping, Li Mei, Luo Rongzhen, Kuang Jinbo, Wang Huajia, Li Haosen, Hui Hui, Xu Shuoyu

2021-Jul-16

Deep learning, Prognosis, Triple-negative breast cancer, Tumor-infiltrating lymphocyte

Cardiology Cardiology

A machine learning-based risk stratification model for ventricular tachycardia and heart failure in hypertrophic cardiomyopathy.

In Computers in biology and medicine

BACKGROUND : Machine learning (ML) and artificial intelligence are emerging as important components of precision medicine that enhance diagnosis and risk stratification. Risk stratification tools for hypertrophic cardiomyopathy (HCM) exist, but they are based on traditional statistical methods. The aim was to develop a novel machine learning risk stratification tool for the prediction of 5-year risk in HCM. The goal was to determine if its predictive accuracy is higher than the accuracy of the state-of-the-art tools.

METHOD : Data from a total of 2302 patients were used. The data were comprised of demographic characteristics, genetic data, clinical investigations, medications, and disease-related events. Four classification models were applied to model the risk level, and their decisions were explained using the SHAP (SHapley Additive exPlanations) method. Unwanted cardiac events were defined as sustained ventricular tachycardia occurrence (VT), heart failure (HF), ICD activation, sudden cardiac death (SCD), cardiac death, and all-cause death.

RESULTS : The proposed machine learning approach outperformed the similar existing risk-stratification models for SCD, cardiac death, and all-cause death risk-stratification: it achieved higher AUC by 17%, 9%, and 1%, respectively. The boosted trees achieved the best performing AUC of 0.82. The resulting model most accurately predicts VT, HF, and ICD with AUCs of 0.90, 0.88, and 0.87, respectively.

CONCLUSIONS : The proposed risk-stratification model demonstrates high accuracy in predicting events in patients with hypertrophic cardiomyopathy. The use of a machine-learning risk stratification model may improve patient management, clinical practice, and outcomes in general.

Smole Tim, Žunkovič Bojan, Pičulin Matej, Kokalj Enja, Robnik-Šikonja Marko, Kukar Matjaž, Fotiadis Dimitrios I, Pezoulas Vasileios C, Tachos Nikolaos S, Barlocco Fausto, Mazzarotto Francesco, Popović Dejana, Maier Lars, Velicki Lazar, MacGowan Guy A, Olivotto Iacopo, Filipović Nenad, Jakovljević Djordje G, Bosnić Zoran

2021-Jul-12

Artificial intelligence, Hypertrophic cardiomyopathy, Machine learning, Risk stratification

General General

Enhancing the estimation of fiber orientation distributions using convolutional neural networks.

In Computers in biology and medicine

Local fiber orientation distributions (FODs) can be computed from diffusion magnetic resonance imaging (dMRI). The accuracy and ability of FODs to resolve complex fiber configurations benefits from acquisition protocols that sample a high number of gradient directions, a high maximum b-value, and multiple b-values. However, acquisition time and scanners that follow these standards are limited in clinical settings, often resulting in dMRI acquired at a single shell (single b-value). In this work, we learn improved FODs from clinically acquired dMRI. We evaluate patch-based 3D convolutional neural networks (CNNs) on their ability to regress multi-shell FODs from single-shell FODs, using constrained spherical deconvolution (CSD). We evaluate U-Net and High-Resolution Network (HighResNet) 3D CNN architectures on data from the Human Connectome Project and an in-house dataset. We evaluate how well each CNN can resolve FODs 1) when training and testing on datasets with the same dMRI acquisition protocol; 2) when testing on a dataset with a different dMRI acquisition protocol than used to train the CNN; and 3) when testing on a dataset with a fewer number of gradient directions than used to train the CNN. This work is a step towards more accurate FOD estimation in time- and resource-limited clinical environments.

Lucena Oeslle, Vos Sjoerd B, Vakharia Vejay, Duncan John, Ashkan Keyoumars, Sparks Rachel, Ourselin Sebastien

2021-Jul-14

Constrained spherical deconvolution, Deep learning, Diffusion weighted image, Tractography

General General

In silico, in vitro, and in vivo machine learning in synthetic biology and metabolic engineering.

In Current opinion in chemical biology

Among the main learning methods reviewed in this study and used in synthetic biology and metabolic engineering are supervised learning, reinforcement and active learning, and in vitro or in vivo learning. In the context of biosynthesis, supervised machine learning is being exploited to predict biological sequence activities, predict structures and engineer sequences, and optimize culture conditions. Active and reinforcement learning methods use training sets acquired through an iterative process generally involving experimental measurements. They are applied to design, engineer, and optimize metabolic pathways and bioprocesses. The nascent but promising developments with in vitro and in vivo learning comprise molecular circuits performing simple tasks such as pattern recognition and classification.

Faulon Jean-Loup, Faure Léon

2021-Jul-16

Active learning, Artificial neural networks, Machine learning, Metabolic engineering, Perceptron, Reinforcement learning, Synthetic biology

General General

Deriving a joint risk estimate from dynamic data collected at motorcycle rides.

In Accident; analysis and prevention

Making motorcycle rides safer by advanced technology is an ongoing challenge in the context of developing driving assistant systems and safety infrastructure. Determining which section of a road and which driving behaviour is "safe" or "unsafe" is rarely possible due to the individual differences in driving experience, driving style, fitness and potentially available assistant systems. This study investigates the feasibility of a new approach to quantify motorcycle riding risk for an experimental sample of bikers by collecting motorcycle-specific dynamic data of several riders on selected road sections. Comparing clustered dynamics with the observed dynamic data at known risk spots, we provide a method to represent individual risk estimates in a single risk map for the investigated road section. This yields a map of potential risk spots, based on an aggregation of individual risk estimates. The risk map is optimized to include most of the previous accident sites, while keeping the overall area classified as risky small. As such, with data collected on a large scale, the presented methodology could guide safety inspections at the highlighted areas of a risk map and be the basis of further studies into the safety relevant differences in driving styles.

Hula Andreas, Fürnsinn Florian, Schwieger Klemens, Saleh Peter, Neumann Manfred, Ecker Horst

2021-Jul-16

Accident spots, Human behaviour, Machine learning, Motorcycle safety, Risk map, Statistics

General General

Low-shot transfer with attention for highly imbalanced cursive character recognition.

In Neural networks : the official journal of the International Neural Network Society

Recognition of ancient Korean-Chinese cursive character (Hanja) is a challenging problem mainly because of large number of classes, damaged cursive characters, various hand-writing styles, and similar confusable characters. They also suffer from lack of training data and class imbalance issues. To address these problems, we propose a unified Regularized Low-shot Attention Transfer with Imbalance τ-Normalizing (RELATIN) framework. This handles the problem with instance-poor classes using a novel low-shot regularizer that encourages the norm of the weight vectors for classes with few samples to be aligned to those of many-shot classes. To overcome the class imbalance problem, we incorporate a decoupled classifier to rectify the decision boundaries via classifier weight-scaling into the proposed low-shot regularizer framework. To address the limited training data issue, the proposed framework performs Jensen-Shannon divergence based data augmentation and incorporate an attention module that aligns the most attentive features of the pretrained network to a target network. We verify the proposed RELATIN framework using highly-imbalanced ancient cursive handwritten character datasets. The results suggest that (i) the extreme class imbalance has a detrimental effect on classification performance; (ii) the proposed low-shot regularizer aligns the norm of the classifier in favor of classes with few samples; (iii) weight-scaling of decoupled classifier for addressing class imbalance appeared to be dominant in all the other baseline conditions; (iv) further addition of the attention module attempts to select more representative features maps from base pretrained model; (v) the proposed (RELATIN) framework results in superior representations to address extreme class imbalance issue.

Jalali Amin, Kavuri Swathi, Lee Minho

2021-Jul-08

Attention transfer learning, Decoupled -normalized classifier, Highly imbalanced data samples, Low-shot regularizer, Traditional cursive character recognition

Pathology Pathology

BrcaSeg: A Deep Learning Approach for Tissue Quantification and Genomic Correlations of Histopathological Images.

In Genomics, proteomics & bioinformatics

Epithelial and stromal tissues are components of the tumor microenvironment and play a major role in tumor initiation and progression. Distinguishing stroma from epithelial tissues is critically important for spatial characterization of the tumor microenvironment. We propose BrcaSeg, an image analysis pipeline based on a convolutional neural network (CNN) model to classify epithelial and stromal regions in whole-slide hematoxylin and eosin (H&E) stained histopathological images. The CNN model was trained using well-annotated breast cancer tissue microarrays and validated with images from The Cancer Genome Atlas (TCGA) Program. BrcaSeg achieves a classification accuracy of 91.02%, which outperforms other state-of-the-art methods. Using this model, we generated pixel-level epithelial/stromal tissue maps for 1000 TCGA breast cancer slide images that are paired with gene expression data. We subsequently estimated the epithelial and stromal ratios and performed correlation analysis to model the relationship between gene expression and tissue ratios. Gene Ontology (GO) enrichment analyses of genes that were highly correlated with tissue ratios suggest that the same tissue was associated with similar biological processes in different breast cancer subtypes, whereas each subtype also had its own idiosyncratic biological processes governing the development of these tissues. Taken all together, our approach can lead to new insights in exploring relationships between image-based phenotypes and their underlying genomic events and biological processes for all types of solid tumors. BrcaSeg can be accessed at https://github.com/Serian1992/ImgBio.

Lu Zixiao, Zhan Xiaohui, Wu Yi, Cheng Jun, Shao Wei, Ni Dong, Han Zhi, Zhang Jie, Feng Qianjin, Huang Kun

2021-Jul-16

Breast cancer, Computational pathology, Deep learning, Integrative genomics, Whole-slide tissue image

Public Health Public Health

Generalizability of heterogeneous treatment effects based on causal forests applied to two randomized clinical trials of intensive glycemic control.

In Annals of epidemiology ; h5-index 39.0

PURPOSE : Machine learning is an attractive tool for identifying heterogeneous treatment effects (HTE) of interventions but generalizability of machine learning derived HTE remains unclear. We examined generalizability of HTE detected using causal forests in two similarly designed randomized trials in type 2 diabetes patients.

METHODS : We evaluated published HTE of intensive versus standard glycemic control on all-cause mortality from the Action to Control Cardiovascular Risk in Diabetes study (ACCORD) in a second trial, the Veterans Affairs Diabetes Trial (VADT). We then applied causal forests to VADT, ACCORD, and pooled data from both studies and compared variable importance and subgroup effects across samples.

RESULTS : HTE in ACCORD did not replicate in similar subgroups in VADT, but variable importance was correlated between VADT and ACCORD (Kendall's tau-b 0.75). Applying causal forests to pooled individual-level data yielded seven subgroups with similar HTE across both studies, ranging from risk difference of all-cause mortality of -3.9% (95% CI -7.0, -0.8) to 4.7% (95% CI 1.8, 7.5).

CONCLUSION : Machine learning detection of HTE subgroups from randomized trials may not generalize across study samples even when variable importance is correlated. Pooling individual-level data may overcome differences in study populations and/or differences in interventions that limit HTE generalizability.

Raghavan Sridharan, Josey Kevin, Bahn Gideon, Reda Domenic, Basu Sanjay, Berkowitz Seth A, Emanuele Nicholas, Reaven Peter, Ghosh Debashis

2021-Jul-16

causal forests, generalizability, glycemic control, heterogeneous treatment effects

General General

Spontaneous transient brain states in EEG source space in disorders of consciousness.

In NeuroImage ; h5-index 117.0

Spontaneous transient states were recently identified by functional magnetic resonance imaging and magnetoencephalography in healthy subjects. They organize and coordinate neural activity in brain networks. How spontaneous transient states are altered in abnormal brain conditions is unknown. Here, we conducted a transient state analysis on resting-state electroencephalography (EEG) source space and developed a state transfer analysis to patients with disorders of consciousness (DOC). They uncovered different neural coordination patterns, including spatial power patterns, temporal dynamics, spectral shifts, and connectivity construction varies at potentially very fast (millisecond) time scales, in groups with different consciousness levels: healthy subjects, patients in minimally conscious state (MCS), and patients with vegetative state/unresponsive wakefulness syndrome (VS/UWS). Machine learning based on transient state features reveal high classification accuracy between MCS and VS/UWS. This study developed methodology of transient states analysis on EEG source space and abnormal brain conditions. Findings correlate spontaneous transient states with human consciousness and suggest potential roles of transient states in brain disease assessment.

Bai Yang, He Jianghong, Xia Xiaoyu, Wang Yong, Yang Yi, Di Haibo, Li Xiaoli, Ziemann Ulf

2021-Jul-16

Disorder of consciousness, Electroencephalography, Hidden Markov model, Transient state

Radiology Radiology

Accelerating Quantitative Susceptibility and R2* Mapping using Incoherent Undersampling and Deep Neural Network Reconstruction.

In NeuroImage ; h5-index 117.0

Quantitative susceptibility mapping (QSM) and R2* mapping are MRI post-processing methods that quantify tissue magnetic susceptibility and transverse relaxation rate distributions. However, QSM and R2* acquisitions are relatively slow, even with parallel imaging. Incoherent undersampling and compressed sensing reconstruction techniques have been used to accelerate traditional magnitude-based MRI acquisitions; however, most do not recover the full phase signal, as required by QSM, due to its non-convex nature. In this study, a learning-based Deep Complex Residual Network (DCRNet) is proposed to recover both the magnitude and phase images from incoherently undersampled data, enabling high acceleration of QSM and R2* acquisition. Magnitude, phase, R2*, and QSM results from DCRNet were compared with two iterative and one deep learning methods on retrospectively undersampled acquisitions from six healthy volunteers, one intracranial hemorrhage and one multiple sclerosis patients, as well as one prospectively undersampled healthy subject using a 7T scanner. Peak signal to noise ratio (PSNR), structural similarity (SSIM), root-mean-squared error (RMSE), and region-of-interest susceptibility and R2* measurements are reported for numerical comparisons. The proposed DCRNet method substantially reduced artifacts and blurring compared to the other methods and resulted in the highest PSNR, SSIM, and RMSE on the magnitude, R2*, local field, and susceptibility maps. Compared to two iterative and one deep learning methods, the DCRNet method demonstrated a 3.2% to 9.1% accuracy improvement in deep grey matter susceptibility when accelerated by a factor of four. The DCRNet also dramatically shortened the reconstruction time of single 2D brain images from 36-140 seconds using conventional approaches to only 15-70 milliseconds.

Gao Yang, Cloos Martijn, Liu Feng, Crozier Stuart, Pike G Bruce, Sun Hongfu

2021-Jul-16

Deep Complex Residual Network (DCRNet), MRI phase acceleration, QSM acceleration, compressed sensing, quantitative susceptibility mapping (QSM)

General General

Prioritization of disease genes from GWAS using ensemble-based positive-unlabeled learning.

In European journal of human genetics : EJHG

A primary challenge in understanding disease biology from genome-wide association studies (GWAS) arises from the inability to directly implicate causal genes from association data. Integration of multiple-omics data sources potentially provides important functional links between associated variants and candidate genes. Machine-learning is well-positioned to take advantage of a variety of such data and provide a solution for the prioritization of disease genes. Yet, classical positive-negative classifiers impose strong limitations on the gene prioritization procedure, such as a lack of reliable non-causal genes for training. Here, we developed a novel gene prioritization tool-Gene Prioritizer (GPrior). It is an ensemble of five positive-unlabeled bagging classifiers (Logistic Regression, Support Vector Machine, Random Forest, Decision Tree, Adaptive Boosting), that treats all genes of unknown relevance as an unlabeled set. GPrior selects an optimal composition of algorithms to tune the model for each specific phenotype. Altogether, GPrior fills an important niche of methods for GWAS data post-processing, significantly improving the ability to pinpoint disease genes compared to existing solutions.

Kolosov Nikita, Daly Mark J, Artomov Mykyta

2021-Jul-19

General General

MINING OF OPINIONS ON COVID-19 LARGE-SCALE SOCIAL RESTRICTIONS IN INDONESIA: PUBLIC SENTIMENT AND EMOTION ANALYSIS ON ONLINE MEDIA.

In Journal of medical Internet research ; h5-index 88.0

BACKGROUND : Among successful measures to curb COVID-19 spread in large populations includes the implementation of a movement restriction order. Globally, it was observed that countries implementing strict movement control were more successful in controlling the spread of the virus as compared to countries with less stringent measures. Society's adherence to the movement control order has helped expedite the process to flatten the pandemic curve as seen in countries such as China and Malaysia. At the same time, there are countries facing challenges with society's nonconformity towards movement restriction orders due to various claims such as human rights violations as well as socio-cultural and economic issues. In Indonesia, society's adherence to its Large-Scale Social Restrictions (LSSR) order is also a challenge to achieve. Indonesia is regarded as among the worst in Southeast Asian countries in terms of managing the spread of COVID-19. It is proven by the significant number of daily confirmed cases and the total number of deaths which was more than 6% of total active cases as of May 2020.

OBJECTIVE : To explore public sentiments and emotions toward the LSSR and identify issues, fear and reluctance to observe this restriction among the Indonesian public.

METHODS : This study adopts sentiment analysis method with supervised machine learning approach on COVID-19 related posts on selected media platforms, which are Twitter, Facebook, Instagram, and YouTube. The analysis was also done on COVID-19 related news contained in more than 500 online news platforms recognized by the Indonesian Press Council. Social media posts and news originating from Indonesian online media between March 31 to May 31, 2020 were analyzed. Emotion analysis on Twitter platform was also performed to identify collective public emotions toward the LSSR.

RESULTS : The study found that positive sentiment surpasses other sentiment categories by 1,002,947 mentions (52%) of the total data collected via the search engine. Negative sentiment was recorded at 36%, and neutral sentiment at 13%. The analysis of Twitter posts also showed that the majority of public have the emotion of "trust" toward the LSSR.

CONCLUSIONS : Public sentiment toward the LSSR appeared to be positive despite doubts on government consistency in executing the LSSR. The emotion analysis also concluded that the majority of people believe in LSSR as the best method to break the chain of COVID-19 transmission. Overall, Indonesians showed trust and expressed hope towards the government's ability to manage this current global health crisis and win against COVID-19.

CLINICALTRIAL :

Tri Sakti Andi Muhammad, Mohamad Emma, Azlan Arina Anis

2021-Jun-15

Public Health Public Health

Development of a smartphone-based lateral-flow imaging system using machine-learning classifiers for detection of Salmonella spp.

In Journal of microbiological methods

Salmonella spp. are a foodborne pathogen frequently found in raw meat, egg products, and milk. Salmonella is responsible for numerous outbreaks, becoming a frequent major public-health concern. Many studies have recently reported handheld and rapid devices for microbial detection. This study explored a smartphone-based lateral-flow assay analyzer which employed machine-learning algorithms to detect various concentrations of Salmonella spp. from the test line images. When cell numbers are low, a faint test line is difficult to detect, leading to misleading results. Hence, this study focused on the development of a smartphone-based lateral-flow assay (SLFA) to distinguish ambiguous concentrations of test line with higher confidence. A smartphone cradle was designed with an angled slot to maximize the intensity, and the optimal direction of the optimal incident light was found. Furthermore, the combination of color spaces and the machine-learning algorithms were applied to the SLFA for classifications. It was found that the combination of L*a*b and RGB color space with SVM and KNN classifiers achieved the high accuracy (95.56%). A blind test was conducted to evaluate the performance of devices; the results by machine-learning techniques reported less error than visual inspection. The smartphone-based lateral-flow assay provided accurate interpretation with a detection limit of 5 × 104 CFU/mL commercially available lateral-flow assays.

Min Hyun Jung, Mina Hansel A, Deering Amanda J, Bae Euiwon

2021-Jul-16

Lateral flow assay, Machine learning algorithms, Salmonella, Smartphone biosensor

General General

A deep learning algorithm for sleep stage scoring in mice based on a multimodal network with fine-tuning technique.

In Neuroscience research

Sleep stage scoring is important to determine sleep structure in preclinical and clinical research. The aim of this study was to develop an automatic sleep stage classification system for mice with a new deep neural network algorithm. For the purpose of base feature extraction, wake-sleep and rapid eye movement (REM) and non- rapid eye movement (NREM) models were developed by extracting defining features from mouse-derived electromyogram (EMG) and electroencephalogram (EEG) signals, respectively. The wake-sleep model and REM-NREM sleep model were integrated into three different algorithms including a rule-based integration approach, an ensemble stacking approach, and a multimodal with fine-tuning approach. The deep learning algorithm assessing sleep stages in animal experiments by the multimodal with fine-tuning approach showed high potential for increasing accuracy in sleep stage scoring in mice and promoting sleep research.

Akada Keishi, Yagi Takuya, Miura Yuji, Beuckmann Carsten T, Koyama Noriyuki, Aoshima Ken

2021-Jul-16

Algorithm, Deep learning, NREM sleep, REM sleep, Sleep stage scoring

General General

Few-Shot Learning in Spiking Neural Networks by Multi-Timescale Optimization.

In Neural computation

Learning new concepts rapidly from a few examples is an open issue in spike-based machine learning. This few-shot learning imposes substantial challenges to the current learning methodologies of spiking neuron networks (SNNs) due to the lack of task-related priori knowledge. The recent learning-to-learn (L2L) approach allows SNNs to acquire priori knowledge through example-level learning and task-level optimization. However, an existing L2L-based framework does not target the neural dynamics (i.e., neuronal and synaptic parameter changes) on different timescales. This diversity of temporal dynamics is an important attribute in spike-based learning, which facilitates the networks to rapidly acquire knowledge from very few examples and gradually integrate this knowledge. In this work, we consider the neural dynamics on various timescales and provide a multi-timescale optimization (MTSO) framework for SNNs. This framework introduces an adaptive-gated LSTM to accommodate two different timescales of neural dynamics: short-term learning and long-term evolution. Short-term learning is a fast knowledge acquisition process achieved by a novel surrogate gradient online learning (SGOL) algorithm, where the LSTM guides gradient updating of SNN on a short timescale through an adaptive learning rate and weight decay gating. The long-term evolution aims to slowly integrate acquired knowledge and form, which can be achieved by optimizing the LSTM guidance process to tune SNN parameters on a long timescale. Experimental results demonstrate that the collaborative optimization of multi-timescale neural dynamics can make SNNs achieve promising performance for the few-shot learning tasks.

Jiang Runhao, Zhang Jie, Yan Rui, Tang Huajin

2021-Jul-19

Radiology Radiology

Fully automated waist circumference measurement on abdominal CT: Comparison with manual measurements and potential value for identifying overweight and obesity as an adjunct output of CT scan.

In PloS one ; h5-index 176.0

OBJECTIVE : Waist circumference (WC) is a widely accepted anthropometric parameter of central obesity. We investigated a fully automated body segmentation algorithm for measuring WC on abdominal computed tomography (CT) in comparison to manual WC measurements (WC-manual) and evaluated the performance of CT-measured WC for identifying overweight/obesity.

MATERIALS AND METHODS : This retrospective study included consecutive adults who underwent both abdominal CT scans and manual WC measurements at a health check-up between January 2013 and November 2019. Mid-waist WCs were automatically measured on noncontrast axial CT images using a deep learning-based body segmentation algorithm. The associations between CT-measured WC and WC-manual was assessed by Pearson correlation analysis and their agreement was assessed through Bland-Altman analysis. The performance of these WC measurements for identifying overweight/obesity (i.e., body mass index [BMI] ≥25 kg/m2) was evaluated using receiver operating characteristics (ROC) curve analysis.

RESULTS : Among 763 subjects whose abdominal CT scans were analyzed using a fully automated body segmentation algorithm, CT-measured WCs were successfully obtained in 757 adults (326 women; mean age, 54.3 years; 64 women and 182 men with overweight/obesity). CT-measured WC was strongly correlated with WC-manual (r = 0.919, p < 0.001), and showed a mean difference of 6.1 cm with limits of agreement between -1.8 cm and 14.0 cm in comparison to WC-manual. For identifying overweight/obesity, CT-measured WC showed excellent performance, with areas under the ROC curve (AUCs) of 0.960 (95% CI, 0.933-0.979) in women and 0.909 (95% CI, 0.878-0.935) in men, which were comparable to WC-manual (AUCs of 0.965 [95% CI, 0.938-0.982] and 0.916 [95% CI, 0.886-0.941]; p = 0.735 and 0.437, respectively).

CONCLUSION : CT-measured WC using a fully automated body segmentation algorithm was closely correlated with manually-measured WC. While radiation issue may limit its general use, it can serve as an adjunctive output of abdominal CT scans to identify overweight/obesity.

Joo Ijin, Kwak Min-Sun, Park Dae Hyun, Yoon Soon Ho

2021

General General

Neuron tracing and quantitative analyses of dendritic architecture reveal symmetrical three-way-junctions and phenotypes of git-1 in C. elegans.

In PLoS computational biology

Complex dendritic trees are a distinctive feature of neurons. Alterations to dendritic morphology are associated with developmental, behavioral and neurodegenerative changes. The highly-arborized PVD neuron of C. elegans serves as a model to study dendritic patterning; however, quantitative, objective and automated analyses of PVD morphology are missing. Here, we present a method for neuronal feature extraction, based on deep-learning and fitting algorithms. The extracted neuronal architecture is represented by a database of structural elements for abstracted analysis. We obtain excellent automatic tracing of PVD trees and uncover that dendritic junctions are unevenly distributed. Surprisingly, these junctions are three-way-symmetrical on average, while dendritic processes are arranged orthogonally. We quantify the effect of mutation in git-1, a regulator of dendritic spine formation, on PVD morphology and discover a localized reduction in junctions. Our findings shed new light on PVD architecture, demonstrating the effectiveness of our objective analyses of dendritic morphology and suggest molecular control mechanisms.

Yuval Omer, Iosilevskii Yael, Meledin Anna, Podbilewicz Benjamin, Shemesh Tom

2021-Jul-19

General General

Identifying Communities at Risk for COVID-19-Related Burden Across 500 U.S. Cities and within New York City: Unsupervised Learning of Co-Prevalence of Health Indicators.

In JMIR public health and surveillance

BACKGROUND : While it is well-known that older individuals with certain comorbidities are at highest risk for complications related to COVID-19 including hospitalization and death, we lack tools to identify communities at highest risk with fine-grained spatial resolution. Information collected at a county level obscures local risk and complex interactions between clinical comorbidities, the built environment, population factors, and other social determinants of health.

OBJECTIVE : Development of a COVID-19 Community Risk Score that summarizes complex disease prevalence together with age and sex, and compare the score to different social determinants of health indicators and built environment measures derived from satellite images using deep-learning.

METHODS : We develop a robust COVID-19 Community Risk Score (COVID-19 Risk Score) that summarizes the complex disease co-occurrences (using data for 2019) for individual census tracts with unsupervised learning, selected on the basis of their association with risk for COVID-19 complications, such as death. We mapped the COVID-19 Risk Score to corresponding zip codes in New York City and associated the score with COVID-19 related death. We further model the variance of the COVID-19 Risk Score using satellite imagery and social determinants of health.

RESULTS : Using 2019 chronic disease data, the COVID-19 Risk Score describes 85% of variation in co-occurrence of 15 diseases and health behaviors that are risk factors for COVID-19 complications among ~28K census tract neighborhoods (median population size of tracts: 4,091). The COVID-19 Risk Score is associated with a 40% greater risk for COVID-19 related death across New York City (April and September 2020) for a 1 standard deviation (SD) change in the score (risk ratio for 1 SD change in COVID-19 Risk Score: 1.4, P < .001) at the zip code level. Satellite imagery coupled with social determinants of health explain nearly 90% of the variance in the COVID-19 Risk Score in the United States in census tracts (r2 = 0.87).

CONCLUSIONS : The COVID-19 Risk Score localizes risk at the census tract level and was able to predict COVID-19 related mortality in New York City. The built environment explained significant variations in the score, suggesting risk models could be enhanced with satellite imagery.

CLINICALTRIAL :

Deonarine Andrew, Lyons Genevieve, Lakhani Chirag

2021-Jul-15

General General

MINING OF OPINIONS ON COVID-19 LARGE-SCALE SOCIAL RESTRICTIONS IN INDONESIA: PUBLIC SENTIMENT AND EMOTION ANALYSIS ON ONLINE MEDIA.

In Journal of medical Internet research ; h5-index 88.0

BACKGROUND : Among successful measures to curb COVID-19 spread in large populations includes the implementation of a movement restriction order. Globally, it was observed that countries implementing strict movement control were more successful in controlling the spread of the virus as compared to countries with less stringent measures. Society's adherence to the movement control order has helped expedite the process to flatten the pandemic curve as seen in countries such as China and Malaysia. At the same time, there are countries facing challenges with society's nonconformity towards movement restriction orders due to various claims such as human rights violations as well as socio-cultural and economic issues. In Indonesia, society's adherence to its Large-Scale Social Restrictions (LSSR) order is also a challenge to achieve. Indonesia is regarded as among the worst in Southeast Asian countries in terms of managing the spread of COVID-19. It is proven by the significant number of daily confirmed cases and the total number of deaths which was more than 6% of total active cases as of May 2020.

OBJECTIVE : To explore public sentiments and emotions toward the LSSR and identify issues, fear and reluctance to observe this restriction among the Indonesian public.

METHODS : This study adopts sentiment analysis method with supervised machine learning approach on COVID-19 related posts on selected media platforms, which are Twitter, Facebook, Instagram, and YouTube. The analysis was also done on COVID-19 related news contained in more than 500 online news platforms recognized by the Indonesian Press Council. Social media posts and news originating from Indonesian online media between March 31 to May 31, 2020 were analyzed. Emotion analysis on Twitter platform was also performed to identify collective public emotions toward the LSSR.

RESULTS : The study found that positive sentiment surpasses other sentiment categories by 1,002,947 mentions (52%) of the total data collected via the search engine. Negative sentiment was recorded at 36%, and neutral sentiment at 13%. The analysis of Twitter posts also showed that the majority of public have the emotion of "trust" toward the LSSR.

CONCLUSIONS : Public sentiment toward the LSSR appeared to be positive despite doubts on government consistency in executing the LSSR. The emotion analysis also concluded that the majority of people believe in LSSR as the best method to break the chain of COVID-19 transmission. Overall, Indonesians showed trust and expressed hope towards the government's ability to manage this current global health crisis and win against COVID-19.

CLINICALTRIAL :

Tri Sakti Andi Muhammad, Mohamad Emma, Azlan Arina Anis

2021-Jun-15

General General

Task-induced Pyramid and Attention GAN for Multimodal Brain Image Imputation and Classification in Alzheimers disease.

In IEEE journal of biomedical and health informatics

With the advance of medical imaging technologies, multimodal images such as magnetic resonance images (MRI) and positron emission tomography (PET) can capture subtle structural and functional changes of brain, facilating the diagnosis of brain diseases such as Alzheimers disease (AD). In practice, multimodal images may be incomplete since PET is often missing due to high financial cost or availability. Most of existing methods simply excluded subjects with missing data, which unfortunately reduced sample size. In addition, how to extract and combine multimodal features is still challenging. To address these problems, we propose a deep learning framework to integrate a task-induced pyramid and attention generative adversarial network (TPA-GAN) with a pathwise transfer dense convolution network (PT-DCN) for imputation and also classification of multimodal brain images. First, we propose a TPA-GAN to integrate pyramid convolution and attention module as well as disease classification task into GAN for generating the missing PET data with their MRI. Then, with the imputed multimodal brain images, we build a dense convolution network with pathwise transfer blocks to gradually learn and combine multimodal features for final disease classification. Experiments are performed on ADNI-1 and ADNI-2 datasets to evaluate our proposed method, achiving superior performance in image imputation and brain disease diagnosis compared to state-of-the-art methods.

Gao Xingyu, Shi Feng, Shen Dinggang, Liu Manhua

2021-Jul-19

General General

A Supervised Learning Algorithm for Multilayer Spiking Neural Networks Based on Temporal Coding Toward Energy-Efficient VLSI Processor Design.

In IEEE transactions on neural networks and learning systems

Spiking neural networks (SNNs) are brain-inspired mathematical models with the ability to process information in the form of spikes. SNNs are expected to provide not only new machine-learning algorithms but also energy-efficient computational models when implemented in very-large-scale integration (VLSI) circuits. In this article, we propose a novel supervised learning algorithm for SNNs based on temporal coding. A spiking neuron in this algorithm is designed to facilitate analog VLSI implementations with analog resistive memory, by which ultrahigh energy efficiency can be achieved. We also propose several techniques to improve the performance on recognition tasks and show that the classification accuracy of the proposed algorithm is as high as that of the state-of-the-art temporal coding SNN algorithms on the MNIST and Fashion-MNIST datasets. Finally, we discuss the robustness of the proposed SNNs against variations that arise from the device manufacturing process and are unavoidable in analog VLSI implementation. We also propose a technique to suppress the effects of variations in the manufacturing process on the recognition performance.

Sakemi Yusuke, Morino Kai, Morie Takashi, Aihara Kazuyuki

2021-Jul-19

General General

On sketch-based selections from scatterplots using KDE, compared to Mahalanobis and CNN brushing.

In IEEE computer graphics and applications

Fast and accurate brushing is crucial in visual data exploration and sketch-based solutions are successful methods. In this paper, we detail a solution, based on kernel density estimation (KDE), which computes a data subset selection in a scatterplot from a simple click-and-drag interaction. We explain, how this technique relates to two alternative approaches, i.e., Mahalanobis brushing and CNN brushing. To study this relation, we conducted two user studies and present both a quantitative three-fold comparison as well as additional details about the prevalence of all possible cases in that each technique succeeds/fails. With this, we also provide a comparison between empirical modeling and implicit modeling by deep learning in terms of accuracy, efficiency, generality and interpretability.

Fan Chaoran, Hauser Helwig

2021-Jul-19

General General

Establishing a second-generation artificial intelligence-based system for improving diagnosis, treatment, and monitoring of patients with rare diseases.

In European journal of human genetics : EJHG

Patients with rare diseases are a major challenge for healthcare systems. These patients face three major obstacles: late diagnosis and misdiagnosis, lack of proper response to therapies, and absence of valid monitoring tools. We reviewed the relevant literature on first-generation artificial intelligence (AI) algorithms which were designed to improve the management of chronic diseases. The shortage of big data resources and the inability to provide patients with clinical value limit the use of these AI platforms by patients and physicians. In the present study, we reviewed the relevant literature on the obstacles encountered in the management of patients with rare diseases. Examples of currently available AI platforms are presented. The use of second-generation AI-based systems that are patient-tailored is presented. The system provides a means for early diagnosis and a method for improving the response to therapies based on clinically meaningful outcome parameters. The system may offer a patient-tailored monitoring tool that is based on parameters that are relevant to patients and caregivers and provides a clinically meaningful tool for follow-up. The system can provide an inclusive solution for patients with rare diseases and ensures adherence based on clinical responses. It has the potential advantage of not being dependent on large datasets and is a dynamic system that adapts to ongoing changes in patients' disease and response to therapy.

Hurvitz Noa, Azmanov Henny, Kesler Asa, Ilan Yaron

2021-Jul-19

General General

Deep Learning Analysis in Prediction of COVID-19 Infection Status Using Chest CT Scan Features.

In Advances in experimental medicine and biology

Background and aims Non-contrast chest computed tomography (CT) scanning is one of the important tools for evaluating of lung lesions. The aim of this study was to use a deep learning approach for predicting the outcome of patients with COVID-19 into two groups of critical and non-critical according to their CT features. Methods This was carried out as a retrospective study from March to April 2020 in Baqiyatallah Hospital, Tehran, Iran. From total of 1078 patients with COVID-19 pneumonia who underwent chest CT, 169 were critical cases and 909 were non-critical. Deep learning neural networks were used to classify samples into critical or non-critical ones according to the chest CT results. Results The best accuracy of prediction was seen by the presence of diffuse opacities and lesion distribution (both=0.91, 95% CI: 0.83-0.99). The largest sensitivity was achieved using lesion distribution (0.74, 95% CI: 0.55-0.93), and the largest specificity was for presence of diffuse opacities (0.95, 95% CI: 0.9-1). The total model showed an accuracy of 0.89 (95% CI: 0.79-0.99), and the corresponding sensitivity and specificity were 0.71 (95% CI: 0.51-0.91) and 0.93 (95% CI: 0.87-0.96), respectively. Conclusions The results showed that CT scan can accurately classify and predict critical and non-critical COVID-19 cases.

Pourhoseingholi Asma, Vahedi Mohsen, Chaibakhsh Samira, Pourhoseingholi Mohamad Amin, Vahedian-Azimi Amir, Guest Paul C, Rahimi-Bashar Farshid, Sahebkar Amirhossein

2021

COVID-2019, Chest CT scan, Computed tomography, Deep learning, Prediction

General General

A Visual Analytics Approach for Structural Differences among Graphs via Deep Learning.

In IEEE computer graphics and applications

Representing and analyzing structural differences among graphs help gain insight into the difference related patterns such as dynamic evolutions of graphs. Conventional solutions leverage representation learning techniques to encode structural information, but lack of an intuitive way of studying structural semantics of graphs. In this paper, we propose a representation-and-analysis scheme for structural differences among graphs. We propose a deep learning based embedding technique (Delta2vec) to encode multiple graphs while preserving semantics of structural differences. We design and implement a web-based visual analytics system to support comparative study of features learned from the embeddings. One distinctive feature of our approach is that it supports semantics-aware construction, quantification, and investigation of latent relations encoded in graphs. We validate the usability and effectiveness of our approach through case studies with three datasets.

Han Dongming, Pan Jiacheng, Xie Cong, Zhao Xiaodong, Luo Xiao-Nan, Chen Wei

2021-Jul-19

General General

Deep Learning Analysis in Prediction of COVID-19 Infection Status Using Chest CT Scan Features.

In Advances in experimental medicine and biology

Background and aims Non-contrast chest computed tomography (CT) scanning is one of the important tools for evaluating of lung lesions. The aim of this study was to use a deep learning approach for predicting the outcome of patients with COVID-19 into two groups of critical and non-critical according to their CT features. Methods This was carried out as a retrospective study from March to April 2020 in Baqiyatallah Hospital, Tehran, Iran. From total of 1078 patients with COVID-19 pneumonia who underwent chest CT, 169 were critical cases and 909 were non-critical. Deep learning neural networks were used to classify samples into critical or non-critical ones according to the chest CT results. Results The best accuracy of prediction was seen by the presence of diffuse opacities and lesion distribution (both=0.91, 95% CI: 0.83-0.99). The largest sensitivity was achieved using lesion distribution (0.74, 95% CI: 0.55-0.93), and the largest specificity was for presence of diffuse opacities (0.95, 95% CI: 0.9-1). The total model showed an accuracy of 0.89 (95% CI: 0.79-0.99), and the corresponding sensitivity and specificity were 0.71 (95% CI: 0.51-0.91) and 0.93 (95% CI: 0.87-0.96), respectively. Conclusions The results showed that CT scan can accurately classify and predict critical and non-critical COVID-19 cases.

Pourhoseingholi Asma, Vahedi Mohsen, Chaibakhsh Samira, Pourhoseingholi Mohamad Amin, Vahedian-Azimi Amir, Guest Paul C, Rahimi-Bashar Farshid, Sahebkar Amirhossein

2021

COVID-2019, Chest CT scan, Computed tomography, Deep learning, Prediction

oncology Oncology

Prediction of COVID-19 deterioration in high-risk patients at diagnosis: an early warning score for advanced COVID-19 developed by machine learning.

In Infection

PURPOSE : While more advanced COVID-19 necessitates medical interventions and hospitalization, patients with mild COVID-19 do not require this. Identifying patients at risk of progressing to advanced COVID-19 might guide treatment decisions, particularly for better prioritizing patients in need for hospitalization.

METHODS : We developed a machine learning-based predictor for deriving a clinical score identifying patients with asymptomatic/mild COVID-19 at risk of progressing to advanced COVID-19. Clinical data from SARS-CoV-2 positive patients from the multicenter Lean European Open Survey on SARS-CoV-2 Infected Patients (LEOSS) were used for discovery (2020-03-16 to 2020-07-14) and validation (data from 2020-07-15 to 2021-02-16).

RESULTS : The LEOSS dataset contains 473 baseline patient parameters measured at the first patient contact. After training the predictor model on a training dataset comprising 1233 patients, 20 of the 473 parameters were selected for the predictor model. From the predictor model, we delineated a composite predictive score (SACOV-19, Score for the prediction of an Advanced stage of COVID-19) with eleven variables. In the validation cohort (n = 2264 patients), we observed good prediction performance with an area under the curve (AUC) of 0.73 ± 0.01. Besides temperature, age, body mass index and smoking habit, variables indicating pulmonary involvement (respiration rate, oxygen saturation, dyspnea), inflammation (CRP, LDH, lymphocyte counts), and acute kidney injury at diagnosis were identified. For better interpretability, the predictor was translated into a web interface.

CONCLUSION : We present a machine learning-based predictor model and a clinical score for identifying patients at risk of developing advanced COVID-19.

Jakob Carolin E M, Mahajan Ujjwal Mukund, Oswald Marcus, Stecher Melanie, Schons Maximilian, Mayerle Julia, Rieg Siegbert, Pletz Mathias, Merle Uta, Wille Kai, Borgmann Stefan, Spinner Christoph D, Dolff Sebastian, Scherer Clemens, Pilgram Lisa, Rüthrich Maria, Hanses Frank, Hower Martin, Strauß Richard, Massberg Steffen, Er Ahmet Görkem, Jung Norma, Vehreschild Jörg Janne, Stubbe Hans, Tometten Lukas, König Rainer

2021-Jul-19

Advanced stage, COVID-19, Complicated stage, LEOSS, Machine learning, Predictive model

Pathology Pathology

Can AI-assisted microscope facilitate breast HER2 interpretation? A multi-institutional ring study.

In Virchows Archiv : an international journal of pathology

The level of human epidermal growth factor receptor-2 (HER2) protein and gene expression in breast cancer is an essential factor in judging the prognosis of breast cancer patients. Several investigations have shown high intraobserver and interobserver variability in the evaluation of HER2 staining by visual examination. In this study, we aim to propose an artificial intelligence (AI)-assisted microscope to improve the HER2 assessment accuracy and reliability. Our AI-assisted microscope was equipped with a conventional microscope with a cell-level classification-based HER2 scoring algorithm and an augmented reality module to enable pathologists to obtain AI results in real time. We organized a three-round ring study of 50 infiltrating duct carcinoma not otherwise specified (NOS) cases without neoadjuvant treatment, and recruited 33 pathologists from 6 hospitals. In the first ring study (RS1), the pathologists read 50 HER2 whole-slide images (WSIs) through an online system. After a 2-week washout period, they read the HER2 slides using a conventional microscope in RS2. After another 2-week washout period, the pathologists used our AI microscope for assisted interpretation in RS3. The consistency and accuracy of HER2 assessment by the AI-assisted microscope were significantly improved (p < 0.001) over those obtained using a conventional microscope and online WSI. Specifically, our AI-assisted microscope improved the precision of immunohistochemistry (IHC) 3 + and 2 + scoring while ensuring the recall of fluorescent in situ hybridization (FISH)-positive results in IHC 2 + . Also, the average acceptance rate of AI for all pathologists was 0.90, demonstrating that the pathologists agreed with most AI scoring results.

Yue Meng, Zhang Jun, Wang Xinran, Yan Kezhou, Cai Lijing, Tian Kuan, Niu Shuyao, Han Xiao, Yu Yongqiang, Huang Junzhou, Han Dandan, Yao Jianhua, Liu Yueping

2021-Jul-19

Artificial intelligence–assisted microscope, Breast cancer, HER2

General General

Towards automatic diagnosis of rheumatic heart disease on echocardiographic exams through video-based deep learning.

In Journal of the American Medical Informatics Association : JAMIA

OBJECTIVE : Rheumatic heart disease (RHD) affects an estimated 39 million people worldwide and is the most common acquired heart disease in children and young adults. Echocardiograms are the gold standard for diagnosis of RHD, but there is a shortage of skilled experts to allow widespread screenings for early detection and prevention of the disease progress. We propose an automated RHD diagnosis system that can help bridge this gap.

MATERIALS AND METHODS : Experiments were conducted on a dataset with 11 646 echocardiography videos from 912 exams, obtained during screenings in underdeveloped areas of Brazil and Uganda. We address the challenges of RHD identification with a 3D convolutional neural network (C3D), comparing its performance with a 2D convolutional neural network (VGG16) that is commonly used in the echocardiogram literature. We also propose a supervised aggregation technique to combine video predictions into a single exam diagnosis.

RESULTS : The proposed approach obtained an accuracy of 72.77% for exam diagnosis. The results for the C3D were significantly better than the ones obtained by the VGG16 network for videos, showing the importance of considering the temporal information during the diagnostic. The proposed aggregation model showed significantly better accuracy than the majority voting strategy and also appears to be capable of capturing underlying biases in the neural network output distribution, balancing them for a more correct diagnosis.

CONCLUSION : Automatic diagnosis of echo-detected RHD is feasible and, with further research, has the potential to reduce the workload of experts, enabling the implementation of more widespread screening programs worldwide.

Martins João Francisco B S, Nascimento Erickson R, Nascimento Bruno R, Sable Craig A, Beaton Andrea Z, Ribeiro Antônio L, Meira Wagner, Pappa Gisele L

2021-Jul-19

deep learning, echocardiography, low-cost imaging, meta-learning, rheumatic heart disease, screening

General General

Machine learning for initial insulin estimation in hospitalized patients.

In Journal of the American Medical Informatics Association : JAMIA

OBJECTIVE : The study sought to determine whether machine learning can predict initial inpatient total daily dose (TDD) of insulin from electronic health records more accurately than existing guideline-based dosing recommendations.

MATERIALS AND METHODS : Using electronic health records from a tertiary academic center between 2008 and 2020 of 16 848 inpatients receiving subcutaneous insulin who achieved target blood glucose control of 100-180 mg/dL on a calendar day, we trained an ensemble machine learning algorithm consisting of regularized regression, random forest, and gradient boosted tree models for 2-stage TDD prediction. We evaluated the ability to predict patients requiring more than 6 units TDD and their point-value TDDs to achieve target glucose control.

RESULTS : The method achieves an area under the receiver-operating characteristic curve of 0.85 (95% confidence interval [CI], 0.84-0.87) and area under the precision-recall curve of 0.65 (95% CI, 0.64-0.67) for classifying patients who require more than 6 units TDD. For patients requiring more than 6 units TDD, the mean absolute percent error in dose prediction based on standard clinical calculators using patient weight is in the range of 136%-329%, while the regression model based on weight improves to 60% (95% CI, 57%-63%), and the full ensemble model further improves to 51% (95% CI, 48%-54%).

DISCUSSION : Owing to the narrow therapeutic window and wide individual variability, insulin dosing requires adaptive and predictive approaches that can be supported through data-driven analytic tools.

CONCLUSIONS : Machine learning approaches based on readily available electronic medical records can discriminate which inpatients will require more than 6 units TDD and estimate individual doses more accurately than standard guidelines and practices.

Nguyen Minh, Jankovic Ivana, Kalesinskas Laurynas, Baiocchi Michael, Chen Jonathan H

2021-Jul-19

clinical decision support, diabetes mellitus, insulin, machine learning, medical informatics

General General

AVPIden: a new scheme for identification and functional prediction of antiviral peptides based on machine learning approaches.

In Briefings in bioinformatics

Antiviral peptide (AVP) is a kind of antimicrobial peptide (AMP) that has the potential ability to fight against virus infection. Machine learning-based prediction with a computational biology approach can facilitate the development of the novel therapeutic agents. In this study, we proposed a double-stage classification scheme, named AVPIden, for predicting the AVPs and their functional activities against different viruses. The first stage is to distinguish the AVP from a broad-spectrum peptide collection, including not only the regular peptides (non-AMP) but also the AMPs without antiviral functions (non-AVP). The second stage is responsible for characterizing one or more virus families or species that the AVP targets. Imbalanced learning is utilized to improve the performance of prediction. The AVPIden uses multiple descriptors to precisely demonstrate the peptide properties and adopts explainable machine learning strategies based on Shapley value to exploit how the descriptors impact the antiviral activities. Finally, the evaluation performance of the proposed model suggests its ability to predict the antivirus activities and their potential functions against six virus families (Coronaviridae, Retroviridae, Herpesviridae, Paramyxoviridae, Orthomyxoviridae, Flaviviridae) and eight kinds of virus (FIV, HCV, HIV, HPIV3, HSV1, INFVA, RSV, SARS-CoV). The AVPIden gives an option for reinforcing the development of AVPs with the computer-aided method and has been deployed at http://awi.cuhk.edu.cn/AVPIden/.

Pang Yuxuan, Yao Lantian, Jhong Jhih-Hua, Wang Zhuo, Lee Tzong-Yi

2021-Jul-19

antimicrobial peptide, antiviral peptide, imbalanced learning, machine learning

General General

Automatization and self-maintenance of the O-GlcNAcome catalog: a smart scientific database.

In Database : the journal of biological databases and curation

Post-translational modifications (PTMs) are ubiquitous and essential for protein function and signaling, motivating the need for sustainable benefit and open models of web databases. Highly conserved O-GlcNAcylation is a case example of one of the most recently discovered PTMs, investigated by a growing community. Historically, details about O-GlcNAcylated proteins and sites were dispersed across literature and in non-O-GlcNAc-focused, rapidly outdated or now defunct web databases. In a first effort to fill the gap, we recently published a human O-GlcNAcome catalog with a basic web interface. Based on the enthusiasm generated by this first resource, we extended our O-GlcNAcome catalog to include data from 42 distinct organisms and released the O-GlcNAc Database v1.2. In this version, more than 14 500 O-GlcNAcylated proteins and 11 000 O-GlcNAcylation sites are referenced from the curation of 2200 publications. In this article, we also present the extensive features of the O-GlcNAc Database, including the user-friendly interface, back-end and client-server interactions. We particularly emphasized our workflow, involving a mostly automatized and self-maintained database, including machine learning approaches for text mining. We hope that this software model will be useful beyond the O-GlcNAc community, to set up new smart, scientific online databases, in a short period of time. Indeed, this database system can be administrated with little to no programming skills and is meant to be an example of a useful, sustainable and cost-efficient resource, which exclusively relies on free open-source software elements (www.oglcnac.mcw.edu).

Malard Florian, Wulff-Fuentes Eugenia, Berendt Rex R, Didier Guillaume, Olivier-Van Stichelen Stephanie

2021-Jul-19

General General

A Molecular Simulation Study of Silica/Polysulfone Mixed Matrix Membrane for Mixed Gas Separation.

In Polymers

Polysulfone-based mixed matrix membranes (MMMs) incorporated with silica nanoparticles are a new generation material under ongoing research and development for gas separation. However, the attributes of a better-performing MMM cannot be precisely studied under experimental conditions. Thus, it requires an atomistic scale study to elucidate the separation performance of silica/polysulfone MMMs. As most of the research work and empirical models for gas transport properties have been limited to pure gas, a computational framework for molecular simulation is required to study the mixed gas transport properties in silica/polysulfone MMMs to reflect real membrane separation. In this work, Monte Carlo (MC) and molecular dynamics (MD) simulations were employed to study the solubility and diffusivity of CO2/CH4 with varying gas concentrations (i.e., 30% CO2/CH4, 50% CO2/CH4, and 70% CO2/CH4) and silica content (i.e., 15-30 wt.%). The accuracy of the simulated structures was validated with published literature, followed by the study of the gas transport properties at 308.15 K and 1 atm. Simulation results concluded an increase in the free volume with an increasing weight percentage of silica. It was also found that pure gas consistently exhibited higher gas transport properties when compared to mixed gas conditions. The results also showed a competitive gas transport performance for mixed gases, which is more apparent when CO2 increases. In this context, an increment in the permeation was observed for mixed gas with increasing gas concentrations (i.e., 70% CO2/CH4 > 50% CO2/CH4 > 30% CO2/CH4). The diffusivity, solubility, and permeability of the mixed gases were consistently increasing until 25 wt.%, followed by a decrease for 30 wt.% of silica. An empirical model based on a parallel resistance approach was developed by incorporating mathematical formulations for solubility and permeability. The model results were compared with simulation results to quantify the effect of mixed gas transport, which showed an 18% and 15% percentage error for the permeability and solubility, respectively, in comparison to the simulation data. This study provides a basis for future understanding of MMMs using molecular simulations and modeling techniques for mixed gas conditions that demonstrate real membrane separation.

Asif Khadija, Lock Serene Sow Mun, Taqvi Syed Ali Ammar, Jusoh Norwahyu, Yiin Chung Loong, Chin Bridgid Lai Fui, Loy Adrian Chun Minh

2021-Jul-01

CO2/CH4 gas transport, empirical modelling, mixed gas, mixed matrix membrane, molecular simulation, polysulfone, silica

General General

Design, Development, and Evaluation of a Telemedicine Platform for Patients With Sleep Apnea (Ognomy): Design Science Research Approach.

In JMIR formative research

BACKGROUND : With an aging population and the escalating cost of care, telemedicine has become a societal imperative. Telemedicine alternatives are especially relevant to patients seeking care for sleep apnea, with its prevalence approaching one billion cases worldwide. Increasing awareness has led to a surge in demand for sleep apnea care; however, there is a shortage of the resources and expertise necessary to cater to the rising demand.

OBJECTIVE : The aim of this study is to design, develop, and evaluate a telemedicine platform, called Ognomy, for the consultation, diagnosis, and treatment of patients with sleep apnea.

METHODS : Using the design science research methodology, we developed a telemedicine platform for patients with sleep apnea. To explore the problem, in the analysis phase, we conducted two brainstorming workshops and structured interviews with 6 subject matter experts to gather requirements. Following that, we conducted three design and architectural review sessions to define and evaluate the system architecture. Subsequently, we conducted 14 formative usability assessments to improve the user interface of the system. In addition, 3 trained test engineers performed end-to-end system testing to comprehensively evaluate the platform.

RESULTS : Patient registration and data collection, physician appointments, video consultation, and patient progress tracking have emerged as critical functional requirements. A telemedicine platform comprising four artifacts-a mobile app for patients, a web app for providers, a dashboard for reporting, and an artificial intelligence-based chatbot for customer onboarding and support-was developed to meet these requirements. Design reviews emphasized the need for a highly cohesive but loosely coupled interaction among the platform's components, which was achieved through a layered modular architecture using third-party application programming interfaces. In contrast, critical findings from formative usability assessments focused on the need for a more straightforward onboarding process for patients, better status indicators during patient registration, and reorganization of the appointment calendar. Feedback from the design reviews and usability assessments was translated into technical improvements and design enhancements that were implemented in subsequent iterations.

CONCLUSIONS : Sleep apnea is an underdiagnosed and undertreated condition. However, with increasing awareness, the demand for quality sleep apnea care is likely to surge, and creative alternatives are needed. The results of this study demonstrate the successful application of a framework using a design science research paradigm to design, develop, and evaluate a telemedicine platform for patients with sleep apnea and their providers.

Mulgund Pavankumar, Sharman Raj, Rifkin Daniel, Marrazzo Sam

2021-Jul-19

design science research, mHealth, mobile health, mobile phone, sleep apnea, sleep apnea care, telemedicine, telemedicine platform, web application

General General

Portable Device Improves the Detection of Atrial Fibrillation After Ablation.

In International heart journal

Asymptomatic recurrences of atrial fibrillation (AF) have been found to be common after ablation.A randomized controlled trial of AF screening using a handheld single-lead ECG monitor (BigThumb®) or a traditional follow-up strategy was conducted in patients with non-valvular AF after catheter ablation. Consecutive patients were randomized to either BigThumb Group (BT Group) or Traditional Follow-up Group (TF Group). The ECGs collected via BigThumb were compared using the automated AF detection algorithm, artificial intelligence (AI) algorithm, and cardiologists' manual review. Subsequent changes in adherence to oral anticoagulation of patients were also recorded. In this study, we examined 218 patients (109 in each group). After a follow-up of 345.4 ± 60.2 days, AF-free survival rate was 64.2% in BT Group and 78.9% in TF Group (P = 0.0163), with more adherence to oral anticoagulation in BT Group (P = 0.0052). The participants in the BT Group recorded 26133 ECGs, among which 3299 (12.6%) were diagnosed as AF by cardiologists' manual review. The sensitivity and specificity of the AI algorithm were 94.4% and 98.5% respectively, which are significantly higher than the automated AF detection algorithm (90.7% and 96.2%).As per our findings, it was determined that follow-up after AF ablation using BigThumb leads to a more frequent detection of AF recurrence and more adherence to oral anticoagulation. AI algorithm improves the accuracy of ECG diagnosis and has the potential to reduce the manual review.

Huang Songqun, Zhao Teng, Liu Chao, Qin Aihong, Dong Shaohua, Yuan Binhang, Xing Wenhui, Guo Zhifu, Huang Xinmiao, Cha Yongmei, Cao Jiang

2021-Jul-17

Artificial intelligence algorithm, BigThumb, Catheter ablation, Follow-up strategy, Rhythm monitoring

oncology Oncology

Prediction of COVID-19 deterioration in high-risk patients at diagnosis: an early warning score for advanced COVID-19 developed by machine learning.

In Infection

PURPOSE : While more advanced COVID-19 necessitates medical interventions and hospitalization, patients with mild COVID-19 do not require this. Identifying patients at risk of progressing to advanced COVID-19 might guide treatment decisions, particularly for better prioritizing patients in need for hospitalization.

METHODS : We developed a machine learning-based predictor for deriving a clinical score identifying patients with asymptomatic/mild COVID-19 at risk of progressing to advanced COVID-19. Clinical data from SARS-CoV-2 positive patients from the multicenter Lean European Open Survey on SARS-CoV-2 Infected Patients (LEOSS) were used for discovery (2020-03-16 to 2020-07-14) and validation (data from 2020-07-15 to 2021-02-16).

RESULTS : The LEOSS dataset contains 473 baseline patient parameters measured at the first patient contact. After training the predictor model on a training dataset comprising 1233 patients, 20 of the 473 parameters were selected for the predictor model. From the predictor model, we delineated a composite predictive score (SACOV-19, Score for the prediction of an Advanced stage of COVID-19) with eleven variables. In the validation cohort (n = 2264 patients), we observed good prediction performance with an area under the curve (AUC) of 0.73 ± 0.01. Besides temperature, age, body mass index and smoking habit, variables indicating pulmonary involvement (respiration rate, oxygen saturation, dyspnea), inflammation (CRP, LDH, lymphocyte counts), and acute kidney injury at diagnosis were identified. For better interpretability, the predictor was translated into a web interface.

CONCLUSION : We present a machine learning-based predictor model and a clinical score for identifying patients at risk of developing advanced COVID-19.

Jakob Carolin E M, Mahajan Ujjwal Mukund, Oswald Marcus, Stecher Melanie, Schons Maximilian, Mayerle Julia, Rieg Siegbert, Pletz Mathias, Merle Uta, Wille Kai, Borgmann Stefan, Spinner Christoph D, Dolff Sebastian, Scherer Clemens, Pilgram Lisa, Rüthrich Maria, Hanses Frank, Hower Martin, Strauß Richard, Massberg Steffen, Er Ahmet Görkem, Jung Norma, Vehreschild Jörg Janne, Stubbe Hans, Tometten Lukas, König Rainer

2021-Jul-19

Advanced stage, COVID-19, Complicated stage, LEOSS, Machine learning, Predictive model

General General

Protective Behavioral Strategies and Alcohol Use While Pregaming: The Moderating Role of Depression and Anxiety Symptoms.

In Substance use & misuse ; h5-index 30.0

The study evaluated the moderating role of anxiety and depression symptoms on the association between subscales on the Protective Behavioral Strategies for Pregaming (PBSP) scale (safety and familiarity, setting drink limits, pacing drinking, and minimizing intoxication) and alcohol consumption during pregaming. Methods: Participants were 359 traditional age undergraduate college students (M = 20, SD = 1.37; 61.7% female; 61.2% White) who reported pregaming in the past year. All participants completed measures through an online survey which evaluated PBSP, depression and anxiety symptoms, and alcohol use during pregame events in the past month. Results: Among students with high depression symptoms, the more frequent use of PBSP to minimize intoxication was not associated with alcohol consumption levels, whereas among those with low depression symptoms, higher use of PBSP to minimize intoxication was associated with higher alcohol consumption. Among those with high anxiety symptoms, the more frequent use of PBSP to minimize intoxication was associated with lower alcohol consumption at pregaming events, whereas among those with low anxiety symptoms, the use of this PBSP was associated with higher alcohol consumption. The more frequent use of PBSP related to safety and familiarity among those with high anxiety symptoms was unrelated to alcohol consumption during pregaming, whereas among those low in anxiety symptoms, the more frequent use of this PBSP was associated with lower alcohol consumption. Conclusion: The findings begin to inform clinical care and intervention techniques aimed at reducing harm associated with risky drinking practices among a vulnerable subset of college students.

Hummer Justin F, Davis Jordan P, Christie Nina, Pedersen Eric R

2021-Jul-19

Pregaming, alcohol use, college students, mental health, protective behavioral strategies

Ophthalmology Ophthalmology

Machine learning in optical coherence tomography angiography.

In Experimental biology and medicine (Maywood, N.J.)

Optical coherence tomography angiography (OCTA) offers a noninvasive label-free solution for imaging retinal vasculatures at the capillary level resolution. In principle, improved resolution implies a better chance to reveal subtle microvascular distortions associated with eye diseases that are asymptomatic in early stages. However, massive screening requires experienced clinicians to manually examine retinal images, which may result in human error and hinder objective screening. Recently, quantitative OCTA features have been developed to standardize and document retinal vascular changes. The feasibility of using quantitative OCTA features for machine learning classification of different retinopathies has been demonstrated. Deep learning-based applications have also been explored for automatic OCTA image analysis and disease classification. In this article, we summarize recent developments of quantitative OCTA features, machine learning image analysis, and classification.

Le David, Son Taeyoon, Yao Xincheng

2021-Jul-19

Retina, artificial intelligence, convolutional neural network, deep learning, machine learning, optical coherence tomography angiography, retinopathy

General General

Flexible Artificial Synapses with a Biocompatible Maltose-Ascorbic Acid Electrolyte Gate for Neuromorphic Computing.

In ACS applied materials & interfaces ; h5-index 147.0

As constructing hardware technology is widely regarded as an important step toward realizing brain-like computers and artificial intelligence systems, the development of artificial synaptic electronics that can simulate biological synaptic functions is an emerging research field. Among the various types of artificial synapses, synaptic transistors using an electrolyte as the gate electrode have been implemented as the high capacitance of the electrolyte increases the driving current and lowers operating voltages. Here, transistors using maltose-ascorbic acid as the proton-conducting electrolyte are proposed. A novel electrolyte composed of maltose and ascorbic acid, both of which are biocompatible, enables the migration of protons. This allows the channel conductance of the transistors to be modulated with the gate input pulse voltage, and fundamental synaptic functions including excitatory postsynaptic current, paired-pulse facilitation, long-term potentiation, and long-term depression can be successfully emulated. Furthermore, the maltose-ascorbic acid electrolyte (MAE)-gated synaptic transistors exhibit high mechanical endurance, with near-linear conductivity modulation and repeatability after 1000 bending cycles under a curvature radius of 5 mm. Benefitting from its excellent biodegradability and biocompatibility, the proposed MAE has potential applications in environmentally friendly, economical, and high-performance neuromorphic electronics, which can be further applied to dermal electronics and implantable electronics in the future.

Qin Wei, Kang Byung Ha, Kim Hyun Jae

2021-Jul-19

artificial synapse, ascorbic acid, indium tin oxide, maltose, synaptic transistor

General General

Time-Frequency Decomposition of Scalp Electroencephalograms Improves Deep Learning-Based Epilepsy Diagnosis.

In International journal of neural systems

Epilepsy diagnosis based on Interictal Epileptiform Discharges (IEDs) in scalp electroencephalograms (EEGs) is laborious and often subjective. Therefore, it is necessary to build an effective IED detector and an automatic method to classify IED-free versus IED EEGs. In this study, we evaluate features that may provide reliable IED detection and EEG classification. Specifically, we investigate the IED detector based on convolutional neural network (ConvNet) with different input features (temporal, spectral, and wavelet features). We explore different ConvNet architectures and types, including 1D (one-dimensional) ConvNet, 2D (two-dimensional) ConvNet, and noise injection at various layers. We evaluate the EEG classification performance on five independent datasets. The 1D ConvNet with preprocessed full-frequency EEG signal and frequency bands (delta, theta, alpha, beta) with Gaussian additive noise at the output layer achieved the best IED detection results with a false detection rate of 0.23/min at 90% sensitivity. The EEG classification system obtained a mean EEG classification Leave-One-Institution-Out (LOIO) cross-validation (CV) balanced accuracy (BAC) of 78.1% (area under the curve (AUC) of 0.839) and Leave-One-Subject-Out (LOSO) CV BAC of 79.5% (AUC of 0.856). Since the proposed classification system only takes a few seconds to analyze a 30-min routine EEG, it may help in reducing the human effort required for epilepsy diagnosis.

Thangavel Prasanth, Thomas John, Peh Wei Yan, Jing Jin, Yuvaraj Rajamanickam, Cash Sydney S, Chaudhari Rima, Karia Sagar, Rathakrishnan Rahul, Saini Vinay, Shah Nilesh, Srivastava Rohit, Tan Yee-Leng, Westover Brandon, Dauwels Justin

2021-Jul-16

Deep learning, EEG classification, convolutional neural networks, interictal epileptiform discharges, multiple features, noise injection

Surgery Surgery

Does It Measure Up?

In World journal for pediatric & congenital heart surgery

Measuring outcomes in pediatric cardiac care has been one of the more widespread, and at the same time controversial and often polarizing, quality improvement initiatives undertaken in the medical field. Risk models, such as the Society of Thoracic Surgeons Congenital Heart Surgery Risk Model, have been developed to account for comorbidities while predicting the expected mortality for a given surgical encounter. In this issue of the journal, Bertsimas and colleagues report on machine learning approaches to predict adverse outcomes in congenital heart surgery using the European Congenital Heart Surgeons Association's congenital database. A head-to-head comparison of machine learning models and the currently available risk models utilizing the same data set are required to better understand the strengths and weaknesses of each of these approaches. Such a focused analysis will shed light on future approaches for risk modeling, which will undoubtedly continue to benefit from the guidance provided by expert clinical intuition.

Kumar S Ram

2021-Jul

General General

Machine Learning to Predict Quasicrystals from Chemical Compositions.

In Advanced materials (Deerfield Beach, Fla.)

Quasicrystals have emerged as the third class of solid-state materials, distinguished from periodic crystals and amorphous solids, which have long-range order without periodicity exhibiting rotational symmetries that are disallowed for periodic crystals in most cases. To date, more than one hundred stable quasicrystals have been reported, leading to the discovery of many new and exciting phenomena. However, the pace of the discovery of new quasicrystals has lowered in recent years, largely owing to the lack of clear guiding principles for the synthesis of new quasicrystals. Here, it is shown that the discovery of new quasicrystals can be accelerated with a simple machine-learning workflow. With a list of the chemical compositions of known stable quasicrystals, approximant crystals, and ordinary crystals, a prediction model is trained to solve the three-class classification task and its predictability compared to the observed phase diagrams of ternary aluminum systems is evaluated. The validation experiments strongly support the superior predictive power of machine learning, with the overall prediction accuracy of the phase prediction task reaching ≈0.728. Furthermore, analyzing the input-output relationships black-boxed into the model, nontrivial empirical equations interpretable by humans that describe conditions necessary for stable quasicrystal formation are identified.

Liu Chang, Fujita Erina, Katsura Yukari, Inada Yuki, Ishikawa Asuka, Tamura Ryuji, Kimura Kaoru, Yoshida Ryo

2021-Jul-19

approximant crystals, high-throughput screening, machine learning, materials informatics, quasicrystals

General General

Predicting plasmid persistence in microbial communities by coarse-grained modeling.

In BioEssays : news and reviews in molecular, cellular and developmental biology

Plasmids are a major type of mobile genetic elements (MGEs) that mediate horizontal gene transfer. The stable maintenance of plasmids plays a critical role in the functions and survival for microbial populations. However, predicting and controlling plasmid persistence and abundance in complex microbial communities remain challenging. Computationally, this challenge arises from the combinatorial explosion associated with the conventional modeling framework. Recently, a plasmid-centric framework (PCF) has been developed to overcome this computational bottleneck. This framework enables the derivation of a simple metric, the persistence potential, to predict plasmid persistence and abundance. Here, we discuss how PCF can be extended to account for plasmid interactions. We also discuss how such model-guided predictions of plasmid fates can benefit from the development of new experimental tools and data-driven computational methods.

Wang Teng, Weiss Andrea, Ha Yuanchi, You Lingchong

2021-Jul-18

coarse-grained model, horizontal gene transfer, machine learning, microbial communities, mobile genetic elements, next generation sequencing, plasmid persistence

General General

A Supervised Image Registration Approach for Late Gadolinium Enhanced MRI and Cine Cardiac MRI Using Convolutional Neural Networks.

In Medical image understanding and analysis : 24th Annual Conference, MIUA 2020, Oxford, UK, July 15-17, 2020, Proceedings. Medical Image Understanding and Analysis (Conference) (24th : 2020 : Online)

Late gadolinium enhanced (LGE) cardiac magnetic resonance (CMR) imaging is the current gold standard for assessing myocardium viability for patients diagnosed with myocardial infarction, myocarditis or cardiomyopathy. This imaging method enables the identification and quantification of myocardial tissue regions that appear hyper-enhanced. However, the delineation of the myocardium is hampered by the reduced contrast between the myocardium and the left ventricle (LV) blood-pool due to the gadolinium-based contrast agent. The balanced-Steady State Free Precession (bSSFP) cine CMR imaging provides high resolution images with superior contrast between the myocardium and the LV blood-pool. Hence, the registration of the LGE CMR images and the bSSFP cine CMR images is a vital step for accurate localization and quantification of the compromised myocardial tissue. Here, we propose a Spatial Transformer Network (STN) inspired convolutional neural network (CNN) architecture to perform supervised registration of bSSFP cine CMR and LGE CMR images. We evaluate our proposed method on the 2019 Multi-Sequence Cardiac Magnetic Resonance Segmentation Challenge (MS-CMRSeg) dataset and use several evaluation metrics, including the center-to-center LV and right ventricle (RV) blood-pool distance, and the contour-to-contour blood-pool and myocardium distance between the LGE and bSSFP CMR images. Specifically, we showed that our registration method reduced the bSSFP to LGE LV blood-pool center distance from 3.28mm before registration to 2.27mm post registration and RV blood-pool center distance from 4.35mm before registration to 2.52mm post registration. We also show that the average surface distance (ASD) between bSSFP and LGE is reduced from 2.53mm to 2.09mm, 1.78mm to 1.40mm and 2.42mm to 1.73mm for LV blood-pool, LV myocardium and RV blood-pool, respectively.

Upendra Roshan Reddy, Simon Richard, Linte Cristian A

2020-Jul

Cine cardiac MRI, Deep learning, Image registration, Late gadolinium enhanced MRI

General General

Detection of Junctional Ectopic Tachycardia by Central Venous Pressure.

In Artificial intelligence in medicine. Conference on Artificial Intelligence in Medicine (2005- )

Central venous pressure (CVP) is the blood pressure in the venae cavae, near the right atrium of the heart. This signal waveform is commonly collected in clinical settings, and yet there has been limited discussion of using this data for detecting arrhythmia and other cardiac events. In this paper, we develop a signal processing and feature engineering pipeline for CVP waveform analysis. Through a case study on pediatric junctional ectopic tachycardia (JET), we show that our extracted CVP features reliably detect JET with comparable results to the more commonly used electrocardiogram (ECG) features. This machine learning pipeline can thus improve the clinical diagnosis and ICU monitoring of arrhythmia. It also corroborates and complements the ECG-based diagnosis, especially when the ECG measurements are unavailable or corrupted.

Tan Xin, Dai Yanwan, Humayun Ahmed Imtiaz, Chen Haoze, Allen Genevera I, Jain Parag N

2021-Jun

Automatic Arrythmia Detection, Central Venous Pressure, Junctional Ectopic Tachycardia, Physiological Signal Feature Extraction

General General

Double-jeopardy: scRNA-seq doublet/multiplet detection using multi-omic profiling.

In Cell reports methods

The computational detection and exclusion of cellular doublets and/or multiplets is a cornerstone for the identification the true biological signals from single-cell RNA sequencing (scRNA-seq) data. Current methods do not sensitively identify both heterotypic and homotypic doublets and/or multiplets. Here, we describe a machine learning approach for doublet/multiplet detection utilizing VDJ-seq and/or CITE-seq data to predict their presence based on transcriptional features associated with identified hybrid droplets. This approach highlights the utility of leveraging multi-omic single-cell information for the generation of high-quality datasets. Our method has high sensitivity and specificity in inflammatory-cell-dominant scRNA-seq samples, thus presenting a powerful approach to ensuring high-quality scRNA-seq data.

Sun Bo, Bugarin-Estrada Emmanuel, Overend Lauren Elizabeth, Walker Catherine Elizabeth, Tucci Felicia Anna, Bashford-Rogers Rachael Jennifer Mary

2021-May-24

ADT, B cell receptor, CITE-seq, T cell receptor, doublets, multi-omics profiling, single-cell transcriptomics

Pathology Pathology

Integration of histopathological images and multi-dimensional omics analyses predicts molecular features and prognosis in high-grade serous ovarian cancer.

In Gynecologic oncology ; h5-index 67.0

OBJECTIVE : This study used histopathological image features to predict molecular features, and combined with multi-dimensional omics data to predict overall survival (OS) in high-grade serous ovarian cancer (HGSOC).

METHODS : Patients from The Cancer Genome Atlas (TCGA) were distributed into training set (n = 115) and test set (n = 114). In addition, we collected tissue microarrays of 92 patients as an external validation set. Quantitative features were extracted from histopathological images using CellProfiler, and utilized to establish prediction models by machine learning methods in training set. The prediction performance was assessed in test set and validation set.

RESULTS : The prediction models were able to identify BRCA1 mutation (AUC = 0.952), BRCA2 mutation (AUC = 0.912), microsatellite instability-high (AUC = 0.919), microsatellite stable (AUC = 0.924), and molecular subtypes: proliferative (AUC = 0.961), differentiated (AUC = 0.952), immunoreactive (AUC = 0.941), mesenchymal (AUC = 0.918) in test set. The prognostic model based on histopathological image features could predict OS in test set (5-year AUC = 0.825) and validation set (5-year AUC = 0.703). We next explored the integrative prognostic models of image features, genomics, transcriptomics and proteomics. In test set, the models combining two omics had higher prediction accuracy, such as image features and genomics (5-year AUC = 0.834). The multi-omics model including all features showed the best prediction performance (5-year AUC = 0.911). According to risk score of multi-omics model, the high-risk and low-risk groups had significant survival differences (HR = 18.23, p < 0.001).

CONCLUSIONS : These results indicated the potential ability of histopathological image features to predict above molecular features and survival risk of HGSOC patients. The integration of image features and multi-omics data may improve prognosis prediction in HGSOC patients.

Zeng Hao, Chen Linyan, Zhang Mingxuan, Luo Yuling, Ma Xuelei

2021-Jul-15

Genomics, Histopathology, Ovarian cancer, Proteomics, Transcriptomics

oncology Oncology

Comparison of Clinical Characteristics Among COVID-19 and Non-COVID-19 Pediatric Pneumonias: A Multicenter Cross-Sectional Study.

In Frontiers in cellular and infection microbiology ; h5-index 53.0

Background : The pandemic of Coronavirus Disease 2019 (COVID-19) brings new challenges for pediatricians, especially in the differentiation with non-COVID-19 pneumonia in the peak season of pneumonia. We aimed to compare the clinical characteristics of pediatric patients with COVID-19 and other respiratory pathogens infected pneumonias.

Methods : We conducted a multi-center, cross-sectional study of pediatric inpatients in China. Based on pathogenic test results, pediatric patients were divided into three groups, including COVID-19 pneumonia group, Non-COVID-19 viral (NCV) pneumonia group and Non-viral (NV) pneumonia group. Their clinical characteristics were compared by Kruskal-Wallis H test or chi-square test.

Results : A total of 636 pediatric pneumonia inpatients, among which 87 in COVID-19 group, 194 in NCV group, and 355 in NV group, were included in analysis. Compared with NCV and NV patients, COVID-19 patients were older (median age 6.33, IQR 2.00-12.00 years), and relatively fewer COVID-19 patients presented fever (63.2%), cough (60.9%), shortness of breath (1.1%), and abnormal pulmonary auscultation (18.4%). The results were verified by the comparison of COVID-19, respiratory syncytial virus (RSV) and influenza A (IFA) pneumonia patients. Approximately 42.5%, 44.8%, and 12.6% of the COVID-19 patients presented simply ground-glass opacity (GGO), simply consolidation, and the both changes on computed tomography (CT) scans, respectively; the proportions were similar as those in NCV and NV group (p>0.05). Only 47.1% of COVID-19 patients had both lungs pneumonia, which was significantly lower than that proportion of nearly 80% in the other two groups. COVID-19 patients presented lower proportions of increased white blood cell count (16.5%) and abnormal procalcitonin (PCT) (10.7%), and a higher proportion of decreased lymphocyte count (44.0%) compared with the other two groups.

Conclusion : Majority clinical characteristics of pediatric COVID-19 pneumonia patients were milder than non-COVID-19 patients. However, lymphocytopenia remained a prominent feature of COVID-19 pediatric pneumonia.

Jia Zhongwei, Yan Xiangyu, Gao Liwei, Ding Shenggang, Bai Yan, Zheng Yuejie, Cui Yuxia, Wang Xianfeng, Li Jingfeng, Lu Gen, Xu Yi, Zhang Xiangyu, Li Junhua, Chen Ning, Shang Yunxiao, Han Mingfeng, Liu Jun, Zhou Hourong, Li Cen, Lu Wanqiu, Liu Jun, Wang Lina, Fan Qihong, Wu Jiang, Shen Hanling, Jiao Rong, Chen Chunxi, Gao Xiaoling, Tian Maoqiang, Lu Wei, Yang Yonghong, Wong Gary Wing-Kin, Wang Tianyou, Jin Runming, Shen Adong, Xu Baoping, Shen Kunling

2021

COVID-19 pneumonia, clinical characteristics, non-viral pneumonia, pediatric patients, viral pneumonia

General General

Responsible and Regulatory Conform Machine Learning for Medicine: A Survey of Technical Challenges and Solutions

ArXiv Preprint

Machine learning is expected to fuel significant improvements in medical care. To ensure that fundamental principles such as beneficence, respect for human autonomy, prevention of harm, justice, privacy, and transparency are respected, medical machine learning applications must be developed responsibly. In this paper, we survey the technical challenges involved in creating medical machine learning systems responsibly and in conformity with existing regulations, as well as possible solutions to address these challenges. We begin by providing a brief overview of existing regulations affecting medical machine learning, showing that properties such as safety, robustness, reliability, privacy, security, transparency, explainability, and nondiscrimination are all demanded already by existing law and regulations - albeit, in many cases, to an uncertain degree. Next, we discuss the underlying technical challenges, possible ways for addressing them, and their respective merits and drawbacks. We notice that distribution shift, spurious correlations, model underspecification, and data scarcity represent severe challenges in the medical context (and others) that are very difficult to solve with classical black-box deep neural networks. Important measures that may help to address these challenges include the use of large and representative datasets and federated learning as a means to that end, the careful exploitation of domain knowledge wherever feasible, the use of inherently transparent models, comprehensive model testing and verification, as well as stakeholder inclusion.

Eike Petersen, Yannik Potdevin, Esfandiar Mohammadi, Stephan Zidowitz, Sabrina Breyer, Dirk Nowotka, Sandra Henn, Ludwig Pechmann, Martin Leucker, Philipp Rostalski, Christian Herzog

2021-07-20

General General

Data Science and Analytics: An Overview from Data-Driven Smart Computing, Decision-Making and Applications Perspective.

In SN computer science

The digital world has a wealth of data, such as internet of things (IoT) data, business data, health data, mobile data, urban data, security data, and many more, in the current age of the Fourth Industrial Revolution (Industry 4.0 or 4IR). Extracting knowledge or useful insights from these data can be used for smart decision-making in various applications domains. In the area of data science, advanced analytics methods including machine learning modeling can provide actionable insights or deeper knowledge about data, which makes the computing process automatic and smart. In this paper, we present a comprehensive view on "Data Science" including various types of advanced analytics methods that can be applied to enhance the intelligence and capabilities of an application through smart decision-making in different scenarios. We also discuss and summarize ten potential real-world application domains including business, healthcare, cybersecurity, urban and rural data science, and so on by taking into account data-driven smart computing and decision making. Based on this, we finally highlight the challenges and potential research directions within the scope of our study. Overall, this paper aims to serve as a reference point on data science and advanced analytics to the researchers and decision-makers as well as application developers, particularly from the data-driven solution point of view for real-world problems.

Sarker Iqbal H

2021

Advanced analytics, Data science, Data science applications, Decision-making, Deep learning, Machine learning, Predictive analytics, Smart computing

General General

A Full-Stack Application for Detecting Seizures and Reducing Data During Continuous Electroencephalogram Monitoring.

In Critical care explorations

** : Continuous electroencephalogram monitoring is associated with lower mortality in critically ill patients; however, it is underused due to the resource-intensive nature of manually interpreting prolonged streams of continuous electroencephalogram data. Here, we present a novel real-time, machine learning-based alerting and monitoring system for epilepsy and seizures that dramatically reduces the amount of manual electroencephalogram review.

METHODS : We developed a custom data reduction algorithm using a random forest and deployed it within an online cloud-based platform, which streams data and communicates interactively with caregivers via a web interface to display algorithm results. We developed real-time, machine learning-based alerting and monitoring system for epilepsy and seizures on continuous electroencephalogram recordings from 77 patients undergoing routine scalp ICU electroencephalogram monitoring and tested it on an additional 20 patients.

RESULTS : We achieved a mean seizure sensitivity of 84% in cross-validation and 85% in testing, as well as a mean specificity of 83% in cross-validation and 86% in testing, corresponding to a high level of data reduction. This study validates a platform for machine learning-assisted continuous electroencephalogram analysis and represents a meaningful step toward improving utility and decreasing cost of continuous electroencephalogram monitoring. We also make our high-quality annotated dataset of 97 ICU continuous electroencephalogram recordings public for others to validate and improve upon our methods.

Bernabei John M, Owoputi Olaoluwa, Small Shyon D, Nyema Nathaniel T, Dumenyo Elom, Kim Joongwon, Baldassano Steven N, Painter Christopher, Conrad Erin C, Ganguly Taneeta M, Balu Ramani, Davis Kathryn A, Levine Joshua M, Pathmanathan Jay, Litt Brian

2021-Jul

critical care, electroencephalography, epilepsy, machine learning, seizures, software

General General

Principles and Practice of Explainable Machine Learning.

In Frontiers in big data

Artificial intelligence (AI) provides many opportunities to improve private and public life. Discovering patterns and structures in large troves of data in an automated manner is a core component of data science, and currently drives applications in diverse areas such as computational biology, law and finance. However, such a highly positive impact is coupled with a significant challenge: how do we understand the decisions suggested by these systems in order that we can trust them? In this report, we focus specifically on data-driven methods-machine learning (ML) and pattern recognition models in particular-so as to survey and distill the results and observations from the literature. The purpose of this report can be especially appreciated by noting that ML models are increasingly deployed in a wide range of businesses. However, with the increasing prevalence and complexity of methods, business stakeholders in the very least have a growing number of concerns about the drawbacks of models, data-specific biases, and so on. Analogously, data science practitioners are often not aware about approaches emerging from the academic literature or may struggle to appreciate the differences between different methods, so end up using industry standards such as SHAP. Here, we have undertaken a survey to help industry practitioners (but also data scientists more broadly) understand the field of explainable machine learning better and apply the right tools. Our latter sections build a narrative around a putative data scientist, and discuss how she might go about explaining her models by asking the right questions. From an organization viewpoint, after motivating the area broadly, we discuss the main developments, including the principles that allow us to study transparent models vs. opaque models, as well as model-specific or model-agnostic post-hoc explainability approaches. We also briefly reflect on deep learning models, and conclude with a discussion about future research directions.

Belle Vaishak, Papantonis Ioannis

2021

black-box models, explainable AI, machine learning, survey, transparent models

General General

Fast and Cost-Effective Mathematical Models for Hydrocarbon-Immiscible Water Alternating Gas Incremental Recovery Factor Prediction.

In ACS omega

Predicting the incremental recovery factor with an enhanced oil recovery (EOR) technique is a very crucial task. It requires a significant investment and expert knowledge to evaluate the EOR incremental recovery factor, design a pilot, and upscale pilot result. Water-alternating-gas (WAG) injection is one of the proven EOR technologies, with an incremental recovery factor typically ranging from 5 to 10%. The current approach of evaluating the WAG process, using reservoir modeling, is a very time-consuming and costly task. The objective of this research is to develop a fast and cost-effective mathematical model for evaluating hydrocarbon-immiscible WAG (HC-IWAG) incremental recovery factor for medium-to-light oil in undersaturated reservoirs, designing WAG pilots, and upscaling pilot results. This integrated research involved WAG literature review, WAG modeling, and selected machine learning techniques. The selected machine learning techniques are stepwise regression and group method of data handling. First, the important parameters for the prediction of the WAG incremental recovery factor were selected. This includes reservoir properties, rock and fluid properties, and WAG injection scheme. Second, an extensive WAG and waterflood modeling was carried out involving more than a thousand reservoir models. Third, WAG incremental recovery factor mathematical predictive models were developed and tested, using the group method of data handling and stepwise regression techniques. HC-IWAG incremental recovery factor mathematical models were developed with a coefficient of determination of about 0.75, using 13 predictors. The developed WAG predictive models are interpretable and user-friendly mathematical formulas. These developed models will help the subsurface teams in a variety of ways. They can be used to identify the best candidates for WAG injection, evaluate and optimize the WAG process, help design successful WAG pilots, and facilitate the upscaling of WAG pilot results to full-field scale. All this can be accomplished in a short time at a low cost and with reasonable accuracy.

Belazreg Lazreg, Mahmood Syed Mohammad, Aulia Akmal

2021-Jul-13

General General

Mycobacterium tuberculosis Cell Wall Permeability Model Generation Using Chemoinformatics and Machine Learning Approaches.

In ACS omega

The drug-resistant strains of Mycobacterium tuberculosis (M.tb) are evolving at an alarming rate, and this indicates the urgent need for the development of novel antitubercular drugs. However, genetic mutations, complex cell wall system of M.tb, and influx-efflux transporter systems are the major permeability barriers that significantly affect the M.tb drugs activity. Thus, most of the small molecules are ineffective to arrest the M.tb cell growth, even though they are effective at the cellular level. To address the permeability issue, different machine learning models that effectively distinguish permeable and impermeable compounds were developed. The enzyme-based (IC50) and cell-based (minimal inhibitory concentration) data were considered for the classification of M.tb permeable and impermeable compounds. It was assumed that the compounds that have high activity in both enzyme-based and cell-based assays possess the required M.tb cell wall permeability. The XGBoost model was outperformed when compared to the other models generated from different algorithms such as random forest, support vector machine, and naïve Bayes. The XGBoost model was further validated using the validation data set (21 permeable and 19 impermeable compounds). The obtained machine learning models suggested that various descriptors such as molecular weight, atom type, electrotopological state, hydrogen bond donor/acceptor counts, and extended topochemical atoms of molecules are the major determining factors for both M.tb cell permeability and inhibitory activity. Furthermore, potential antimycobacterial drugs were identified using computational drug repurposing. All the approved drugs from DrugBank were collected and screened using the developed permeability model. The screened compounds were given as input in the PASS server for the identification of possible antimycobacterial compounds. The drugs that were retained after two filters were docked to the active site of 10 different potential antimycobacterial drug targets. The results obtained from this study may improve the understanding of M.tb permeability and activity that may aid in the development of novel antimycobacterial drugs.

Nagamani Selvaraman, Sastry G Narahari

2021-Jul-13

General General

Density Functional Theory and Machine Learning Description and Prediction of Oxygen Atom Chemisorption on Platinum Surfaces and Nanoparticles.

In ACS omega

Elucidating chemical interactions between catalyst surfaces and adsorbates is crucial for understanding surface chemical reactivity. Herein, interactions between O atoms and Pt surfaces and nanoparticles are described as a linear combination of the properties of pristine surfaces and isolated nanoparticles. The energetics of O chemisorption onto Pt surfaces were described using only two descriptors related to surface geometrical features. The relatively high coefficient of determination and low mean absolute error between the density functional theory-calculated and predicted O binding energies indicate good accuracy of the model. For Pt nanoparticles, O binding is described by the geometrical features and electronic properties of isolated nanoparticles. Using a linear combination of five descriptors and accounting for nanoparticle size effects and adsorption site types, the O binding energy was estimated with a higher accuracy than with conventional single-descriptor models. Finally, these five descriptors were used in a general model that decomposes O binding energetics on Pt surfaces and nanoparticles. Good correlation was achieved between the calculated and predicted O binding energies, and model validation confirmed its accuracy. This is the first model that considers the nanoparticle size effect and all possible adsorption sites on Pt nanoparticles and surfaces.

Rivera Rocabado David S, Nanba Yusuke, Koyama Michihisa

2021-Jul-13

General General

Accelerating the Selection of Covalent Organic Frameworks with Automated Machine Learning.

In ACS omega

Covalent organic frameworks (COFs) have the advantages of high thermal stability and large specific surface and have great application prospects in the fields of gas storage and catalysis. This article mainly focuses on COFs' working capacity of methane (CH4). Due to the vast number of possible COF structures, it is time-consuming to use traditional calculation methods to find suitable materials, so it is important to apply appropriate machine learning (ML) algorithms to build accurate prediction models. A major obstacle for the use of ML algorithms is that the performance of an algorithm may be affected by many design decisions. Finding appropriate algorithm and model parameters is quite a challenge for nonprofessionals. In this work, we use automated machine learning (AutoML) to analyze the working capacity of CH4 based on 403,959 COFs. We explore the relationship between 23 features such as the structure, chemical characteristics, atom types of COFs, and the working capacity. Then, the tree-based pipeline optimization tool (TPOT) in AutoML and the traditional ML methods including multiple linear regression, support vector machine, decision tree, and random forest that manually set model parameters are compared. It is found that the TPOT can not only save complex data preprocessing and model parameter tuning but also show higher performance than traditional ML models. Compared with traditional grand canonical Monte Carlo simulations, it can save a lot of time. AutoML has broken through the limitations of professionals so that researchers in nonprofessional fields can realize automatic parameter configuration for experiments to obtain highly accurate and easy-to-understand results, which is of great significance for material screening.

Yang Peisong, Zhang Huan, Lai Xin, Wang Kunfeng, Yang Qingyuan, Yu Duli

2021-Jul-13

General General

Improving the phishing website detection using empirical analysis of Function Tree and its variants.

In Heliyon

The phishing attack is one of the most complex threats that have put internet users and legitimate web resource owners at risk. The recent rise in the number of phishing attacks has instilled distrust in legitimate internet users, making them feel less safe even in the presence of powerful antivirus apps. Reports of a rise in financial damages as a result of phishing website attacks have caused grave concern. Several methods, including blacklists and machine learning-based models, have been proposed to combat phishing website attacks. The blacklist anti-phishing method has been faulted for failure to detect new phishing URLs due to its reliance on compiled blacklisted phishing URLs. Many ML methods for detecting phishing websites have been reported with relatively low detection accuracy and high false alarm. Hence, this research proposed a Functional Tree (FT) based meta-learning models for detecting phishing websites. That is, this study investigated improving the phishing website detection using empirical analysis of FT and its variants. The proposed models outperformed baseline classifiers, meta-learners and hybrid models that are used for phishing websites detection in existing studies. Besides, the proposed FT based meta-learners are effective for detecting legitimate and phishing websites with accuracy as high as 98.51% and a false positive rate as low as 0.015. Hence, the deployment and adoption of FT and its meta-learner variants for phishing website detection and applicable cybersecurity attacks are recommended.

Balogun Abdullateef O, Adewole Kayode S, Raheem Muiz O, Akande Oluwatobi N, Usman-Hamza Fatima E, Mabayoje Modinat A, Akintola Abimbola G, Asaju-Gbolagade Ayisat W, Jimoh Muhammed K, Jimoh Rasheed G, Adeyemo Victor E

2021-Jul

Bagging, Boosting, Ensemble, Functional trees, Machine learning, Meta-learning, Phishing websites, Rotation forest

General General

Predictive modelling of hypoxic ischaemic encephalopathy risk following perinatal asphyxia.

In Heliyon

Hypoxic Ischemic Encephalopathy (HIE) remains a major cause of neurological disability. Early intervention with therapeutic hypothermia improves outcome, but prediction of HIE is difficult and no single clinical marker is reliable. Machine learning algorithms may allow identification of patterns in clinical data to improve prognostic power. Here we examine the use of a Random Forest machine learning algorithm and five-fold cross-validation to predict the occurrence of HIE in a prospective cohort of infants with perinatal asphyxia. Infants with perinatal asphyxia were recruited at birth and neonatal course was followed for the development of HIE. Clinical variables were recorded for each infant including maternal demographics, delivery details and infant's condition at birth. We found that the strongest predictors of HIE were the infant's condition at birth (as expressed by Apgar score), need for resuscitation, and the first postnatal measures of pH, lactate, and base deficit. Random Forest models combining features including Apgar score, most intensive resuscitation, maternal age and infant birth weight both with and without biochemical markers of pH, lactate, and base deficit resulted in a sensitivity of 56-100% and a specificity of 78-99%. This study presents a dynamic method of rapid classification that has the potential to be easily adapted and implemented in a clinical setting, with and without the availability of blood gas analysis. Our results demonstrate that applying machine learning algorithms to readily available clinical data may support clinicians in the early and accurate identification of infants who will develop HIE. We anticipate our models to be a starting point for the development of a more sophisticated clinical decision support system to help identify which infants will benefit from early therapeutic hypothermia.

Mooney Catherine, O’Boyle Daragh, Finder Mikael, Hallberg Boubou, Walsh Brian H, Henshall David C, Boylan Geraldine B, Murray Deirdre M

2021-Jul

Acidosis, Clinical risk prediction, Hypoxic ischaemic encephalopathy, Machine learning, Neonatal encephalopathy, Perinatal asphyxia

General General

Dataset for toothbrushing activity using brush-attached and wearable sensors.

In Data in brief

Maintaining oral hygiene is very important for a healthy life. Poor toothbrushing is one of the leading causes of tooth decay and other gum problems. Many people do not brush their teeth properly. There is very limited technology available to help in assessing the quality of toothbrushing. Human Activity Recognition (HAR) applications have seen a tremendous growth in recent years. In this work, we treat the adherence to standard toothbrushing practice as an activity recognition problem. We investigate this problem and collect experimental data using a brush-attached and a wearable sensor when the users brush their teeth. In this paper, we extend our previous dataset [1] for toothbrushing activity by including more experiments and adding a new sensor. We discuss and analyse the collection of the dataset. We use an Inertial Measurement Unit (IMU) sensor to collect the time-series data for toothbrushing activity. We recruited 22 healthy participants and collected the data in two different settings when they brushed their teeth in five different locations using both electric and manual brushes. In total, we have recorded 120 toothbrushing sessions using both brush-attached sensor and the wearable sensor.

Hussain Zawar, Waterworth David, Mahmood Adnan, Sheng Quan Z, Zhang Wei Emma

2021-Aug

Activity recognition, Machine learning, Sensor, Smart toothbrush, Toothbrushing

Radiology Radiology

Using deep learning convolutional neural networks to automatically perform cerebral aqueduct CSF flow analysis.

In Journal of clinical neuroscience : official journal of the Neurosurgical Society of Australasia

Since the development of phase-contrast magnetic resonance imaging (PC-MRI), quantification of cerebrospinal fluid (CSF) flow across the cerebral aqueduct has been utilized for diagnosis of conditions such as normal pressure hydrocephalus (NPH). This study aims to develop an automated method of aqueduct CSF flow analysis using convolution neural networks (CNNs), which can replace the current standard involving manual segmentation of aqueduct region of interest (ROI). Retrospective analysis was performed on 333 patients who underwent PC-MRI, totaling 353 imaging studies. Aqueduct flow measurements using manual ROI placement was performed independently by two radiologists. Two types of CNNs, MultiResUNet and UNet, were trained using ROI data from the senior radiologist, with PC-MRI studies being randomly divided into training (80%) and validation (20%) datasets. Segmentation performance was assessed using Dice similarity coefficient (DSC), and CSF flow parameters were calculated from both manual and CNN-derived ROIs. MultiResUNet, UNet and second radiologist (Rater 2) had DSCs of 0.933, 0.928, and 0.867, respectively, with p < 0.001 between CNNs and Rater 2. Comparison of CSF flow parameters showed excellent intraclass correlation coefficients (ICCs) for MultiResUNet, with lowest correlation being 0.67. For UNet, lower ICCs of -0.01 to 0.56 were observed. Only 3/353 (0.8%) studies failed to have appropriate ROIs placed by MultiResUNet, compared to 12/353 (3.4%) failed cases for UNet. In conclusion, CNNs were able to measure aqueductal CSF flow with similar performance to a senior neuroradiologist. MultiResUNet demonstrated fewer cases of segmentation failure and more consistent flow measurements compared to the widely adopted UNet.

Tsou Cheng-Hsien, Cheng Yun-Chung, Huang Chin-Yin, Chen Jeon-Hor, Chen Wen-Hsien, Chai Jyh-Wen, Chen Clayton Chi-Chang

2021-Aug

Cerebral aqueduct, Cerebrospinal fluid, Deep learning, Magnetic resonance imaging

General General

Correcting data imbalance for semi-supervised COVID-19 detection using X-ray chest images.

In Applied soft computing

A key factor in the fight against viral diseases such as the coronavirus (COVID-19) is the identification of virus carriers as early and quickly as possible, in a cheap and efficient manner. The application of deep learning for image classification of chest X-ray images of COVID-19 patients could become a useful pre-diagnostic detection methodology. However, deep learning architectures require large labelled datasets. This is often a limitation when the subject of research is relatively new as in the case of the virus outbreak, where dealing with small labelled datasets is a challenge. Moreover, in such context, the datasets are also highly imbalanced, with few observations from positive cases of the new disease. In this work we evaluate the performance of the semi-supervised deep learning architecture known as MixMatch with a very limited number of labelled observations and highly imbalanced labelled datasets. We demonstrate the critical impact of data imbalance to the model's accuracy. Therefore, we propose a simple approach for correcting data imbalance, by re-weighting each observation in the loss function, giving a higher weight to the observations corresponding to the under-represented class. For unlabelled observations, we use the pseudo and augmented labels calculated by MixMatch to choose the appropriate weight. The proposed method improved classification accuracy by up to 18%, with respect to the non balanced MixMatch algorithm. We tested our proposed approach with several available datasets using 10, 15 and 20 labelled observations, for binary classification (COVID-19 positive and normal cases). For multi-class classification (COVID-19 positive, pneumonia and normal cases), we tested 30, 50, 70 and 90 labelled observations. Additionally, a new dataset is included among the tested datasets, composed of chest X-ray images of Costa Rican adult patients.

Calderon-Ramirez Saul, Yang Shengxiang, Moemeni Armaghan, Elizondo David, Colreavy-Donnelly Simon, Chavarría-Estrada Luis Fernando, Molina-Cabello Miguel A

2021-Nov

COVID-19, Computer aided diagnosis, Coronavirus, Data imbalance, Semi-supervised learning

Surgery Surgery

Hybrid neural network reduced order modelling for turbulent flows with geometric parameters

ArXiv Preprint

Geometrically parametrized Partial Differential Equations are nowadays widely used in many different fields as, for example, shape optimization processes or patient specific surgery studies. The focus of this work is on some advances for this topic, capable of increasing the accuracy with respect to previous approaches while relying on a high cost-benefit ratio performance. The main scope of this paper is the introduction of a new technique mixing up a classical Galerkin-projection approach together with a data-driven method to obtain a versatile and accurate algorithm for the resolution of geometrically parametrized incompressible turbulent Navier-Stokes problems. The effectiveness of this procedure is demonstrated on two different test cases: a classical academic back step problem and a shape deformation Ahmed body application. The results show into details the properties of the architecture we developed while exposing possible future perspectives for this work.

Matteo Zancanaro, Markus Mrosek, Giovanni Stabile, Carsten Othmer, Gianluigi Rozza

2021-07-20

Radiology Radiology

Artificial Intelligence in the Management of Anterior Cruciate Ligament Injuries.

In Orthopaedic journal of sports medicine

Background : Technological innovation is a key component of orthopaedic surgery. With the integration of powerful technologies in surgery and clinical practice, artificial intelligence (AI) may become an important tool for orthopaedic surgeons in the future. Through adaptive learning and problem solving that serve to constantly increase accuracy, machine learning algorithms show great promise in orthopaedics.

Purpose : To investigate the current and potential uses of AI in the management of anterior cruciate ligament (ACL) injury.

Study Design : Systematic review; Level of evidence, 3.

Methods : A systematic review of the PubMed, MEDLINE, Embase, Web of Science, and SPORTDiscus databases between their start and August 12, 2020, was performed by 2 independent reviewers. Inclusion criteria included application of AI anywhere along the spectrum of predicting, diagnosing, and managing ACL injuries. Exclusion criteria included non-English publications, conference abstracts, review articles, and meta-analyses. Statistical analysis could not be performed because of data heterogeneity; therefore, a descriptive analysis was undertaken.

Results : A total of 19 publications were included after screening. Applications were divided based on the different stages of the clinical course in ACL injury: prediction (n = 2), diagnosis (n = 12), intraoperative application (n = 1), and postoperative care and rehabilitation (n = 4). AI-based technologies were used in a wide variety of applications, including image interpretation, automated chart review, assistance in the physical examination via optical tracking using infrared cameras or electromagnetic sensors, generation of predictive models, and optimization of postoperative care and rehabilitation.

Conclusion : There is an increasing interest in AI among orthopaedic surgeons, as reflected by the applications for ACL injury presented in this review. Although some studies showed similar or better outcomes using AI compared with traditional techniques, many challenges need to be addressed before this technology is ready for widespread use.

Corban Jason, Lorange Justin-Pierre, Laverdiere Carl, Khoury Jason, Rachevsky Gil, Burman Mark, Martineau Paul Andre

2021-Jul

anterior cruciate ligament, gait analysis, general, imaging and radiology, injury prevention, physical therapy/rehabilitation

General General

Deep Learning Based Prediction of Atrial Fibrillation Disease Progression with Endocardial Electrograms in a Canine Model.

In Computing in cardiology

Objective : We sought to determine whether electrical patterns in endocardial wavefronts contained elements specific to atrial fibrillation (AF) disease progression.

Methods : A canine paced model (n=7, female mongrel, 29±2 kg) of persistent AF was endocardially mapped with a 64-electrode basket catheter during periods of AF at 1 month, 3 month, and 6 months post-implant of stimulator. A 50-layer residual network was then trained to map half-second electrogram samples to their source timepoint.

Results : The trained network achieved final validation and testing accuracies of 51.6 and 48.5% respectively. Per class F1 scores were 24%, 59%, and 53% for 1 month, 3 month, and 6 month inputs from the testing dataset.

Conclusion : Differentiation of AF based on its time progression was shown to be feasible with a deep learning method. This is promising for differentiating treatment based on disease progression though low accuracy with earlier timepoints may be an obstacle to identifying nascent AF.

Hunt Bram, Kwan Eugene, McMillan Mark, Dosdall Derek, MacLeod Rob, Ranjan Ravi

2020-Sep

General General

The use of explainable artificial intelligence to explore types of fenestral otosclerosis misdiagnosed when using temporal bone high-resolution computed tomography.

In Annals of translational medicine

Background : The purpose of this study was to explore the common characteristics of fenestral otosclerosis (OS) which are misdiagnosed, and develop a deep learning model for the diagnosis of fenestral OS based on temporal bone high-resolution computed tomography scans.

Methods : We conducted a study to explicitly analyze the clinical performance of otolaryngologists in diagnosing fenestral OS and developed an explainable deep learning model using 134,574 temporal bone high-resolution computed tomography (HRCT) slices collected from 1,294 patients for the automatic diagnosis of fenestral OS. We prospectively created an external test set with 31,774 CT slices from 144 patients, which contained 86 fenestral OS ears and 202 normal ears and used it to evaluate the performance of our otosclerosis-Logical Neural Network (LNN) model to assess its potential clinical utility. In addition, we compared the diagnostic acumen of seven otolaryngologists with the otosclerosis-LNN approach in the clinical test set, which was mixed with 78 fenestral OS and 62 normal ears. Finally, to evaluate the assisting value of the model, the seven participants were again invited to classify all cases in the clinical test set after referring to the diagnostic results of the model, to which they were blinded.

Results : The diagnostic performance of otologists was not satisfactory, and those CT samples which were misdiagnosed had similar characteristics. Based on this finding, we defined three subtypes of fenestral OS lesions that are suitable for clinical diagnosis guidance: "focal", "transitional", and "typical" fenestral OS. The most encouraging result is that the model achieved an area under the curve (AUC) of 99.5% (per-ear-sensitivity of 96.4%, per-ear-specificity of 98.9%) on the prospective unknown external test. Furthermore, we used this model to assist otologists and observed a consistent and significant improvement in diagnostic performance, especially for the newly defined focal and transitional fenestral OS, which led to the initial high misdiagnosis rate.

Conclusions : Our findings of the fine-grained classification of fenestral OS could have implications for future diagnosis and prevention programs. In addition, our deep OS localization network is an effective approach providing assistance to otologists to deal with the significant challenge of the misdiagnosis of fenestral OS.

Tan Weimin, Guan Pengfei, Wu Lingjie, Chen Hedan, Li Jichun, Ling Yu, Fan Ting, Wang Yunfeng, Li Jian, Yan Bo

2021-Jun

Fenestral otosclerosis, artificial intelligence (AI), deep learning, high-resolution computed tomography

Cardiology Cardiology

Automated Left Ventricle Ischemic Scar Detection in CT Using Deep Neural Networks.

In Frontiers in cardiovascular medicine

Objectives: The aim of this study is to develop a scar detection method for routine computed tomography angiography (CTA) imaging using deep convolutional neural networks (CNN), which relies solely on anatomical information as input and is compatible with existing clinical workflows. Background: Identifying cardiac patients with scar tissue is important for assisting diagnosis and guiding interventions. Late gadolinium enhancement (LGE) magnetic resonance imaging (MRI) is the gold standard for scar imaging; however, there are common instances where it is contraindicated. CTA is an alternative imaging modality that has fewer contraindications and is faster than Cardiovascular magnetic resonance imaging but is unable to reliably image scar. Methods: A dataset of LGE MRI (200 patients, 83 with scar) was used to train and validate a CNN to detect ischemic scar slices using segmentation masks as input to the network. MRIs were segmented to produce 3D left ventricle meshes, which were sampled at points along the short axis to extract anatomical masks, with scar labels from LGE as ground truth. The trained CNN was tested with an independent CTA dataset (25 patients, with ground truth established with paired LGE MRI). Automated segmentation was performed to provide the same input format of anatomical masks for the network. The CNN was compared against manual reading of the CTA dataset by 3 experts. Results: Note that 84.7% cross-validated accuracy (AUC: 0.896) for detecting scar slices in the left ventricle on the MRI data was achieved. The trained network was tested against the CTA-derived data, with no further training, where it achieved an 88.3% accuracy (AUC: 0.901). The automated pipeline outperformed the manual reading by clinicians. Conclusion: Automatic ischemic scar detection can be performed from a routine cardiac CTA, without any scar-specific imaging or contrast agents. This requires only a single acquisition in the cardiac cycle. In a clinical setting, with near zero additional cost, scar presence could be detected to triage images, reduce reading times, and guide clinical decision-making.

O’Brien Hugh, Whitaker John, Singh Sidhu Baldeep, Gould Justin, Kurzendorfer Tanja, O’Neill Mark D, Rajani Ronak, Grigoryan Karine, Rinaldi Christopher Aldo, Taylor Jonathan, Rhode Kawal, Mountney Peter, Niederer Steven

2021

automated classification, computed tomography angiography, convolutional neural network, deep learning, fibrosis, left ventricle

Pathology Pathology

Advances in Imaging Modalities, Artificial Intelligence, and Single Cell Biomarker Analysis, and Their Applications in Cytopathology.

In Frontiers in medicine

Several advances in recent decades in digital imaging, artificial intelligence, and multiplex modalities have improved our ability to automatically analyze and interpret imaging data. Imaging technologies such as optical coherence tomography, optical projection tomography, and quantitative phase microscopy allow analysis of tissues and cells in 3-dimensions and with subcellular granularity. Improvements in computer vision and machine learning have made algorithms more successful in automatically identifying important features to diagnose disease. Many new automated multiplex modalities such as antibody barcoding with cleavable DNA (ABCD), single cell analysis for tumor phenotyping (SCANT), fast analytical screening technique fine needle aspiration (FAST-FNA), and portable fluorescence-based image cytometry analyzer (CytoPAN) are under investigation. These have shown great promise in their ability to automatically analyze several biomarkers concurrently with high sensitivity, even in paucicellular samples, lending themselves well as tools in FNA. Not yet widely adopted for clinical use, many have successfully been applied to human samples. Once clinically validated, some of these technologies are poised to change the routine practice of cytopathology.

Lau Ryan P, Kim Teresa H, Rao Jianyu

2021

computational cytopathology, computational pathology, molecular cytopathology, multiplex immunofluorescence, single cell biomarker analysis

Surgery Surgery

Machine Learning Prediction Models for Mechanically Ventilated Patients: Analyses of the MIMIC-III Database.

In Frontiers in medicine

Background: Mechanically ventilated patients in the intensive care unit (ICU) have high mortality rates. There are multiple prediction scores, such as the Simplified Acute Physiology Score II (SAPS II), Oxford Acute Severity of Illness Score (OASIS), and Sequential Organ Failure Assessment (SOFA), widely used in the general ICU population. We aimed to establish prediction scores on mechanically ventilated patients with the combination of these disease severity scores and other features available on the first day of admission. Methods: A retrospective administrative database study from the Medical Information Mart for Intensive Care (MIMIC-III) database was conducted. The exposures of interest consisted of the demographics, pre-ICU comorbidity, ICU diagnosis, disease severity scores, vital signs, and laboratory test results on the first day of ICU admission. Hospital mortality was used as the outcome. We used the machine learning methods of k-nearest neighbors (KNN), logistic regression, bagging, decision tree, random forest, Extreme Gradient Boosting (XGBoost), and neural network for model establishment. A sample of 70% of the cohort was used for the training set; the remaining 30% was applied for testing. Areas under the receiver operating characteristic curves (AUCs) and calibration plots would be constructed for the evaluation and comparison of the models' performance. The significance of the risk factors was identified through models and the top factors were reported. Results: A total of 28,530 subjects were enrolled through the screening of the MIMIC-III database. After data preprocessing, 25,659 adult patients with 66 predictors were included in the model analyses. With the training set, the models of KNN, logistic regression, decision tree, random forest, neural network, bagging, and XGBoost were established and the testing set obtained AUCs of 0.806, 0.818, 0.743, 0.819, 0.780, 0.803, and 0.821, respectively. The calibration curves of all the models, except for the neural network, performed well. The XGBoost model performed best among the seven models. The top five predictors were age, respiratory dysfunction, SAPS II score, maximum hemoglobin, and minimum lactate. Conclusion: The current study indicates that models with the risk of factors on the first day could be successfully established for predicting mortality in ventilated patients. The XGBoost model performs best among the seven machine learning models.

Zhu Yibing, Zhang Jin, Wang Guowei, Yao Renqi, Ren Chao, Chen Ge, Jin Xin, Guo Junyang, Liu Shi, Zheng Hua, Chen Yan, Guo Qianqian, Li Lin, Du Bin, Xi Xiuming, Li Wei, Huang Huibin, Li Yang, Yu Qian

2021

death, intensive care unit, machine learning, mechanical ventilation, prediction model

General General

Bioaugmentation Technology for Treatment of Toxic and Refractory Organic Waste Water Based on Artificial Intelligence.

In Frontiers in bioengineering and biotechnology

With the development of modern chemical synthesis technology, toxic and harmful compounds increase sharply. In order to improve the removal efficiency of refractory organic matter in waste water, the method of adding powdered activated carbon (PAC) to the system for adsorption was adopted. Through the analysis of organic matter removal rule before and after waste water treatment, it can be found that PAC is easy to adsorb hydrophobic organic matter, while activated sludge is easy to remove hydrophilic and weakly hydrophobic neutral organic matter. Powdered activated carbon-activated sludge SBR system (PAC-AS) system is obviously superior to AS and PAC system in removing organic matter of hydrophilic and hydrophobic components, that is, biodegradation and PAC adsorption are additive. Compared with the control system, the Chemical Oxygen Demand (COD) removal rate of refractory substances increased by 8.36%, and PAC had a good adsorption effect on small molecular weight organic compounds, but with the increase of molecular weight of organic compounds, the adsorption effect of PAC gradually weakened, and it had no adsorption effect on macromolecular organic compounds. Based on the research of fuzzy control theory, an Agent control system for ozone oxidation process of industrial waste water based on Mobile Agent Server (MAS) theory was established, which was realized by fuzzy control method. The simulation results showed strong stability and verified the feasibility and adaptability of the distributed intelligent waste water treatment system based on MAS theory in the actual control process.

Yanbo Jiang, Jianyi Jiang, Xiandong Wei, Wei Ling, Lincheng Jiang

2021

artificial intelligence, biofortification, fuzzy neural network, intelligent control, waste water treatment

oncology Oncology

Comparison of Clinical Characteristics Among COVID-19 and Non-COVID-19 Pediatric Pneumonias: A Multicenter Cross-Sectional Study.

In Frontiers in cellular and infection microbiology ; h5-index 53.0

Background : The pandemic of Coronavirus Disease 2019 (COVID-19) brings new challenges for pediatricians, especially in the differentiation with non-COVID-19 pneumonia in the peak season of pneumonia. We aimed to compare the clinical characteristics of pediatric patients with COVID-19 and other respiratory pathogens infected pneumonias.

Methods : We conducted a multi-center, cross-sectional study of pediatric inpatients in China. Based on pathogenic test results, pediatric patients were divided into three groups, including COVID-19 pneumonia group, Non-COVID-19 viral (NCV) pneumonia group and Non-viral (NV) pneumonia group. Their clinical characteristics were compared by Kruskal-Wallis H test or chi-square test.

Results : A total of 636 pediatric pneumonia inpatients, among which 87 in COVID-19 group, 194 in NCV group, and 355 in NV group, were included in analysis. Compared with NCV and NV patients, COVID-19 patients were older (median age 6.33, IQR 2.00-12.00 years), and relatively fewer COVID-19 patients presented fever (63.2%), cough (60.9%), shortness of breath (1.1%), and abnormal pulmonary auscultation (18.4%). The results were verified by the comparison of COVID-19, respiratory syncytial virus (RSV) and influenza A (IFA) pneumonia patients. Approximately 42.5%, 44.8%, and 12.6% of the COVID-19 patients presented simply ground-glass opacity (GGO), simply consolidation, and the both changes on computed tomography (CT) scans, respectively; the proportions were similar as those in NCV and NV group (p>0.05). Only 47.1% of COVID-19 patients had both lungs pneumonia, which was significantly lower than that proportion of nearly 80% in the other two groups. COVID-19 patients presented lower proportions of increased white blood cell count (16.5%) and abnormal procalcitonin (PCT) (10.7%), and a higher proportion of decreased lymphocyte count (44.0%) compared with the other two groups.

Conclusion : Majority clinical characteristics of pediatric COVID-19 pneumonia patients were milder than non-COVID-19 patients. However, lymphocytopenia remained a prominent feature of COVID-19 pediatric pneumonia.

Jia Zhongwei, Yan Xiangyu, Gao Liwei, Ding Shenggang, Bai Yan, Zheng Yuejie, Cui Yuxia, Wang Xianfeng, Li Jingfeng, Lu Gen, Xu Yi, Zhang Xiangyu, Li Junhua, Chen Ning, Shang Yunxiao, Han Mingfeng, Liu Jun, Zhou Hourong, Li Cen, Lu Wanqiu, Liu Jun, Wang Lina, Fan Qihong, Wu Jiang, Shen Hanling, Jiao Rong, Chen Chunxi, Gao Xiaoling, Tian Maoqiang, Lu Wei, Yang Yonghong, Wong Gary Wing-Kin, Wang Tianyou, Jin Runming, Shen Adong, Xu Baoping, Shen Kunling

2021

COVID-19 pneumonia, clinical characteristics, non-viral pneumonia, pediatric patients, viral pneumonia

General General

CeCILE - An Artificial Intelligence Based Cell-Detection for the Evaluation of Radiation Effects in Eucaryotic Cells.

In Frontiers in oncology

The fundamental basis in the development of novel radiotherapy methods is in-vitro cellular studies. To assess different endpoints of cellular reactions to irradiation like proliferation, cell cycle arrest, and cell death, several assays are used in radiobiological research as standard methods. For example, colony forming assay investigates cell survival and Caspase3/7-Sytox assay cell death. The major limitation of these assays is the analysis at a fixed timepoint after irradiation. Thus, not much is known about the reactions before or after the assay is performed. Additionally, these assays need special treatments, which influence cell behavior and health. In this study, a completely new method is proposed to tackle these challenges: A deep-learning algorithm called CeCILE (Cell Classification and In-vitro Lifecycle Evaluation), which is used to detect and analyze cells on videos obtained from phase-contrast microscopy. With this method, we can observe and analyze the behavior and the health conditions of single cells over several days after treatment, up to a sample size of 100 cells per image frame. To train CeCILE, we built a dataset by labeling cells on microscopic images and assign class labels to each cell, which define the cell states in the cell cycle. After successful training of CeCILE, we irradiated CHO-K1 cells with 4 Gy protons, imaged them for 2 days by a microscope equipped with a live-cell-imaging set-up, and analyzed the videos by CeCILE and by hand. From analysis, we gained information about cell numbers, cell divisions, and cell deaths over time. We could show that similar results were achieved in the first proof of principle compared with colony forming and Caspase3/7-Sytox assays in this experiment. Therefore, CeCILE has the potential to assess the same endpoints as state-of-the-art assays but gives extra information about the evolution of cell numbers, cell state, and cell cycle. Additionally, CeCILE will be extended to track individual cells and their descendants throughout the whole video to follow the behavior of each cell and the progeny after irradiation. This tracking method is capable to put radiobiologic research to the next level to obtain a better understanding of the cellular reactions to radiation.

Rudigkeit Sarah, Reindl Julian B, Matejka Nicole, Ramson Rika, Sammer Matthias, Dollinger Günther, Reindl Judith

2021

cell-tracking, deep-learning, lifecycle analysis, phase-contrast microscopy, radiobiology

Radiology Radiology

Deep Neural Network Analysis of Pathology Images With Integrated Molecular Data for Enhanced Glioma Classification and Grading.

In Frontiers in oncology

Gliomas are primary brain tumors that originate from glial cells. Classification and grading of these tumors is critical to prognosis and treatment planning. The current criteria for glioma classification in central nervous system (CNS) was introduced by World Health Organization (WHO) in 2016. This criteria for glioma classification requires the integration of histology with genomics. In 2017, the Consortium to Inform Molecular and Practical Approaches to CNS Tumor Taxonomy (cIMPACT-NOW) was established to provide up-to-date recommendations for CNS tumor classification, which in turn the WHO is expected to adopt in its upcoming edition. In this work, we propose a novel glioma analytical method that, for the first time in the literature, integrates a cellularity feature derived from the digital analysis of brain histopathology images integrated with molecular features following the latest WHO criteria. We first propose a novel over-segmentation strategy for region-of-interest (ROI) selection in large histopathology whole slide images (WSIs). A Deep Neural Network (DNN)-based classification method then fuses molecular features with cellularity features to improve tumor classification performance. We evaluate the proposed method with 549 patient cases from The Cancer Genome Atlas (TCGA) dataset for evaluation. The cross validated classification accuracies are 93.81% for lower-grade glioma (LGG) and high-grade glioma (HGG) using a regular DNN, and 73.95% for LGG II and LGG III using a residual neural network (ResNet) DNN, respectively. Our experiments suggest that the type of deep learning has a significant impact on tumor subtype discrimination between LGG II vs. LGG III. These results outperform state-of-the-art methods in classifying LGG II vs. LGG III and offer competitive performance in distinguishing LGG vs. HGG in the literature. In addition, we also investigate molecular subtype classification using pathology images and cellularity information. Finally, for the first time in literature this work shows promise for cellularity quantification to predict brain tumor grading for LGGs with IDH mutations.

Pei Linmin, Jones Karra A, Shboul Zeina A, Chen James Y, Iftekharuddin Khan M

2021

IDH mutation, brain tumor classification and grading, cellularity, central nervous system tumor, deep neural network, glioma, molecular, radiomics

oncology Oncology

Potential and limitations of radiomics in neuro-oncology.

In Journal of clinical neuroscience : official journal of the Neurosurgical Society of Australasia

Radiomics seeks to apply classical methods of image processing to obtain quantitative parameters from imaging. Derived features are subsequently fed into algorithmic models to aid clinical decision making. The application of radiomics and machine learning techniques to clinical medicine remains in its infancy. The great potential of radiomics lies in its objective, granular approach to investigating clinical imaging. In neuro-oncology, advanced machine learning techniques, particularly deep learning, are at the forefront of new discoveries in the field. However, despite the great promise of machine learning aided radiomic approaches, the current use remains confined to scholarly research, without real-world deployment in neuro-oncology. The paucity of data, inconsistencies in preprocessing, radiomic feature instability, and the rarity of the events of interest are critical barriers to clinical translation. In this article, we will outline the major steps in the process of radiomics, as well as review advances and challenges in the field as they pertain to neuro-oncology.

Taha Birra, Boley Daniel, Sun Ju, Chen Clark

2021-Aug

Deep learning, Imaging, Machine learning, Neuro-oncology, Radiomics

General General

Fusion of AI techniques to tackle COVID-19 pandemic: models, incidence rates, and future trends.

In Multimedia systems

The COVID-19 pandemic is rapidly spreading across the globe and infected millions of people that take hundreds of thousands of lives. Over the years, the role of Artificial intelligence (AI) has been on the rise as its algorithms are getting more and more accurate and it is thought that its role in strengthening the existing healthcare system will be the most profound. Moreover, the pandemic brought an opportunity to showcase AI and healthcare integration potentials as the current infrastructure worldwide is overwhelmed and crumbling. Due to AI's flexibility and adaptability, it can be used as a tool to tackle COVID-19. Motivated by these facts, in this paper, we surveyed how the AI techniques can handle the COVID-19 pandemic situation and present the merits and demerits of these techniques. This paper presents a comprehensive end-to-end review of all the AI-techniques that can be used to tackle all areas of the pandemic. Further, we systematically discuss the issues of the COVID-19, and based on the literature review, we suggest their potential countermeasures using AI techniques. In the end, we analyze various open research issues and challenges associated with integrating the AI techniques in the COVID-19.

Shah Het, Shah Saiyam, Tanwar Sudeep, Gupta Rajesh, Kumar Neeraj

2021-Jul-13

AI, COVID-19, Deep learning, Healthcare, Machine learning

General General

MIMO: Mutual Integration of Patient Journey and Medical Ontology for Healthcare Representation Learning

ArXiv Preprint

Healthcare representation learning on the Electronic Health Record (EHR) is seen as crucial for predictive analytics in the medical field. Many natural language processing techniques, such as word2vec, RNN and self-attention, have been adapted for use in hierarchical and time stamped EHR data, but fail when they lack either general or task-specific data. Hence, some recent works train healthcare representations by incorporating medical ontology (a.k.a. knowledge graph), by self-supervised tasks like diagnosis prediction, but (1) the small-scale, monotonous ontology is insufficient for robust learning, and (2) critical contexts or dependencies underlying patient journeys are never exploited to enhance ontology learning. To address this, we propose an end-to-end robust Transformer-based solution, Mutual Integration of patient journey and Medical Ontology (MIMO) for healthcare representation learning and predictive analytics. Specifically, it consists of task-specific representation learning and graph-embedding modules to learn both patient journey and medical ontology interactively. Consequently, this creates a mutual integration to benefit both healthcare representation learning and medical ontology embedding. Moreover, such integration is achieved by a joint training of both task-specific predictive and ontology-based disease typing tasks based on fused embeddings of the two modules. Experiments conducted on two real-world diagnosis prediction datasets show that, our healthcare representation model MIMO not only achieves better predictive results than previous state-of-the-art approaches regardless of sufficient or insufficient training data, but also derives more interpretable embeddings of diagnoses.

Xueping Peng, and Guodong Long, Tao Shen, Sen Wang, Zhendong Niu, Chengqi Zhang

2021-07-20

General General

Easily Created Prediction Model Using Automated Artificial Intelligence Framework (Prediction One, Sony Network Communications Inc., Tokyo, Japan) for Subarachnoid Hemorrhage Outcomes Treated by Coiling and Delayed Cerebral Ischemia.

In Cureus

Introduction Reliable prediction models of subarachnoid hemorrhage (SAH) outcomes and delayed cerebral ischemia (DCI) are needed to decide the treatment strategy. Automated artificial intelligence (AutoAI) is attractive, but there are few reports on AutoAI-based models for SAH functional outcomes and DCI. We herein made models using an AutoAI framework, Prediction One (Sony Network Communications Inc., Tokyo, Japan), and compared it to other previous statistical prediction scores. Methods We used an open dataset of 298 SAH patients, who were with non-severe neurological grade and treated by coiling. Modified Rankin Scale 0-3 at six months was defined as a favorable functional outcome and DCI occurrence as another outcome. We randomly divided them into a 248-patient training dataset and a 50-patient test dataset. Prediction One made the model using training dataset with 5-fold cross-validation. We evaluated the model using the test dataset and compared the area under the curves (AUCs) of the created models. Those of the modified SAFIRE score and the Fisher computed tomography (CT) scale to predict the outcomes. Results The AUCs of the AutoAI-based models for functional outcome in the training and test dataset were 0.994 and 0.801, and those for the DCI occurrence were 0.969 and 0.650. AUCs for functional outcome calculated using modified SAFIRE score were 0.844 and 0.892. Those for the DCI occurrence calculated using the Fisher CT scale were 0.577 and 0.544. Conclusions We easily and quickly made AutoAI-based prediction models. The models' AUCs were not inferior to the previous prediction models despite the easiness.

Katsuki Masahito, Kawamura Shin, Koh Akihito

2021-Jun

automated artificial intelligence (autoai), deep learning (dl), delayed cerebral ischemia (dci), machine learning (ml), subarachnoid hemorrhage (sah)

General General

Hand tremor detection in videos with cluttered background using neural network based approaches.

In Health information science and systems

With the increasing prevalence of neurodegenerative diseases, including Parkinson's disease, hand tremor detection has become a popular research topic because it helps with the diagnosis and tracking of disease progression. Conventional hand tremor detection algorithms involved wearable sensors. A non-invasive hand tremor detection algorithm using videos as input is desirable but the existing video-based algorithms are sensitive to environmental conditions. An algorithm, with the capability of detecting hand tremor from videos with a cluttered background, would allow the videos recorded in a non-research environment to be used. Clinicians and researchers could use videos collected from patients and participants in their own home environment or standard clinical settings. Neural network based machine learning architectures provide high accuracy classification results in related fields including hand gesture recognition and body movement detection systems. We thus investigated the accuracy of advanced neural network architectures to automatically detect hand tremor in videos with a cluttered background. We examined configurations with different sets of features and neural network based classification models. We compared the performance of different combinations of features and classification models and then selected the combination which provided the highest accuracy of hand tremor detection. We used cross validation to test the accuracy of the trained model predictions. The highest classification accuracy for automatically detecting tremor (vs non tremor) was 80.6% and this was obtained using Convolutional Neural Network-Long Short-Term Memory and features based on measures of frequency and amplitude change.

Wang Xinyi, Garg Saurabh, Tran Son N, Bai Quan, Alty Jane

2021-Dec

Advanced neural network, Hand tremor detection, Machine learning, Videos with cluttered background

General General

ACP-DA: Improving the Prediction of Anticancer Peptides Using Data Augmentation.

In Frontiers in genetics ; h5-index 62.0

Anticancer peptides (ACPs) have provided a promising perspective for cancer treatment, and the prediction of ACPs is very important for the discovery of new cancer treatment drugs. It is time consuming and expensive to use experimental methods to identify ACPs, so computational methods for ACP identification are urgently needed. There have been many effective computational methods, especially machine learning-based methods, proposed for such predictions. Most of the current machine learning methods try to find suitable features or design effective feature learning techniques to accurately represent ACPs. However, the performance of these methods can be further improved for cases with insufficient numbers of samples. In this article, we propose an ACP prediction model called ACP-DA (Data Augmentation), which uses data augmentation for insufficient samples to improve the prediction performance. In our method, to better exploit the information of peptide sequences, peptide sequences are represented by integrating binary profile features and AAindex features, and then the samples in the training set are augmented in the feature space. After data augmentation, the samples are used to train the machine learning model, which is used to predict ACPs. The performance of ACP-DA exceeds that of existing methods, and ACP-DA achieves better performance in the prediction of ACPs compared with a method without data augmentation. The proposed method is available at http://github.com/chenxgscuec/ACPDA.

Chen Xian-Gan, Zhang Wen, Yang Xiaofei, Li Chenhong, Chen Hengling

2021

anticancer peptide prediction, data augmentation, feature representation, machine learning, multilayer perception

General General

ImputEHR: A Visualization Tool of Imputation for the Prediction of Biomedical Data.

In Frontiers in genetics ; h5-index 62.0

Electronic health records (EHRs) have been widely adopted in recent years, but often include a high proportion of missing data, which can create difficulties in implementing machine learning and other tools of personalized medicine. Completed datasets are preferred for a number of analysis methods, and successful imputation of missing EHR data can improve interpretation and increase our power to predict health outcomes. However, use of the most popular imputation methods mainly require scripting skills, and are implemented using various packages and syntax. Thus, the implementation of a full suite of methods is generally out of reach to all except experienced data scientists. Moreover, imputation is often considered as a separate exercise from exploratory data analysis, but should be considered as art of the data exploration process. We have created a new graphical tool, ImputEHR, that is based on a Python base and allows implementation of a range of simple and sophisticated (e.g., gradient-boosted tree-based and neural network) data imputation approaches. In addition to imputation, the tool enables data exploration for informed decision-making, as well as implementing machine learning prediction tools for response data selected by the user. Although the approach works for any missing data problem, the tool is primarily motivated by problems encountered for EHR and other biomedical data. We illustrate the tool using multiple real datasets, providing performance measures of imputation and downstream predictive analysis.

Zhou Yi-Hui, Saghapour Ehsan

2021

decision trees, electronic health records, gradient boosting, imputation, prediction

General General

Bronchopulmonary Dysplasia Predicted by Developing a Machine Learning Model of Genetic and Clinical Information.

In Frontiers in genetics ; h5-index 62.0

Background : An early and accurate evaluation of the risk of bronchopulmonary dysplasia (BPD) in premature infants is pivotal in implementing preventive strategies. The risk prediction models nowadays for BPD risk that included only clinical factors but without genetic factors are either too complex without practicability or provide poor-to-moderate discrimination. We aim to identify the role of genetic factors in BPD risk prediction early and accurately.

Methods : Exome sequencing was performed in a cohort of 245 premature infants (gestational age <32 weeks), with 131 BPD infants and 114 infants without BPD as controls. A gene burden test was performed to find risk genes with loss-of-function mutations or missense mutations over-represented in BPD and severe BPD (sBPD) patients, with risk gene sets (RGS) defined as BPD-RGS and sBPD-RGS, respectively. We then developed two predictive models for the risk of BPD and sBPD by integrating patient clinical and genetic features. The performance of the models was evaluated using the area under the receiver operating characteristic curve (AUROC).

Results : Thirty and 21 genes were included in BPD-RGS and sBPD-RGS, respectively. The predictive model for BPD, which combined the BPD-RGS and basic clinical risk factors, showed better discrimination than the model that was only based on basic clinical features (AUROC, 0.915 vs. AUROC, 0.814, P = 0.013, respectively) in the independent testing dataset. The same was observed in the predictive model for sBPD (AUROC, 0.907 vs. AUROC, 0.826; P = 0.016).

Conclusion : This study suggests that genetic information contributes to susceptibility to BPD. The predictive model in this study, which combined BPD-RGS with basic clinical risk factors, can thus accurately stratify BPD risk in premature infants.

Dai Dan, Chen Huiyao, Dong Xinran, Chen Jinglong, Mei Mei, Lu Yulan, Yang Lin, Wu Bingbing, Cao Yun, Wang Jin, Zhou Wenhao, Qian Liling

2021

bronchopulmonary dysplasia, exome sequencing, machine learning, prediction model, premature infants

General General

Bioinformatic Analysis of Temporal and Spatial Proteome Alternations During Infections.

In Frontiers in genetics ; h5-index 62.0

Microbial pathogens have evolved numerous mechanisms to hijack host's systems, thus causing disease. This is mediated by alterations in the combined host-pathogen proteome in time and space. Mass spectrometry-based proteomics approaches have been developed and tailored to map disease progression. The result is complex multidimensional data that pose numerous analytic challenges for downstream interpretation. However, a systematic review of approaches for the downstream analysis of such data has been lacking in the field. In this review, we detail the steps of a typical temporal and spatial analysis, including data pre-processing steps (i.e., quality control, data normalization, the imputation of missing values, and dimensionality reduction), different statistical and machine learning approaches, validation, interpretation, and the extraction of biological information from mass spectrometry data. We also discuss current best practices for these steps based on a collection of independent studies to guide users in selecting the most suitable strategies for their dataset and analysis objectives. Moreover, we also compiled the list of commonly used R software packages for each step of the analysis. These could be easily integrated into one's analysis pipeline. Furthermore, we guide readers through various analysis steps by applying these workflows to mock and host-pathogen interaction data from public datasets. The workflows presented in this review will serve as an introduction for data analysis novices, while also helping established users update their data analysis pipelines. We conclude the review by discussing future directions and developments in temporal and spatial proteomics and data analysis approaches. Data analysis codes, prepared for this review are available from https://github.com/BabuLab-UofR/TempSpac, where guidelines and sample datasets are also offered for testing purposes.

Rahmatbakhsh Matineh, Gagarinova Alla, Babu Mohan

2021

clustering, data imputation, host-pathogen interactions, normalization, principal component analysis, self-organizing maps, spatial proteomics, temporal proteomics

General General

Prediction of Alternative Drug-Induced Liver Injury Classifications Using Molecular Descriptors, Gene Expression Perturbation, and Toxicology Reports.

In Frontiers in genetics ; h5-index 62.0

Motivation: Drug-induced liver injury (DILI) is one of the primary problems in drug development. Early prediction of DILI, based on the chemical properties of substances and experiments performed on cell lines, would bring a significant reduction in the cost of clinical trials and faster development of drugs. The current study aims to build predictive models of risk of DILI for chemical compounds using multiple sources of information. Methods: Using several supervised machine learning algorithms, we built predictive models for several alternative splits of compounds between DILI and non-DILI classes. To this end, we used chemical properties of the given compounds, their effects on gene expression levels in six human cell lines treated with them, as well as their toxicological profiles. First, we identified the most informative variables in all data sets. Then, these variables were used to build machine learning models. Finally, composite models were built with the Super Learner approach. All modeling was performed using multiple repeats of cross-validation for unbiased and precise estimates of performance. Results: With one exception, gene expression profiles of human cell lines were non-informative and resulted in random models. Toxicological reports were not useful for prediction of DILI. The best results were obtained for models discerning between harmless compounds and those for which any level of DILI was observed (AUC = 0.75). These models were built with Random Forest algorithm that used molecular descriptors.

Lesiński Wojciech, Mnich Krzysztof, Rudnicki Witold R

2021

data integration, drug induced liver injury, feature selection, machine learning, random forest

General General

TheLNet270v1 - A Novel Deep-Network Architecture for the Automatic Classification of Thermal Images for Greenhouse Plants.

In Frontiers in plant science

The real challenge for separating leaf pixels from background pixels in thermal images is associated with various factors such as the amount of emitted and reflected thermal radiation from the targeted plant, absorption of reflected radiation by the humidity of the greenhouse, and the outside environment. We proposed TheLNet270v1 (thermal leaf network with 270 layers version 1) to recover the leaf canopy from its background in real time with higher accuracy than previous systems. The proposed network had an accuracy of 91% (mean boundary F1 score or BF score) to distinguish canopy pixels from background pixels and then segment the image into two classes: leaf and background. We evaluated the classification (segment) performance by using more than 13,766 images and obtained 95.75% training and 95.23% validation accuracies without overfitting issues. This research aimed to develop a deep learning technique for the automatic segmentation of thermal images to continuously monitor the canopy surface temperature inside a greenhouse.

Islam Md Parvez, Nakano Yuka, Lee Unseok, Tokuda Keinichi, Kochi Nobuo

2021

classification, deep learning, network architecture, segmentation, thermal image

General General

Tumor-Associated Tertiary Lymphoid Structures: From Basic and Clinical Knowledge to Therapeutic Manipulation.

In Frontiers in immunology ; h5-index 100.0

The tumor microenvironment is a complex ecosystem almost unique to each patient. Most of available therapies target tumor cells according to their molecular characteristics, angiogenesis or immune cells involved in tumor immune-surveillance. Unfortunately, only a limited number of patients benefit in the long-term of these treatments that are often associated with relapses, in spite of the remarkable progress obtained with the advent of immune checkpoint inhibitors (ICP). The presence of "hot" tumors is a determining parameter for selecting therapies targeting the patient immunity, even though some of them still do not respond to treatment. In human studies, an in-depth analysis of the organization and interactions of tumor-infiltrating immune cells has revealed the presence of an ectopic lymphoid organization termed tertiary lymphoid structures (TLS) in a large number of tumors. Their marked similarity to secondary lymphoid organs has suggested that TLS are an "anti-tumor school" and an "antibody factory" to fight malignant cells. They are effectively associated with long-term survival in most solid tumors, and their presence has been recently shown to predict response to ICP inhibitors. This review discusses the relationship between TLS and the molecular characteristics of tumors and the presence of oncogenic viruses, as well as their role when targeted therapies are used. Also, we present some aspects of TLS biology in non-tumor inflammatory diseases and discuss the putative common characteristics that they share with tumor-associated TLS. A detailed overview of the different pre-clinical models available to investigate TLS function and neogenesis is also presented. Finally, new approaches aimed at a better understanding of the role and function of TLS such as the use of spheroids and organoids and of artificial intelligence algorithms, are also discussed. In conclusion, increasing our knowledge on TLS will undoubtedly improve prognostic prediction and treatment selection in cancer patients with key consequences for the next generation immunotherapy.

Domblides Charlotte, Rochefort Juliette, Riffard Clémence, Panouillot Marylou, Lescaille Géraldine, Teillaud Jean-Luc, Mateo Véronique, Dieu-Nosjean Marie-Caroline

2021

artificial intelligence, biomarker, cancer, lymphoid neogenesis, organoid, tertiary lymphoid structure, therapeutic intervention, tumor model

General General

LSTM Neural Network for Inferring Conduction Velocity Distribution in Demyelinating Neuropathies.

In Frontiers in neurology

Waveform analysis of compound muscle action potential (CMAP) is important in the detailed analysis of conduction velocities of each axon as seen in temporal dispersion. This understanding is limited because conduction velocity distribution cannot be easily available from a CMAP waveform. Given the recent advent of artificial intelligence, this study aimed to assess whether conduction velocity (CV) distribution can be inferred from CMAP by the use of deep learning algorithms. Simulated CMAP waveforms were constructed from a single motor unit potential and randomly created CV histograms (n = 12,000). After training the data with various recurrent neural networks (RNNs), CV inference was tested by the network. Among simple RNNs, long short-term memory (LSTM) and gated recurrent unit, the best accuracy and loss profiles, were shown by two-layer bidirectional LSTM, with training and validation accuracies of 0.954 and 0.975, respectively. Training with the use of a recurrent neural network can accurately infer conduction velocity distribution in a wide variety of simulated demyelinating neuropathies. Using deep learning techniques, CV distribution can be assessed in a non-invasive manner.

Nodera Hiroyuki, Matsui Makoto

2021

conduction, deep learning, demyelination, nerve conduction studies, recurrent neural networks

Public Health Public Health

Machine-learning based feature selection for a non-invasive breathing change detection.

In BioData mining

BACKGROUND : Chronic Obstructive Pulmonary Disease (COPD) is one of the top 10 causes of death worldwide, representing a major public health problem. Researchers have been looking for new technologies and methods for patient monitoring with the intention of an early identification of acute exacerbation events. Many of these works have been focusing in breathing rate variation, while achieving unsatisfactory sensitivity and/or specificity. This study aims to identify breathing features that better describe respiratory pattern changes in a short-term adjustment of the load-capacity-drive balance, using exercising data.

RESULTS : Under any tested circumstances, breathing rate alone leads to poor capability of classifying rest and effort periods. The best performances were achieved when using Fourier coefficients or when combining breathing rate with the signal amplitude and/or ARIMA coefficients.

CONCLUSIONS : Breathing rate alone is a quite poor feature in terms of prediction of breathing change and the addition of any of the other proposed features improves the classification power. Thus, the combination of features may be considered for enhancing exacerbation prediction methods based in the breathing signal.

TRIAL REGISTRATION : ClinicalTrials NCT03753386. Registered 27 November 2018, https://clinicaltrials.gov/show/NCT03753386.

Pegoraro Juliana Alves, Lavault Sophie, Wattiez Nicolas, Similowski Thomas, Gonzalez-Bermejo Jésus, Birmelé Etienne

2021-Jul-18

Chronic obstructive pulmonary disease (COPD), Classification, Novelty detection, Respiratory pattern, Telemonitoring

General General

MIDCAN: A multiple input deep convolutional attention network for Covid-19 diagnosis based on chest CT and chest X-ray.

In Pattern recognition letters

Background : COVID-19 has caused 3.34m deaths till 13/May/2021. It is now still causing confirmed cases and ongoing deaths every day.

Method : This study investigated whether fusing chest CT with chest X-ray can help improve the AI's diagnosis performance. Data harmonization is employed to make a homogeneous dataset. We create an end-to-end multiple-input deep convolutional attention network (MIDCAN) by using the convolutional block attention module (CBAM). One input of our model receives 3D chest CT image, and other input receives 2D X-ray image. Besides, multiple-way data augmentation is used to generate fake data on training set. Grad-CAM is used to give explainable heatmap.

Results : The proposed MIDCAN achieves a sensitivity of 98.10±1.88%, a specificity of 97.95±2.26%, and an accuracy of 98.02±1.35%.

Conclusion : Our MIDCAN method provides better results than 8 state-of-the-art approaches. We demonstrate the using multiple modalities can achieve better results than individual modality. Also, we demonstrate that CBAM can help improve the diagnosis performance.

Zhang Yu-Dong, Zhang Zheng, Zhang Xin, Wang Shui-Hua

2021-Jul-14

Automatic differentiation, COVID-19, Chest CT, Chest X-ray, Convolutional neural network, Data harmonization, Deep learning, Multimodality, Multiple input

Radiology Radiology

Automated Segmentation and Volume Measurement of Intracranial Carotid Artery Calcification on Non-Contrast CT

ArXiv Preprint

Purpose: To evaluate a fully-automated deep-learning-based method for assessment of intracranial carotid artery calcification (ICAC). Methods: Two observers manually delineated ICAC in non-contrast CT scans of 2,319 participants (mean age 69 (SD 7) years; 1154 women) of the Rotterdam Study, prospectively collected between 2003 and 2006. These data were used to retrospectively develop and validate a deep-learning-based method for automated ICAC delineation and volume measurement. To evaluate the method, we compared manual and automatic assessment (computed using ten-fold cross-validation) with respect to 1) the agreement with an independent observer's assessment (available in a random subset of 47 scans); 2) the accuracy in delineating ICAC as judged via blinded visual comparison by an expert; 3) the association with first stroke incidence from the scan date until 2012. All method performance metrics were computed using 10-fold cross-validation. Results: The automated delineation of ICAC reached sensitivity of 83.8% and positive predictive value (PPV) of 88%. The intraclass correlation between automatic and manual ICAC volume measures was 0.98 (95% CI: 0.97, 0.98; computed in the entire dataset). Measured between the assessments of independent observers, sensitivity was 73.9%, PPV was 89.5%, and intraclass correlation was 0.91 (95% CI: 0.84, 0.95; computed in the 47-scan subset). In the blinded visual comparisons, automatic delineations were more accurate than manual ones (p-value = 0.01). The association of ICAC volume with incident stroke was similarly strong for both automated (hazard ratio, 1.38 (95% CI: 1.12, 1.75) and manually measured volumes (hazard ratio, 1.48 (95% CI: 1.20, 1.87)). Conclusions: The developed model was capable of automated segmentation and volume quantification of ICAC with accuracy comparable to human experts.

Gerda Bortsova, Daniel Bos, Florian Dubost, Meike W. Vernooij, M. Kamran Ikram, Gijs van Tulder, Marleen de Bruijne

2021-07-20

Cardiology Cardiology

Left Atrial Wall Stress and the Long-Term Outcome of Catheter Ablation of Atrial Fibrillation: An Artificial Intelligence-Based Prediction of Atrial Wall Stress.

In Frontiers in physiology

Atrial stretch may contribute to the mechanism of atrial fibrillation (AF) recurrence after atrial fibrillation catheter ablation (AFCA). We tested whether the left atrial (LA) wall stress (LAW-stress[measured]) could be predicted by artificial intelligence (AI) using non-invasive parameters (LAW-stress[AI]) and whether rhythm outcome after AFCA could be predicted by LAW-stress[AI] in an independent cohort. Cohort 1 included 2223 patients, and cohort 2 included 658 patients who underwent AFCA. LAW-stress[measured] was calculated using the Law of Laplace using LA diameter by echocardiography, peak LA pressure measured during procedure, and LA wall thickness measured by customized software (AMBER) using computed tomography. The highest quartile (Q4) LAW-stress[measured] was predicted and validated by AI using non-invasive clinical parameters, including non-paroxysmal type of AF, age, presence of hypertension, diabetes, vascular disease, and heart failure, left ventricular ejection fraction, and the ratio of the peak mitral flow velocity of the early rapid filling to the early diastolic velocity of the mitral annulus (E/Em). We tested the AF/atrial tachycardia recurrence 3 months after the blanking period after AFCA using the LAW-stress[measured] and LAW-stress[AI] in cohort 1 and LAW-stress[AI] in cohort 2. LAW-stress[measured] was independently associated with non-paroxysmal AF (p < 0.001), diabetes (p = 0.012), vascular disease (p = 0.002), body mass index (p < 0.001), E/Em (p < 0.001), and mean LA voltage measured by electrogram voltage mapping (p < 0.001). The best-performing AI model had acceptable prediction power for predicting Q4-LAW-stress[measured] (area under the receiver operating characteristic curve 0.734). During 26.0 (12.0-52.0) months of follow-up, AF recurrence was significantly higher in the Q4-LAW-stress[measured] group [log-rank p = 0.001, hazard ratio 2.43 (1.21-4.90), p = 0.013] and Q4-LAW-stress[AI] group (log-rank p = 0.039) in cohort 1. In cohort 2, the Q4-LAW-stress[AI] group consistently showed worse rhythm outcomes (log-rank p < 0.001). A higher LAW-stress was associated with poorer rhythm outcomes after AFCA. AI was able to predict this complex but useful prognostic parameter using non-invasive parameters with moderate accuracy.

Lee Jae-Hyuk, Kwon Oh-Seok, Shim Jaemin, Lee Jisu, Han Hee-Jin, Yu Hee Tae, Kim Tae-Hoon, Uhm Jae-Sun, Joung Boyoung, Lee Moon-Hyoung, Kim Young-Hoon, Pak Hui-Nam

2021

artificial intelliegnce, atrial fibrillation, atrial wall stress, catheter ablation, rhythm outcome

Pathology Pathology

Deep Shape Features for Predicting Future Intracranial Aneurysm Growth.

In Frontiers in physiology

Introduction: Intracranial aneurysms (IAs) are a common vascular pathology and are associated with a risk of rupture, which is often fatal. Aneurysm growth is considered a surrogate of rupture risk; therefore, the study aimed to develop and evaluate prediction models of future artificial intelligence (AI) growth based on baseline aneurysm morphology as a computer-aided treatment decision support. Materials and methods: Follow-up CT angiography (CTA) and magnetic resonance angiography (MRA) angiograms of 39 patients with 44 IAs were classified by an expert as growing and stable (25/19). From the angiograms vascular surface meshes were extracted and the aneurysm shape was characterized by established morphologic features and novel deep shape features. The features corresponding to the baseline aneurysms were used to predict future aneurysm growth using univariate thresholding, multivariate random forest and multi-layer perceptron (MLP) learning, and deep shape learning based on the PointNet++ model. Results: The proposed deep shape feature learning method achieved an accuracy of 0.82 (sensitivity = 0.96, specificity = 0.63), while the multivariate learning and univariate thresholding methods were inferior with an accuracy of up to 0.68 and 0.63, respectively. Conclusion: High-performing classification of future growing IAs renders the proposed deep shape features learning approach as the key enabling tool to manage rupture risk in the "no treatment" paradigm of patient follow-up imaging.

Bizjak Žiga, Pernuš Franjo, Špiclin Žiga

2021

classification, deep learning, growth prediction, intracranial aneurysm, morphologic features, vascular disease

General General

Learning Then, Learning Now, and Every Second in Between: Lifelong Learning With a Simulated Humanoid Robot.

In Frontiers in neurorobotics

Long-term human-robot interaction requires the continuous acquisition of knowledge. This ability is referred to as lifelong learning (LL). LL is a long-standing challenge in machine learning due to catastrophic forgetting, which states that continuously learning from novel experiences leads to a decrease in the performance of previously acquired knowledge. Two recently published LL approaches are the Growing Dual-Memory (GDM) and the Self-organizing Incremental Neural Network+ (SOINN+). Both are growing neural networks that create new neurons in response to novel sensory experiences. The latter approach shows state-of-the-art clustering performance on sequentially available data with low memory requirements regarding the number of nodes. However, classification capabilities are not investigated. Two novel contributions are made in our research paper: (I) An extended SOINN+ approach, called associative SOINN+ (A-SOINN+), is proposed. It adopts two main properties of the GDM model to facilitate classification. (II) A new LL object recognition dataset (v-NICO-World-LL) is presented. It is recorded in a nearly photorealistic virtual environment, where a virtual humanoid robot manipulates 100 different objects belonging to 10 classes. Real-world and artificially created background images, grouped into four different complexity levels, are utilized. The A-SOINN+ reaches similar state-of-the-art classification accuracy results as the best GDM architecture of this work and consists of 30 to 350 times fewer neurons, evaluated on two LL object recognition datasets, the novel v-NICO-World-LL and the well-known CORe50. Furthermore, we observe an approximately 268 times lower training time. These reduced numbers result in lower memory and computational requirements, indicating higher suitability for autonomous social robots with low computational resources to facilitate a more efficient LL during long-term human-robot interactions.

Logacjov Aleksej, Kerzel Matthias, Wermter Stefan

2021

growing dual-memory, lifelong learning, lifelong learning dataset, long-term human-robot interaction, self-organizing incremental neural network, simulated humanoid robot

General General

Revisiting Persistent Neuronal Activity During Covert Spatial Attention.

In Frontiers in neural circuits

Persistent activity has been observed in the prefrontal cortex (PFC), in particular during the delay periods of visual attention tasks. Classical approaches based on the average activity over multiple trials have revealed that such an activity encodes the information about the attentional instruction provided in such tasks. However, single-trial approaches have shown that activity in this area is rather sparse than persistent and highly heterogeneous not only within the trials but also between the different trials. Thus, this observation raised the question of how persistent the actually persistent attention-related prefrontal activity is and how it contributes to spatial attention. In this paper, we review recent evidence of precisely deconstructing the persistence of the neural activity in the PFC in the context of attention orienting. The inclusion of machine-learning methods for decoding the information reveals that attention orienting is a highly dynamic process, possessing intrinsic oscillatory dynamics working at multiple timescales spanning from milliseconds to minutes. Dimensionality reduction methods further show that this persistent activity dynamically incorporates multiple sources of information. This novel framework reflects a high complexity in the neural representation of the attention-related information in the PFC, and how its computational organization predicts behavior.

Amengual Julian L, Ben Hamed Suliann

2021

alpha oscillations, decoding, mixed-selectivity, neurophysiology, persistent activity, population activity, prefrontal cortex, spatial attention

General General

Possibilistic Clustering-Promoting Semi-Supervised Learning for EEG-Based Emotion Recognition.

In Frontiers in neuroscience ; h5-index 72.0

The purpose of the latest brain computer interface is to perform accurate emotion recognition through the customization of their recognizers to each subject. In the field of machine learning, graph-based semi-supervised learning (GSSL) has attracted more and more attention due to its intuitive and good learning performance for emotion recognition. However, the existing GSSL methods are sensitive or not robust enough to noise or outlier electroencephalogram (EEG)-based data since each individual subject may present noise or outlier EEG patterns in the same scenario. To address the problem, in this paper, we invent a Possibilistic Clustering-Promoting semi-supervised learning method for EEG-based Emotion Recognition. Specifically, it constrains each instance to have the same label membership value with its local weighted mean to improve the reliability of the recognition method. In addition, a regularization term about fuzzy entropy is introduced into the objective function, and the generalization ability of membership function is enhanced by increasing the amount of sample discrimination information, which improves the robustness of the method to noise and the outlier. A large number of experimental results on the three real datasets (i.e., DEAP, SEED, and SEED-IV) show that the proposed method improves the reliability and robustness of the EEG-based emotion recognition.

Dan Yufang, Tao Jianwen, Fu Jianjing, Zhou Di

2021

electroencephalogram, emotion recognition, fuzzy entropy, membership function, semi-supervised classification

Public Health Public Health

Serverless Workflows for Containerised Applications in the Cloud Continuum.

In Journal of grid computing

This paper introduces an open-source platform to support serverless computing for scientific data-processing workflow-based applications across the Cloud continuum (i.e. simultaneously involving both on-premises and public Cloud platforms to process data captured at the edge). This is achieved via dynamic resource provisioning for FaaS platforms compatible with scale-to-zero approaches that minimise resource usage and cost for dynamic workloads with different elasticity requirements. The platform combines the usage of dynamically deployed auto-scaled Kubernetes clusters on on-premises Clouds and automated Cloud bursting into AWS Lambda to achieve higher levels of elasticity. A use case in public health for smart cities is used to assess the platform, in charge of detecting people not wearing face masks from captured videos. Faces are blurred for enhanced anonymity in the on-premises Cloud and detection via Deep Learning models is performed in AWS Lambda for this data-driven containerised workflow. The results indicate that hybrid workflows across the Cloud continuum can efficiently perform local data processing for enhanced regulations compliance and perform Cloud bursting for increased levels of elasticity.

Risco Sebastián, Moltó Germán, Naranjo Diana M, Blanquer Ignacio

2021

Cloud computing, Containers, Serverless computing, Workflow

General General

Correcting data imbalance for semi-supervised COVID-19 detection using X-ray chest images.

In Applied soft computing

A key factor in the fight against viral diseases such as the coronavirus (COVID-19) is the identification of virus carriers as early and quickly as possible, in a cheap and efficient manner. The application of deep learning for image classification of chest X-ray images of COVID-19 patients could become a useful pre-diagnostic detection methodology. However, deep learning architectures require large labelled datasets. This is often a limitation when the subject of research is relatively new as in the case of the virus outbreak, where dealing with small labelled datasets is a challenge. Moreover, in such context, the datasets are also highly imbalanced, with few observations from positive cases of the new disease. In this work we evaluate the performance of the semi-supervised deep learning architecture known as MixMatch with a very limited number of labelled observations and highly imbalanced labelled datasets. We demonstrate the critical impact of data imbalance to the model's accuracy. Therefore, we propose a simple approach for correcting data imbalance, by re-weighting each observation in the loss function, giving a higher weight to the observations corresponding to the under-represented class. For unlabelled observations, we use the pseudo and augmented labels calculated by MixMatch to choose the appropriate weight. The proposed method improved classification accuracy by up to 18%, with respect to the non balanced MixMatch algorithm. We tested our proposed approach with several available datasets using 10, 15 and 20 labelled observations, for binary classification (COVID-19 positive and normal cases). For multi-class classification (COVID-19 positive, pneumonia and normal cases), we tested 30, 50, 70 and 90 labelled observations. Additionally, a new dataset is included among the tested datasets, composed of chest X-ray images of Costa Rican adult patients.

Calderon-Ramirez Saul, Yang Shengxiang, Moemeni Armaghan, Elizondo David, Colreavy-Donnelly Simon, Chavarría-Estrada Luis Fernando, Molina-Cabello Miguel A

2021-Nov

COVID-19, Computer aided diagnosis, Coronavirus, Data imbalance, Semi-supervised learning

General General

Improve teaching with modalities and collaborative groups in an LMS: an analysis of monitoring using visualisation techniques.

In Journal of computing in higher education

Monitoring students in Learning Management Systems (LMS) throughout the teaching-learning process has been shown to be a very effective technique for detecting students at risk. Likewise, the teaching style in the LMS conditions, the type of student behaviours on the platform and the learning outcomes. The main objective of this study was to test the effectiveness of three teaching modalities (all using Online Project-based Learning -OPBL- and Flipped Classroom experiences and differing in the use of virtual laboratories and Intelligent Personal Assistant -IPA-) on Moodle behaviour and student performance taking into account the covariate "collaborative group". Both quantitative and qualitative research methods were used. With regard to the quantitative analysis, differences were found in student behaviour in Moodle and in learning outcomes, with respect to teaching modalities that included virtual laboratories. Similarly, the qualitative study also analysed the behaviour patterns found in each collaborative group in the three teaching modalities studied. The results indicate that the collaborative group homogenises the learning outcomes, but not the behaviour pattern of each member. Future research will address the analysis of collaborative behaviour in LMSs according to different variables (motivation and metacognitive strategies in students, number of members, interactions between students and teacher in the LMS, etc.).

Sáiz-Manzanares María Consuelo, Marticorena-Sánchez Raúl, Rodríguez-Díez Juan José, Rodríguez-Arribas Sandra, Díez-Pastor José Francisco, Ji Yi Peng

2021-Jul-13

Heat map, Machine learning techniques, Monitoring students, Online project-based learning, Self-regulated learning, Visualisation techniques

General General

Fusion of AI techniques to tackle COVID-19 pandemic: models, incidence rates, and future trends.

In Multimedia systems

The COVID-19 pandemic is rapidly spreading across the globe and infected millions of people that take hundreds of thousands of lives. Over the years, the role of Artificial intelligence (AI) has been on the rise as its algorithms are getting more and more accurate and it is thought that its role in strengthening the existing healthcare system will be the most profound. Moreover, the pandemic brought an opportunity to showcase AI and healthcare integration potentials as the current infrastructure worldwide is overwhelmed and crumbling. Due to AI's flexibility and adaptability, it can be used as a tool to tackle COVID-19. Motivated by these facts, in this paper, we surveyed how the AI techniques can handle the COVID-19 pandemic situation and present the merits and demerits of these techniques. This paper presents a comprehensive end-to-end review of all the AI-techniques that can be used to tackle all areas of the pandemic. Further, we systematically discuss the issues of the COVID-19, and based on the literature review, we suggest their potential countermeasures using AI techniques. In the end, we analyze various open research issues and challenges associated with integrating the AI techniques in the COVID-19.

Shah Het, Shah Saiyam, Tanwar Sudeep, Gupta Rajesh, Kumar Neeraj

2021-Jul-13

AI, COVID-19, Deep learning, Healthcare, Machine learning

General General

New but for whom? Discourses of innovation in precision agriculture.

In Agriculture and human values

We describe how the set of tools, practices, and social relations known as "precision agriculture" is defined, promoted, and debated. To do so, we perform a critical discourse analysis of popular and trade press websites. Promoters of precision agriculture champion how big data analytics, automated equipment, and decision-support software will optimize yields in the face of narrow margins and public concern about farming's environmental impacts. At its core, however, the idea of farmers leveraging digital infrastructure in their operations is not new, as agronomic research in this vein has existed for over 30 years. Contemporary discourse in precision ag tends to favour emerging digital technologies themselves over their embeddedness in longstanding precision management approaches. Following several strands of science and technology studies (STS) research, we explore what rhetorical emphasis on technical innovation achieves, and argue that this discourse of novelty is a reinvention of precision agriculture in the context of the growing "smart" agricultural economy. We overview six tensions that remain unresolved in this promotional rhetoric, concerning the definitions, history, goals, adoption, uses, and impacts of precision agriculture. We then synthesize these in a discussion of the extent to which digital tools are believed to displace farmer decision-making and whether digital agriculture addresses the biophysical heterogeneity of farm landscapes or land itself has become an "experimental technology"-a way to advance the general development of artificial intelligence. This discussion ultimately helps us name a larger dilemma: that the smart agricultural economy is perhaps less about supporting land and its stewards than promising future tech and profits.

Duncan Emily, Glaros Alesandros, Ross Dennis Z, Nost Eric

2021-Jul-14

Digital agriculture, Discourse, Innovation, Precision agriculture

General General

Machine Learning for Real-World Evidence Analysis of COVID-19 Pharmacotherapy

ArXiv Preprint

Introduction: Real-world data generated from clinical practice can be used to analyze the real-world evidence (RWE) of COVID-19 pharmacotherapy and validate the results of randomized clinical trials (RCTs). Machine learning (ML) methods are being used in RWE and are promising tools for precision-medicine. In this study, ML methods are applied to study the efficacy of therapies on COVID-19 hospital admissions in the Valencian Region in Spain. Methods: 5244 and 1312 COVID-19 hospital admissions - dated between January 2020 and January 2021 from 10 health departments, were used respectively for training and validation of separate treatment-effect models (TE-ML) for remdesivir, corticosteroids, tocilizumab, lopinavir-ritonavir, azithromycin and chloroquine/hydroxychloroquine. 2390 admissions from 2 additional health departments were reserved as an independent test to analyze retrospectively the survival benefits of therapies in the population selected by the TE-ML models using cox-proportional hazard models. TE-ML models were adjusted using treatment propensity scores to control for pre-treatment confounding variables associated to outcome and further evaluated for futility. ML architecture was based on boosted decision-trees. Results: In the populations identified by the TE-ML models, only Remdesivir and Tocilizumab were significantly associated with an increase in survival time, with hazard ratios of 0.41 (P = 0.04) and 0.21 (P = 0.001), respectively. No survival benefits from chloroquine derivatives, lopinavir-ritonavir and azithromycin were demonstrated. Tools to explain the predictions of TE-ML models are explored at patient-level as potential tools for personalized decision making and precision medicine. Conclusion: ML methods are suitable tools toward RWE analysis of COVID-19 pharmacotherapies. Results obtained reproduce published results on RWE and validate the results from RCTs.

Aurelia Bustos, Patricio Mas_Serrano, Mari L. Boquera, Jose M. Salinas

2021-07-19

General General

Analysis of training and seed bias in small molecules generated with a conditional graph-based variational autoencoder -- Insights for practical AI-driven molecule generation

ArXiv Preprint

The application of deep learning to generative molecule design has shown early promise for accelerating lead series development. However, questions remain concerning how factors like training, dataset, and seed bias impact the technology's utility to medicine and computational chemists. In this work, we analyze the impact of seed and training bias on the output of an activity-conditioned graph-based variational autoencoder (VAE). Leveraging a massive, labeled dataset corresponding to the dopamine D2 receptor, our graph-based generative model is shown to excel in producing desired conditioned activities and favorable unconditioned physical properties in generated molecules. We implement an activity swapping method that allows for the activation, deactivation, or retention of activity of molecular seeds, and we apply independent deep learning classifiers to verify the generative results. Overall, we uncover relationships between noise, molecular seeds, and training set selection across a range of latent-space sampling procedures, providing important insights for practical AI-driven molecule generation.

Seung-gu Kang, Joseph A. Morrone, Jeffrey K. Weber, Wendy D. Cornell

2021-07-19

General General

Analysis of training and seed bias in small molecules generated with a conditional graph-based variational autoencoder -- Insights for practical AI-driven molecule generation

ArXiv Preprint

The application of deep learning to generative molecule design has shown early promise for accelerating lead series development. However, questions remain concerning how factors like training, dataset, and seed bias impact the technology's utility to medicine and computational chemists. In this work, we analyze the impact of seed and training bias on the output of an activity-conditioned graph-based variational autoencoder (VAE). Leveraging a massive, labeled dataset corresponding to the dopamine D2 receptor, our graph-based generative model is shown to excel in producing desired conditioned activities and favorable unconditioned physical properties in generated molecules. We implement an activity swapping method that allows for the activation, deactivation, or retention of activity of molecular seeds, and we apply independent deep learning classifiers to verify the generative results. Overall, we uncover relationships between noise, molecular seeds, and training set selection across a range of latent-space sampling procedures, providing important insights for practical AI-driven molecule generation.

Seung-gu Kang, Joseph A. Morrone, Jeffrey K. Weber, Wendy D. Cornell

2021-07-19

General General

Wave-Informed Matrix Factorization withGlobal Optimality Guarantees

ArXiv Preprint

With the recent success of representation learning methods, which includes deep learning as a special case, there has been considerable interest in developing representation learning techniques that can incorporate known physical constraints into the learned representation. As one example, in many applications that involve a signal propagating through physical media (e.g., optics, acoustics, fluid dynamics, etc), it is known that the dynamics of the signal must satisfy constraints imposed by the wave equation. Here we propose a matrix factorization technique that decomposes such signals into a sum of components, where each component is regularized to ensure that it satisfies wave equation constraints. Although our proposed formulation is non-convex, we prove that our model can be efficiently solved to global optimality in polynomial time. We demonstrate the benefits of our work by applications in structural health monitoring, where prior work has attempted to solve this problem using sparse dictionary learning approaches that do not come with any theoretical guarantees regarding convergence to global optimality and employ heuristics to capture desired physical constraints.

Harsha Vardhan Tetali, Joel B. Harley, Benjamin D. Haeffele

2021-07-19

General General

Deep Open Snake Tracker for Vessel Tracing

ArXiv Preprint

Vessel tracing by modeling vascular structures in 3D medical images with centerlines and radii can provide useful information for vascular health. Existing algorithms have been developed but there are certain persistent problems such as incomplete or inaccurate vessel tracing, especially in complicated vascular beds like the intracranial arteries. We propose here a deep learning based open curve active contour model (DOST) to trace vessels in 3D images. Initial curves were proposed from a centerline segmentation neural network. Then data-driven machine knowledge was used to predict the stretching direction and vessel radius of the initial curve, while the active contour model (as human knowledge) maintained smoothness and intensity fitness of curves. Finally, considering the nonloop topology of most vasculatures, individually traced vessels were connected into a tree topology by applying a minimum spanning tree algorithm on a global connection graph. We evaluated DOST on a Time-of-Flight (TOF) MRA intracranial artery dataset and demonstrated its superior performance over existing segmentation-based and tracking-based vessel tracing methods. In addition, DOST showed strong adaptability on different imaging modalities (CTA, MR T1 SPACE) and vascular beds (coronary arteries).

Li Chen, Wenjin Liu, Niranjan Balu, Mahmud Mossa-Basha, Thomas S. Hatsukami, Jenq-Neng Hwang, Chun Yuan

2021-07-19

General General

Deep Open Snake Tracker for Vessel Tracing

ArXiv Preprint

Vessel tracing by modeling vascular structures in 3D medical images with centerlines and radii can provide useful information for vascular health. Existing algorithms have been developed but there are certain persistent problems such as incomplete or inaccurate vessel tracing, especially in complicated vascular beds like the intracranial arteries. We propose here a deep learning based open curve active contour model (DOST) to trace vessels in 3D images. Initial curves were proposed from a centerline segmentation neural network. Then data-driven machine knowledge was used to predict the stretching direction and vessel radius of the initial curve, while the active contour model (as human knowledge) maintained smoothness and intensity fitness of curves. Finally, considering the nonloop topology of most vasculatures, individually traced vessels were connected into a tree topology by applying a minimum spanning tree algorithm on a global connection graph. We evaluated DOST on a Time-of-Flight (TOF) MRA intracranial artery dataset and demonstrated its superior performance over existing segmentation-based and tracking-based vessel tracing methods. In addition, DOST showed strong adaptability on different imaging modalities (CTA, MR T1 SPACE) and vascular beds (coronary arteries).

Li Chen, Wenjin Liu, Niranjan Balu, Mahmud Mossa-Basha, Thomas S. Hatsukami, Jenq-Neng Hwang, Chun Yuan

2021-07-19

Dermatology Dermatology

Joint Dermatological Lesion Classification and Confidence Modeling with Uncertainty Estimation

ArXiv Preprint

Deep learning has played a major role in the interpretation of dermoscopic images for detecting skin defects and abnormalities. However, current deep learning solutions for dermatological lesion analysis are typically limited in providing probabilistic predictions which highlights the importance of concerning uncertainties. This concept of uncertainty can provide a confidence level for each feature which prevents overconfident predictions with poor generalization on unseen data. In this paper, we propose an overall framework that jointly considers dermatological classification and uncertainty estimation together. The estimated confidence of each feature to avoid uncertain feature and undesirable shift, which are caused by environmental difference of input image, in the latent space is pooled from confidence network. Our qualitative results show that modeling uncertainties not only helps to quantify model confidence for each prediction but also helps classification layers to focus on confident features, therefore, improving the accuracy for dermatological lesion classification. We demonstrate the potential of the proposed approach in two state-of-the-art dermoscopic datasets (ISIC 2018 and ISIC 2019).

Gun-Hee Lee, Han-Bin Ko, Seong-Whan Lee

2021-07-19

Dermatology Dermatology

Joint Dermatological Lesion Classification and Confidence Modeling with Uncertainty Estimation

ArXiv Preprint

Deep learning has played a major role in the interpretation of dermoscopic images for detecting skin defects and abnormalities. However, current deep learning solutions for dermatological lesion analysis are typically limited in providing probabilistic predictions which highlights the importance of concerning uncertainties. This concept of uncertainty can provide a confidence level for each feature which prevents overconfident predictions with poor generalization on unseen data. In this paper, we propose an overall framework that jointly considers dermatological classification and uncertainty estimation together. The estimated confidence of each feature to avoid uncertain feature and undesirable shift, which are caused by environmental difference of input image, in the latent space is pooled from confidence network. Our qualitative results show that modeling uncertainties not only helps to quantify model confidence for each prediction but also helps classification layers to focus on confident features, therefore, improving the accuracy for dermatological lesion classification. We demonstrate the potential of the proposed approach in two state-of-the-art dermoscopic datasets (ISIC 2018 and ISIC 2019).

Gun-Hee Lee, Han-Bin Ko, Seong-Whan Lee

2021-07-19

Radiology Radiology

Current and emerging artificial intelligence applications for pediatric musculoskeletal radiology.

In Pediatric radiology

Artificial intelligence (AI) is playing an ever-increasing role in radiology (more so in the adult world than in pediatrics), to the extent that there are unfounded fears it will completely take over the role of the radiologist. In relation to musculoskeletal applications of AI in pediatric radiology, we are far from the time when AI will replace radiologists; even for the commonest application (bone age assessment), AI is more often employed in an AI-assist mode rather than an AI-replace or AI-extend mode. AI for bone age assessment has been in clinical use for more than a decade and is the area in which most research has been conducted. Most other potential indications in children (such as appendicular and vertebral fracture detection) remain largely in the research domain. This article reviews the areas in which AI is most prominent in relation to the pediatric musculoskeletal system, briefly summarizing the current literature and highlighting areas for future research. Pediatric radiologists are encouraged to participate as members of the research teams conducting pediatric radiology artificial intelligence research.

Offiah Amaka C

2021-Jul-16

Artificial intelligence, Bone, Children, Musculoskeletal, Pediatric radiology

General General

High-resolution transcription factor binding sites prediction improved performance and interpretability by deep learning method.

In Briefings in bioinformatics

Transcription factors (TFs) are essential proteins in regulating the spatiotemporal expression of genes. It is crucial to infer the potential transcription factor binding sites (TFBSs) with high resolution to promote biology and realize precision medicine. Recently, deep learning-based models have shown exemplary performance in the prediction of TFBSs at the base-pair level. However, the previous models fail to integrate nucleotide position information and semantic information without noisy responses. Thus, there is still room for improvement. Moreover, both the inner mechanism and prediction results of these models are challenging to interpret. To this end, the Deep Attentive Encoder-Decoder Neural Network (D-AEDNet) is developed to identify the location of TFs-DNA binding sites in DNA sequences. In particular, our model adopts Skip Architecture to leverage the nucleotide position information in the encoder and removes noisy responses in the information fusion process by Attention Gate. Simultaneously, the Transcription Factor Motif Discovery based on Sliding Window (TF-MoDSW), an approach to discover TFs-DNA binding motifs by utilizing the output of neural networks, is proposed to understand the biological meaning of the predicted result. On ChIP-exo datasets, experimental results show that D-AEDNet has better performance than competing methods. Besides, we authenticate that Attention Gate can improve the interpretability of our model by ways of visualization analysis. Furthermore, we confirm that ability of D-AEDNet to learn TFs-DNA binding motifs outperform the state-of-the-art methods and availability of TF-MoDSW to discover biological sequence motifs in TFs-DNA interaction by conducting experiment on ChIP-seq datasets.

Zhang Yongqing, Wang Zixuan, Zeng Yuanqi, Zhou Jiliu, Zou Quan

2021-Jul-16

Attention Gate, interpretability, motif discovery, transcription factor binding sites

General General

A comprehensive transferability evaluation of U-Net and ResU-Net for landslide detection from Sentinel-2 data (case study areas from Taiwan, China, and Japan).

In Scientific reports ; h5-index 158.0

Earthquakes and heavy rainfalls are the two leading causes of landslides around the world. Since they often occur across large areas, landslide detection requires rapid and reliable automatic detection approaches. Currently, deep learning (DL) approaches, especially different convolutional neural network and fully convolutional network (FCN) algorithms, are reliably achieving cutting-edge accuracies in automatic landslide detection. However, these successful applications of various DL approaches have thus far been based on very high resolution satellite images (e.g., GeoEye and WorldView), making it easier to achieve such high detection performances. In this study, we use freely available Sentinel-2 data and ALOS digital elevation model to investigate the application of two well-known FCN algorithms, namely the U-Net and residual U-Net (or so-called ResU-Net), for landslide detection. To our knowledge, this is the first application of FCN for landslide detection only from freely available data. We adapt the algorithms to the specific aim of landslide detection, then train and test with data from three different case study areas located in Western Taitung County (Taiwan), Shuzheng Valley (China), and Eastern Iburi (Japan). We characterize three different window size sample patches to train the algorithms. Our results also contain a comprehensive transferability assessment achieved through different training and testing scenarios in the three case studies. The highest f1-score value of 73.32% was obtained by ResU-Net, trained with a dataset from Japan, and tested on China's holdout testing area using the sample patch size of 64 × 64 pixels.

Ghorbanzadeh Omid, Crivellari Alessandro, Ghamisi Pedram, Shahabi Hejar, Blaschke Thomas

2021-Jul-16

General General

GazeBase, a large-scale, multi-stimulus, longitudinal eye movement dataset.

In Scientific data

This manuscript presents GazeBase, a large-scale longitudinal dataset containing 12,334 monocular eye-movement recordings captured from 322 college-aged participants. Participants completed a battery of seven tasks in two contiguous sessions during each round of recording, including a - (1) fixation task, (2) horizontal saccade task, (3) random oblique saccade task, (4) reading task, (5/6) free viewing of cinematic video task, and (7) gaze-driven gaming task. Nine rounds of recording were conducted over a 37 month period, with participants in each subsequent round recruited exclusively from prior rounds. All data was collected using an EyeLink 1000 eye tracker at a 1,000 Hz sampling rate, with a calibration and validation protocol performed before each task to ensure data quality. Due to its large number of participants and longitudinal nature, GazeBase is well suited for exploring research hypotheses in eye movement biometrics, along with other applications applying machine learning to eye movement signal analysis. Classification labels produced by the instrument's real-time parser are provided for a subset of GazeBase, along with pupil area.

Griffith Henry, Lohr Dillon, Abdulin Evgeny, Komogortsev Oleg

2021-Jul-16

General General

Towards omics-based predictions of planktonic functional composition from environmental data.

In Nature communications ; h5-index 260.0

Marine microbes play a crucial role in climate regulation, biogeochemical cycles, and trophic networks. Unprecedented amounts of data on planktonic communities were recently collected, sparking a need for innovative data-driven methodologies to quantify and predict their ecosystemic functions. We reanalyze 885 marine metagenome-assembled genomes through a network-based approach and detect 233,756 protein functional clusters, from which 15% are functionally unannotated. We investigate all clusters' distributions across the global ocean through machine learning, identifying biogeographical provinces as the best predictors of protein functional clusters' abundance. The abundances of 14,585 clusters are predictable from the environmental context, including 1347 functionally unannotated clusters. We analyze the biogeography of these 14,585 clusters, identifying the Mediterranean Sea as an outlier in terms of protein functional clusters composition. Applicable to any set of sequences, our approach constitutes a step towards quantitative predictions of functional composition from the environmental context.

Faure Emile, Ayata Sakina-Dorothée, Bittner Lucie

2021-Jul-16

Radiology Radiology

Optimal timing of cholecystectomy after necrotising biliary pancreatitis.

In Gut ; h5-index 124.0

OBJECTIVE : Following an episode of acute biliary pancreatitis, cholecystectomy is advised to prevent recurrent biliary events. There is limited evidence regarding the optimal timing and safety of cholecystectomy in patients with necrotising biliary pancreatitis.

DESIGN : A post hoc analysis of a multicentre prospective cohort. Patients with biliary pancreatitis and a CT severity score of three or more were included in 27 Dutch hospitals between 2005 and 2014. Primary outcome was the optimal timing of cholecystectomy in patients with necrotising biliary pancreatitis, defined as: the optimal point in time with the lowest risk of recurrent biliary events and the lowest risk of complications of cholecystectomy. Secondary outcomes were the number of recurrent biliary events, periprocedural complications of cholecystectomy and the protective value of endoscopic sphincterotomy for the recurrence of biliary events.

RESULTS : Overall, 248 patients were included in the analysis. Cholecystectomy was performed in 191 patients (77%) at a median of 103 days (P25-P75: 46-222) after discharge. Infected necrosis after cholecystectomy occurred in four (2%) patients with persistent peripancreatic collections. Before cholecystectomy, 66 patients (27%) developed biliary events. The risk of overall recurrent biliary events prior to cholecystectomy was significantly lower before 10 weeks after discharge (risk ratio 0.49 (95% CI 0.27 to 0.90); p=0.02). The risk of recurrent pancreatitis before cholecystectomy was significantly lower before 8 weeks after discharge (risk ratio 0.14 (95% CI 0.02 to 1.0); p=0.02). The complication rate of cholecystectomy did not decrease over time. Endoscopic sphincterotomy did not reduce the risk of recurrent biliary events (OR 1.40 (95% CI 0.74 to 2.83)).

CONCLUSION : The optimal timing of cholecystectomy after necrotising biliary pancreatitis, in the absence of peripancreatic collections, is within 8 weeks after discharge.

Hallensleben Nora D, Timmerhuis Hester C, Hollemans Robbert A, Pocornie Sabrina, van Grinsven Janneke, van Brunschot Sandra, Bakker Olaf J, van der Sluijs Rogier, Schwartz Matthijs P, van Duijvendijk Peter, Römkens Tessa, Stommel Martijn W J, Verdonk Robert C, Besselink Marc G, Bouwense Stefan A W, Bollen Thomas L, van Santvoort Hjalmar C, Bruno Marco J

2021-Jul-16

acute pancreatitis, cholecystectomy

Surgery Surgery

Radiopharmaceutical and Eu3+ doped gadolinium oxide nanoparticles mediated triple-excited fluorescence imaging and image-guided surgery.

In Journal of nanobiotechnology

Cerenkov luminescence imaging (CLI) is a novel optical imaging technique that has been applied in clinic using various radionuclides and radiopharmaceuticals. However, clinical application of CLI has been limited by weak optical signal and restricted tissue penetration depth. Various fluorescent probes have been combined with radiopharmaceuticals for improved imaging performances. However, as most of these probes only interact with Cerenkov luminescence (CL), the low photon fluence of CL greatly restricted it's interaction with fluorescent probes for in vivo imaging. Therefore, it is important to develop probes that can effectively convert energy beyond CL such as β and γ to the low energy optical signals. In this study, a Eu3+ doped gadolinium oxide (Gd2O3:Eu) was synthesized and combined with radiopharmaceuticals to achieve a red-shifted optical spectrum with less tissue scattering and enhanced optical signal intensity in this study. The interaction between Gd2O3:Eu and radiopharmaceutical were investigated using 18F-fluorodeoxyglucose (18F-FDG). The ex vivo optical signal intensity of the mixture of Gd2O3:Eu and 18F-FDG reached 369 times as high as that of CLI using 18F-FDG alone. To achieve improved biocompatibility, the Gd2O3:Eu nanoparticles were then modified with polyvinyl alcohol (PVA), and the resulted nanoprobe PVA modified Gd2O3:Eu (Gd2O3:Eu@PVA) was applied in intraoperative tumor imaging. Compared with 18F-FDG alone, intraoperative administration of Gd2O3:Eu@PVA and 18F-FDG combination achieved a much higher tumor-to-normal tissue ratio (TNR, 10.24 ± 2.24 vs. 1.87 ± 0.73, P = 0.0030). The use of Gd2O3:Eu@PVA and 18F-FDG also assisted intraoperative detection of tumors that were omitted by preoperative positron emission tomography (PET) imaging. Further experiment of image-guided surgery demonstrated feasibility of image-guided tumor resection using Gd2O3:Eu@PVA and 18F-FDG. In summary, Gd2O3:Eu can achieve significantly optimized imaging property when combined with 18F-FDG in intraoperative tumor imaging and image-guided tumor resection surgery. It is expected that the development of the Gd2O3:Eu nanoparticle will promote investigation and application of novel nanoparticles that can interact with radiopharmaceuticals for improved imaging properties. This work highlighted the impact of the nanoprobe that can be excited by radiopharmaceuticals emitting CL, β, and γ radiation for precisely imaging of tumor and intraoperatively guide tumor resection.

Shi Xiaojing, Cao Caiguang, Zhang Zeyu, Tian Jie, Hu Zhenhua

2021-Jul-16

Cerenkov luminescence imaging, Gd2O3:Eu, Image-guided surgery, Optical imaging, Radiopharmaceuticals

General General

Revealing the threat of emerging SARS-CoV-2 mutations to antibody therapies.

In Journal of molecular biology ; h5-index 65.0

The ongoing massive vaccination and the development of effective intervention offer the long-awaited hope to end the global rage of the COVID-19 pandemic. However, the rapidly growing SARS-CoV-2 variants might compromise existing vaccines and monoclonal antibody (mAb) therapies. Although there are valuable experimental studies about the potential threats from emerging variants, the results are limited to a handful of mutations and Eli Lilly and Regeneron mAbs. The potential threats from frequently occurring mutations on the SARS-CoV-2 spike (S) protein receptor-binding domain (RBD) to many mAbs in clinical trials are largely unknown. We fill the gap by developing a topology-based deep learning strategy that is validated with tens of thousands of experimental data points. We analyze 796,759 genome isolates from patients to identify 606 non-degenerate RBD mutations and investigate their impacts on 16 mAbs in clinical trials. Our findings, which are highly consistent with existing experimental results about Alpha, Beta, Gamma, Delta, Epsilon, and Kappa variants shed light on potential threats of 100 most observed mutations to mAbs not only from Eli Lilly and Regeneron but also from Celltrion and Rockefeller University that are in clinical trials. We unveil, for the first time, that high-frequency mutations R346K/S, N439K, G446V, L455F, V483F/A, F486L, F490L/S, Q493L, and S494P might compromise some of mAbs in clinical trials. Our study gives rise to a general perspective about how mutations will affect current vaccines.

Chen Jiahui, Gao Kaifu, Wang Rui, Wei Guo-Wei

2021-Jul-14

Antibody, clinical trial., deep learning, mutation, variant

General General

An evaluation of performance measures for arterial brain vessel segmentation.

In BMC medical imaging

BACKGROUND : Arterial brain vessel segmentation allows utilising clinically relevant information contained within the cerebral vascular tree. Currently, however, no standardised performance measure is available to evaluate the quality of cerebral vessel segmentations. Thus, we developed a performance measure selection framework based on manual visual scoring of simulated segmentation variations to find the most suitable measure for cerebral vessel segmentation.

METHODS : To simulate segmentation variations, we manually created non-overlapping segmentation errors common in magnetic resonance angiography cerebral vessel segmentation. In 10 patients, we generated a set of approximately 300 simulated segmentation variations for each ground truth image. Each segmentation was visually scored based on a predefined scoring system and segmentations were ranked based on 22 performance measures common in the literature. The correlation of visual scores with performance measure rankings was calculated using the Spearman correlation coefficient.

RESULTS : The distance-based performance measures balanced average Hausdorff distance (rank = 1) and average Hausdorff distance (rank = 2) provided the segmentation rankings with the highest average correlation with manual rankings. They were followed by overlap-based measures such as Dice coefficient (rank = 7), a standard performance measure in medical image segmentation.

CONCLUSIONS : Average Hausdorff distance-based measures should be used as a standard performance measure in evaluating cerebral vessel segmentation quality. They can identify more relevant segmentation errors, especially in high-quality segmentations. Our findings have the potential to accelerate the validation and development of novel vessel segmentation approaches.

Aydin Orhun Utku, Taha Abdel Aziz, Hilbert Adam, Khalil Ahmed A, Galinovic Ivana, Fiebach Jochen B, Frey Dietmar, Madai Vince Istvan

2021-Jul-16

Average Hausdorff distance, Cerebral arteries, Cerebral vessel segmentation, Dice, Image processing (computer-assisted), Ranking, Segmentation, Segmentation measures

General General

Transfer learning via multi-scale convolutional neural layers for human-virus protein-protein interaction prediction.

In Bioinformatics (Oxford, England)

MOTIVATION : To complement experimental efforts, machine learning-based computational methods are playing an increasingly important role to predict human-virus protein-protein interactions (PPIs). Furthermore, transfer learning can effectively apply prior knowledge obtained from a large source dataset/task to a small target dataset/task, improving prediction performance.

RESULTS : To predict interactions between human and viral proteins, we combine evolutionary sequence profile features with a Siamese convolutional neural network (CNN) architecture and a multi-layer perceptron. Our architecture outperforms various feature encodings-based machine learning and state-of-the-art prediction methods. As our main contribution, we introduce two transfer learning methods (i.e., 'frozen' type and 'fine-tuning' type) that reliably predict interactions in a target human-virus domain based on training in a source human-virus domain, by retraining CNN layers. Finally, we utilize the 'frozen' type transfer learning approach to predict human-SARS-CoV-2 PPIs, indicating that our predictions are topologically and functionally similar to experimentally known interactions.

SUPPLEMENTARY INFORMATION : Supplementary data are available at Bioinformatics online.

Yang Xiaodi, Yang Shiping, Lian Xianyi, Wuchty Stefan, Zhang Ziding

2021-Jul-17

General General

Diffuse reflectance spectroscopy based rapid coal rank estimation: A machine learning enabled framework.

In Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy

This research aims at studying the ability of using diffuse reflectance spectroscopy (DRS) for discriminating or classifying coal samples into different ranks. Spectral characteristics such as the shape of the spectral profile, slope, absorption intensity of coal samples of different ranks ranging from lignite A to semi-anthracite were studied in the Vis-NIR-SWIR (350-2500 nm) range. A number of classification algorithms (Logistic Regression, Random Forest, and SVM) were trained using the DRS dataset of coal samples. Class imbalances present in the dataset were handled using different approaches (SMOTE and Oversampling of minority classes), which improved the classification accuracy. Coal samples were initially classified into broad classes viz., lignite, sub-bituminous, bituminous, and anthracite with an accuracy of 0.98 and F1 score of 0.75. Later, the same samples were further classified into sub-class levels. The sub-class level classification also obtained good results with an accuracy of 0.77 and F1 score of 0.64. The results demonstrate the effectiveness of rapid coal classification systems based on DRS dataset in combination with different machine learning-based classification algorithms.

Begum Nafisa, Maiti Abhik, Chakravarty Debashish, Das Bhabani Sankar

2021-Jul-05

Coal rank, Diffuse reflectance spectroscopy, Logistic regression, Random forest classifier, Support vector machine

General General

Modeling the response of ecological service value to land use change through deep learning simulation in Lanzhou, China.

In The Science of the total environment

Land use (LU) changes caused by urbanization, climate, and anthropogenic activities alter the supply of ecosystem services (ES), which affects the ecological service value (ESV) of a given region. Existing LU simulation models extract neighborhood effects with only one data time slice, which ignores long-term dependence in neighborhood interactions. Previous studies on the dynamic relationship between LU change and ES in semi-arid areas is rare than that in humid coastal areas. Here, we selected a semi-arid city, Lanzhou, in Northwest China as the study area, to simulate LU changes in 2030 under natural growth (NG), ecological protection (EP), economic development (EP), and ecological protection-economic development (EPD) scenarios, using a novel deep learning method, named CL-CA. Convolutional neural network and long short term memory (CNN-LSTM) with cellular automata (CA) were utilized to extract the spatiotemporal neighborhood features. The overall simulation performance of the proposed model was larger than 0.92, which is surpassed that of LSTM-CA, artificial neural network (ANN)-CA, and recursive neural network (RNN)-CA. Ultimately, we utilized LU and ES to quantitatively evaluate the ESV changes. The results indicated that: (1) The variable trend of ESV in arid area is different from that in coastal humid areas. (2) Forest land and water were the main factors that affect the ESV change. (3) The EPD scenario was more suitable for sustainable urban development.

Liu Jiamin, Xiao Bin, Jiao Jizong, Li Yueshi, Wang Xiaoyun

2021-Jul-10

Deep learning, Ecological service value, Land use change, Lanzhou, Scenario simulation, Semi-arid region

General General

Towards automatic airborne pollen monitoring: From commercial devices to operational by mitigating class-imbalance in a deep learning approach.

In The Science of the total environment

Allergic diseases have been the epidemic of the century among chronic diseases. Particularly for pollen allergies, and in the context of climate change, as airborne pollen seasons have been shifting earlier and abundances have been becoming higher, pollen monitoring plays an important role in generating high-risk allergy alerts. However, this task requires labour-intensive and time-consuming manual classification via optical microscopy. Even new-generation, automatic, monitoring devices require manual pollen labelling to increase accuracy and to advance to genuinely operational devices. Deep Learning-based models have the potential to increase the accuracy of automated pollen monitoring systems. In the current research, transfer learning-based convolutional neural networks were employed to classify pollen grains from microscopic images. Given a high imbalance in the dataset, we incorporated class weighted loss, focal loss and weight vector normalisation for class balancing as well as data augmentation and weight penalties for regularisation. Airborne pollen has been routinely recorded by a Bio-Aerosol Analyzer (BAA500, Hund GmbH) located in Augsburg, Germany. Here we utilised a database referring to manually classified airborne pollen images of the whole pollen diversity throughout an annual pollen season. By using the cropped pollen images collected by this device, we achieved an unweighted average F1 score of 93.8% across 15 classes and an unweighted average F1 score of 75.9% across 31 classes. The majority of taxa (9 of 15), being also the most abundant and allergenic, showed a recall of at least 95%, reaching up to a remarkable 100% in pollen from Taxus and Urticaceae. The recent introduction of novel pollen monitoring devices worldwide has pointed to the necessity for real-time, automatic measurements of airborne pollen and fungal spores. Thus, we may improve everyday clinical practice and achieve the most efficient prophylaxis of allergic patients.

Schaefer Jakob, Milling Manuel, Schuller Björn W, Bauer Bernhard, Brunner Jens O, Traidl-Hoffmann Claudia, Damialis Athanasios

2021-Jul-08

Aerobiology, Automatic classification, Convolutional neural network, Machine learning, Pollen

General General

Linking population dynamics to microbial kinetics for hybrid modeling of bioelectrochemical systems.

In Water research

Mechanistic and data-driven models have been developed to provide predictive insights into the design and optimization of engineered bioprocesses. These two modeling strategies can be combined to form hybrid models to address the issues of parameter identifiability and prediction interpretability. Herein, we developed a novel and robust hybrid modeling strategy by incorporating microbial population dynamics into model construction. The hybrid model was constructed using bioelectrochemical systems (BES) as a platform system. We collected 77 samples from 13 publications, in which the BES were operated under diverse conditions, and performed holistic processing of the 16S rRNA amplicon sequencing data. Community analysis revealed core populations composed of putative electroactive taxa Geobacter, Desulfovibrio, Pseudomonas, and Acinetobacter. Primary Bayesian networks were trained with the core populations and environmental parameters, and directed Bayesian networks were trained by defining the operating parameters to improve the prediction interpretability. Both networks were validated with Bray-Curtis similarly, relative root-mean-square error (RMSE), and a null model. A hybrid model was developed by first building a three-population mechanistic component and subsequently feeding the estimated microbial kinetic parameters into network training. The hybrid model generated a simulated community that shared a Bray-Curtis similarity of 72% with the actual microbial community at the genus level and an average relative RMSE of 7% for individual taxa. When examined with additional samples that were not included in network training, the hybrid model achieved accurate prediction of current production with a relative error-based RMSE of 0.8 and outperformed the data-driven models. The genomics-enabled hybrid modeling strategy represents a significant step toward robust simulation of a variety of engineered bioprocesses.

Cheng Zhang, Yao Shiyun, Yuan Heyang

2021-Jul-09

Bayesian network, Engineered bioprocess, Hybrid modeling, Machine learning, Microbial kinetics, Microbial population dynamics

General General

An adaptive learning method of anchor shape priors for biological cells detection and segmentation.

In Computer methods and programs in biomedicine

BACKGROUND AND OBJECTIVE : Owing to the variable shapes, large size difference, uneven grayscale and dense distribution among biological cells in an image, it is still a challenging task for the standard Mask R-CNN to accurately detect and segment cells. Especially, the state-of-the-art anchor-based methods fail to generate the anchors of sufficient scales effectively according to the various sizes and shapes of cells, thereby hardly covering all scales of cells.

METHODS : We propose an adaptive approach to learn the anchor shape priors from data samples, and the aspect ratios and the number of anchor boxes can be dynamically adjusted by using ISODATA clustering algorithm instead of human prior knowledge in this work. To solve the identification difficulties for small objects owing to the multiple down-samplings in a deep learning-based method, a densification strategy of candidate anchors is presented to enhance the effects of identifying tinny size cells. Finally, a series of comparative experiments are conducted on various datasets to select appropriate a network structure and verify the effectiveness of the proposed methods.

RESULTS : The results show that the ResNet-50-FPN combining the ISODATA method and densification strategy can obtain better performance than other methods in multiple metrics (including AP, Precision, Recall, Dice and PQ) for various biological cell datasets, such as U373, GoTW1, SIM+ and T24.

CONCLUSIONS : Our adaptive algorithm could effectively learn the anchor shape priors from the various sizes and shapes of cells. It is promising and encouraging for a real-world anchor-based detection and segmentation application of biomedical engineering in the future.

Hu Haigen, Liu Aizhu, Zhou Qianwei, Guan Qiu, Li Xiaoxin, Chen Qi

2021-Jul-08

Anchor densification, Anchor shape, Cell detection and segmentation, ISODATA, Mask R-CNN

General General

Selective ensemble-based online adaptive deep neural networks for streaming data with concept drift.

In Neural networks : the official journal of the International Neural Network Society

Concept drift is an important issue in the field of streaming data mining. However, how to maintain real-time model convergence in a dynamic environment is an important and difficult problem. In addition, the current methods have limited ability to deal with the problem of streaming data classification for complex nonlinear problems. To solve these problems, a selective ensemble-based online adaptive deep neural network (SEOA) is proposed to address concept drift. First, the adaptive depth unit is constructed by combining shallow features with deep features and adaptively controls the information flow in the neural network according to changes in streaming data at adjacent moments, which improves the convergence of the online deep learning model. Then, the adaptive depth units of different layers are regarded as base classifiers for ensemble and weighted dynamically according to the loss of each classifier. In addition, a dynamic selection of base classifiers is adopted according to the fluctuation of the streaming data to achieve a balance between stability and adaptability. The experimental results show that the SEOA can effectively contend with different types of concept drift and has good robustness and generalization.

Guo Husheng, Zhang Shuai, Wang Wenjian

2021-Jul-02

Adaptive method, Concept drift, Deep neural networks, Online learning, Selective ensemble

General General

Functionalization of remote sensing and on-site data for simulating surface water dissolved oxygen: Development of hybrid tree-based artificial intelligence models.

In Marine pollution bulletin

Dissolved oxygen (DO) is an important indicator of river health for environmental engineers and ecological scientists to understand the state of river health. This study aims to evaluate the reliability of four feature selector algorithms i.e., Boruta, genetic algorithm (GA), multivariate adaptive regression splines (MARS), and extreme gradient boosting (XGBoost) to select the best suited predictor of the applied water quality (WQ) parameters; and compare four tree-based predictive models, namely, random forest (RF), conditional random forests (cForest), RANdom forest GEneRator (Ranger), and XGBoost to predict the changes of dissolved oxygen (DO) in the Klang River, Malaysia. The total features including 15 WQ parameters from monitoring site data and 7 hydrological components from remote sensing data. All predictive models performed well as per the features selected by the algorithms XGBoost and MARS in terms applied statistical evaluators. Besides, the best performance noted in case of XGBoost predictive model among all applied predictive models when the feature selected by MARS and XGBoost algorithms, with the coefficient of determination (R2) values of 0.84 and 0.85, respectively, nonetheless the marginal performance came up by Boruta-XGBoost model on in this scenario.

Tiyasha Tiyasha, Tung Tran Minh, Bhagat Suraj Kumar, Tan Mou Leong, Jawad Ali H, Mohtar Wan Hanna Melini Wan, Yaseen Zaher Mundher

2021-Jul-14

Artificial intelligence, Dissolved oxygen, Feature selection, Remote sensing data, Surface water quality

General General

Fatores associados ao óbito em casos confirmados de COVID-19 no estado do Rio de Janeiro.

In BMC infectious diseases ; h5-index 58.0

BACKGROUND : COVID-19 can occur asymptomatically, as influenza-like illness, or as more severe forms, which characterize severe acute respiratory syndrome (SARS). Its mortality rate is higher in individuals over 80 years of age and in people with comorbidities, so these constitute the risk group for severe forms of the disease. We analyzed the factors associated with death in confirmed cases of COVID-19 in the state of Rio de Janeiro. This cross-sectional study evaluated the association between individual demographic, clinical, and epidemiological variables and the outcome (death) using data from the Unified Health System information systems.

METHODS : We used the extreme boosting gradient (XGBoost) model to analyze the data, which uses decision trees weighted by the estimation difficulty. To evaluate the relevance of each independent variable, we used the SHapley Additive exPlanations (SHAP) metric. From the probabilities generated by the XGBoost model, we transformed the data to the logarithm of odds to estimate the odds ratio for each independent variable.

RESULTS : This study showed that older individuals of black race/skin color with heart disease or diabetes who had dyspnea or fever were more likely to die.

CONCLUSIONS : The early identification of patients who may progress to a more severe form of the disease can help improve the clinical management of patients with COVID-19 and is thus essential to reduce the lethality of the disease.

Cini Oliveira Marcella, de Araujo Eleuterio Tatiana, de Andrade Corrêa Allan Bruno, da Silva Lucas Dalsenter Romano, Rodrigues Renata Coelho, de Oliveira Bruna Andrade, Martins Marlos Melo, Raymundo Carlos Eduardo, de Andrade Medronho Roberto

2021-Jul-16

COVID-19, Coronavirus death, Coronavirus infection, Machine learning, Pandemic, SARS-CoV-2, XGBoost

General General

Fatores associados ao óbito em casos confirmados de COVID-19 no estado do Rio de Janeiro.

In BMC infectious diseases ; h5-index 58.0

BACKGROUND : COVID-19 can occur asymptomatically, as influenza-like illness, or as more severe forms, which characterize severe acute respiratory syndrome (SARS). Its mortality rate is higher in individuals over 80 years of age and in people with comorbidities, so these constitute the risk group for severe forms of the disease. We analyzed the factors associated with death in confirmed cases of COVID-19 in the state of Rio de Janeiro. This cross-sectional study evaluated the association between individual demographic, clinical, and epidemiological variables and the outcome (death) using data from the Unified Health System information systems.

METHODS : We used the extreme boosting gradient (XGBoost) model to analyze the data, which uses decision trees weighted by the estimation difficulty. To evaluate the relevance of each independent variable, we used the SHapley Additive exPlanations (SHAP) metric. From the probabilities generated by the XGBoost model, we transformed the data to the logarithm of odds to estimate the odds ratio for each independent variable.

RESULTS : This study showed that older individuals of black race/skin color with heart disease or diabetes who had dyspnea or fever were more likely to die.

CONCLUSIONS : The early identification of patients who may progress to a more severe form of the disease can help improve the clinical management of patients with COVID-19 and is thus essential to reduce the lethality of the disease.

Cini Oliveira Marcella, de Araujo Eleuterio Tatiana, de Andrade Corrêa Allan Bruno, da Silva Lucas Dalsenter Romano, Rodrigues Renata Coelho, de Oliveira Bruna Andrade, Martins Marlos Melo, Raymundo Carlos Eduardo, de Andrade Medronho Roberto

2021-Jul-16

COVID-19, Coronavirus death, Coronavirus infection, Machine learning, Pandemic, SARS-CoV-2, XGBoost

General General

Revealing the threat of emerging SARS-CoV-2 mutations to antibody therapies.

In Journal of molecular biology ; h5-index 65.0

The ongoing massive vaccination and the development of effective intervention offer the long-awaited hope to end the global rage of the COVID-19 pandemic. However, the rapidly growing SARS-CoV-2 variants might compromise existing vaccines and monoclonal antibody (mAb) therapies. Although there are valuable experimental studies about the potential threats from emerging variants, the results are limited to a handful of mutations and Eli Lilly and Regeneron mAbs. The potential threats from frequently occurring mutations on the SARS-CoV-2 spike (S) protein receptor-binding domain (RBD) to many mAbs in clinical trials are largely unknown. We fill the gap by developing a topology-based deep learning strategy that is validated with tens of thousands of experimental data points. We analyze 796,759 genome isolates from patients to identify 606 non-degenerate RBD mutations and investigate their impacts on 16 mAbs in clinical trials. Our findings, which are highly consistent with existing experimental results about Alpha, Beta, Gamma, Delta, Epsilon, and Kappa variants shed light on potential threats of 100 most observed mutations to mAbs not only from Eli Lilly and Regeneron but also from Celltrion and Rockefeller University that are in clinical trials. We unveil, for the first time, that high-frequency mutations R346K/S, N439K, G446V, L455F, V483F/A, F486L, F490L/S, Q493L, and S494P might compromise some of mAbs in clinical trials. Our study gives rise to a general perspective about how mutations will affect current vaccines.

Chen Jiahui, Gao Kaifu, Wang Rui, Wei Guo-Wei

2021-Jul-14

Antibody, clinical trial., deep learning, mutation, variant

General General

Transfer learning via multi-scale convolutional neural layers for human-virus protein-protein interaction prediction.

In Bioinformatics (Oxford, England)

MOTIVATION : To complement experimental efforts, machine learning-based computational methods are playing an increasingly important role to predict human-virus protein-protein interactions (PPIs). Furthermore, transfer learning can effectively apply prior knowledge obtained from a large source dataset/task to a small target dataset/task, improving prediction performance.

RESULTS : To predict interactions between human and viral proteins, we combine evolutionary sequence profile features with a Siamese convolutional neural network (CNN) architecture and a multi-layer perceptron. Our architecture outperforms various feature encodings-based machine learning and state-of-the-art prediction methods. As our main contribution, we introduce two transfer learning methods (i.e., 'frozen' type and 'fine-tuning' type) that reliably predict interactions in a target human-virus domain based on training in a source human-virus domain, by retraining CNN layers. Finally, we utilize the 'frozen' type transfer learning approach to predict human-SARS-CoV-2 PPIs, indicating that our predictions are topologically and functionally similar to experimentally known interactions.

SUPPLEMENTARY INFORMATION : Supplementary data are available at Bioinformatics online.

Yang Xiaodi, Yang Shiping, Lian Xianyi, Wuchty Stefan, Zhang Ziding

2021-Jul-17

General General

Multitask machine learning models for predicting lipophilicity (logP) in the SAMPL7 challenge.

In Journal of computer-aided molecular design

Accurate prediction of lipophilicity-logP-based on molecular structures is a well-established field. Predictions of logP are often used to drive forward drug discovery projects. Driven by the SAMPL7 challenge, in this manuscript we describe the steps that were taken to construct a novel machine learning model that can predict and generalize well. This model is based on the recently described Directed-Message Passing Neural Networks (D-MPNNs). Further enhancements included: both the inclusion of additional datasets from ChEMBL (RMSE improvement of 0.03), and the addition of helper tasks (RMSE improvement of 0.04). To the best of our knowledge, the concept of adding predictions from other models (Simulations Plus logP and logD@pH7.4, respectively) as helper tasks is novel and could be applied in a broader context. The final model that we constructed and used to participate in the challenge ranked 2/17 ranked submissions with an RMSE of 0.66, and an MAE of 0.48 (submission: Chemprop). On other datasets the model also works well, especially retrospectively applied to the SAMPL6 challenge where it would have ranked number one out of all submissions (RMSE of 0.35). Despite the fact that our model works well, we conclude with suggestions that are expected to improve the model even further.

Lenselink Eelke B, Stouten Pieter F W

2021-Jul-17

D-MPNN, Multitask machine learning, SAMPL7, logP prediction

Public Health Public Health

Utilisation of machine learning to predict surgical candidates for the treatment of childhood upper airway obstruction.

In Sleep & breathing = Schlaf & Atmung

OBJECTIVE : To investigate the effect of adenotonsillectomy on OSAS symptoms based on a data-driven approach and thereby identify criteria that may help avoid unnecessary surgery in children with OSAS.

METHODS : In 323 children enrolled in the Childhood Adenotonsillectomy Trial, randomised to undergo either early adenotonsillectomy (eAT; N = 165) or a strategy of watchful waiting with supportive care (WWSC; N = 158), the apnea-hypopnea index, heart period pattern dynamics, and thoraco-abdominal asynchrony measurements from overnight polysomnography (PSG) were measured. Using machine learning, all children were classified into one of two different clusters based on those features. The cluster transitions between follow-up and baseline PSG were investigated for each to predict those children who recovered spontaneously, following surgery and those who did not benefit from surgery.

RESULTS : The two clusters showed significant differences in OSAS symptoms, where children assigned in cluster A had fewer physiological and neurophysiological symptoms than cluster B. Whilst the majority of children were assigned to cluster A, those children who underwent surgery were more likely to stay in cluster A after seven months. Those children who were in cluster B at baseline PSG were more likely to have their symptoms reversed via surgery. Children who were assigned to cluster B at both baseline and 7 months after surgery had significantly higher end-tidal carbon dioxide at baseline. Children who spontaneously changed from cluster B to A presented highly problematic ratings in behaviour and emotional regulation at baseline.

CONCLUSIONS : Data-driven analysis demonstrated that AT helps to reverse and to prevent the worsening of the pathophysiological symptoms in children with OSAS. Multiple pathophysiological markers used with machine learning can capture more comprehensive information on childhood OSAS. Children with mild physiological and neurophysiological symptoms could avoid AT, and children who have UAO symptoms post AT may have sleep-related hypoventilation disease which requires further investigation. Furthermore, the findings may help surgeons more accurately predict children on whom they should perform AT.

Liu Xiao, Pamula Yvonne, Immanuel Sarah, Kennedy Declan, Martin James, Baumert Mathias

2021-Jul-17

Adenotonsillectomy, Children, Data-driven, Machine learning, Sleep apnea

General General

Custom TKA: what to expect and where do we stand today?

In Archives of orthopaedic and trauma surgery

INTRODUCTION : The concept of custom total knee arthroplasty (TKA) is explored with specific attention to current limitations. Arguments in favor of custom TKA are the anatomic and functional variability we encounter in our patients. The biggest conceptual challenge is to marry the need for correction of deformity with the ambition to stay as close as possible to original anatomy.

MATERIALS AND METHODS : A Pubmed search was performed on the following terms: 'patient specific implant', 'custom made implant', 'custom implant', 'total knee arthroplasty' and 'total knee replacement'. These studies were evaluated for the following intra- and post-operative variables: blood loss, hospital stay, range of motion, patient-reported outcome measures, limb and implant alignment, implant fit, tibiofemoral kinematics, complications and revision rates.

RESULTS : Out of 1117 studies found with the initial search, a total of 17 articles were included in the final analysis. In eight out of the 17 (47%) studies, either the research was commercially funded or one of the authors had a conflict of interest related to the work. 11 out of 17 studies included a control group in their study setup. Of those studies that included a control group, both superior and inferior results compared to off-the-shelf implants have been reported.

CONCLUSION : Custom knee implants are the next step in matching the geometric features of the prosthesis to the anatomy of the individual patient, after several iterations that added asymmetry and sizes in the existing implants. Several companies have proven that it is feasible to produce these implants in a safe way. An overview of current literature reveals the lack of strong methodological studies that prove the value of this new technology. Custom knee implants face conceptual and practical difficulties, some of which might be overcome with technological advances, such as robotics and artificial intelligence.

Victor Jan, Vermue Hannes

2021-Jul-17

Coronal alignment, Custom-made, Knee arthroplasty

General General

Wireless Soft Scalp Electronics and Virtual Reality System for Motor Imagery-Based Brain-Machine Interfaces.

In Advanced science (Weinheim, Baden-Wurttemberg, Germany)

Motor imagery offers an excellent opportunity as a stimulus-free paradigm for brain-machine interfaces. Conventional electroencephalography (EEG) for motor imagery requires a hair cap with multiple wired electrodes and messy gels, causing motion artifacts. Here, a wireless scalp electronic system with virtual reality for real-time, continuous classification of motor imagery brain signals is introduced. This low-profile, portable system integrates imperceptible microneedle electrodes and soft wireless circuits. Virtual reality addresses subject variance in detectable EEG response to motor imagery by providing clear, consistent visuals and instant biofeedback. The wearable soft system offers advantageous contact surface area and reduced electrode impedance density, resulting in significantly enhanced EEG signals and classification accuracy. The combination with convolutional neural network-machine learning provides a real-time, continuous motor imagery-based brain-machine interface. With four human subjects, the scalp electronic system offers a high classification accuracy (93.22 ± 1.33% for four classes), allowing wireless, real-time control of a virtual reality game.

Mahmood Musa, Kwon Shinjae, Kim Hojoong, Kim Yun-Soung, Siriaraya Panote, Choi Jeongmoon, Otkhmezuri Boris, Kang Kyowon, Yu Ki Jun, Jang Young C, Ang Chee Siang, Yeo Woon-Hong

2021-Jul-17

brain-machine interfaces, motor imagery brain signals, virtual reality system, wireless soft scalp electronics

General General

The role of artificial intelligence in enhancing clinical nursing care: A scoping review.

In Journal of nursing management ; h5-index 43.0

AIM : To present an overview of how artificial intelligence has been used to improve clinical nursing care.

BACKGROUND : Artificial intelligence has been reshaping the healthcare industry but little is known about its applicability in enhancing nursing care.

METHODS : A scoping review was conducted. Seven electronic databases (CINAHL, Cochrane Library, EMBASE, IEEE Xplore, PubMed, Scopus, and Web of Science) were searched from 1 January 2010 till 20 December 2020. Grey literature and reference lists of included articles were also searched.

RESULTS : Thirty-seven studies encapsulating the use of AI in improving clinical nursing care were included in this review. Six use cases were identified - documentation, formulating nursing diagnoses, formulating nursing care plans, patient monitoring, patient care prediction such as falls prediction (most common) and wound management. Various techniques of machine learning and classification were used for predictive analyses and to improve nurses' preparedness and management of patients' conditions.

CONCLUSIONS : This review highlighted the potential of artificial intelligence in improving the quality of nursing care. However, more randomized controlled trials in real-life healthcare settings should be conducted to enhance the rigor of evidence.

IMPLICATIONS FOR NURSING MANAGEMENT : Education in the application of artificial intelligence should be promoted to empower nurses to lead technological transformations and not passively trail behind others.

Ng Zi Qi Pamela, Ling Li Ying Janice, Chew Han Shi Jocelyn, Lau Ying

2021-Jul-17

Artificial Intelligence, Health Care, Machine Learning, Nursing, Patient Care

General General

The PHU-NET: A robust phase unwrapping method for MRI based on deep learning.

In Magnetic resonance in medicine ; h5-index 66.0

PURPOSE : This work was aimed at designing a deep-learning-based approach for MR image phase unwrapping to improve the robustness and efficiency of traditional methods.

METHODS : A deep learning network called PHU-NET was designed for MR phase unwrapping. In this network, a novel training data generation method was proposed to simulate the wrapping patterns in MR phase images. The wrapping boundary and wrapping counts were explicitly estimated and used for network training. The proposed method was quantitatively evaluated and compared to other methods using a number of simulated datasets with varying signal-to-noise ratio (SNR) and MR phase images from various parts of the human body.

RESULTS : The results showed that our method performed better in the simulated data even under an extremely low SNR. The proposed method had less residual wrapping in the images from various parts of human body and worked well in the presence of severe anatomical discontinuity. Our method was also advantageous in terms of computational efficiency compared to the traditional methods.

CONCLUSION : This work proposed a robust and computationally efficient MR phase unwrapping method based on a deep learning network, which has promising performance in applications using MR phase information.

Zhou Hongyu, Cheng Chuanli, Peng Hao, Liang Dong, Liu Xin, Zheng Hairong, Zou Chao

2021-Jul-17

artificial intelligence, deep learning, magnetic resonance imaging, phase unwrapping

General General

Spatiotemporal distributions of pan evaporation and the influencing factors in China from 1961 to 2017.

In Environmental science and pollution research international

Pan evaporation (EVP) is an important element of the hydrological cycle and exhibits a close relationship with climate change. In this study, the generalized regression neural network (GRNN) model and extreme gradient boosting (Xgboost) model were applied to estimate the monthly EVP. The spatiotemporal distributions of EVP and influencing factors in China and eight subregions from 1961 to 2017 were analyzed. The root mean square error (RMSE) of all GRNN models was approximately 10%, and the Nash-Sutcliffe efficiency (NSE) coefficient was larger than 0.94 in different subregions. The annual mean EVP in all subregions and throughout China showed decreasing trends before 1993, while EVP increasing trends occurred in East China (EC), South China (SC), Southwest China (SWC), west of Northwest China (WNC), and throughout China after 1994. Subsequently, the variable importance in projection (VIP) between EVP and climatic factors obtained by partial least squares (PLS) regression and the relative contribution calculated by Xgboost stepwise regression analysis (SRA) were used to investigate the climatic parameter sensitivity to EVP. The results indicated that the combined effects of the vapor pressure deficit (VPD), sunshine duration (SSD), and wind speed (WIN) were the main reasons for the variations in EVP across China. At the seasonal scale, SSD, WIN, relative humidity (RHU), and VPD were the most sensitive climatic factors to EVP in different seasons. In addition, the Pacific decadal oscillation (PDO) index showed a significant negative correlation with EVP, and the El Niño 3.4 (N3.4) and East Atlantic/Western Russia (EA/WR) indices revealed positive correlations in most regions from 1961 to 1993, while the North Atlantic oscillation (NAO) was negatively correlated with EVP. Moreover, N3.4 and Atlantic multidecadal oscillation (AMO) were positively correlated with EVP from 1994 to 2017. Finally, the yearly number of heatwave events (HWN) was highly correlated with EVP because of the high VPD and SSD levels during the heatwave event periods.

Niu Zigeng, Wang Lunche, Chen Xinxin, Yang Liu, Feng Lan

2021-Jul-16

Heatwave event, Machine learning, Meteorological factors, Pan evaporation, Spatiotemporal distributions

General General

Identification of kinase inhibitors that rule out the CYP27B1-mediated activation of vitamin D: an integrated machine learning and structure-based drug designing approach.

In Molecular diversity

CYP27B1, a cytochrome P450-containing hydroxylase enzyme, converts vitamin D precursor calcidiol (25-hydroxycholecalciferol) to its active form calcitriol (1α,25(OH)2D3). Tyrosine kinase inhibitor such as imatinib is reported to interfere with the activation of vitamin D3 by inhibiting CYP27B1 enzyme. Consequently, there is a decrease in the serum levels of active vitamin D that in turn may increase the relapse risk among the cancer patients treated with imatinib. Within this framework, the current study focuses on identifying other possible kinase inhibitors that may affect the calcitriol level in the body by inhibiting CYP27B1. To achieve this, we explored multiple machine learning approaches including support vector machine (SVM), random forest (RF), and artificial neural network (ANN) to identify possible CYP27B1 inhibitors from a pool of kinase inhibitors database. The most reliable classification model was obtained from the SVM approach with Matthews correlation coefficient of 0.82 for the external test set. This model was further employed for the virtual screening of kinase inhibitors from the binding database (DB), which tend to interfere with the CYP27B1-mediated activation of vitamin D. This screening yielded around 4646 kinase inhibitors that were further subjected to structure-based analyses using the homology model of CYP27B1, as the 3D structure of CYP27B1 complexed with heme was not available. Overall, five kinase inhibitors including two well-known drugs, i.e., AT7867 (Compound-2) and amitriptyline N-oxide (Compound-3), were found to interact with CYP27B1 in such a way that may preclude the conversion of vitamin D to its active form and hence testify the impairment of vitamin D activation pathway.

Mahajan Kanupriya, Verma Himanshu, Choudhary Shalki, Raju Baddipadige, Silakari Om

2021-Jul-16

BindingDB, CYP27B1, Kinase inhibitors, Machine learning, Vitamin D

General General

Adaptive optical beam alignment and link protection switching for 5G-over-FSO.

In Optics express

Free-space optics (FSO) convey an enormous potential for ultra-high-capacity seamless fiber-wireless transmission in 5G and beyond communication systems. However, for its practical exploitation in future deployments, FSO still requires the development of very high-precision and robust optical beam alignment. In this paper, we propose two different methods to achieve tight, precise alignment between a pair of FSO transceivers, using a gimbal-based setup. For scenarios where there is no information about the system, a black-box artificial intelligence (AI)-based method resorting to particle swarm optimization (PSO) is presented, enabling to autonomously align the system with a success rate above 96%, converging from a blind starting position. Alternatively, for scenarios with partial information about the FSO system, we propose a tailored custom algorithm, with a success rate of 92%, but with a ∼4 × reduction on the alignment time. The automatic alignment is then validated in a 5G-like fiber-FSO scenario, transmitting a 16 × 400 MHz signal and achieving a maximum bit-rate of 30 Gbps. Moreover, we propose the implementation of a fail-safe mechanism with a backup FSO receiver, thereby providing an extra degree of robustness towards temporary events of strong degradation on the FSO channel or line-of-sight (LOS) interruption.

Fernandes Marco A, Brandão Bruno T, Georgieva Petia, Monteiro Paulo P, Guiomar Fernando P

2021-Jun-21

General General

A toxicogenomic data space for system-level understanding and prediction of EDC-induced toxicity.

In Environment international

Endocrine disrupting compounds (EDCs) are a persistent threat to humans and wildlife due to their ability to interfere with endocrine signaling pathways. Inspired by previous work to improve chemical hazard identification through the use of toxicogenomics data, we developed a genomic-oriented data space for profiling the molecular activity of EDCs in an in silico manner, and for creating predictive models that identify and prioritize EDCs. Predictive models of EDCs, derived from gene expression data from rats (in vivo and in vitro primary hepatocytes) and humans (in vitro primary hepatocytes and HepG2), achieve testing accuracy greater than 90%. Negative test sets indicate that known safer chemicals are not predicted as EDCs. The rat in vivo-based classifiers achieve accuracy greater than 75% when tested for invitro to in vivoextrapolation. This study reveals key metabolic pathways and genes affected by EDCs together with a set of predictive models that utilize these pathways to prioritize EDCs in dose/time dependent manner and to predict EDCevokedmetabolic diseases.

Sakhteman A, Failli M, Kublbeck J, Levonen A L, Fortino V

2021-Jul-13

Endocrine disrupting chemicals, In silico toxicity prediction, Machine learning, Metabolic diseases, Toxicogenomics

General General

Efficient machine learning model for predicting drug-target interactions with case study for Covid-19.

In Computational biology and chemistry

BACKGROUND : Discover possible Drug Target Interactions (DTIs) is a decisive step in the detection of the effects of drugs as well as drug repositioning. There is a strong incentive to develop effective computational methods that can effectively predict potential DTIs, as traditional DTI laboratory experiments are expensive, time-consuming, and labor-intensive. Some technologies have been developed for this purpose, however large numbers of interactions have not yet been detected, the accuracy of their prediction still low, and protein sequences and structured data are rarely used together in the prediction process.

METHODS : This paper presents DTIs prediction model that takes advantage of the special capacity of the structured form of proteins and drugs. Our model obtains features from protein amino-acid sequences using physical and chemical properties, and from drugs smiles (Simplified Molecular Input Line Entry System) strings using encoding techniques. Comparing the proposed model with different existing methods under K-fold cross validation, empirical results show that our model based on ensemble learning algorithms for DTI prediction provide more accurate results from both structures and features data.

RESULTS : The proposed model is applied on two datasets:Benchmark (feature only) datasets and DrugBank (Structure data) datasets. Experimental results obtained by Light-Boost and ExtraTree using structures and feature data results in 98 % accuracy and 0.97 f-score comparing to 94 % and 0.92 achieved by the existing methods. Moreover, our model can successfully predict more yet undiscovered interactions, and hence can be used as a practical tool to drug repositioning. A case study of applying our prediction model on the proteins that are known to be affected by Corona viruses in order to predict the possible interactions among these proteins and existing drugs is performed. Also, our model is applied on Covid-19 related drugs announced on DrugBank. The results show that some drugs like DB00691 and DB05203 are predicted with 100 % accuracy to interact with ACE2 protein. This protein is a self-membrane protein that enables Covid-19 infection. Hence, our model can be used as an effective tool in drug reposition to predict possible drug treatments for Covid-19.

El-Behery Heba, Attia Abdel-Fattah, El-Feshawy Nawal, Torkey Hanaa

2021-Jul-05

Covid-19, Deep-learning, Drug-target interactions, Drugs, Machine learning, Prediction, Proteins

General General

Realistic preterm prediction based on optimized synthetic sampling of EHG signal.

In Computers in biology and medicine

Preterm labor is the leading cause of neonatal morbidity and mortality in newborns and has attracted significant research attention from many scientific areas. The relationship between uterine contraction and the underlying electrical activities makes uterine electrohysterogram (EHG) a promising direction for detecting and predicting preterm births. However, due to the scarcity of EHG signals, especially those leading to preterm births, synthetic algorithms have been used to generate artificial samples of preterm birth type in order to eliminate bias in the prediction towards normal delivery, at the expense of reducing the feature effectiveness in automatic preterm detection based on machine learning. To address this problem, we quantify the effect of synthetic samples (balance coefficient) on the effectiveness of features and form a general performance metric by using several feature scores with relevant weights that describe their contributions to class segregation. In combination with the activation/inactivation functions that characterize the effect of the abundance of training samples on the accuracy of the prediction of preterm and normal birth delivery, we obtained an optimal sample balance coefficient that compromises the effect of synthetic samples in removing bias toward the majority group (i.e., normal delivery and the side effect of reducing the importance of features). A more realistic predictive accuracy was achieved through a series of numerical tests on the publicly available TPEHG database, therefore demonstrating the effectiveness of the proposed method.

Xu Jinshan, Chen Zhenqin, Zhang Jinpeng, Lu Yanpei, Yang Xi, Pumir Alain

2021-Jul-10

Preterm prediction, Sample balance coefficient, Synthetic sampling, Uterine electrohysterogram

Surgery Surgery

Prediction of 1-year mortality after heart transplantation using machine learning approaches: A single-center study from China.

In International journal of cardiology ; h5-index 68.0

BACKGROUND : Heart transplantation (HTx) remains the gold-standard treatment for end-stage heart failure. The aim of this study was to establish a risk-prediction model for assessing prognosis of HTx using machine-learning approach.

METHODS : Consecutive recipients of orthotopic HTx at our institute between January 1st, 2015 and December 31st, 2018 were included in this study. The primary outcome was 1-year mortality. Least absolute shrinkage and selection operator method was used to select variables and seven different machine-learning approaches were employed to develop the risk-prediction model. Bootstrap method was used for model validation. Shapley Additive exPlanations (SHAP) method was used for model interpretation.

RESULTS : 381 recipients were included with average age of 43.783 years old. Albumin, recipient age and left atrium diameter ranked top three most important variables that affect the 1-year mortality of HTx. Other important variables included red blood cell, hemoglobin, lymphocyte%, smoking history, use of lyophilized rhBNP, use of Levosimendan, hypertension, cardiac surgery history, malignancy and endotracheal intubation history. Random Forest (RF) model achieved the best area under curves (AUC) of 0.801 and gradient boosting machine (GBM) showed the best sensitivity of 0.271. SHAP method was introduced to display the RF model's predicting processes of "survival" or "death" in individual level.

CONCLUSIONS : We established the risk-prediction model for postoperative prognosis of HTx patients by using machine learning method and demonstrated that the RF model performed the highest discrimination with the largest AUC when validated. This prediction model could help to recognize high-risk HTx recipients, provide personalized therapy plan and reduce organ wastage.

Zhou Ying, Chen Si, Rao Zhenqi, Yang Dong, Liu Xiang, Dong Nianguo, Li Fei

2021-Jul-13

Heart transplantation, Machine-learning approach, Risk-prediction model, Shapley Additive exPlanations

General General

Metamaterial perfect absorber with morphology-engineered meta-atoms using deep learning.

In Optics express

Metamaterial perfect absorbers (MPAs) typically have regularly-shaped unit structures owing to constraints on conventional analysis methods, limiting their absorption properties. We propose an MPA structure with a general polygon-shaped meta-atom. Its irregular unit structure provides multiple degrees-of-freedom, enabling flexible properties, such as dual-band absorption. We constructed a deep neural network to predict the parameters of the corresponding MPA structure with a given absorptivity as input, and vice versa. The mean-square error was as low as 0.0017 on the validation set. This study provides a basis for the design of complicated artificial electromagnetic structures for application in metamaterials and metasurfaces.

Han Cheng, Zhang Baifu, Wang Hao, Ding Jianping

2021-Jun-21

General General

Efficient machine learning model for predicting drug-target interactions with case study for Covid-19.

In Computational biology and chemistry

BACKGROUND : Discover possible Drug Target Interactions (DTIs) is a decisive step in the detection of the effects of drugs as well as drug repositioning. There is a strong incentive to develop effective computational methods that can effectively predict potential DTIs, as traditional DTI laboratory experiments are expensive, time-consuming, and labor-intensive. Some technologies have been developed for this purpose, however large numbers of interactions have not yet been detected, the accuracy of their prediction still low, and protein sequences and structured data are rarely used together in the prediction process.

METHODS : This paper presents DTIs prediction model that takes advantage of the special capacity of the structured form of proteins and drugs. Our model obtains features from protein amino-acid sequences using physical and chemical properties, and from drugs smiles (Simplified Molecular Input Line Entry System) strings using encoding techniques. Comparing the proposed model with different existing methods under K-fold cross validation, empirical results show that our model based on ensemble learning algorithms for DTI prediction provide more accurate results from both structures and features data.

RESULTS : The proposed model is applied on two datasets:Benchmark (feature only) datasets and DrugBank (Structure data) datasets. Experimental results obtained by Light-Boost and ExtraTree using structures and feature data results in 98 % accuracy and 0.97 f-score comparing to 94 % and 0.92 achieved by the existing methods. Moreover, our model can successfully predict more yet undiscovered interactions, and hence can be used as a practical tool to drug repositioning. A case study of applying our prediction model on the proteins that are known to be affected by Corona viruses in order to predict the possible interactions among these proteins and existing drugs is performed. Also, our model is applied on Covid-19 related drugs announced on DrugBank. The results show that some drugs like DB00691 and DB05203 are predicted with 100 % accuracy to interact with ACE2 protein. This protein is a self-membrane protein that enables Covid-19 infection. Hence, our model can be used as an effective tool in drug reposition to predict possible drug treatments for Covid-19.

El-Behery Heba, Attia Abdel-Fattah, El-Feshawy Nawal, Torkey Hanaa

2021-Jul-05

Covid-19, Deep-learning, Drug-target interactions, Drugs, Machine learning, Prediction, Proteins

Public Health Public Health

Protein Interaction Network-based Deep Learning Framework for Identifying Disease-Associated Human Proteins.

In Journal of molecular biology ; h5-index 65.0

Infectious diseases in humans appear to be one of the most primary public health issues. Identification of novel disease-associated proteins will furnish an efficient recognition of the novel therapeutic targets. Here, we develop a Graph Convolutional Network (GCN)-based model called PINDeL to identify the disease-associated host proteins by integrating the human Protein Locality Graph and its corresponding topological features. Because of the amalgamation of GCN with the protein interaction network, PINDeL achieves the highest accuracy of 83.45%while AUROC and AUPRC values are 0.90and 0.88, respectively. With high accuracy, recall, F1-score, specificity, AUROC, and AUPRC, PINDeL outperforms other existing machine-learning and deep-learning techniques for disease gene/protein identification in humans. Application of PINDeL on an independent dataset of 24320proteins, which are not used for training, validation, or testing purposes, predicts 6448new disease-protein associations of which we verify 3196disease-proteins through experimental evidence like disease ontology, Gene Ontology, and KEGG pathway enrichment analyses. Our investigation informs that experimentally-verified 748proteins are indeed responsible for pathogen-host protein interactions of which 22disease-proteins share their association with multiple diseases such as cancer, aging, chem-dependency, pharmacogenomics, normal variation, infection, and immune-related diseases. This unique Graph Convolution Network-based prediction model is of utmost use in large-scale disease-protein association prediction and hence, will provide crucial insights on disease pathogenesis and will further aid in developing novel therapeutics.

Das Barnali, Mitra Pralay

2021-Jul-14

Deep Learning-based Classification, Disease-associated Proteins, Enrichment analysis, Graph Convolutional Networks, Topological features of Protein Locality Graph

General General

Principles and methods in computational membrane protein design.

In Journal of molecular biology ; h5-index 65.0

After decades of progress in computational protein design, the design of proteins folding and functioning in lipid membranes appears today as the next frontier. Some notable successes in the de novo design of simplified model membrane protein systems have helped articulate fundamental principles of protein folding, architecture and interaction in the hydrophobic lipid environment. These principles are reviewed here, together with the computational methods and approaches that were used to identify them. We provide an overview of the methodological innovations in the generation of new protein structures and functions and in the development of membrane-specific energy functions. We highlight the opportunities offered by new machine learning approaches applied to protein design, and by new experimental characterization techniques applied to membrane proteins. Although membrane protein design is in its infancy, it appears more reachable than previously thought.

Andreevna Vorobieva Anastassia

2021-Jul-13

Computational protein design, de novo protein design, membrane proteins folding, protein function, protein structure

General General

Systematic mapping of global research on climate and health: a machine learning review.

In The Lancet. Planetary health

BACKGROUND : The global literature on the links between climate change and human health is large, increasing exponentially, and it is no longer feasible to collate and synthesise using traditional systematic evidence mapping approaches. We aimed to use machine learning methods to systematically synthesise an evidence base on climate change and human health.

METHODS : We used supervised machine learning and other natural language processing methods (topic modelling and geoparsing) to systematically identify and map the scientific literature on climate change and health published between Jan 1, 2013, and April 9, 2020. Only literature indexed in English were included. We searched Web of Science Core Collection, Scopus, and PubMed using title, abstract, and keywords only. We searched for papers including both a health component and an explicit mention of either climate change, climate variability, or climate change-relevant weather phenomena. We classified relevant publications according to the fields of climate research, climate drivers, health impact, date, and geography. We used supervised and unsupervised machine learning to identify and classify relevant articles in the field of climate and health, with outputs including evidence heat maps, geographical maps, and narrative synthesis of trends in climate health-related publications. We included empirical literature of any study design that reported on health pathways associated with climate impacts, mitigation, or adaptation.

FINDINGS : We predict that there are 15 963 studies in the field of climate and health published between 2013 and 2019. Climate health literature is dominated by impact studies, with mitigation and adaptation responses and their co-benefits and co-risks remaining niche topics. Air quality and heat stress are the most frequently studied exposures, with all-cause mortality and infectious disease incidence being the most frequently studied health outcomes. Seasonality, extreme weather events, heat, and weather variability are the most frequently studied climate-related hazards. We found major gaps in evidence on climate health research for mental health, undernutrition, and maternal and child health. Geographically, the evidence base is dominated by studies from high-income countries and China, with scant evidence from low-income counties, which often suffer most from the health consequences of climate change.

INTERPRETATION : Our findings show the importance and feasibility of using automated machine learning to comprehensively map the science on climate change and human health in the age of big literature. These can provide key inputs into global climate and health assessments. The scant evidence on climate change response options is concerning and could significantly hamper the design of evidence-based pathways to reduce the effects on health of climate change. In the post-2015 Paris Agreement era of climate solutions, we believe much more attention should be given to climate adaptation and mitigation options and their effects on human health.

FUNDING : Foreign, Commonwealth & Development Office.

Berrang-Ford Lea, Sietsma Anne J, Callaghan Max, Minx Jan C, Scheelbeek Pauline F D, Haddaway Neal R, Haines Andy, Dangour Alan D

2021-Jul-13

General General

PyRMD: A New Fully Automated AI-Powered Ligand-Based Virtual Screening Tool.

In Journal of chemical information and modeling

Artificial intelligence (AI) algorithms are dramatically redefining the current drug discovery landscape by boosting the efficiency of its various steps. Still, their implementation often requires a certain level of expertise in AI paradigms and coding. This often prevents the use of these powerful methodologies by non-expert users involved in the design of new biologically active compounds. Here, the random matrix discriminant (RMD) algorithm, a high-performance AI method specifically tailored for the identification of new ligands, was implemented in a new fully automated tool, PyRMD. This ligand-based virtual screening tool can be trained using target bioactivity data directly downloaded from the ChEMBL repository without manual intervention. The software automatically splits the available training compounds into active and inactive sets and learns the distinctive chemical features responsible for the compounds' activity/inactivity. PyRMD was designed to easily screen millions of compounds in hours through an automated workflow and intuitive input files, allowing fine tuning of each parameter of the calculation. Additionally, PyRMD features a wealth of benchmark metrics, to accurately probe the model performance, which were used here to gauge its predictive potential and limitations. PyRMD is freely available on GitHub (https://github.com/cosconatilab/PyRMD) as an open-source tool.

Amendola Giorgio, Cosconati Sandro

2021-Jul-16

General General

A hybrid machine learning/pharmacokinetic approach outperforms maximum a posteriori Bayesian estimation by selectively flattening model priors.

In CPT: pharmacometrics & systems pharmacology

Model-informed precision dosing (MIPD) approaches typically apply maximum a posteriori (MAP) Bayesian estimation to determine individual pharmacokinetic (PK) parameters with the goal of optimizing future dosing regimens. This process combines knowledge about the individual, in the form of drug levels or pharmacodynamic biomarkers, with prior knowledge of the drug PK in the general population. Use of "flattened priors" (FP), in which the weight of the model priors is reduced relative to observations about the patient, has been previously proposed to estimate individual PK parameters in instances where the patient is poorly described by the PK model. However, little is known about the predictive performance of FP and when to apply FP in MIPD. Here, FP is evaluated in a data set of 4679 adult patients treated with vancomycin. Depending on the PK model, prediction error could be reduced by applying FP in 42-55% of PK parameter estimations. Machine learning (ML) models could identify instances where FP would outperform MAP with a specificity of 81-86%, reducing overall root mean squared error (RMSE) of PK model predictions by 12-22% (0.5-1.2 mg/L) relative to MAP alone. The factors most indicative of the use of FP were past prediction residuals and bias in past PK predictions. A more clinically practical minimal model was developed using only these two features, reducing RMSE by 5-18% (0.20-0.93 mg/L) relative to MAP. This hybrid ML/PK approach advances the precision dosing toolkit by leveraging the power of ML while maintaining the mechanistic insight and interpretability of pharmacokinetic models.

Hughes Jasmine H, Keizer Ron J

2021-Jul-16

Bayesian, Pharmacokinetics-pharmacodynamics, Therapeutic Drug Monitoring, algorithms, individualization, personalized medicine, pharmacometrics, population pharmacokinetics, precision medicine

General General

Artificial Intelligence in Orthodontics: Where Are We Now? A Scoping Review.

In Orthodontics & craniofacial research

OBJECTIVE : This scoping review aims to determine the applications of Artificial Intelligence (AI) that are extensively employed in the field of Orthodontics, to evaluate its benefits, and to discuss its potential implications in this speciality. Recent decades have witnessed enormous changes in our profession. The arrival of new and more aesthetic options in orthodontic treatment, the transition to a fully digital workflow, the emergence of temporary anchorage devices, and new imaging methods all provide both patients and professionals with a new focus in orthodontic care.

MATERIALS AND METHODS : This review was performed following the Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) guidelines. The electronic literature search was performed through MEDLINE/PubMed, Scopus, Web of Science, Cochrane, and IEEE Xplore databases with a 11-year time restriction: January 2010 till March 2021. No additional manual searches were performed.

RESULTS : The electronic literature search initially returned 311 records, and 115 after removing duplicate references. Finally, the application of the inclusion criteria resulted in 17 eligible publications in the qualitative synthesis review.

CONCLUSION : The analyzed studies demonstrated that Convolution Neural Networks can be used for the automatic detection of anatomical reference points on radiological images. In the growth and development research area, the Cervical Vertebral Maturation stage can be determined using an Artificial Neural Network model and obtain the same results as expert human observers. AI technology can also improve the diagnostic accuracy for orthodontic treatments, thereby helping the orthodontist work more accurately and efficiently.

Monill-González Anna, Rovira-Calatayud Laia, d’Oliveira Nuno Gustavo, Ustrell-Torrent Josep M

2021-Jul-16

Artificial Intelligence, machine learning, orthodontics, review

Cardiology Cardiology

Distinct Phenotypes of Hospitalized Patients with Hyperkalemia by Machine Learning Consensus Clustering and Associated Mortality Risks.

In QJM : monthly journal of the Association of Physicians

BACKGROUND : Hospitalized patients with hyperkalemia are heterogeneous, and cluster approaches may identify specific homogenous groups. This study aimed to cluster patients with hyperkalemia on admission using unsupervised machine learning consensus clustering approach, and to compare characteristics and outcomes among these distinct clusters.

METHODS : Consensus cluster analysis was performed in 5,133 hospitalized adult patients with admission hyperkalemia, based on available clinical and laboratory data. The standardized mean difference was used to identify each cluster's key clinical features. The association of hyperkalemia clusters with hospital and one-year mortality was assessed using logistic and Cox proportional hazard regression.

RESULTS : Three distinct clusters of hyperkalemia patients were identified using consensus cluster analysis: 1,661 (32%) in cluster 1, 2,455 (48%) in cluster 2, and 1,017 (20%) in cluster 3. Cluster 1 was mainly characterized by older age, higher serum chloride, and acute kidney injury (AKI), but lower estimated glomerular filtration rate (eGFR), serum bicarbonate and hemoglobin. Cluster 2 was mainly characterized by higher eGFR, serum bicarbonate, and hemoglobin, but lower comorbidity burden, serum potassium, and AKI. Cluster 3 was mainly characterized by higher comorbidity burden, particularly diabetes, and end-stage kidney disease, AKI, serum potassium, anion gap, but lower eGFR, serum sodium, chloride, and bicarbonate. Hospital and one-year mortality risk was significantly different among the three identified clusters, with highest mortality in cluster 3, followed by cluster 1, and then cluster 2.

CONCLUSION : In a heterogeneous cohort of hyperkalemia patients, three distinct clusters were identified using unsupervised machine learning. These three clusters had different clinical characteristics and outcomes.

Thongprayoon Charat, Kattah Andrea G, Mao Michael A, Keddis Mira T, Pattharanitima Pattharawin, Vallabhajosyula Saraschandra, Nissaisorakarn Voravech, Erickson Stephen B, Dillon John J, Garovic Vesna D, Cheungpasitporn Wisit

2021-Jul-16

Artificial intelligence, Clustering, Hospitalization, Hyperkalemia, Machine Learning, Mortality, Potassium

General General

Association of Snoring Characteristics with Predominant Site of Collapse of Upper Airway in Obstructive Sleep Apnoea Patients.

In Sleep

STUDY OBJECTIVES : Acoustic analysis of isolated events and snoring by previous researchers suggests a correlation between individual acoustic features and individual site of collapse events. In this study, we hypothesised that multi-parameter evaluation of snore sounds during natural sleep would provide a robust prediction of the predominant site of airway collapse.

METHODS : The audio signals of 58 OSA patients were recorded simultaneously with full night polysomnography. The site of collapse was determined by manual analysis of the shape of the airflow signal during hypopnoea events and corresponding audio signal segments containing snore were manually extracted and processed. Machine learning algorithms were developed to automatically annotate the site of collapse of each hypopnoea event into three classes (lateral wall, palate and tongue-base). The predominant site of collapse for a sleep period was determined from the individual hypopnoea annotations and compared to the manually determined annotations. This was a retrospective study that used cross-validation to estimate performance.

RESULTS : Cluster analysis showed that the data fits well in two clusters with a mean silhouette coefficient of 0.79 and an accuracy of 68% for classifying tongue/non-tongue collapse. A classification model using linear discriminants achieved an overall accuracy of 81% for discriminating tongue/non-tongue predominant site of collapse and accuracy of 64% for all site of collapse classes.

CONCLUSIONS : Our results reveal that the snore signal during hypopnoea can provide information regarding the predominant site of collapse in the upper airway. Therefore, the audio signal recorded during sleep could potentially be used as a new tool in identifying the predominant site of collapse and consequently improving the treatment selection and outcome.

Sebastian Arun, Cistulli Peter A, Cohen Gary, de Chazal Philip

2021-Jul-16

airflow signal, hypopnoea, machine learning, obstructive sleep apnoea, predominant site of collapse, snore recording

Internal Medicine Internal Medicine

Utilizing timestamps of longitudinal electronic health record data to classify clinical deterioration events.

In Journal of the American Medical Informatics Association : JAMIA

OBJECTIVE : To propose an algorithm that utilizes only timestamps of longitudinal electronic health record data to classify clinical deterioration events.

MATERIALS AND METHODS : This retrospective study explores the efficacy of machine learning algorithms in classifying clinical deterioration events among patients in intensive care units using sequences of timestamps of vital sign measurements, flowsheets comments, order entries, and nursing notes. We design a data pipeline to partition events into discrete, regular time bins that we refer to as timesteps. Logistic regressions, random forest classifiers, and recurrent neural networks are trained on datasets of different length of timesteps, respectively, against a composite outcome of death, cardiac arrest, and Rapid Response Team calls. Then these models are validated on a holdout dataset.

RESULTS : A total of 6720 intensive care unit encounters meet the criteria and the final dataset includes 830 578 timestamps. The gated recurrent unit model utilizes timestamps of vital signs, order entries, flowsheet comments, and nursing notes to achieve the best performance on the time-to-outcome dataset, with an area under the precision-recall curve of 0.101 (0.06, 0.137), a sensitivity of 0.443, and a positive predictive value of 0. 092 at the threshold of 0.6.

DISCUSSION AND CONCLUSION : This study demonstrates that our recurrent neural network models using only timestamps of longitudinal electronic health record data that reflect healthcare processes achieve well-performing discriminative power.

Fu Li-Heng, Knaplund Chris, Cato Kenrick, Perotte Adler, Kang Min-Jeoung, Dykes Patricia C, Albers David, Collins Rossetti Sarah

2021-Jul-16

clinical informatics, early warning scores, machine learning, electronic health records, predictive modeling

General General

DeepADEMiner: a deep learning pharmacovigilance pipeline for extraction and normalization of adverse drug event mentions on Twitter.

In Journal of the American Medical Informatics Association : JAMIA

OBJECTIVE : Research on pharmacovigilance from social media data has focused on mining adverse drug events (ADEs) using annotated datasets, with publications generally focusing on 1 of 3 tasks: ADE classification, named entity recognition for identifying the span of ADE mentions, and ADE mention normalization to standardized terminologies. While the common goal of such systems is to detect ADE signals that can be used to inform public policy, it has been impeded largely by limited end-to-end solutions for large-scale analysis of social media reports for different drugs.

MATERIALS AND METHODS : We present a dataset for training and evaluation of ADE pipelines where the ADE distribution is closer to the average 'natural balance' with ADEs present in about 7% of the tweets. The deep learning architecture involves an ADE extraction pipeline with individual components for all 3 tasks.

RESULTS : The system presented achieved state-of-the-art performance on comparable datasets and scored a classification performance of F1 = 0.63, span extraction performance of F1 = 0.44 and an end-to-end entity resolution performance of F1 = 0.34 on the presented dataset.

DISCUSSION : The performance of the models continues to highlight multiple challenges when deploying pharmacovigilance systems that use social media data. We discuss the implications of such models in the downstream tasks of signal detection and suggest future enhancements.

CONCLUSION : Mining ADEs from Twitter posts using a pipeline architecture requires the different components to be trained and tuned based on input data imbalance in order to ensure optimal performance on the end-to-end resolution task.

Magge Arjun, Tutubalina Elena, Miftahutdinov Zulfat, Alimova Ilseyar, Dirkson Anne, Verberne Suzan, Weissenbacher Davy, Gonzalez-Hernandez Graciela

2021-Jul-16

drug safety, information extraction, natural language processing, pharmacovigilance, social media mining

General General

768-ary Laguerre-Gaussian-mode shift keying free-space optical communication based on convolutional neural networks.

In Optics express

Beyond orbital angular momentum of Laguerre-Gaussian (LG) modes, the radial index can also be exploited as information channel in free-space optical (FSO) communication to extend the communication capacity, resulting in the LG- shift keying (LG-SK) FSO communications. However, the recognition of radial index is critical and tough when the superposed high-order LG modes are disturbed by the atmospheric turbulences (ATs). In this paper, the convolutional neural network (CNN) is utilized to recognize both the azimuthal and radial index of superposed LG modes. We experimentally demonstrate the application of CNN model in a 10-meter 768-ary LG-SK FSO communication system at the AT of Cn2 = 1e-14 m-2/3. Based on the high recognition accuracy of the CNN model (>95%) in the scheme, a colorful image can be transmitted and the peak signal-to-noise ratio of the received image can exceed 35 dB. We anticipate that our results can stimulate further researches on the utilization of the potential applications of LG modes with non-zero radial index based on the artificial-intelligence-enhanced optoelectronic systems.

Luan Haitao, Lin Dajun, Li Keyao, Meng Weijia, Gu Min, Fang Xinyuan

2021-Jun-21

General General

The impact of artificial intelligence and digital style on industry and energy post-COVID-19 pandemic.

In Environmental science and pollution research international

The SARS-CoV-2 virus caused crises in social, economic, and energy areas and medical life worldwide throughout 2020. This crisis had many direct and indirect effects on all areas of society. In the meantime, the digital and artificial intelligence industry can be used as a professional assistant to manage and control the outbreak of the virus. The present article's objective is to investigate the effects of COVID-19 on each of the various fields of medicine, industry, and energy. What set