Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

Automated prioritization of sick newborns for whole genome sequencing using clinical natural language processing and machine learning.

In Genome medicine ; h5-index 64.0

BACKGROUND : Rapidly and efficiently identifying critically ill infants for whole genome sequencing (WGS) is a costly and challenging task currently performed by scarce, highly trained experts and is a major bottleneck for application of WGS in the NICU. There is a dire need for automated means to prioritize patients for WGS.

METHODS : Institutional databases of electronic health records (EHRs) are logical starting points for identifying patients with undiagnosed Mendelian diseases. We have developed automated means to prioritize patients for rapid and whole genome sequencing (rWGS and WGS) directly from clinical notes. Our approach combines a clinical natural language processing (CNLP) workflow with a machine learning-based prioritization tool named Mendelian Phenotype Search Engine (MPSE).

RESULTS : MPSE accurately and robustly identified NICU patients selected for WGS by clinical experts from Rady Children's Hospital in San Diego (AUC 0.86) and the University of Utah (AUC 0.85). In addition to effectively identifying patients for WGS, MPSE scores also strongly prioritize diagnostic cases over non-diagnostic cases, with projected diagnostic yields exceeding 50% throughout the first and second quartiles of score-ranked patients.

CONCLUSIONS : Our results indicate that an automated pipeline for selecting acutely ill infants in neonatal intensive care units (NICU) for WGS can meet or exceed diagnostic yields obtained through current selection procedures, which require time-consuming manual review of clinical notes and histories by specialized personnel.

Peterson Bennet, Hernandez Edgar Javier, Hobbs Charlotte, Malone Jenkins Sabrina, Moore Barry, Rosales Edwin, Zoucha Samuel, Sanford Erica, Bainbridge Matthew N, Frise Erwin, Oriol Albert, Brunelli Luca, Kingsmore Stephen F, Yandell Mark

2023-Mar-16

Public Health Public Health

DFFNDDS: prediction of synergistic drug combinations with dual feature fusion networks.

In Journal of cheminformatics

Drug combination therapies are promising clinical treatments for curing patients. However, efficiently identifying valid drug combinations remains challenging because the number of available drugs has increased rapidly. In this study, we proposed a deep learning model called the Dual Feature Fusion Network for Drug-Drug Synergy prediction (DFFNDDS) that utilizes a fine-tuned pretrained language model and dual feature fusion mechanism to predict synergistic drug combinations. The dual feature fusion mechanism fuses the drug features and cell line features at the bit-wise level and the vector-wise level. We demonstrated that DFFNDDS outperforms competitive methods and can serve as a reliable tool for identifying synergistic drug combinations.

Xu Mengdie, Zhao Xinwei, Wang Jingyu, Feng Wei, Wen Naifeng, Wang Chunyu, Wang Junjie, Liu Yun, Zhao Lingling

2023-Mar-16

Deep learning, Drug combination, Dual-feature fusion, Synergistic effect

General General

B-LBConA: a medical entity disambiguation model based on Bio-LinkBERT and context-aware mechanism.

In BMC bioinformatics

BACKGROUND : The main task of medical entity disambiguation is to link mentions, such as diseases, drugs, or complications, to standard entities in the target knowledge base. To our knowledge, models based on Bidirectional Encoder Representations from Transformers (BERT) have achieved good results in this task. Unfortunately, these models only consider text in the current document, fail to capture dependencies with other documents, and lack sufficient mining of hidden information in contextual texts.

RESULTS : We propose B-LBConA, which is based on Bio-LinkBERT and context-aware mechanism. Specifically, B-LBConA first utilizes Bio-LinkBERT, which is capable of learning cross-document dependencies, to obtain embedding representations of mentions and candidate entities. Then, cross-attention is used to capture the interaction information of mention-to-entity and entity-to-mention. Finally, B-LBConA incorporates disambiguation clues about the relevance between the mention context and candidate entities via the context-aware mechanism.

CONCLUSIONS : Experiment results on three publicly available datasets, NCBI, ADR and ShARe/CLEF, show that B-LBConA achieves a signifcantly more accurate performance compared with existing models.

Yang Siyu, Zhang Peiliang, Che Chao, Zhong Zhaoqian

2023-Mar-16

Bio-LinkBERT, Candidate ranking, Cross-attention, ELMo, Medical entity disambiguation

General General

Prediction model for biochar energy potential based on biomass properties and pyrolysis conditions derived from rough set machine learning.

In Environmental technology

Biochar is a high carbon content organic compound has potential applications in the field of energy storage and conversion. It can be produced from a variety of biomass feedstocks such as plant based, animal based, and municipal waste at different pyrolysis conditions. However, it is difficult to produce biochar on a large scale if the relationship between the type of biomass, operating conditions, and biochar properties is not understood well. Hence, the use of machine learning based data analysis is necessary to find the relationship between biochar production parameters as well as feedstock properties with biochar energy properties. In this work, a rough set-based machine learning (RSML) approach has been applied to generate decision rules and classify biochar properties. The condition attributes were biomass properties (volatile matter, fixed carbon, ash content, carbon, hydrogen, nitrogen, oxygen) and pyrolysis conditions (operating temperature, heating rate residence time) while the decision attributes considered were yield, carbon content, and higher heating value. The rules generated were tested against a set of validation data and evaluated for its scientific coherency. Based on then decision rules generated, biomass with ash content of 11 to 14 wt%, volatile matter of 60 to 62 wt% and carbon content of 42 to 45.3 wt% can generate biochar with promising yield, carbon content and higher heating value via pyrolysis process at operating temperature of 425°C to 475°C. This work provided the optimal biomass feedstock properties and pyrolysis conditions for biochar production with high mass and energy yield.

Tang Jia Yong, Chung Boaz Yi Heng, Ang Jia Chun, Chong Jia Wen, Tan Raymond R, Aviso Kathleen B, Chemmangattuvalappil Nishanth G, Thangalazhy-Gopakumar Suchithra

2023-Mar-17

biochar, biochar yield, carbon content, higher heating value, rough set machine learning

General General

Modeling of methylene blue removal on Fe3O4 modified activated carbon with artificial neural network (ANN).

In International journal of phytoremediation

In this study, AC/Fe3O4 adsorbent was first synthesized by modifying activated carbon with Fe3O4. The structure of the adsorbent was then characterized using analysis techniques specific surface area (BET), Scanning Electron Microscopy with Energy Dispersive X-ray Spectroscopy (SEM-EDX), and Fourier Transform Infrared Spectroscopy (FTIR). Equilibrium, thermodynamic and kinetic studies were carried out on the removal of methylene blue (MB) dyestuff from aqueous solutions AC/Fe3O4 adsorbent. The Langmuir maximum adsorption capacity of AC/Fe3O4 was 312.8 mg g-1, and the best fitness was observed with the pseudo-second-order kinetics model, with an endothermic adsorption process. In the final stage of the study, the adsorption process of MB on AC/Fe3O4 was modeled using artificial neural network modeling (ANN). Considering the smallest mean square error (MSE), The backpropagation neural network was configured as a three-layer ANN with a tangent sigmoid transfer function (Tansig) at the hidden layer with 10 neurons, linear transfer function (Purelin) the at output layer and Levenberg-Marquardt backpropagation training algorithm (LMA). Input parameters included initial solution pH (2.0-9.0), amount (0.05-0.5 g L-1), temperature (298-318 K), contact time (5-180 min), and concentration (50-500 mg L-1). The effect of each parameter on the removal and adsorption percentages was evaluated. The performance of the ANN model was adjusted by changing parameters such as the number of neurons in the middle layer, the number of inputs, and the learning coefficient. The mean absolute percentage error (MAPE) was used to evaluate the model's accuracy for the removal and adsorption percentage output parameters. The absolute fraction of variance (R2) values were 99.83, 99.36, and 98.26% for the dyestuff training, validation, and test sets, respectively.

Altintig Esra, Özcelik Tijen Över, Aydemir Zeynep, Bozdag Dilay, Kilic Eren, Yılmaz Yalçıner Ayten

2023-Mar-17

Dye removal, adsorption, artificial neural networks, desorption, estimation, methylene blue

Surgery Surgery

Deep Learning Prediction for Distal Aortic Remodeling After Thoracic Endovascular Aortic Repair in Stanford Type B Aortic Dissection.

In Journal of endovascular therapy : an official journal of the International Society of Endovascular Specialists

PURPOSE : This study aimed to develop a deep learning model for predicting distal aortic remodeling after proximal thoracic endovascular aortic repair (TEVAR) in patients with Stanford type B aortic dissection (TBAD) using computed tomography angiography (CTA).

METHODS : A total of 147 patients with acute or subacute TBAD who underwent proximal TEVAR at a single center were retrospectively reviewed. The boundary of aorta was manually segmented, and the point clouds of each aorta were obtained. Prediction of negative aortic remodeling or reintervention was accomplished by a convolutional neural network (CNN) and a point cloud neural network (PC-NN), respectively. The discriminatory value of the established models was mainly evaluated by the area under the receiver operating characteristic curve (AUC) in the test set.

RESULTS : The mean follow-up time was 34.0 months (range: 12-108 months). During follow-up, a total of 25 (17.0%) patients were identified as having negative aortic remodeling, and 16 (10.9%) patients received reintervention. The AUC (0.876) by PC-NN for predicting negative aortic remodeling was superior to that obtained by CNN (0.612, p=0.034) and similar to the AUC by PC-NN combined with clinical features (0.884, p=0.92). As to reintervention, the AUC by PC-NN was significantly higher than that by CNN (0.805 vs 0.579; p=0.042), and AUCs by PC-NN combined with clinical features and PC-NN alone were comparable (0.836 vs 0.805; p=0.81).

CONCLUSION : The CTA-based deep learning algorithms may assist clinicians in automated prediction of distal aortic remodeling after TEVAR for acute or subacute TBAD.

CLINICAL IMPACT : Negative aortic remodeling is the leading cause of late reintervention after proximal thoracic endovascular aortic repair (TEVAR) for Stanford type B aortic dissection (TBAD), and possesses great challenge to endovascular repair. Early recognizing high-risk patients is of supreme importance for optimizing the follow-up interval and therapy strategy. Currently, clinicians predict the prognosis of these patients based on several imaging signs, which is subjective. The computed tomography angiography-based deep learning algorithms may incorporate abundant morphological information of aorta, provide with a definite and objective output value, and finally assist clinicians in automated prediction of distal aortic remodeling after TEVAR for acute or subacute TBAD.

Zhou Min, Luo Xiaoyuan, Wang Xia, Xie Tianchen, Wang Yonggang, Shi Zhenyu, Wang Manning, Fu Weiguo

2023-Mar-16

aortic dissection, aortic remodeling, computed tomography angiography, deep learning