Doctor Penguin

Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General

General

DeepLBCEPred: A Bi-LSTM and multi-scale CNN-based deep learning method for predicting linear B-cell epitopes.

In Frontiers in microbiology
The epitope is the site where antigens and antibodies interact and is vital to understanding the immune system. Experimental identification of linear B-cell epitopes (BCEs) is expensive, is labor-consuming, and has a low throughput. Although a few computational methods have been proposed to address this challenge, there is still a long way to go for practical applications. We proposed a deep learning method called DeepLBCEPred for predicting linear BCEs, which consists of bi-directional long short-term memory (Bi-LSTM), feed-forward attention, and multi-scale convolutional neural networks (CNNs). We extensively tested the performance of DeepLBCEPred through cross-validation and independent tests on training and two testing datasets. The empirical results showed that the DeepLBCEPred obtained state-of-the-art performance. We also investigated the contribution of different deep learning elements to recognize linear BCEs. In addition, we have developed a user-friendly web application for linear BCEs prediction, which is freely available for all scientific researchers at: http://www.biolscience.cn/DeepLBCEPred/.
Qi Yue, Zheng Peijie, Huang Guohua

2023

B-cell, CNN, LSTM, epitope, protein sequence

Surgery

Surgery

Changes in the gut microbiome of patients with type a aortic dissection.

In Frontiers in microbiology

OBJECTIVE : To investigate the characteristic changes in the gut microbiota of patients with type A aortic dissection (AAD) and provide a theoretical basis for future microbiome-oriented interventional studies.

METHODS : High-throughput 16S rDNA sequencing was performed on the stool samples of patients with and without (healthy control subjects) AAD. Using alpha and beta diversity analysis, we compared the gut microbiota composition of 20 patients with AAD and 20 healthy controls matched for gender, age, BMI, and geographical region. The accuracy of AAD prediction by differential microbiome was calculated using the random forest machine learning model. Targeted measurement of the plasma concentration of short-chain fatty acids (SCFAs), which are the main metabolites of the gut microbiome, was performed using gas chromatography-mass spectrometry (GC-MS). Spearman's correlation analysis was conducted to determine the relationships of gut microbiome and SCFAs with the clinical characteristics of subjects.

RESULTS : The differences in gut microbiota alpha diversity between patients with AAD and the healthy controls were not statistically significant (Shannon index: p = 0.19; Chao1: p = 0.4); however, the microbiota composition (beta diversity) was significantly different between the two groups (Anosim, p = 0.001). Bacteroidota was enriched at the phylum level, and the SCFA-producing genera Prevotella, Porphyromonas, Lachnospiraceae, and Ruminococcus and inflammation-related genera Fenollaria and Sutterella were enriched at the genus level in the AAD group compared with those in the control group. The random forest model could predict AAD from gut microbiota composition with an accuracy of 87.5% and the area-under-curve (AUC) of the receiver operating characteristic curve was 0.833. The SCFA content of patients with AAD was higher than that of the control group, with the difference being statistically significant (p < 0.05). The different microflora and SCFAs were positively correlated with inflammatory cytokines.

CONCLUSION : To the best of our knowledge, this is the first demonstration of the presence of significant differences in the gut microbiome of patients with AAD and healthy controls. The differential microbiome exhibited high predictive potential toward AAD and was positively correlated with inflammatory cytokines. Our results will assist in the development of preventive and therapeutic treatment methods for patients with AAD.

Jiang Fei, Cai Meiling, Peng Yanchun, Li Sailan, Liang Bing, Ni Hong, Lin Yanjuan

2023

16S rDNA sequencing, SCFAs, aortic dissection, gut microbiome, metabolomics

General

General

Machine learning-based ozone and PM2.5 forecasting: Application to multiple AQS sites in the Pacific Northwest.

In Frontiers in big data
Air quality in the Pacific Northwest (PNW) of the U.S has generally been good in recent years, but unhealthy events were observed due to wildfires in summer or wood burning in winter. The current air quality forecasting system, which uses chemical transport models (CTMs), has had difficulty forecasting these unhealthy air quality events in the PNW. We developed a machine learning (ML) based forecasting system, which consists of two components, ML1 (random forecast classifiers and multiple linear regression models) and ML2 (two-phase random forest regression model). Our previous study showed that the ML system provides reliable forecasts of O₃ at a single monitoring site in Kennewick, WA. In this paper, we expand the ML forecasting system to predict both O₃ in the wildfire season and PM2.5 in wildfire and cold seasons at all available monitoring sites in the PNW during 2017-2020, and evaluate our ML forecasts against the existing operational CTM-based forecasts. For O₃, both ML1 and ML2 are used to achieve the best forecasts, which was the case in our previous study: ML2 performs better overall (R² = 0.79), especially for low-O₃ events, while ML1 correctly captures more high-O₃ events. Compared to the CTM-based forecast, our O₃ ML forecasts reduce the normalized mean bias (NMB) from 7.6 to 2.6% and normalized mean error (NME) from 18 to 12% when evaluating against the observation. For PM2.5, ML2 performs the best and thus is used for the final forecasts. Compared to the CTM-based PM2.5, ML2 clearly improves PM2.5 forecasts for both wildfire season (May to September) and cold season (November to February): ML2 reduces NMB (-27 to 7.9% for wildfire season; 3.4 to 2.2% for cold season) and NME (59 to 41% for wildfires season; 67 to 28% for cold season) significantly and captures more high-PM2.5 events correctly. Our ML air quality forecast system requires fewer computing resources and fewer input datasets, yet it provides more reliable forecasts than (if not, comparable to) the CTM-based forecast. It demonstrates that our ML system is a low-cost, reliable air quality forecasting system that can support regional/local air quality management.
Fan Kai, Dhammapala Ranil, Harrington Kyle, Lamb Brian, Lee Yunha

2023

PM2.5, air quality forecasts, machine learning, multiple linear regression, ozone, random forest

General

General

Automatic measurement of exophthalmos based orbital CT images using deep learning.

In Frontiers in cell and developmental biology
Introduction: Objective, accurate, and efficient measurement of exophthalmos is imperative for diagnosing orbital diseases that cause abnormal degrees of exophthalmos (such as thyroid-related eye diseases) and for quantifying treatment effects. Methods: To address the limitations of existing clinical methods for measuring exophthalmos, such as poor reproducibility, low reliability, and subjectivity, we propose a method that uses deep learning and image processing techniques to measure the exophthalmos. The proposed method calculates two vertical distances; the distance from the apex of the anterior surface of the cornea to the highest protrusion point of the outer edge of the orbit in axial CT images and the distance from the apex of the anterior surface of the cornea to the highest protrusion point of the upper and lower outer edges of the orbit in sagittal CT images. Results: Based on the dataset used, the results of the present method are in good agreement with those measured manually by clinicians, achieving a concordance correlation coefficient (CCC) of 0.9895 and an intraclass correlation coefficient (ICC) of 0.9698 on axial CT images while achieving a CCC of 0.9902 and an ICC of 0.9773 on sagittal CT images. Discussion: In summary, our method can provide a fully automated measurement of the exophthalmos based on orbital CT images. The proposed method is reproducible, shows high accuracy and objectivity, aids in the diagnosis of relevant orbital diseases, and can quantify treatment effects.
Zhang Yinghuai, Rao Jing, Wu Xingyang, Zhou Yongjin, Liu Guiqin, Zhang Hua

2023

CT images, deep learning, exophthalmos, orbital diseases, thyroid-associated ophthalmopathy

oncology

Oncology

Prolonged air leak after video-assisted thoracic anatomical pulmonary resections: a clinical predicting model based on data from the Italian VATS group registry, a machine learning approach.

In Journal of thoracic disease ; h5-index 52.0

BACKGROUND : Prolonged air leak (PAL) is a frequent complication after lung resection surgery and has a high clinical and economic impact. A useful risk predictor model can help recognize those patients who might benefit from additional preventive procedures. Currently, no risk model has sufficient discriminatory capacity to be used in common clinical practice. The aim of this study is to identify predictive risk factors for PAL after video-assisted thoracoscopic surgery (VATS) anatomical resections in the Italian VATS group database and to evaluate their clinical and statistical performance.

METHODS : We processed data collected in the second edition of the Italian VATS group registry. It includes patients that underwent a thoracoscopic anatomical resection for benign or malignant diseases, between November 2015 and December 2020. We used recursive feature elimination (RFE), using a backward selection process, to find the optimal combination of predictors. The study population was randomly split based on the outcome into a derivation (80%) and an internal validation cohort (20%). Discrimination of the model was measured using the area under the curve, or C-statistic. Calibration was displayed using a calibration plot and was measured using Emax and Eavg, the maximum and the average difference in predicted versus loess calibrated probabilities.

RESULTS : A cohort of 6,236 patients was eligible for the study after application of the exclusion criteria. Five-day PAL rate in this patient cohort was 11.3%. For the construction of our predictive model, we used both preoperative and intraoperative variables, with a total of 320 variables. The presence of variables with missing values greater than 5% led to 120 remaining predictors. RFE algorithm recommended 8 features for the model that are relevant in predicting the target variable.

CONCLUSIONS : We confirmed significant prognostic risk factors for the prediction of PAL: decreased DLCO/VA ratio, longer duration of surgery, male sex, the need for adhesiolysis, COPD, and right side. We identified middle lobe resections and ground glass opacity as protective factors. After internal validation, a C statistic of 0.63 was revealed, which is too low to generate a reliable score in clinical practice.

Divisi Duilio, Pipitone Marco, Perkmann Reinhold, Bertolaccini Luca, Curcio Carlo, Baldinelli Francesco, Crisci Roberto, Zaraca Francesco

2023-Feb-28

Prolonged air leak (PAL), risk factors, risk predictive model, video-assisted thoracoscopic surgery lobectomy (VATS lobectomy)

Surgery

Surgery

Value of contrast-enhanced magnetic resonance imaging-T2WI-based radiomic features in distinguishing lung adenocarcinoma from lung squamous cell carcinoma with solid components >8 mm.

In Journal of thoracic disease ; h5-index 52.0

BACKGROUND : Radiomics is one of the research frontiers in the field of imaging and has excellent diagnostic performance. However, there is a lack of magnetic resonance imaging (MRI)-based omics studies on identifying pathological subtypes of lung cancer. Here we explored the value of the contrast-enhanced MRI-T2-weighted imaging (T2WI)-based radiomic analysis in distinguishing adenocarcinoma (Ade) from squamous cell carcinoma (Squ) with solid components >8 mm.

METHODS : A retrospective analysis was performed of a total of 71 lung cancer patients who undergoing contrast-enhanced MRI and computed tomography (CT) before treatment, and the nodules had solid components ≥8 mm in our center from January 2020 to September 2021. All enrolled patients were divided into Squ and Ade groups according to the pathological results. In addition, the two groups were randomly divided into training set and validation set in a ratio of about 7:3. Radiomics software was used to extract the relevant radiomic features. The least absolute shrinkage and selection operator (Lasso) was used to screen radiomic features that were most relevant to lung cancer subtypes, thus calculating the radiomic scores (Rad-score) and constructing the radiomic models. Multivariate logistic regression was used to combine relevant clinical features with Rad-score to form combined model nomograms. The receiver operating characteristic (ROC) curves. the area under the ROC curve (AUC), the decision curve analysis (DCA) and the DeLong's test were used to evaluate the clinical application potentials.

RESULTS : The sensitivity and specificity of the clinical model based on smoking was 75.0% and 93.8%. The AUC of the constructed magnetic resonance (MR)-Rad model for differentiating the pathological subtypes of lung cancer was 0.8651 in the validation sets. The AUC of the CT-Rad model in the validation set were 0.9286. The combined model constructed by combining clinical features and Rad-score had AUC of 0.8016, for identifying the 2 pathological subtypes of lung cancer in the validation set. There was no significant difference in diagnostic performance between MR-Rad model and CT-Rad model (P>0.05).

CONCLUSIONS : The MR-Rad model has a diagnostic performance similar to that of CT-Rad model, while the diagnostic performance of the combined mode was better than the single MR model.

Yang Maoyuan, Shi Liang, Huang Tianwei, Li Guangzheng, Shao Hancheng, Shen Yijun, Zhu Jun, Ni Bin

2023-Feb-28

Radiomics, lung adenocarcinoma, lung squamous cell carcinoma, machine learning, nomogram