Doctor Penguin

Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

Pathology

Pathology

Renal enhanced CT images reveal the tandem mechanism between tumor cells and immunocytes based on bulk/single-cell RNA sequencing.

In Functional & integrative genomics
Metabolic reprogramming is essential for establishing the tumor microenvironment (TME). Glutamine has been implicated in cancer metabolism, but its role in clear cell renal carcinoma (ccRCC) remains unknown. Transcriptome data of patients with ccRCC and single-cell RNA sequencing (scRNA-seq) data were obtained from The Cancer Genome Atlas (TCGA, 539 ccRCC samples and 59 normal samples) database and GSE152938 (5 ccRCC samples). Differentially expressed genes related to glutamine metabolism (GRGs) were obtained from the MSigDB database. Consensus cluster analysis distinguished metabolism-related ccRCC subtypes. LASSO-Cox regression analysis was used to construct a metabolism-related prognostic model. The ssGSEA and ESTIMATE algorithms evaluated the level of immune cell infiltration in the TME, and the immunotherapy sensitivity score was obtained from TIDE. Cell-cell communication analysis was used to observe the distribution and effects of the target genes in the cell subsets. An image genomics model was constructed using imaging feature extraction and a machine learning algorithm. Results: Fourteen GRGs were identified. Overall survival and progression-free survival rates were lower in metabolic cluster 2, compared with those in cluster 1. The matrix/ESTIMATE/immune score in C1 decreased, but tumor purity in C2 increased. Immune cells were more active in the high-risk group, in which CD8 + T cells, follicular helper T cells, Th1 cells, and Th2 cells were significantly higher than those in the low-risk group. The expression levels of immune checkpoints were also significantly different between the two groups. RIMKL mainly appeared in epithelial cells in the single-cell analysis. ARHGAP11B was sparsely distributed. The imaging genomics model proved effective in aiding with clinical decisions. Glutamine metabolism plays a crucial role in the formation of immune TMEs in ccRCC. It is effective in differentiating the risk and predicting survival in patients with ccRCC. Imaging features can be used as new biomarkers for predicting ccRCC immunotherapy.
Liang Haote, Wu Keming, Wu Rongrong, Huang KaTe, Deng Zhexian, Chen Hongde

2023-Mar-18

Clear cell renal carcinoma, Enhanced CT images, Metabolic reprogramming, Single-cell RNA sequencing, Tumor microenvironment

Public Health

Public Health

Improved Decision Making for Water Lead Testing in U.S. Child Care Facilities Using Machine-Learned Bayesian Networks.

In Environmental science & technology ; h5-index 132.0
Tap water lead testing programs in the U.S. need improved methods for identifying high-risk facilities to optimize limited resources. In this study, machine-learned Bayesian network (BN) models were used to predict building-wide water lead risk in over 4,000 child care facilities in North Carolina according to maximum and 90th percentile lead levels from water lead concentrations at 22,943 taps. The performance of the BN models was compared to common alternative risk factors, or heuristics, used to inform water lead testing programs among child care facilities including building age, water source, and Head Start program status. The BN models identified a range of variables associated with building-wide water lead, with facilities that serve low-income families, rely on groundwater, and have more taps exhibiting greater risk. Models predicting the probability of a single tap exceeding each target concentration performed better than models predicting facilities with clustered high-risk taps. The BN models' F_β-scores outperformed each of the alternative heuristics by 118-213%. This represents up to a 60% increase in the number of high-risk facilities that could be identified and up to a 49% decrease in the number of samples that would need to be collected by using BN model-informed sampling compared to using simple heuristics. Overall, this study demonstrates the value of machine-learning approaches for identifying high water lead risk that could improve lead testing programs nationwide.
Mulhern Riley E, Kondash A J, Norman Ed, Johnson Joseph, Levine Keith, McWilliams Andrea, Napier Melanie, Weber Frank, Stella Laurie, Wood Erica, Lee Pow Jackson Crystal, Colley Sarah, Cajka Jamie, MacDonald Gibson Jacqueline, Hoponick Redmon Jennifer

2023-Mar-18

children’s health, drinking water, lead, machine learning, risk assessment

Radiology

Radiology

Prediction of Renal Function 1 Year After Transplantation Using Machine Learning Methods Based on Ultrasound Radiomics Combined With Clinical and Imaging Features.

In Ultrasonic imaging
Kidney transplantation is the most effective treatment for advanced chronic kidney disease (CKD). If the prognosis of transplantation can be predicted early after transplantation, it might improve the long-term survival of patients with transplanted kidneys. Currently, studies on the assessment and prediction of renal function by radiomics are limited. Therefore, the present study aimed to explore the value of ultrasound (US)-based imaging and radiomics features, combined with clinical features to develop and validate the models for predicting transplanted kidney function after 1 year (TKF-1Y) using different machine learning algorithms. A total of 189 patients were included and classified into the abnormal TKF-1Y group, and the normal TKF-1Y group based on their estimated glomerular filtration rate (eGFR) levels 1 year after transplantation. The radiomics features were derived from the US images of each case. Three machine learning methods were employed to establish different models for predicting TKF-1Y using selected clinical and US imaging as well as radiomics features from the training set. Two US imaging, four clinical, and six radiomics features were selected. Then, the clinical (including clinical and US image features), radiomics, and combined models were developed. The area under the curves (AUCs) of the models was 0.62 to 0.82 within the test set. Combined models showed statistically higher AUCs than the radiomics models (all p-values <.05). The prediction performance of different models was not significantly affected by the different machine learning algorithms (all p-values >.05). In conclusion, US imaging features combined with clinical features could predict TKF-1Y and yield an incremental value over radiomics features. A model integrating all available features may further improve the predictive efficacy. Different machine learning algorithms may not have a significant impact on the predictive performance of the model.
Zhu Lili, Huang Renjun, Zhou Zhiyong, Fan Qingmin, Yan Junchen, Wan Xiaojing, Zhao Xiaojun, He Yao, Dong Fenglin

2023-Mar-18

chronic kidney disease, kidney transplantation, machine learning, radiomics, ultrasound

General

General

Nonparametric failure time: Time-to-event machine learning with heteroskedastic bayesian additive regression trees and low information omnibus dirichlet process mixtures.

In Biometrics
Many popular survival models rely on restrictive parametric, or semi-parametric, assumptions that could provide erroneous predictions when the effects of covariates are complex. Modern advances in computational hardware have led to an increasing interest in flexible Bayesian nonparametric methods for time-to-event data such as Bayesian additive regression trees (BART). We propose a novel approach that we call nonparametric failure time (NFT) BART in order to increase the flexibility beyond accelerated failure time (AFT) and proportional hazard models. NFT BART has three key features: 1) a BART prior for the mean function of the event time logarithm; 2) a heteroskedastic BART prior to deduce a covariate-dependent variance function; and 3) a flexible nonparametric error distribution using Dirichlet process mixtures (DPM). Our proposed approach widens the scope of hazard shapes including non-proportional hazards, can be scaled up to large sample sizes, naturally provides estimates of uncertainty via the posterior and can be seamlessly employed for variable selection. We provide convenient, user-friendly, computer software that is freely available as a reference implementation. Simulations demonstrate that NFT BART maintains excellent performance for survival prediction especially when AFT assumptions are violated by heteroskedasticity. We illustrate the proposed approach on a study examining predictors for mortality risk in patients undergoing hematopoietic stem cell transplant (HSCT) for blood-borne cancer, where heteroskedasticity and non-proportional hazards are likely present. This article is protected by copyright. All rights reserved.
Sparapani R A, Logan B R, Maiers M J, Laud P W, McCulloch R E

2023-Mar-18

Accelerated failure time, BART, LIO prior hierarchy, Thompson sampling variable selection, constrained DPM, hematopoietic stem cell transplant, non-proportional hazards, survival analysis

Surgery

Surgery

Can a Novel Natural Language Processing Model and Artificial Intelligence Automatically Generate Billing Codes From Spine Surgical Operative Notes?

In Global spine journal

STUDY DESIGN : Retrospective cohort.

OBJECTIVE : Billing and coding-related administrative tasks are a major source of healthcare expenditure in the United States. We aim to show that a second-iteration Natural Language Processing (NLP) machine learning algorithm, XLNet, can automate the generation of CPT codes from operative notes in ACDF, PCDF, and CDA procedures.

METHODS : We collected 922 operative notes from patients who underwent ACDF, PCDF, or CDA from 2015 to 2020 and included CPT codes generated by the billing code department. We trained XLNet, a generalized autoregressive pretraining method, on this dataset and tested its performance by calculating AUROC and AUPRC.

RESULTS : The performance of the model approached human accuracy. Trial 1 (ACDF) achieved an AUROC of .82 (range: .48-.93), an AUPRC of .81 (range: .45-.97), and class-by-class accuracy of 77% (range: 34%-91%); trial 2 (PCDF) achieved an AUROC of .83 (.44-.94), an AUPRC of .70 (.45-.96), and class-by-class accuracy of 71% (42%-93%); trial 3 (ACDF and CDA) achieved an AUROC of .95 (.68-.99), an AUPRC of .91 (.56-.98), and class-by-class accuracy of 87% (63%-99%); trial 4 (ACDF, PCDF, CDA) achieved an AUROC of .95 (.76-.99), an AUPRC of .84 (.49-.99), and class-by-class accuracy of 88% (70%-99%).

CONCLUSIONS : We show that the XLNet model can be successfully applied to orthopedic surgeon's operative notes to generate CPT billing codes. As NLP models as a whole continue to improve, billing can be greatly augmented with artificial intelligence assisted generation of CPT billing codes which will help minimize error and promote standardization in the process.

Zaidat Bashar, Tang Justin, Arvind Varun, Geng Eric A, Cho Brian, Duey Akiro H, Dominy Calista, Riew Kiehyun D, Cho Samuel K, Kim Jun S

2023-Mar-18

ACDF, PCDF, artificial intelligence, cervical, disc replacement, fusion, natural language processing

General

General

Higher-order structure formation using refined monomer structures of lipid raft markers, Stomatin, Prohibitin, Flotillin, and HflK/C-related proteins.

In FEBS open bio
Currently, information on the higher-order structure of Stomatin, Prohibitin, Flotillin, and HflK/C (SPFH)-domain proteins is limited. Briefly, the coordinate information (Refined PH1511.pdb) of the stomatin ortholog, PH1511 monomer, was obtained using the artificial intelligence, ColabFold: AlphaFold2. Thereafter, the 24mer homo-oligomer structure of PH1511 was constructed using the superposing method, with HflK/C and FtsH (KCF complex) as templates. The 9mer-12mer homo-oligomer structures of PH1511 were also constructed using the ab initio docking method, with the GalaxyHomomer server for artificiality elimination. The features and functional validity of the higher-order structures were discussed. The coordinate information (Refined PH1510.pdb) of the membrane protease PH1510 monomer, which specifically cleaves the C-terminal hydrophobic region of PH1511, was obtained. Thereafter, the PH1510 12mer structure was constructed by superposing 12 molecules of the Refined PH1510.pdb monomer onto a 1510-C prism-like 12mer structure formed along the crystallographic threefold helical axis. The 12mer PH1510 (prism) structure revealed the spatial arrangement of membrane-spanning regions between the 1510-N and 1510-C domains within the membrane tube complex. Based on these refined 3D homo-oligomeric structures, the substrate recognition mechanism of the membrane protease was investigated. These refined 3D homo-oligomer structures are provided via PDB files as Supplementary data and can be used for further reference.
Yokoyama Hideshi, Matsui Ikuo

2023-Mar-17

ColabFold: AlphaFold2, SPFH, ab initio docking, lipid rafts, membrane protease, stomatin specific protease