Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

Diagnosis of childhood and adolescent growth hormone deficiency using transcriptomic data.

In Frontiers in endocrinology ; h5-index 55.0

BACKGROUND : Gene expression (GE) data have shown promise as a novel tool to aid in the diagnosis of childhood growth hormone deficiency (GHD) when comparing GHD children to normal children. The aim of this study was to assess the utility of GE data in the diagnosis of GHD in childhood and adolescence using non-GHD short stature children as a control group.

METHODS : GE data was obtained from patients undergoing growth hormone stimulation testing. Data were taken for the 271 genes whose expression was utilized in our previous study. The synthetic minority oversampling technique was used to balance the dataset and a random forest algorithm applied to predict GHD status.

RESULTS : 24 patients were recruited to the study and eight subsequently diagnosed with GHD. There were no significant differences in gender, age, auxology (height SDS, weight SDS, BMI SDS) or biochemistry (IGF-I SDS, IGFBP-3 SDS) between the GHD and non-GHD subjects. A random forest algorithm gave an AUC of 0.97 (95% CI 0.93 - 1.0) for the diagnosis of GHD.

CONCLUSION : This study demonstrates highly accurate diagnosis of childhood GHD using a combination of GE data and random forest analysis.

Garner Terence, Wangsaputra Ivan, Whatmore Andrew, Clayton Peter Ellis, Stevens Adam, Murray Philip George

2023

growth hormone, growth hormone deficiency, machine learning, random forest - ensemble classifier, transcriptome (RNA-seq)

General General

The effect of COVID-19 on self-reported safety incidents in aviation: An examination of the heterogeneous effects using causal machine learning.

In Journal of safety research

INTRODUCTION : Disruptions to aviation operations occur daily on a micro-level with negligible impacts beyond the inconvenience of rebooking and changing aircrew schedules. The unprecedented disruption in global aviation due to COVID-19 highlighted a need to evaluate emergent safety issues rapidly.

METHOD : This paper uses causal machine learning to examine the heterogeneous effects of COVID-19 on reported aircraft incursions/excursions. The analysis utilized self report data from NASA Aviation Safety Reporting System collected from 2018 to 2020. The report attributes include self identified group characteristics and expert categorization of factors and outcomes. The analysis identified attributes and subgroup characteristics that were most sensitive to COVID-19 in inducing incursions/excursions. The method included the generalized random forest and difference-in-difference techniques to explore causal effects.

RESULTS : The analysis indicates first officers are more prone to experiencing incursion/excursion events during the pandemic. In addition, events categorized with the human factors confusion, distraction, and the causal factor fatigue increased incursion/excursion events.

PRACTICAL APPLICATIONS : Understanding the attributes associated with the likelihood of incursion/excursion events provides policymakers and aviation organizations insights to improve prevention mechanisms for future pandemics or extended periods of reduced aviation operations.

Choi Youngran, Gibson James R

2023-Feb

Aviation incursions/excursions, COVID-19, Heterogeneous treatment effects, Machine learning

Ophthalmology Ophthalmology

Integrating oculomics with genomics reveals imaging biomarkers for preventive and personalized prediction of arterial aneurysms.

In The EPMA journal

OBJECTIVE : Arterial aneurysms are life-threatening but usually asymptomatic before requiring hospitalization. Oculomics of retinal vascular features (RVFs) extracted from retinal fundus images can reflect systemic vascular properties and therefore were hypothesized to provide valuable information on detecting the risk of aneurysms. By integrating oculomics with genomics, this study aimed to (i) identify predictive RVFs as imaging biomarkers for aneurysms and (ii) evaluate the value of these RVFs in supporting early detection of aneurysms in the context of predictive, preventive and personalized medicine (PPPM).

METHODS : This study involved 51,597 UK Biobank participants who had retinal images available to extract oculomics of RVFs. Phenome-wide association analyses (PheWASs) were conducted to identify RVFs associated with the genetic risks of the main types of aneurysms, including abdominal aortic aneurysm (AAA), thoracic aneurysm (TAA), intracranial aneurysm (ICA) and Marfan syndrome (MFS). An aneurysm-RVF model was then developed to predict future aneurysms. The performance of the model was assessed in both derivation and validation cohorts and was compared with other models employing clinical risk factors. An RVF risk score was derived from our aneurysm-RVF model to identify patients with an increased risk of aneurysms.

RESULTS : PheWAS identified a total of 32 RVFs that were significantly associated with the genetic risks of aneurysms. Of these, the number of vessels in the optic disc ('ntreeA') was associated with both AAA (β = -0.36, P = 6.75e-10) and ICA (β = -0.11, P = 5.51e-06). In addition, the mean angles between each artery branch ('curveangle_mean_a') were commonly associated with 4 MFS genes (FBN1: β = -0.10, P = 1.63e-12; COL16A1: β = -0.07, P = 3.14e-09; LOC105373592: β = -0.06, P = 1.89e-05; C8orf81/LOC441376: β = 0.07, P = 1.02e-05). The developed aneurysm-RVF model showed good discrimination ability in predicting the risks of aneurysms. In the derivation cohort, the C-index of the aneurysm-RVF model was 0.809 [95% CI: 0.780-0.838], which was similar to the clinical risk model (0.806 [0.778-0.834]) but higher than the baseline model (0.739 [0.733-0.746]). Similar performance was observed in the validation cohort, with a C-index of 0.798 (0.727-0.869) for the aneurysm-RVF model, 0.795 (0.718-0.871) for the clinical risk model and 0.719 (0.620-0.816) for the baseline model. An aneurysm risk score was derived from the aneurysm-RVF model for each study participant. The individuals in the upper tertile of the aneurysm risk score had a significantly higher risk of aneurysm compared to those in the lower tertile (hazard ratio = 17.8 [6.5-48.8], P = 1.02e-05).

CONCLUSION : We identified a significant association between certain RVFs and the risk of aneurysms and revealed the impressive capability of using RVFs to predict the future risk of aneurysms by a PPPM approach. Our finds have great potential to support not only the predictive diagnosis of aneurysms but also a preventive and more personalized screening plan which may benefit both patients and the healthcare system.

SUPPLEMENTARY INFORMATION : The online version contains supplementary material available at 10.1007/s13167-023-00315-7.

Huang Yu, Li Cong, Shi Danli, Wang Huan, Shang Xianwen, Wang Wei, Zhang Xueli, Zhang Xiayin, Hu Yijun, Tang Shulin, Liu Shunming, Luo Songyuan, Zhao Ke, Mordi Ify R, Doney Alex S F, Yang Xiaohong, Yu Honghua, Li Xin, He Mingguang

2023-Mar

Aneurysm, Genetic risk scores, Imaging biomarker, Oculomics, Phenome-wide association analysis, Predictive preventive and personalized medicine (PPPM / 3PM), Retinal vascular features, Risk assessment

General General

Supervised Classes, Unsupervised Mixing Proportions: Detection of Bots in a Likert-Type Questionnaire.

In Educational and psychological measurement

Administering Likert-type questionnaires to online samples risks contamination of the data by malicious computer-generated random responses, also known as bots. Although nonresponsivity indices (NRIs) such as person-total correlations or Mahalanobis distance have shown great promise to detect bots, universal cutoff values are elusive. An initial calibration sample constructed via stratified sampling of bots and humans-real or simulated under a measurement model-has been used to empirically choose cutoffs with a high nominal specificity. However, a high-specificity cutoff is less accurate when the target sample has a high contamination rate. In the present article, we propose the supervised classes, unsupervised mixing proportions (SCUMP) algorithm that chooses a cutoff to maximize accuracy. SCUMP uses a Gaussian mixture model to estimate, unsupervised, the contamination rate in the sample of interest. A simulation study found that, in the absence of model misspecification on the bots, our cutoffs maintained accuracy across varying contamination rates.

Ilagan Michael John, Falk Carl F

2023-Apr

Mahalanobis distance, aberrant responding, bots, machine learning, person-total correlation

General General

Is early or late biological maturation trigger obesity? A machine learning modeling research in Turkey boys and girls.

In Frontiers in nutrition

Biological maturation status can affect individual differences, sex, height, body fat, and body weight in adolescents and thus may be associated with obesity. The primary aim of this study was to examine the relationship between biological maturation and obesity. Overall, 1,328 adolescents (792 boys and 536 girls) aged 12.00 ± 0.94-12.21 ± 0.99 years, respectively (measured for body mass, body stature, sitting stature). Body weights were deter-mined with Tanita body analysis system and adolescent obesity status was calculated according to the WHO classification. Biological maturation was determined according to the somatic maturation method. Our results showed that boys mature 3.077-fold later than girls. Obesity was an increasing effect on early maturation. It was determined that being obese, overweight and healthy-weight increased the risk of early maturation 9.80, 6.99 and 1.81-fold, respectively. The equation of the model predicting maturation is: Logit (P) = 1/(1 + exp. (- (-31.386 + sex-boy * (1.124) + [chronological age = 10] * (-7.031) + [chronological age = 11] * (-4.338) + [chronological age = 12] * (-1.677) + age * (-2.075) + weight * 0.093 + height * (-0.141) + obesity * (-2.282) + overweight * (-1.944) + healthy weight * (-0.592)))). Logistic regression model predicted maturity with 80.7% [95% CI: 77.2-84.1%] accuracy. In addition, the model had a high sensitivity value (81.7% [76.2-86.6%]), which indicates that the model can successfully distinguish adolescents with early maturation. In conclusion, sex and obesity are independent predictors of maturity, and the risk of early maturation is increased, especially in the case of obesity and in girls.

Gülü Mehmet, Yagin Fatma Hilal, Yapici Hakan, Irandoust Khadijeh, Dogan Ali Ahmet, Taheri Morteza, Szura Ewa, Barasinska Magdalena, Gabrys Tomasz

2023

adolescent, body mass index, childhood, noncommunicable diseases, overweight, puberty

General General

DeepBend: An interpretable model of DNA bendability.

In iScience

The bendability of genomic DNA impacts chromatin packaging and protein-DNA binding. However, we do not have a comprehensive understanding of the motifs influencing DNA bendability. Recent high-throughput technologies such as Loop-Seq offer an opportunity to address this gap but the lack of accurate and interpretable machine learning models still remains. Here we introduce DeepBend, a convolutional neural network model with convolutions designed to directly capture the motifs underlying DNA bendability and their periodic occurrences or relative arrangements that modulate bendability. DeepBend consistently performs on par with alternative models while giving an extra edge through mechanistic interpretations. Besides confirming the known motifs of DNA bendability, DeepBend also revealed several novel motifs and showed how the spatial patterns of motif occurrences influence bendability. DeepBend's genome-wide prediction of bendability further showed how bendability is linked to chromatin conformation and revealed the motifs controlling the bendability of topologically associated domains and their boundaries.

Khan Samin Rahman, Sakib Sadman, Rahman M Sohel, Samee Md Abul Hassan

2023-Feb-17

Biochemistry, Biological sciences, Genetics