Doctor Penguin

Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General

General

Identifying populations at ultra-high risk of suicide using a novel machine learning method.

In Comprehensive psychiatry

BACKGROUND : Targeted interventions for suicide prevention rely on adequate identification of groups at elevated risk. Several risk factors for suicide are known, but little is known about the interactions between risk factors. Interactions between risk factors may aid in detecting more specific sub-populations at higher risk.

METHODS : Here, we use a novel machine learning heuristic to detect sub-populations at ultra high-risk for suicide based on interacting risk factors. The data-driven and hypothesis-free model is applied to investigate data covering the entire population of the Netherlands.

FINDINGS : We found three sub-populations with extremely high suicide rates (i.e. >50 suicides per 100,000 person years, compared to 12/100,000 in the general population), namely: (1) people on unfit for work benefits that were never married, (2) males on unfit for work benefits, and (3) those aged 55-69 who live alone, were never married and have a relatively low household income. Additionally, we found two sub-populations where the rate was higher than expected based on individual risk factors alone: widowed males, and people aged 25-39 with a low level of education.

INTERPRETATION : Our model is effective at finding ultra-high risk groups which can be targeted using sub-population level interventions. Additionally, it is effective at identifying high-risk groups that would not be considered risk groups based on conventional risk factor analysis.

Berkelmans Guus, Schweren Lizanne, Bhulai Sandjai, van der Mei Rob, Gilissen Renske

2023-Mar-01

Interactions, Machine learning, Population data, Risk factors, Suicide

General

General

Reduced corticolimbic habituation to negative stimuli characterizes bipolar depressed suicide attempters.

In Psychiatry research. Neuroimaging
Suicide attempts in Bipolar Disorder are characterized by high levels of lethality and impulsivity. Reduced rates of amygdala and cortico-limbic habituation can identify a fMRI phenotype of suicidality in the disorder related to internal over-arousing states. Hence, we investigated if reduced amygdala and whole-brain habituation may differentiate bipolar suicide attempters (SA, n = 17) from non-suicide attempters (nSA, n = 57), and healthy controls (HC, n = 32). Habituation was assessed during a fMRI task including facial expressions of anger and fear and a control condition. Associations with suicidality and current depressive symptomatology were assessed, including machine learning procedure to estimate the potentiality of habituation as biomarker for suicidality. SA showed lower habituation compared to HC and nSA in several cortico-limbic areas, including amygdalae, cingulate and parietal cortex, insula, hippocampus, para-hippocampus, cerebellar vermis, thalamus, and striatum, while nSA displayed intermediate rates between SA and HC. Lower habituation rates in the amygdalae were also associated with higher depressive and suicidal current symptomatology. Machine learning on whole-brain and amygdala habituation differentiated SA vs. nSA with 94% and 69% of accuracy, respectively. Reduced habituation in cortico-limbic system can identify a candidate biomarker for attempting suicide, helping in detecting at-risk bipolar patients, and in developing new therapeutic interventions.
Vai Benedetta, Calesella Federico, Lenti Claudia, Fortaner-Uyà Lidia, Caselani Elisa, Fiore Paola, Breit Sigrid, Poletti Sara, Colombo Cristina, Zanardi Raffaella, Benedetti Francesco

2023-Mar-11

Bipolar disorder, Depression, Habituation, Machine learning, Suicide

General

General

Constructing discriminative feature space for LncRNA-protein interaction based on deep autoencoder and marginal fisher analysis.

In Computers in biology and medicine
Long non-coding RNAs (lncRNAs) play important roles by regulating proteins in many biological processes and life activities. To uncover molecular mechanisms of lncRNA, it is very necessary to identify interactions of lncRNA with proteins. Recently, some machine learning methods were proposed to detect lncRNA-protein interactions according to the distribution of known interactions. The performances of these methods were largely dependent upon: (1) how exactly the distribution of known interactions was characterized by feature space; (2) how discriminative the feature space was for distinguishing lncRNA-protein interactions. Because the known interactions may be multiple and complex model, it remains a challenge to construct discriminative feature space for lncRNA-protein interactions. To resolve this problem, a novel method named DFRPI was developed based on deep autoencoder and marginal fisher analysis in this paper. Firstly, some initial features of lncRNA-protein interactions were extracted from the primary sequences and secondary structures of lncRNA and protein. Secondly, a deep autoencoder was exploited to learn encode parameters of the initial features to describe the known interactions precisely. Next, the marginal fisher analysis was employed to optimize the encode parameters of features to characterize a discriminative feature space of the lncRNA-protein interactions. Finally, a random forest-based predictor was trained on the discriminative feature space to detect lncRNA-protein interactions. Verified by a series of experiments, the results showed that our predictor achieved the precision of 0.920, recall of 0.916, accuracy of 0.918, MCC of 0.836, specificity of 0.920, sensitivity of 0.916 and AUC of 0.906 respectively, which outperforms the concerned methods for predicting lncRNA-protein interaction. It may be suggested that the proposed method can generate a reasonable and effective feature space for distinguishing lncRNA-protein interactions accurately. The code and data are available on https://github.com/D0ub1e-D/DFRPI.
Teng Zhixia, Zhang Yiran, Dai Qiguo, Wu Chengyan, Li Dan

2023-Feb-28

Autoencoder, Feature extraction, LncRNA, Marginal fisher analysis, Protein

General

General

Newly reconstructed Arctic surface air temperatures for 1979-2021 with deep learning method.

In Scientific data
A precise Arctic surface air temperature (SAT) dataset, that is regularly updated, has more complete spatial and temporal coverage, and is based on instrumental observations, is critically important for timely monitoring and improving understanding of the rapid change in the Arctic climate. In this study, a new monthly gridded Arctic SAT dataset dated back to 1979 was reconstructed with a deep learning method by combining surface air temperatures from multiple data sources. The source data include the observations from land station of GHCN (Global Historical Climatology Network), ICOADS (International Comprehensive Ocean-Atmosphere Data Set) over the oceans, drifting ice station of Russian NP (North Pole), and buoys of IABP (International Arctic Buoy Programme). The last two are crucial for improving the representation of the in-situ observed temperatures within the Arctic. The newly reconstructed dataset includes monthly Arctic SAT beginning in 1979 and daily Arctic SAT beginning in 2011. This dataset would represent a new improvement in developing observational temperature datasets and can be used for a variety of applications.
Ma Ziqi, Huang Jianbin, Zhang Xiangdong, Luo Yong, Ding Minghu, Wen Jun, Jin Weixin, Qiao Chen, Yin Yifu

2023-Mar-15

General

General

Multi-weight susceptible-infected model for predicting COVID-19 in China.

In Neurocomputing
The mutant strains of COVID-19 caused a global explosion of infections, including many cities of China. In 2020, a hybrid AI model was proposed by Zheng et al., which accurately predicted the epidemic in Wuhan. As the main part of the hybrid AI model, ISI method makes two important assumptions to avoid over-fitting. However, the assumptions cannot be effectively applied to new mutant strains. In this paper, a more general method, named the multi-weight susceptible-infected model (MSI) is proposed to predict COVID-19 in Chinese Mainland. First, a Gaussian pre-processing method is proposed to solve the problem of data fluctuation based on the quantity consistency of cumulative infection number and the trend consistency of daily infection number. Then, we improve the model from two aspects: changing the grouped multi-parameter strategy to the multi-weight strategy, and removing the restriction of weight distribution of viral infectivity. Experiments on the outbreaks in many places in China from the end of 2021 to May 2022 show that, in China, an individual infected by Delta or Omicron strains of SARS-CoV-2 can infect others within 3-4 days after he/she got infected. Especially, the proposed method effectively predicts the trend of the epidemics in Xi'an, Tianjin, Henan, and Shanghai from December 2021 to May 2022.
Zhang Jun, Zheng Nanning, Liu Mingyu, Yao Dingyi, Wang Yusong, Wang Jianji, Xin Jingmin

2023-May-14

COVID-19 prediction, Data processing, Epidemic model, Multi-weight susceptible-infected model

Pathology

Pathology

GNNFormer: A Graph-based Framework for Cytopathology Report Generation

ArXiv Preprint
Cytopathology report generation is a necessary step for the standardized examination of pathology images. However, manually writing detailed reports brings heavy workloads for pathologists. To improve efficiency, some existing works have studied automatic generation of cytopathology reports, mainly by applying image caption generation frameworks with visual encoders originally proposed for natural images. A common weakness of these works is that they do not explicitly model the structural information among cells, which is a key feature of pathology images and provides significant information for making diagnoses. In this paper, we propose a novel graph-based framework called GNNFormer, which seamlessly integrates graph neural network (GNN) and Transformer into the same framework, for cytopathology report generation. To the best of our knowledge, GNNFormer is the first report generation method that explicitly models the structural information among cells in pathology images. It also effectively fuses structural information among cells, fine-grained morphology features of cells and background features to generate high-quality reports. Experimental results on the NMI-WSI dataset show that GNNFormer can outperform other state-of-the-art baselines.
Yang-Fan Zhou, Kai-Lang Yao, Wu-Jun Li

2023-03-17