Doctor Penguin

Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General

General

Investigating a neural language model's replicability of psycholinguistic experiments: A case study of NPI licensing.

In Frontiers in psychology ; h5-index 92.0
The recent success of deep learning neural language models such as Bidirectional Encoder Representations from Transformers (BERT) has brought innovations to computational language research. The present study explores the possibility of using a language model in investigating human language processes, based on the case study of negative polarity items (NPIs). We first conducted an experiment with BERT to examine whether the model successfully captures the hierarchical structural relationship between an NPI and its licensor and whether it may lead to an error analogous to the grammatical illusion shown in the psycholinguistic experiment (Experiment 1). We also investigated whether the language model can capture the fine-grained semantic properties of NPI licensors and discriminate their subtle differences on the scale of licensing strengths (Experiment 2). The results of the two experiments suggest that overall, the neural language model is highly sensitive to both syntactic and semantic constraints in NPI processing. The model's processing patterns and sensitivities are shown to be very close to humans, suggesting their role as a research tool or object in the study of language.
Shin Unsub, Yi Eunkyung, Song Sanghoun

2023

BERT, NPI licensing, grammatical illusion, licensing strength, negative polarity items, neural language model, psycholinguistics, scale of negativity

Public Health

Public Health

Exploring the impact of sentiment on multi-dimensional information dissemination using COVID-19 data in China.

In Computers in human behavior ; h5-index 125.0
The outbreak of information epidemic in crisis events, with the channel effect of social media, has brought severe challenges to global public health. Combining information, users and environment, understanding how emotional information spreads on social media plays a vital role in public opinion governance and affective comfort, preventing mass incidents and stabilizing the network order. Therefore, from the perspective of the information ecology and elaboration likelihood model (ELM), this study conducted a comparative analysis based on two large-scale datasets related to COVID-19 to explore the influence mechanism of sentiment on the forwarding volume, spreading depth and network influence of information dissemination. Based on machine learning and social network methods, topics, sentiments, and network variables are extracted from large-scale text data, and the dissemination characteristics and evolution rules of online public opinions in crisis events are further analyzed. The results show that negative sentiment positively affects the volume, depth, and influence compared with positive sentiment. In addition, information characteristics such as richness, authority, and topic influence moderate the relationship between sentiment and information dissemination. Therefore, the research can build a more comprehensive connection between the emotional reaction of network users and information dissemination and analyze the internal characteristics and evolution trend of online public opinion. Then it can help sentiment management and information release strategy when emergencies occur.
Luo Han, Meng Xiao, Zhao Yifei, Cai Meng

2023-Jul

COVID-19, Emotional response, Information authority, Information dissemination, Information richness, Topic influence

General

General

Normalization and possibility of classification analysis using the optimal warping paths of dynamic time warping in gait analysis.

In Journal of exercise rehabilitation
The purpose of this study was to verify classification performance and the difference analysis between gender using optimal warping paths of dynamic time warping (DTW) and to examine the usefulness of root mean square error (RMSE) represented by the perpendicular distance from the optimal warping path to the diagonal. A 3-dimensional motion analysis experiment was performed with 24 healthy adults (male=12, female=12) in their 20s of age without gait-related diseases or injuries for the past 6 months to collect gait data. This study performed a DTW 132 times in total (male=62, female=62) for the flexion angle of the right leg's hip, knee, and ankle joints. Then, the global cost and the RMSE of the optimal warping paths were calculated and normalized. The difference analysis was performed by independent t-test. Machine learning was performed to test the classification performance using the neural network, support vector machine, and logistic regression model among the supervised models. Results analyzed using global cost and RMSE for hip, knee, and ankle joints showed a statistically significant difference between genders in global cost and RMSE for hip and knee joints but not for ankle joints using RMSE. Considering both area under the receiver operating characteristic curve and F1-score, the logistic regression model has been evaluated as the most suitable for gender classification using the global cost or RMSE. This study demonstrated that optimal warping paths could be used for statistical difference analysis and classification analysis.
Lee Hyun-Seob

2023-Feb

Classification analysis, Dynamic time warping, Gait, Machine learning, Similarity

General

General

Pathogen-driven cancers from a structural perspective: Targeting host-pathogen protein-protein interactions.

In Frontiers in oncology
Host-pathogen interactions (HPIs) affect and involve multiple mechanisms in both the pathogen and the host. Pathogen interactions disrupt homeostasis in host cells, with their toxins interfering with host mechanisms, resulting in infections, diseases, and disorders, extending from AIDS and COVID-19, to cancer. Studies of the three-dimensional (3D) structures of host-pathogen complexes aim to understand how pathogens interact with their hosts. They also aim to contribute to the development of rational therapeutics, as well as preventive measures. However, structural studies are fraught with challenges toward these aims. This review describes the state-of-the-art in protein-protein interactions (PPIs) between the host and pathogens from the structural standpoint. It discusses computational aspects of predicting these PPIs, including machine learning (ML) and artificial intelligence (AI)-driven, and overviews available computational methods and their challenges. It concludes with examples of how theoretical computational approaches can result in a therapeutic agent with a potential of being used in the clinics, as well as future directions.
Ozdemir Emine Sila, Nussinov Ruth

2023

artificial intelligence, cancer therapeutics, drug discovery, host-pathogen interactions, machine learning, protein-protein interactions

oncology

Oncology

Incorporating the synthetic CT image for improving the performance of deformable image registration between planning CT and cone-beam CT.

In Frontiers in oncology

OBJECTIVE : To develop a contrast learning-based generative (CLG) model for the generation of high-quality synthetic computed tomography (sCT) from low-quality cone-beam CT (CBCT). The CLG model improves the performance of deformable image registration (DIR).

METHODS : This study included 100 post-breast-conserving patients with the pCT images, CBCT images, and the target contours, which the physicians delineated. The CT images were generated from CBCT images via the proposed CLG model. We used the Sct images as the fixed images instead of the CBCT images to achieve the multi-modality image registration accurately. The deformation vector field is applied to propagate the target contour from the pCT to CBCT to realize the automatic target segmentation on CBCT images. We calculate the Dice similarity coefficient (DSC), 95 % Hausdorff distance (HD95), and average surface distance (ASD) between the prediction and reference segmentation to evaluate the proposed method.

RESULTS : The DSC, HD95, and ASD of the target contours with the proposed method were 0.87 ± 0.04, 4.55 ± 2.18, and 1.41 ± 0.56, respectively. Compared with the traditional method without the synthetic CT assisted (0.86 ± 0.05, 5.17 ± 2.60, and 1.55 ± 0.72), the proposed method was outperformed, especially in the soft tissue target, such as the tumor bed region.

CONCLUSION : The CLG model proposed in this study can create the high-quality sCT from low-quality CBCT and improve the performance of DIR between the CBCT and the pCT. The target segmentation accuracy is better than using the traditional DIR.

Li Na, Zhou Xuanru, Chen Shupeng, Dai Jingjing, Wang Tangsheng, Zhang Chulong, He Wenfeng, Xie Yaoqin, Liang Xiaokun

2023

breast cancer, deep learning, deformable image registration (DIR), radiation therapy, synthetic image

Radiology

Radiology

Artificial-intelligence-based computed tomography histogram analysis predicting tumor invasiveness of lung adenocarcinomas manifesting as radiological part-solid nodules.

In Frontiers in oncology

BACKGROUND : Tumor invasiveness plays a key role in determining surgical strategy and patient prognosis in clinical practice. The study aimed to explore artificial-intelligence-based computed tomography (CT) histogram indicators significantly related to the invasion status of lung adenocarcinoma appearing as part-solid nodules (PSNs), and to construct radiomics models for prediction of tumor invasiveness.

METHODS : We identified surgically resected lung adenocarcinomas manifesting as PSNs in Peking University People's Hospital from January 2014 to October 2019. Tumors were categorized as adenocarcinoma in situ (AIS), minimally invasive adenocarcinoma (MIA), and invasive adenocarcinoma (IAC) by comprehensive pathological assessment. The whole cohort was randomly assigned into a training (70%, n=832) and a validation cohort (30%, n=356) to establish and validate the prediction model. An artificial-intelligence-based algorithm (InferRead CT Lung) was applied to extract CT histogram parameters for each pulmonary nodule. For feature selection, multivariate regression models were built to identify factors associated with tumor invasiveness. Logistic regression classifier was used for radiomics model building. The predictive performance of the model was then evaluated by ROC and calibration curves.

RESULTS : In total, 299 AIS/MIAs and 889 IACs were included. In the training cohort, multivariate logistic regression analysis demonstrated that age [odds ratio (OR), 1.020; 95% CI, 1.004-1.037; p=0.017], smoking history (OR, 1.846; 95% CI, 1.058-3.221; p=0.031), solid mean density (OR, 1.014; 95% CI, 1.004-1.024; p=0.008], solid volume (OR, 5.858; 95% CI, 1.259-27.247; p = 0.037), pleural retraction sign (OR, 3.179; 95% CI, 1.057-9.559; p = 0.039), variance (OR, 0.570; 95% CI, 0.399-0.813; p=0.002), and entropy (OR, 4.606; 95% CI, 2.750-7.717; p<0.001) were independent predictors for IAC. The areas under the curve (AUCs) in the training and validation cohorts indicated a better discriminative ability of the histogram model (AUC=0.892) compared with the clinical model (AUC=0.852) and integrated model (AUC=0.886).

CONCLUSION : We developed an AI-based histogram model, which could reliably predict tumor invasiveness in lung adenocarcinoma manifesting as PSNs. This finding would provide promising value in guiding the precision management of PSNs in the daily practice.

Gao Jian, Qi Qingyi, Li Hao, Wang Zhenfan, Sun Zewen, Cheng Sida, Yu Jie, Zeng Yaqi, Hong Nan, Wang Dawei, Wang Huiyang, Yang Feng, Li Xiao, Li Yun

2023

CT histogram, lung adenocarcinoma, part-solid nodule, three-dimensional index, tumor invasiveness