Doctor Penguin

Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General

General

Interpreting Generative Adversarial Networks to Infer Natural Selection from Genetic Data.

In bioRxiv : the preprint server for biology

MOTIVATION : Understanding the landscape of natural selection in humans and other species has been a major focus for the use of machine learning methods in population genetics. Existing methods rely on computationally intensive simulated training data incorporating selection. Unlike efficient neutral coalescent simulations for demographic inference, realistic selection typically requires slow forward simulations. Large populations sizes (for example due to recent exponential growth in humans) make these simulations even more prohibitive. Because there are many possible modes of selection, a high dimensional parameter space must be explored, with no guarantee that the simulated models are close to the real processes. Since machine learning methods use the simulated data for training, mismatches between simulated training data and real test data are particularly problematic. In addition, it has been difficult to interpret the trained neural networks, leading to a lack of understanding about what features contribute to identifying selected variants.

RESULTS : Here we develop a new approach to detect selection that does not require selection simulations during training. We use a Generative Adversarial Network (GAN) that has been trained to simulate neutral data that mirrors a real genomic dataset. The resulting GAN consists of a generator (demographic model) and a discriminator (convolutional neural network). For a given genomic region, the discriminator predicts whether it is "real" genomic data or "fake" in the sense that it could have been simulated by the generator. As the "real" training data includes regions that experienced selection and the generator cannot produce such regions, regions with a high probability of being real may have experienced selection. This enables us to apply the trained discriminator of the GAN to held-out test data and identify candidate selected regions. We show that this approach has high power to identify regions under selection in simulations, and that it reliably identifies selected regions identified by state-of-the art population genetic methods in three human populations (YRI, CEU, and CHB). Finally, we show how to interpret the trained networks by clustering hidden units of the discriminator based on their correlation patterns with known summary statistics. In summary, our approach is a novel, efficient, and powerful way to use machine learning to detect natural selection.

AVAILABILITY : Our software is available open-source at https://github.com/mathiesonlab/disc-pg-gan.

Riley Rebecca, Mathieson Iain, Mathieson Sara

2023-Mar-08

General

General

Comparative study of convolutional neural network architectures for gastrointestinal lesions classification.

In PeerJ
The gastrointestinal (GI) tract can be affected by different diseases or lesions such as esophagitis, ulcers, hemorrhoids, and polyps, among others. Some of them can be precursors of cancer such as polyps. Endoscopy is the standard procedure for the detection of these lesions. The main drawback of this procedure is that the diagnosis depends on the expertise of the doctor. This means that some important findings may be missed. In recent years, this problem has been addressed by deep learning (DL) techniques. Endoscopic studies use digital images. The most widely used DL technique for image processing is the convolutional neural network (CNN) due to its high accuracy for modeling complex phenomena. There are different CNNs that are characterized by their architecture. In this article, four architectures are compared: AlexNet, DenseNet-201, Inception-v3, and ResNet-101. To determine which architecture best classifies GI tract lesions, a set of metrics; accuracy, precision, sensitivity, specificity, F1-score, and area under the curve (AUC) were used. These architectures were trained and tested on the HyperKvasir dataset. From this dataset, a total of 6,792 images corresponding to 10 findings were used. A transfer learning approach and a data augmentation technique were applied. The best performing architecture was DenseNet-201, whose results were: 97.11% of accuracy, 96.3% sensitivity, 99.67% specificity, and 95% AUC.
Cuevas-Rodriguez Erik O, Galvan-Tejada Carlos E, Maeda-Gutiérrez Valeria, Moreno-Chávez Gamaliel, Galván-Tejada Jorge I, Gamboa-Rosales Hamurabi, Luna-García Huizilopoztli, Moreno-Baez Arturo, Celaya-Padilla José María

2023

Classification, Computer-aided diagnostic, Convolutional neural network, Deep learning, Endoscopy, Gastrointestinal, Gastrointestinal lesions

General

General

A ventilation early warning system (VEWS) for diaphanous workspaces considering COVID-19 and future pandemics scenarios.

In Heliyon
The COVID-19 pandemic has generated new needs due to the associated health risks and, more specifically, its rapid infection rate. Prevention measures to avoid contagions in indoor spaces, especially in office and public buildings (e.g., hospitals, public administration, educational centres, etc.), have led to the need for adequate ventilation to dilute the possible concentration of the virus. This article presents our contribution to this new challenge, namely the Ventilation Early Warning System (VEWS) which has aims to adapt the operation of the current Heating, Ventilating and Air Conditioning (HVAC) systems to the ventilation needs of diaphanous workspaces, based on a Smart Campus Digital Twin (SCDT) framework approach, while maintaining sustainability. Different technologies such as the Internet of Things (IoT), Building Information Modelling (BIM) and Artificial Intelligence (AI) algorithms are combined to collect and integrate monitoring data (historical records, real-time information, and location-related patterns) to carry out forecasting simulations in this digital twin. The generated outputs serve to assist facility managers in their building governance, considering the appropriate application of health measures to reduce the risk of coronavirus contagion in combination with sustainability criteria. The article also provides the results of the implementation of the VEWS in a university workspace as a case study. Its application has made it possible to detect and warn of inadequate ventilation situations for the daily flow of people in the different controlled zones.
Costa Gonçal, Arroyo Oriol, Rueda Pablo, Briones Alan

2023-Mar

BIM, Building digital twin, COVID-19, Facilities management, IoT, Simulation, Smart building

General

General

COVID-19 diagnosis: A comprehensive review of pre-trained deep learning models based on feature extraction algorithm.

In Results in engineering
Due to the augmented rise of COVID-19, clinical specialists are looking for fast faultless diagnosis strategies to restrict Covid spread while attempting to lessen the computational complexity. In this way, swift diagnosis techniques for COVID-19 with high precision can offer valuable aid to clinical specialists. RT- PCR test is an expensive and tedious COVID diagnosis technique in practice. Medical imaging is feasible to diagnose COVID-19 by X-ray chest radiography to get around the shortcomings of RT-PCR. Through a variety of Deep Transfer-learning models, this research investigates the potential of Artificial Intelligence -based early diagnosis of COVID-19 via X-ray chest radiographs. With 10,192 normal and 3616 Covid X-ray chest radiographs, the deep transfer-learning models are optimized to further the accurate diagnosis. The x-ray chest radiographs undergo a data augmentation phase before developing a modified dataset to train the Deep Transfer-learning models. The Deep Transfer-learning architectures are trained using the extracted features from the Feature Extraction stage. During training, the classification of X-ray Chest radiographs based on feature extraction algorithm values is converted into a feature label set containing the classified image data with a feature string value representing the number of edges detected after edge detection. The feature label set is further tested with the SVM, KNN, NN, Naive Bayes and Logistic Regression classifiers to audit the quality metrics of the proposed model. The quality metrics include accuracy, precision, F1 score, recall and AUC. The Inception-V3 dominates the six Deep Transfer-learning models, according to the assessment results, with a training accuracy of 84.79% and a loss function of 2.4%. The performance of Cubic SVM was superior to that of the other SVM classifiers, with an AUC score of 0.99, precision of 0.983, recall of 0.8977, accuracy of 95.8%, and F1 score of 0.9384. Cosine KNN fared better than the other KNN classifiers with an AUC score of 0.95, precision of 0.974, recall of 0.777, accuracy of 90.8%, and F1 score of 0.864. Wide NN fared better than the other NN classifiers with an AUC score of 0.98, precision of 0.975, recall of 0.907, accuracy of 95.5%, and F1 score of 0.939. According to the findings, SVM classifiers topped other classifiers in terms of performance indicators like accuracy, precision, recall, F1-score, and AUC. The SVM classifiers reported better mean optimal scores compared to other classifiers. The performance assessment metrics uncover that the proposed methodology can aid in preliminary COVID diagnosis.
Poola Rahul Gowtham, Pl Lahari, Y Siva Sankar

2023-Jun

Boundary tracing, Covid diagnosis, Deep transfer-learning, Medical imaging, Neural network models and classifiers

Pathology

Pathology

AMLnet, A deep-learning pipeline for the differential diagnosis of acute myeloid leukemia from bone marrow smears.

In Journal of hematology & oncology ; h5-index 60.0
Acute myeloid leukemia (AML) is a deadly hematological malignancy. Cellular morphology detection of bone marrow smears based on the French-American-British (FAB) classification system remains an essential criterion in the diagnosis of hematological malignancies. However, the diagnosis and discrimination of distinct FAB subtypes of AML obtained from bone marrow smear images are tedious and time-consuming. In addition, there is considerable variation within and among pathologists, particularly in rural areas, where pathologists may not have relevant expertise. Here, we established a comprehensive database encompassing 8245 bone marrow smear images from 651 patients based on a retrospective dual-center study between 2010 and 2021 for the purpose of training and testing. Furthermore, we developed AMLnet, a deep-learning pipeline based on bone marrow smear images, that can discriminate not only between AML patients and healthy individuals but also accurately identify various AML subtypes. AMLnet achieved an AUC of 0.885 at the image level and 0.921 at the patient level in distinguishing nine AML subtypes on the test dataset. Furthermore, AMLnet outperformed junior human experts and was comparable to senior experts on the test dataset at the patient level. Finally, we provided an interactive demo website to visualize the saliency maps and the results of AMLnet for aiding pathologists' diagnosis. Collectively, AMLnet has the potential to serve as a fast prescreening and decision support tool for cytomorphological pathologists, especially in areas where pathologists are overburdened by medical demands as well as in rural areas where medical resources are scarce.
Yu Zebin, Li Jianhu, Wen Xiang, Han Yingli, Jiang Penglei, Zhu Meng, Wang Minmin, Gao Xiangli, Shen Dan, Zhang Ting, Zhao Shuqi, Zhu Yijing, Tong Jixiang, Yuan Shuchong, Zhu HongHu, Huang He, Qian Pengxu

2023-Mar-21

Acute myeloid leukemia, Bone marrow smears, Deep learning, Diagnosis

oncology

Oncology

Serum immuno-oncology markers carry independent prognostic information in patients with newly diagnosed metastatic breast cancer, from a prospective observational study.

In Breast cancer research : BCR

BACKGROUND : Metastatic breast cancer (MBC) is a challenging disease, and despite new therapies, prognosis is still poor for a majority of patients. There is a clinical need for improved prognostication where immuno-oncology markers can provide important information. The aim of this study was to evaluate serum immuno-oncology markers in MBC patients and their respective relevance for prediction of survival.

PATIENTS AND METHODS : We investigated a broad panel of 92 immuno-oncology proteins in serum from 136 MBC patients included in a prospective observational study (NCT01322893) with long-term follow-up. Serum samples were collected before start of systemic therapy and analyzed using multiplex proximity extension assay (Olink Target 96 Immuno-Oncology panel). Multiple machine learning techniques were used to identify serum markers with highest importance for prediction of overall and progression-free survival (OS and PFS), and associations to survival were further evaluated using Cox regression analyses. False discovery rate was then used to adjust for multiple comparisons.

RESULTS : Using random forest and random survival forest analyses, we identified the top nine and ten variables of highest predictive importance for OS and PFS, respectively. Cox regression analyses revealed significant associations (P < 0.005) of higher serum levels of IL-8, IL-10 and CAIX with worse OS in multivariable analyses, adjusted for established clinical prognostic factors including circulating tumor cells (CTCs). Similarly, high serum levels of IL-8, IL-10, ADA and CASP8 significantly associated with worse PFS. Interestingly, high serum levels of FasL significantly associated with improved OS and PFS. In addition, CSF-1, IL-6, MUC16, TFNSFR4 and CD244 showed suggestive evidence (P < 0.05) for an association to survival in multivariable analyses. After correction for multiple comparisons, IL-8 still showed strong evidence for correlation to survival.

CONCLUSION : To conclude, we found six serum immuno-oncology markers that were significantly associated with OS and/or PFS in MBC patients, independently of other established prognostic factors including CTCs. Furthermore, an additional five serum immuno-oncology markers provided suggestive evidence for an independent association to survival. These findings highlight the relevance of immuno-oncology serum markers in MBC patients and support their usefulness for improved prognostication. Trial registration Clinical Trials (NCT01322893), registered March 25, 2011.

Gunnarsdottir Frida Björk, Bendahl Pär-Ola, Johansson Alexandra, Benfeitas Rui, Rydén Lisa, Bergenfelz Caroline, Larsson Anna-Maria

2023-Mar-21

Immuno-oncology, Marker, Metastatic breast cancer, Serum, Survival