Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

Automated detection of the head-twitch response using wavelet scalograms and a deep convolutional neural network.

In Scientific reports ; h5-index 158.0

Hallucinogens induce the head-twitch response (HTR), a rapid reciprocal head movement, in mice. Although head twitches are usually identified by direct observation, they can also be assessed using a head-mounted magnet and a magnetometer. Procedures have been developed to automate the analysis of magnetometer recordings by detecting events that match the frequency, duration, and amplitude of the HTR. However, there is considerable variability in the features of head twitches, and behaviors such as jumping have similar characteristics, reducing the reliability of these methods. We have developed an automated method that can detect head twitches unambiguously, without relying on features in the amplitude-time domain. To detect the behavior, events are transformed into a visual representation in the time-frequency domain (a scalogram), deep features are extracted using the pretrained convolutional neural network (CNN) ResNet-50, and then the images are classified using a Support Vector Machine (SVM) algorithm. These procedures were used to analyze recordings from 237 mice containing 11,312 HTR. After transformation to scalograms, the multistage CNN-SVM approach detected 11,244 (99.4%) of the HTR. The procedures were insensitive to other behaviors, including jumping and seizures. Deep learning based on scalograms can be used to automate HTR detection with robust sensitivity and reliability.

Halberstadt Adam L


General General

Real-time detection of colon polyps during colonoscopy using deep learning: systematic validation with four independent datasets.

In Scientific reports ; h5-index 158.0

We developed and validated a deep-learning algorithm for polyp detection. We used a YOLOv2 to develop the algorithm for automatic polyp detection on 8,075 images (503 polyps). We validated the algorithm using three datasets: A: 1,338 images with 1,349 polyps; B: an open, public CVC-clinic database with 612 polyp images; and C: 7 colonoscopy videos with 26 polyps. To reduce the number of false positives in the video analysis, median filtering was applied. We tested the algorithm performance using 15 unaltered colonoscopy videos (dataset D). For datasets A and B, the per-image polyp detection sensitivity was 96.7% and 90.2%, respectively. For video study (dataset C), the per-image polyp detection sensitivity was 87.7%. False positive rates were 12.5% without a median filter and 6.3% with a median filter with a window size of 13. For dataset D, the sensitivity and false positive rate were 89.3% and 8.3%, respectively. The algorithm detected all 38 polyps that the endoscopists detected and 7 additional polyps. The operation speed was 67.16 frames per second. The automatic polyp detection algorithm exhibited good performance, as evidenced by the high detection sensitivity and rapid processing. Our algorithm may help endoscopists improve polyp detection.

Lee Ji Young, Jeong Jinhoon, Song Eun Mi, Ha Chunae, Lee Hyo Jeong, Koo Ja Eun, Yang Dong-Hoon, Kim Namkug, Byeon Jeong-Sik


General General

Applying Machine Learning to Kinematic and Eye Movement Features of a Movement Imitation Task to Predict Autism Diagnosis.

In Scientific reports ; h5-index 158.0

Autism is a developmental condition currently identified by experts using observation, interview, and questionnaire techniques and primarily assessing social and communication deficits. Motor function and movement imitation are also altered in autism and can be measured more objectively. In this study, motion and eye tracking data from a movement imitation task were combined with supervised machine learning methods to classify 22 autistic and 22 non-autistic adults. The focus was on a reliable machine learning application. We have used nested validation to develop models and further tested the models with an independent data sample. Feature selection was aimed at selection stability to assure result interpretability. Our models predicted diagnosis with 73% accuracy from kinematic features, 70% accuracy from eye movement features and 78% accuracy from combined features. We further explored features which were most important for predictions to better understand movement imitation differences in autism. Consistent with the behavioural results, most discriminative features were from the experimental condition in which non-autistic individuals tended to successfully imitate unusual movement kinematics while autistic individuals tended to fail. Machine learning results show promise that future work could aid in the diagnosis process by providing quantitative tests to supplement current qualitative ones.

Vabalas Andrius, Gowen Emma, Poliakoff Ellen, Casson Alexander J


General General

Environmental DNA can act as a biodiversity barometer of anthropogenic pressures in coastal ecosystems.

In Scientific reports ; h5-index 158.0

Loss of biodiversity from lower to upper trophic levels reduces overall productivity and stability of coastal ecosystems in our oceans, but rarely are these changes documented across both time and space. The characterisation of environmental DNA (eDNA) from sediment and seawater using metabarcoding offers a powerful molecular lens to observe marine biota and provides a series of 'snapshots' across a broad spectrum of eukaryotic organisms. Using these next-generation tools and downstream analytical innovations including machine learning sequence assignment algorithms and co-occurrence network analyses, we examined how anthropogenic pressures may have impacted marine biodiversity on subtropical coral reefs in Okinawa, Japan. Based on 18 S ribosomal RNA, but not ITS2 sequence data due to inconsistent amplification for this marker, as well as proxies for anthropogenic disturbance, we show that eukaryotic richness at the family level significantly increases with medium and high levels of disturbance. This change in richness coincides with compositional changes, a decrease in connectedness among taxa, an increase in fragmentation of taxon co-occurrence networks, and a shift in indicator taxa. Taken together, these findings demonstrate the ability of eDNA to act as a barometer of disturbance and provide an exemplar of how biotic networks and coral reefs may be impacted by anthropogenic activities.

DiBattista Joseph D, Reimer James D, Stat Michael, Masucci Giovanni D, Biondi Piera, De Brauwer Maarten, Wilkinson Shaun P, Chariton Anthony A, Bunce Michael


Radiology Radiology

Artificial Intelligence Algorithm Detecting Lung Infection in Supine Chest Radiographs of Critically Ill Patients With a Diagnostic Accuracy Similar to Board-Certified Radiologists.

In Critical care medicine ; h5-index 87.0

OBJECTIVES : Interpretation of lung opacities in ICU supine chest radiographs remains challenging. We evaluated a prototype artificial intelligence algorithm to classify basal lung opacities according to underlying pathologies.

DESIGN : Retrospective study. The deep neural network was trained on two publicly available datasets including 297,541 images of 86,876 patients.

PATIENTS : One hundred sixty-six patients received both supine chest radiograph and CT scans (reference standard) within 90 minutes without any intervention in between.

MEASUREMENTS AND MAIN RESULTS : Algorithm accuracy was referenced to board-certified radiologists who evaluated supine chest radiographs according to side-separate reading scores for pneumonia and effusion (0 = absent, 1 = possible, and 2 = highly suspected). Radiologists were blinded to the supine chest radiograph findings during CT interpretation. Performances of radiologists and the artificial intelligence algorithm were quantified by receiver-operating characteristic curve analysis. Diagnostic metrics (sensitivity, specificity, positive predictive value, negative predictive value, and accuracy) were calculated based on different receiver-operating characteristic operating points. Regarding pneumonia detection, radiologists achieved a maximum diagnostic accuracy of up to 0.87 (95% CI, 0.78-0.93) when considering only the supine chest radiograph reading score 2 as positive for pneumonia. Radiologist's maximum sensitivity up to 0.87 (95% CI, 0.76-0.94) was achieved by additionally rating the supine chest radiograph reading score 1 as positive for pneumonia and taking previous examinations into account. Radiologic assessment essentially achieved nonsignificantly higher results compared with the artificial intelligence algorithm: artificial intelligence-area under the receiver-operating characteristic curve of 0.737 (0.659-0.815) versus radiologist's area under the receiver-operating characteristic curve of 0.779 (0.723-0.836), diagnostic metrics of receiver-operating characteristic operating points did not significantly differ. Regarding the detection of pleural effusions, there was no significant performance difference between radiologist's and artificial intelligence algorithm: artificial intelligence-area under the receiver-operating characteristic curve of 0.740 (0.662-0.817) versus radiologist's area under the receiver-operating characteristic curve of 0.698 (0.646-0.749) with similar diagnostic metrics for receiver-operating characteristic operating points.

CONCLUSIONS : Considering the minor level of performance differences between the algorithm and radiologists, we regard artificial intelligence as a promising clinical decision support tool for supine chest radiograph examinations in the clinical routine with high potential to reduce the number of missed findings in an artificial intelligence-assisted reading setting.

Rueckel Johannes, Kunz Wolfgang G, Hoppe Boj F, Patzig Maximilian, Notohamiprodjo Mike, Meinel Felix G, Cyran Clemens C, Ingrisch Michael, Ricke Jens, Sabel Bastian O


oncology Oncology

How Does the Skeletal Oncology Research Group Algorithm's Prediction of 5-year Survival in Patients with Chondrosarcoma Perform on International Validation?

In Clinical orthopaedics and related research ; h5-index 71.0

BACKGROUND : The Skeletal Oncology Research Group (SORG) machine learning algorithm for predicting survival in patients with chondrosarcoma was developed using data from the Surveillance, Epidemiology, and End Results (SEER) registry. This algorithm was externally validated on a dataset of patients from the United States in an earlier study, where it demonstrated generally good performance but overestimated 5-year survival. In addition, this algorithm has not yet been validated in patients outside the United States; doing so would be important because external validation is necessary as algorithm performance may be misleading when applied in different populations.

QUESTIONS/PURPOSES : Does the SORG algorithm retain validity in patients who underwent surgery for primary chondrosarcoma outside the United States, specifically in Italy?

METHODS : A total of 737 patients were treated for chondrosarcoma between January 2000 and October 2014 at the Italian tertiary care center which was used for international validation. We excluded patients whose first surgical procedure was performed elsewhere (n = 25), patients who underwent nonsurgical treatment (n = 27), patients with a chondrosarcoma of the soft tissue or skull (n = 60), and patients with peripheral, periosteal, or mesenchymal chondrosarcoma (n = 161). Thus, 464 patients were ultimately included in this external validation study, as the earlier performed SEER study was used as the training set. Therefore, this study-unlike most of this type-does not have a training and validation set. Although the earlier study overestimated 5-year survival, we did not modify the algorithm in this report, as this is the first international validation and the prior performance in the single-institution validation study from the United States may have been driven by a small sample or non-generalizable patterns related to its single-center setting. Variables needed for the SORG algorithm were manually collected from electronic medical records. These included sex, age, histologic subtype, tumor grade, tumor size, tumor extension, and tumor location. By inputting these variables into the algorithm, we calculated the predicted probabilities of survival for each patient. The performance of the SORG algorithm was assessed in this study through discrimination (the ability of a model to distinguish between a binary outcome), calibration (the agreement of observed and predicted outcomes), overall performance (the accuracy of predictions), and decision curve analysis (establishment on the ability of a model to make a decision better than without using the model). For discrimination, the c-statistic (commonly known as the area under the receiver operating characteristic curve for binary classification) was calculated; this ranged from 0.5 (no better than chance) to 1.0 (excellent discrimination). The agreement between predicted and observed outcomes was visualized with a calibration plot, and the calibration slope and intercept were calculated. Perfect calibration results in a slope of 1 and an intercept of 0. For overall performance, the Brier score and the null-model Brier score were calculated. The Brier score ranges from 0 (perfect prediction) to 1 (poorest prediction). Appropriate interpretation of the Brier score requires comparison with the null-model Brier score. The null-model Brier score is the score for an algorithm that predicts a probability equal to the population prevalence of the outcome for every patient. A decision curve analysis was performed to compare the potential net benefit of the algorithm versus other means of decision support, such as treating all or none of the patients. There were several differences between this study and the earlier SEER study, and such differences are important because they help us to determine the performance of the algorithm in a group different from the initial study population. In this study from Italy, 5-year survival was different from the earlier SEER study (71% [319 of 450 patients] versus 76% [1131 of 1487 patients]; p = 0.03). There were more patients with dedifferentiated chondrosarcoma than in the earlier SEER study (25% [118 of 464 patients] versus 8.5% [131 of 1544 patients]; p < 0.001). In addition, in this study patients were older, tumor size was larger, and there were higher proportions of high-grade tumors than the earlier SEER study (age: 56 years [interquartile range {IQR} 42 to 67] versus 52 years [IQR 40 to 64]; p = 0.007; tumor size: 80 mm [IQR 50 to 120] versus 70 mm [IQR 42 to 105]; p < 0.001; tumor grade: 22% [104 of 464 had Grade 1], 42% [196 of 464 had Grade 2], and 35% [164 of 464 had Grade 3] versus 41% [592 of 1456 had Grade 1], 40% [588 of 1456 had Grade 2], and 19% [276 of 1456 had Grade 3]; p ≤ 0.001).

RESULTS : Validation of the SORG algorithm in a primarily Italian population achieved a c-statistic of 0.86 (95% confidence interval 0.82 to 0.89), suggesting good-to-excellent discrimination. The calibration plot showed good agreement between the predicted probability and observed survival in the probability thresholds of 0.8 to 1.0. With predicted survival probabilities lower than 0.8, however, the SORG algorithm underestimated the observed proportion of patients with 5-year survival, reflected in the overall calibration intercept of 0.82 (95% CI 0.67 to 0.98) and calibration slope of 0.68 (95% CI 0.42 to 0.95). The Brier score for 5-year survival was 0.15, compared with a null-model Brier of 0.21. The algorithm showed a favorable decision curve analysis in the validation cohort.

CONCLUSIONS : The SORG algorithm to predict 5-year survival for patients with chondrosarcoma held good discriminative ability and overall performance on international external validation; however, it underestimated 5-year survival for patients with predicted probabilities from 0 to 0.8 because the calibration plot was not perfectly aligned for the observed outcomes, which resulted in a maximum underestimation of 20%. The differences may reflect the baseline differences noted between the two study populations. The overall performance of the algorithm supports the utility of the algorithm and validation presented here. The freely available digital application for the algorithm is available here:

LEVEL OF EVIDENCE : Level III, prognostic study.

Bongers Michiel E R, Karhade Aditya V, Setola Elisabetta, Gambarotti Marco, Groot Olivier Q, Erdoğan Kıvılcım E, Picci Piero, Donati Davide M, Schwab Joseph H, Palmerini Emanuela