Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

Digitization of Broccoli Freshness Integrating External Color and Mass Loss.

In Foods (Basel, Switzerland)

Yellowing of green vegetables due to chlorophyll decomposition is a phenomenon indicating serious deterioration of freshness, and it is evaluated by measuring color space values. In contrast, mass reduction due to water loss is a deterioration of freshness observed in all horticultural crops. Therefore, in this study, we propose a novel freshness evaluation index for green vegetables that combines the degree of greenness and mass loss. The green color retention rate was measured using a computer vision system, and the mass retention rate was measured by weighing. Linear discriminant analysis (LDA) was performed using both variables (greenness and mass) as covariates to obtain a single freshness evaluation value (first canonical variable). The correct classification of storage period length by LDA was 96%. Green color retention alone allowed for classification of storage durations between 0 day and 10 days, whereas LDA could classify storage durations between 0 day and 12 days. The novel freshness evaluation index proposed by this research, which integrates greenness and mass, has been shown to be more accurate than the conventional evaluation index that uses only greenness.

Makino Yoshio, Amino Genki


Brassica oleracea var. italica, computer vision, evaluation, image analysis, machine learning, shelf life, statistical analysis, vegetable

General General

Uncovering New Drug Properties in Target-Based Drug-Drug Similarity Networks.

In Pharmaceutics

Despite recent advances in bioinformatics, systems biology, and machine learning, the accurate prediction of drug properties remains an open problem. Indeed, because the biological environment is a complex system, the traditional approach-based on knowledge about the chemical structures-can not fully explain the nature of interactions between drugs and biological targets. Consequently, in this paper, we propose an unsupervised machine learning approach that uses the information we know about drug-target interactions to infer drug properties. To this end, we define drug similarity based on drug-target interactions and build a weighted Drug-Drug Similarity Network according to the drug-drug similarity relationships. Using an energy-model network layout, we generate drug communities associated with specific, dominant drug properties. DrugBank confirms the properties of 59.52% of the drugs in these communities, and 26.98% are existing drug repositioning hints we reconstruct with our DDSN approach. The remaining 13.49% of the drugs seem not to match the dominant pharmacologic property; thus, we consider them potential drug repurposing hints. The resources required to test all these repurposing hints are considerable. Therefore we introduce a mechanism of prioritization based on the betweenness/degree node centrality. Using betweenness/degree as an indicator of drug repurposing potential, we select Azelaic acid and Meprobamate as a possible antineoplastic and antifungal, respectively. Finally, we use a test procedure based on molecular docking to analyze Azelaic acid and Meprobamate's repurposing.

Udrescu Lucreţia, Bogdan Paul, Chiş Aimée, Sîrbu Ioan Ovidiu, Topîrceanu Alexandru, Văruţ Renata-Maria, Udrescu Mihai


drug repurposing, drug–drug similarity network, drug–target interactions, molecular docking, network centrality, network clustering

General General

Detection of COVID-19 from Chest X-Ray Images Using Convolutional Neural Networks.

In SLAS technology

The detection of severe acute respiratory syndrome coronavirus 2 (SARS CoV-2), which is responsible for coronavirus disease 2019 (COVID-19), using chest X-ray images has life-saving importance for both patients and doctors. In addition, in countries that are unable to purchase laboratory kits for testing, this becomes even more vital. In this study, we aimed to present the use of deep learning for the high-accuracy detection of COVID-19 using chest X-ray images. Publicly available X-ray images (1583 healthy, 4292 pneumonia, and 225 confirmed COVID-19) were used in the experiments, which involved the training of deep learning and machine learning classifiers. Thirty-eight experiments were performed using convolutional neural networks, 10 experiments were performed using five machine learning models, and 14 experiments were performed using the state-of-the-art pre-trained networks for transfer learning. Images and statistical data were considered separately in the experiments to evaluate the performances of models, and eightfold cross-validation was used. A mean sensitivity of 93.84%, mean specificity of 99.18%, mean accuracy of 98.50%, and mean receiver operating characteristics-area under the curve scores of 96.51% are achieved. A convolutional neural network without pre-processing and with minimized layers is capable of detecting COVID-19 in a limited number of, and in imbalanced, chest X-ray images.

Sekeroglu Boran, Ozsahin Ilker


COVID-19, X-ray, convolutional neural networks, coronavirus, pneumonia

General General

Heavy metal Hg stress detection in tobacco plant using hyperspectral sensing and data-driven machine learning methods.

In Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy

Accurate detection of heavy metal stress on the growth status of plants is of great concern for agricultural production and management, food security, and ecological environment. A proximal hyperspectral imaging (HSI) system covered the visible/near-infrared (Vis/NIR) region of 400-1000 nm coupled with machine learning methods were employed to discriminate the tobacco plants stressed by different concentration of heavy metal Hg. After acquiring hyperspectral images of tobacco plants stressed by heavy metal Hg with concentration solutions of 0 mg·L-1 (non-stressed groups), 1, 3, and 5 mg·L-1 (3 stressed groups), regions of interest (ROIs) of canopy in tobacco plants were identified for spectra processing. Meanwhile, tobacco plant's appearance and microstructure of mesophyll tissue in tobacco leaves were analyzed. After that, clustering effects of the non-stressed and stressed groups were revealed by score plots and score images calculated by principal component analysis (PCA). Then, loadings of PCA and competitive adaptive reweighted sampling (CARS) algorithm were employed to pick effective wavelengths (EWs) for discriminating non-stressed and stressed samples. Partial least squares discriminant analysis (PLS-DA) and least-squares support vector machine (LS-SVM) were utilized to estimate the stressed tobacco plants status with different concentrations Hg solutions. The performances of those models were evaluated using confusion matrixes (CMes) and receiver operating characteristics (ROC) curves. Results demonstrated that PLS-DA models failed to offer relatively good result, and this algorithm was abandoned to classify the stressed and non-stressed groups of tobacco plants. Compared to LS-SVM model based on full spectra (FS-LS-SVM), the LS-SVM model established EWs selected by CARS (CARS-LS-SVM) carried 13 variables provided an accuracy of 100%, which was promising to achieve the qualitative discrimination of the non-stressed and stressed tobacco plants. Meanwhile, for revealing the discrepancy between 3 stressed groups of tobacco plants, the other FS-LS-SVM, PCA-LS-SVM, and CARS-LS-SVM models were setup and offered relatively low accuracies of 55.56%, 51.11% and 66.67%, respectively. Performance of those 3 LS-SVM discriminative models was also poorly performing to differentiate 3 stressed groups of tobacco plants, which might be caused by low concentration of heavy metal and similar canopy (especially in fresh leaves) of plant. The achievements of the research indicated that HSI coupled with machine learning methods had a powerful potential to discriminate tobacco plant stressed by heavy metal Hg.

Yu Keqiang, Fang Shiyan, Zhao Yanru


Canopy, Heavy metal Hg stress, Machine learning methods, Proximal hyperspectral imaging, Tobacco plant

General General

DeepPPSite: A deep learning-based model for analysis and prediction of phosphorylation sites using efficient sequence information.

In Analytical biochemistry

Phosphorylation is a ubiquitous type of post-translational modification (PTM) that occurs in both eukaryotic and prokaryotic cells where in a phosphate group binds with amino acid residues. These specific residues, i.e., serine (S), threonine (T), and tyrosine (Y), exhibit diverse functions at the molecular level. Recent studies have determined that some diseases such as cancer, diabetes, and neurodegenerative diseases are caused by abnormal phosphorylation. Based on its potential applications in biological research and drug development, the large-scale identification of phosphorylation sites has attracted interest. Existing wet-lab technologies for targeting phosphorylation sites are overpriced and time consuming. Thus, computational algorithms that can efficiently accelerate the annotation of phosphorylation sites from massive protein sequences are needed. Numerous machine learning-based methods have been implemented for phosphorylation site prediction. However, despite extensive efforts, existing computational approaches continue to have inadequate performance, particularly in terms of overall ACC, MCC, and AUC. In this paper, we report a novel deep learning-based predictor to overcome these performance hurdles, DeepPPSite, which was constructed using a stacked long short-term memory recurrent neural network for predicting phosphorylation sites. The proposed technique expediently learns the protein representations from conjoint protein descriptors. The experimental results indicated that our model achieved superior performance on the training dataset for S, T and Y, with MCC values of 0.608, 0.602, and 0.558, respectively, using a 10-fold cross-validation test. We further determined the generalization efficacy of the proposed predictor DeepPPSite by conducting a rigorous independent test. The predictive MCC values were 0.358, 0.356, and 0.350 for the S, T, and Y phosphorylation sites, respectively. Rigorous cross-validation and independent validation tests for the three types of phosphorylation site demonstrated that the designed DeepPPSite tool significantly outperforms state-of-the-art methods.

Ahmed Saeed, Kabir Muhammad, Arif Muhammad, Khan Zaheer Ullah, Yu Dong-Jun


Deep Learning, Phosphorylation sites, Post-translation modification, Sequence feature information, Stacked Long Short Term Memory

General General

Real-time artificial intelligence-based histological classification of colorectal polyps with augmented visualization.

In Gastrointestinal endoscopy ; h5-index 72.0

BACKGROUND AND AIMS : Artificial intelligence (AI)-based computer-aided diagnostic (CADx) algorithms are a promising approach for real-time histology (RTH) of colonic polyps. Our aim is to present a novel in situ CADx approach that seeks to increase transparency and interpretability of results by generating an intuitive augmented visualization of the model's predicted histology over the polyp surface.

METHODS : We developed a deep learning (DL) model using semantic segmentation to delineate polyp boundaries, and a DL model to classify subregions within the segmented polyp. These subregions were classified independently, and subsequently aggregated to generate a histology map of the polyp's surface. We used 740 high-magnification narrow-band images from 607 polyps in 286 patients, and over 65,000 subregions, to train and validate the model.

RESULTS : The model achieved a sensitivity of 0.96, specificity of 0.84, negative predictive value (NPV) of 0.91, and high-confidence rate (HCR) of 0.88, distinguishing 171 neoplastic polyps from 83 non-neoplastic polyps of all sizes. Among 93 neoplastic and 75 non-neoplastic polyps ≤5 mm, the model achieved a sensitivity of 0.95, specificity of 0.84, NPV of 0.91 and HCR of 0.86.

CONCLUSIONS : The CADx model is capable of accurately distinguishing neoplastic from non-neoplastic polyps and provides a histology map of the spatial distribution of localized histologic predictions along the delineated polyp surface. This capability may improve interpretability and transparency of AI-based RTH and offer intuitive, accurate, and user-friendly guidance in real time for the clinical management and documentation of optical histology results.

Rodriguez-Diaz Eladio, Baffy György, Lo Wai-Kit, Mashimo Hiroshi, Vidyarthi Gitanjali, Mohapatra Shyam S, Singh Satish K


artificial intelligence, augmented visualization, colorectal neoplasm, colorectal polyps, computer-aided diagnosis, deep learning, endoscopy, histology map, machine learning, near-focus narrow-band imaging, optical biopsy, real-time polyp histology