Doctor Penguin

Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General

General

Design of intelligent module design for humanoid translation robot by combining the deep learning with blockchain technology.

In Scientific reports ; h5-index 158.0
To accelerate the deep application of deep learning in text data processing, an English statistical translation system is established and applied to the question answering of humanoid robot. Firstly, the model of machine translation based on recursive neural network is implemented. A crawler system is established to collect English movie subtitle data. On this basis, an English subtitle translation system is designed. Then, combined with sentence embedding technology, the Particle Swarm Optimization (PSO) algorithm of meta-heuristic algorithm is adopted to locate the defects of translation software. A translation robot automatic question and answer interactive module is constructed. Additionally, the hybrid recommendation mechanism based on personalized learning is built using blockchain technology. Finally, the performance of translation model and software defect location model is evaluated. The results show that the Recurrent Neural Network (RNN) embedding algorithm has certain effect of word clustering. RNN embedded model has a strong ability to process short sentences. The strongest translated sentences are between 11 and 39 words long, while the weakest translated sentences are between 71 and 79 words long. Therefore, the model must strengthen the processing of long sentences, especially character-level input. The average sentence length is much longer than word-level input. The model based on PSO algorithm shows good accuracy in different data sets. This model averages better performance on Tomcat, standard widget toolkits, and Java development tool datasets than other comparison methods. The average reciprocal rank and average accuracy of the weight combination of PSO algorithm are very high. Moreover, this method is greatly affected by the dimension of the word embedding model, and the 300-dimension word embedding model has the best effect. To sum up, this study proposes a good statistical translation model for humanoid robot English translation, which lays the foundation for intelligent interaction between humanoid robots.
Yang Fan, Deng Jie

2023-Mar-09

General

General

A CAD system for automatic dysplasia grading on H&E cervical whole-slide images.

In Scientific reports ; h5-index 158.0
Cervical cancer is the fourth most common female cancer worldwide and the fourth leading cause of cancer-related death in women. Nonetheless, it is also among the most successfully preventable and treatable types of cancer, provided it is early identified and properly managed. As such, the detection of pre-cancerous lesions is crucial. These lesions are detected in the squamous epithelium of the uterine cervix and are graded as low- or high-grade intraepithelial squamous lesions, known as LSIL and HSIL, respectively. Due to their complex nature, this classification can become very subjective. Therefore, the development of machine learning models, particularly directly on whole-slide images (WSI), can assist pathologists in this task. In this work, we propose a weakly-supervised methodology for grading cervical dysplasia, using different levels of training supervision, in an effort to gather a bigger dataset without the need of having all samples fully annotated. The framework comprises an epithelium segmentation step followed by a dysplasia classifier (non-neoplastic, LSIL, HSIL), making the slide assessment completely automatic, without the need for manual identification of epithelial areas. The proposed classification approach achieved a balanced accuracy of 71.07% and sensitivity of 72.18%, at the slide-level testing on 600 independent samples, which are publicly available upon reasonable request.
Oliveira Sara P, Montezuma Diana, Moreira Ana, Oliveira Domingos, Neto Pedro C, Monteiro Ana, Monteiro João, Ribeiro Liliana, Gonçalves Sofia, Pinto Isabel M, Cardoso Jaime S

2023-Mar-09

General

General

Quantum deep learning by sampling neural nets with a quantum annealer.

In Scientific reports ; h5-index 158.0
We demonstrate the feasibility of framing a classically learned deep neural network as an energy based model that can be processed on a one-step quantum annealer in order to exploit fast sampling times. We propose approaches to overcome two hurdles for high resolution image classification on a quantum processing unit (QPU): the required number and the binary nature of the model states. With this novel method we successfully transfer a pretrained convolutional neural network to the QPU. By taking advantage of the strengths of quantum annealing, we show the potential for classification speedup of at least one order of magnitude.
Higham Catherine F, Bedford Adrian

2023-Mar-09

General

General

Machine learning accelerated approach to infer nuclear magnetic resonance porosity for a middle eastern carbonate reservoir.

In Scientific reports ; h5-index 158.0
Carbonate rocks present a complicated pore system owing to the existence of intra-particle and interparticle porosities. Therefore, characterization of carbonate rocks using petrophysical data is a challenging task. Conventional neutron, sonic, and neutron-density porosities are proven to be less accurate as compared to the NMR porosity. This study aims to predict the NMR porosity by implementing three different machine learning (ML) algorithms using conventional well logs including neutron-porosity, sonic, resistivity, gamma ray, and photoelectric factor. Data, comprising 3500 data points, was acquired from a vast carbonate petroleum reservoir in the Middle East. The input parameters were selected based on their relative importance with respect to output parameter. Three ML techniques such as adaptive neuro-fuzzy inference system (ANFIS), artificial neural network (ANN), and functional network (FN) were implemented for the development of prediction models. The model's accuracy was evaluated by correlation coefficient (R), root mean square error (RMSE), and average absolute percentage error (AAPE). The results demonstrated that all three prediction models are reliable and consistent exhibiting low errors and high 'R' values for both training and testing prediction when related to actual dataset. However, the performance of ANN model was better as compared to other two studied ML techniques based on minimum AAPE and RMSE errors (5.12 and 0.39) and highest R (0.95) for testing and validation outcome. The AAPE and RMSE for the testing and validation results were found to be 5.38 and 0.41 for ANFIS and 6.06 and 0.48 for FN model, respectively. The ANFIS and FN models exhibited 'R' 0.937 and 0.942, for testing and validation dataset, respectively. Based on testing and validation results, ANFIS and FN models have been ranked second and third after ANN. Further, optimized ANN and FN models were used to extract explicit correlations to compute the NMR porosity. Hence, this study reveals the successful applications of ML techniques for the accurate prediction of NMR porosity.
Mustafa Ayyaz, Tariq Zeeshan, Mahmoud Mohamed, Abdulraheem Abdulazeez

2023-Mar-09

General

General

All-optical image classification through unknown random diffusers using a single-pixel diffractive network.

In Light, science & applications
Classification of an object behind a random and unknown scattering medium sets a challenging task for computational imaging and machine vision fields. Recent deep learning-based approaches demonstrated the classification of objects using diffuser-distorted patterns collected by an image sensor. These methods demand relatively large-scale computing using deep neural networks running on digital computers. Here, we present an all-optical processor to directly classify unknown objects through unknown, random phase diffusers using broadband illumination detected with a single pixel. A set of transmissive diffractive layers, optimized using deep learning, forms a physical network that all-optically maps the spatial information of an input object behind a random diffuser into the power spectrum of the output light detected through a single pixel at the output plane of the diffractive network. We numerically demonstrated the accuracy of this framework using broadband radiation to classify unknown handwritten digits through random new diffusers, never used during the training phase, and achieved a blind testing accuracy of 87.74 ± 1.12%. We also experimentally validated our single-pixel broadband diffractive network by classifying handwritten digits "0" and "1" through a random diffuser using terahertz waves and a 3D-printed diffractive network. This single-pixel all-optical object classification system through random diffusers is based on passive diffractive layers that process broadband input light and can operate at any part of the electromagnetic spectrum by simply scaling the diffractive features proportional to the wavelength range of interest. These results have various potential applications in, e.g., biomedical imaging, security, robotics, and autonomous driving.
Bai Bijie, Li Yuhang, Luo Yi, Li Xurong, Çetintaş Ege, Jarrahi Mona, Ozcan Aydogan

2023-Mar-09

General

General

Overproduce and select, or Determine Optimal Molecular Descriptor Subset via Configuration Space Optimization? Application to the Prediction of Ecotoxicological Endpoints.

In Molecular informatics
Predicting the likely biological activity (or property) of compounds is a fundamental and challenging task in the drug discovery process. Current computational methodologies aim to improve their predictive accuracies by using deep learning (DL) approaches. However, shallow learning-based methodologies for small- and medium-sized chemical datasets have demonstrated to be most suitable for. The latter start with a universe of molecular descriptors (MDs), then apply different feature selection algorithms, and finally construct a predictive model for the intended learning task. We demonstrate here that this approach may miss relevant information by assuming that the initial universe of MDs codifies, when it does not, all relevant aspects for the respective learning task. We argue that the limitation is mainly because of the constrained intervals of the parameters used in the algorithms that compute MDs, parameters that define the Descriptor Configuration Space (DCS). We propose to relax these constraints in an open CDS approach, so that a larger universe of MDs can initially be considered. We model the generation of MDs as a multicriteria optimization problem and tackle it with a variant of the standard genetic algorithm. As a novel component, the individual fitness function is computed by aggregating four criteria via the Choquet integral using a fuzzy (non-additive) measure. Experimental results on benchmarking chemical datasets show that the proposed approach generates a meaningful DCS by improving state-of-the-art approaches in most of the datasets.
García-González Luis A, Marrero-Ponce Yovani, Brizuela Carlos A, Garcia-Jacas Cesar

2023-Mar-09

ecotoxicological endpoints, genetic algorithm, molecular descriptors