Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

A Comparison of Random Forest Variable Selection Methods for Classification Prediction Modeling.

In Expert systems with applications

Random forest classification is a popular machine learning method for developing prediction models in many research settings. Often in prediction modeling, a goal is to reduce the number of variables needed to obtain a prediction in order to reduce the burden of data collection and improve efficiency. Several variable selection methods exist for the setting of random forest classification; however, there is a paucity of literature to guide users as to which method may be preferable for different types of datasets. Using 311 classification datasets freely available online, we evaluate the prediction error rates, number of variables, computation times and area under the receiver operating curve for many random forest variable selection methods. We compare random forest variable selection methods for different types of datasets (datasets with binary outcomes, datasets with many predictors, and datasets with imbalanced outcomes) and for different types of methods (standard random forest versus conditional random forest methods and test based versus performance based methods). Based on our study, the best variable selection methods for most datasets are Jiang's method and the method implemented in the VSURF R package. For datasets with many predictors, the methods implemented in the R packages varSelRF and Boruta are preferable due to computational efficiency. A significant contribution of this study is the ability to assess different variable selection techniques in the setting of random forest classification in order to identify preferable methods based on applications in expert and intelligent systems.

Speiser Jaime Lynn, Miller Michael E, Tooze Janet, Ip Edward


classification, feature reduction, random forest, variable selection

General General

Third-order nanocircuit elements for neuromorphic engineering.

In Nature ; h5-index 368.0

Current hardware approaches to biomimetic or neuromorphic artificial intelligence rely on elaborate transistor circuits to simulate biological functions. However, these can instead be more faithfully emulated by higher-order circuit elements that naturally express neuromorphic nonlinear dynamics1-4. Generating neuromorphic action potentials in a circuit element theoretically requires a minimum of third-order complexity (for example, three dynamical electrophysical processes)5, but there have been few examples of second-order neuromorphic elements, and no previous demonstration of any isolated third-order element6-8. Using both experiments and modelling, here we show how multiple electrophysical processes-including Mott transition dynamics-form a nanoscale third-order circuit element. We demonstrate simple transistorless networks of third-order elements that perform Boolean operations and find analogue solutions to a computationally hard graph-partitioning problem. This work paves a way towards very compact and densely functional neuromorphic computing primitives, and energy-efficient validation of neuroscientific models.

Kumar Suhas, Williams R Stanley, Wang Ziwen


Radiology Radiology

Intensity harmonization techniques influence radiomics features and radiomics-based predictions in sarcoma patients.

In Scientific reports ; h5-index 158.0

Intensity harmonization techniques (IHT) are mandatory to homogenize multicentric MRIs before any quantitative analysis because signal intensities (SI) do not have standardized units. Radiomics combine quantification of tumors' radiological phenotype with machine-learning to improve predictive models, such as metastastic-relapse-free survival (MFS) for sarcoma patients. We post-processed the initial T2-weighted-imaging of 70 sarcoma patients by using 5 IHTs and extracting 45 radiomics features (RFs), namely: classical standardization (IHTstd), standardization per adipose tissue SIs (IHTfat), histogram-matching with a patient histogram (IHTHM.1), with the average histogram of the population (IHTHM.All) and plus ComBat method (IHTHM.All.C), which provided 5 radiomics datasets in addition to the original radiomics dataset without IHT (No-IHT). We found that using IHTs significantly influenced all RFs values (p-values: < 0.0001-0.02). Unsupervised clustering performed on each radiomics dataset showed that only clusters from the No-IHT, IHTstd, IHTHM.All, and IHTHM.All.C datasets significantly correlated with MFS in multivariate Cox models (p = 0.02, 0.007, 0.004 and 0.02, respectively). We built radiomics-based supervised models to predict metastatic relapse at 2-years with a training set of 50 patients. The models performances varied markedly depending on the IHT in the validation set (range of AUROC from 0.688 with IHTstd to 0.823 with IHTHM.1). Hence, the use of intensity harmonization and the related technique should be carefully detailed in radiomics post-processing pipelines as it can profoundly affect the reproducibility of analyses.

Crombé Amandine, Kind Michèle, Fadli David, Le Loarer François, Italiano Antoine, Buy Xavier, Saut Olivier


Pathology Pathology

Automated thermal imaging for the detection of fatty liver disease.

In Scientific reports ; h5-index 158.0

Non-alcoholic fatty liver disease (NAFLD) comprises a spectrum of progressive liver pathologies, ranging from simple steatosis to non-alcoholic steatohepatitis (NASH), fibrosis and cirrhosis. A liver biopsy is currently required to stratify high-risk patients, and predicting the degree of liver inflammation and fibrosis using non-invasive tests remains challenging. Here, we sought to develop a novel, cost-effective screening tool for NAFLD based on thermal imaging. We used a commercially available and non-invasive thermal camera and developed a new image processing algorithm to automatically predict disease status in a small animal model of fatty liver disease. To induce liver steatosis and inflammation, we fed C57/black female mice (8 weeks old) a methionine-choline deficient diet (MCD diet) for 6 weeks. We evaluated structural and functional liver changes by serial ultrasound studies, histopathological analysis, blood tests for liver enzymes and lipids, and measured liver inflammatory cell infiltration by flow cytometry. We developed an image processing algorithm that measures relative spatial thermal variation across the skin covering the liver. Thermal parameters including temperature variance, homogeneity levels and other textural features were fed as input to a t-SNE dimensionality reduction algorithm followed by k-means clustering. During weeks 3,4, and 5 of the experiment, our algorithm demonstrated a 100% detection rate and classified all mice correctly according to their disease status. Direct thermal imaging of the liver confirmed the presence of changes in surface thermography in diseased livers. We conclude that non-invasive thermal imaging combined with advanced image processing and machine learning-based analysis successfully correlates surface thermography with liver steatosis and inflammation in mice. Future development of this screening tool may improve our ability to study, diagnose and treat liver disease.

Brzezinski Rafael Y, Levin-Kotler Lapaz, Rabin Neta, Ovadia-Blechman Zehava, Zimmer Yair, Sternfeld Adi, Finchelman Joanna Molad, Unis Razan, Lewis Nir, Tepper-Shaihov Olga, Naftali-Shani Nili, Balint-Lahat Nora, Safran Michal, Ben-Ari Ziv, Grossman Ehud, Leor Jonathan, Hoffer Oshrit


General General

Machine learning-driven electronic identifications of single pathogenic bacteria.

In Scientific reports ; h5-index 158.0

A rapid method for screening pathogens can revolutionize health care by enabling infection control through medication before symptom. Here we report on label-free single-cell identifications of clinically-important pathogenic bacteria by using a polymer-integrated low thickness-to-diameter aspect ratio pore and machine learning-driven resistive pulse analyses. A high-spatiotemporal resolution of this electrical sensor enabled to observe galvanotactic response intrinsic to the microbes during their translocation. We demonstrated discrimination of the cellular motility via signal pattern classifications in a high-dimensional feature space. As the detection-to-decision can be completed within milliseconds, the present technique may be used for real-time screening of pathogenic bacteria for environmental and medical applications.

Hattori Shota, Sekido Rintaro, Leong Iat Wai, Tsutsui Makusu, Arima Akihide, Tanaka Masayoshi, Yokota Kazumichi, Washio Takashi, Kawai Tomoji, Okochi Mina


Public Health Public Health

The Effects of Obesity-Related Anthropometric Factors on Cardiovascular Risks of Homeless Adults in Taiwan.

In International journal of environmental research and public health ; h5-index 73.0

Homelessness is a pre-existing phenomenon in society and an important public health issue that national policy strives to solve. Cardiovascular disease (CVD) is an important health problem of the homeless. This cross-sectional study explored the effects of four obesity-related anthropometric factors-body mass index (BMI), waist circumference (WC), waist-to-hip ratio (WHR), and waist-to-height ratio (WHtR)-on cardiovascular disease risks (expressed by three CVD markers: hypertension, hyperglycemia, and hyperlipidemia) among homeless adults in Taipei and compared the relevant results with ordinary adults in Taiwan. The research team sampled homeless adults over the age of 20 in Taipei City in 2018 and collected 297 participants. Through anthropometric measurements, blood pressure measurements, and blood tests, we calculated the obesity-related indicators of the participants and found those at risks of cardiovascular disease. The results showed that the prevalence of hypertension, hyperglycemia, and hyperlipidemia in homeless adults was significantly higher than that of ordinary adults in Taiwan. Among the four obesity-related indicators, WHtR showed the strongest association with the prevalence of hypertension and hyperlipidemia, followed by WHR, both of which showed stronger association than traditional WC and BMI indicators. It can be inferred that abdominal obesity characterized by WHtR is a key risk factor for hypertension and hyperlipidemia in homeless adults in Taiwan. We hope that the results will provide medical clinical references and effectively warn of cardiovascular disease risks for the homeless in Taiwan.

Chen Ching-Lin, Chen Mingchih, Liu Chih-Kuang


BMI, WC, WHR, WHtR, cardiovascular risk, homeless adults