In IEEE transactions on cybernetics
Attribute reduction is one of the most important preprocessing steps in machine learning and data mining. As a key step of attribute reduction, attribute evaluation directly affects classification performance, search time, and stopping criterion. The existing evaluation functions are greatly dependent on the relationship between objects, which makes its computational time and space more costly. To solve this problem, we propose a novel separability-based evaluation function and reduction method by using the relationship between objects and decision categories directly. The degree of aggregation (DA) of intraclass objects and the degree of dispersion (DD) of between-class objects are first defined to measure the significance of an attribute subset. Then, the separability of attribute subsets is defined by DA and DD in fuzzy decision systems, and we design a sequentially forward selection based on the separability (SFSS) algorithm to select attributes. Furthermore, a postpruning strategy is introduced to prevent overfitting and determine a termination parameter. Finally, the SFSS algorithm is compared with some typical reduction algorithms using some public datasets from UCI and ELVIRA Biomedical repositories. The interpretability of SFSS is directly presented by the performance on MNIST handwritten digits. The experimental comparisons show that SFSS is fast and robust, which has higher classification accuracy and compression ratio, with extremely low computational time.
Hu Meng, Tsang Eric C C, Guo Yanting, Xu Weihua