In Neural networks : the official journal of the International Neural Network Society
Due to their unprecedented capacity to learn patterns from raw data, deep neural networks have become the de facto modeling choice to address complex machine learning tasks. However, recent works have emphasized the vulnerability of deep neural networks when being fed with intelligently manipulated adversarial data instances tailored to confuse the model. In order to overcome this issue, a major effort has been made to find methods capable of making deep learning models robust against adversarial inputs. This work presents a new perspective for improving the robustness of deep neural networks in image classification. In computer vision scenarios, adversarial images are crafted by manipulating legitimate inputs so that the target classifier is eventually fooled, but the manipulation is not visually distinguishable by an external observer. The reason for the imperceptibility of the attack is that the human visual system fails to detect minor variations in color space, but excels at detecting anomalies in geometric shapes. We capitalize on this fact by extracting color gradient features from input images at multiple sensitivity levels to detect possible manipulations. We resort to a deep neural classifier to predict the category of unseen images, whereas a discrimination model analyzes the extracted color gradient features with time series techniques to determine the legitimacy of input images. The performance of our method is assessed over experiments comprising state-of-the-art techniques for crafting adversarial attacks. Results corroborate the increased robustness of the classifier when using our discrimination module, yielding drastically reduced success rates of adversarial attacks that operate on the whole image rather than on localized regions or around the existing shapes of the image. Future research is outlined towards improving the detection accuracy of the proposed method for more general attack strategies.
Oregi Izaskun, Del Ser Javier, Pérez Aritz, Lozano José A
Adversarial machine learning, Computer vision, Deep neural networks, Time series analysis