In International journal of machine learning and cybernetics

Completely Automated Public Turing Test To Tell Computer and Humans Apart (CAPTCHA) is a computer program that prevents malicious computer users. Text-CAPTCHA schemes utilize less-computational costs. Hence, they are the most popularly used. This paper investigates the effectiveness of state-of-the-art (SOTA) text-CAPTCHA schemes, proposes a Multiview deep learning system to break them, and highlights their weaknesses. Rather than the usual single-view feature extraction, the proposed model explores correlational features from multiple views to increase the model's generalization and classification accuracy. The model combines convolutional neural networks and recurrent networks to preserve the input text-CAPTCHA's spatial and sequential order. The proposed system has successfully achieved average accuracies ranging from 93.6% to 100%, and the average time to break a text-CAPTCHA scheme ranges from 0.0032 to 0.21 seconds on eight different datasets. Furthermore, an ablation study on 71 human users was conducted to evaluate the effectiveness of the schemes. The results demonstrated that the proposed system effectively outperforms the human users whom the schemes were designed to serve. Lastly, when compared with existing systems, the proposed system outperforms existing SOTA systems with an accuracy gap of almost 40% higher.

Yusuf Mukhtar Opeyemi, Srivastava Divya, Singh Deepak, Rathor Vijaypal Singh


CAPTCHA, Connectionist temporal classification, Discriminative features, Multiview integration, Multiview learning classification, Security and privacy