ArXiv Preprint
The adversarial input generation problem has become central in establishing
the robustness and trustworthiness of deep neural nets, especially when they
are used in safety-critical application domains such as autonomous vehicles and
precision medicine. This is also practically challenging for multiple
reasons-scalability is a common issue owing to large-sized networks, and the
generated adversarial inputs often lack important qualities such as naturalness
and output-impartiality. We relate this problem to the task of patching neural
nets, i.e. applying small changes in some of the network$'$s weights so that
the modified net satisfies a given property. Intuitively, a patch can be used
to produce an adversarial input because the effect of changing the weights can
also be brought about by changing the inputs instead. This work presents a
novel technique to patch neural networks and an innovative approach of using it
to produce perturbations of inputs which are adversarial for the original net.
We note that the proposed solution is significantly more effective than the
prior state-of-the-art techniques.
Tooba Khan, Kumar Madhukar, Subodh Vishnu Sharma
2022-11-30