AnyAttack: Anyattack: Towards Large-scale Self-supervised Adversarial Attacks on Vision-language Models

TL;DR

AnyAttack is a powerful adversarial attack model that can transform ordinary images into targeted adversarial examples capable of misleading Vision-Language Models (VLMs). By pre-training on the LAION-400M dataset, our model enables a benign image (e.g., a dog) to be misinterpreted by VLMs as any specified content (e.g., "this is violent content"), working across both open-source and commercial models.

Model Overview

AnyAttack is designed to generate adversarial examples efficiently and at scale. Unlike traditional adversarial methods, it does not require predefined labels and instead leverages a self-supervised adversarial noise generator trained on large-scale data.

For a detailed explanation of the AnyAttack framework and methodology, please visit our Project Page.

πŸ”— Links & Resources

πŸ“œ Citation

If you use AnyAttack in your research, please cite our work:

@inproceedings{zhang2025anyattack,
    title={Anyattack: Towards Large-scale Self-supervised Adversarial Attacks on Vision-language Models},
    author={Zhang, Jiaming and Ye, Junhong and Ma, Xingjun and Li, Yige and Yang, Yunfan and Yunhao, Chen and Sang, Jitao and Yeung, Dit-Yan},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    year={2025}
}

⚠️ Disclaimer

This model is intended for research purposes only. The misuse of adversarial attacks can have ethical and legal implications. Please use responsibly.


⭐ If you find this model useful, please give it a star on Hugging Face! ⭐

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.