AnyAttack: Anyattack: Towards Large-scale Self-supervised Adversarial Attacks on Vision-language Models
TL;DR
AnyAttack is a powerful adversarial attack model that can transform ordinary images into targeted adversarial examples capable of misleading Vision-Language Models (VLMs). By pre-training on the LAION-400M dataset, our model enables a benign image (e.g., a dog) to be misinterpreted by VLMs as any specified content (e.g., "this is violent content"), working across both open-source and commercial models.
Model Overview
AnyAttack is designed to generate adversarial examples efficiently and at scale. Unlike traditional adversarial methods, it does not require predefined labels and instead leverages a self-supervised adversarial noise generator trained on large-scale data.
For a detailed explanation of the AnyAttack framework and methodology, please visit our Project Page.
π Links & Resources
- Project Page: AnyAttack Website
- Paper: arXiv
- Code: GitHub.
π Citation
If you use AnyAttack in your research, please cite our work:
@inproceedings{zhang2025anyattack,
title={Anyattack: Towards Large-scale Self-supervised Adversarial Attacks on Vision-language Models},
author={Zhang, Jiaming and Ye, Junhong and Ma, Xingjun and Li, Yige and Yang, Yunfan and Yunhao, Chen and Sang, Jitao and Yeung, Dit-Yan},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2025}
}
β οΈ Disclaimer
This model is intended for research purposes only. The misuse of adversarial attacks can have ethical and legal implications. Please use responsibly.