--- license: cc-by-4.0 --- # Model card for FLAME-ViT-B-16 ## Model description * **Model Type:** A FLAME-CC3M-ViT-B-16 model and a FLAME-YFCC15M-ViT-B-16 model. * **Task:** Long/short/multilingual-context image-text retrieval, zero-shot image classification. ## Uses See https://github.com/MIV-XJTU/FLAME. ## Citation ```bibtex @article{cao2024flame, title={FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-training}, author={Cao, Anjia and Wei, Xing and Ma, Zhiheng}, journal={arXiv preprint arXiv:2411.11927}, year={2024} } ```