|
--- |
|
license: cc-by-4.0 |
|
--- |
|
|
|
# Model card for FLAME-ViT-B-16 |
|
|
|
## Model description |
|
|
|
* **Model Type:** A FLAME-CC3M-ViT-B-16 model and a FLAME-YFCC15M-ViT-B-16 model. |
|
|
|
* **Task:** Long/short/multilingual-context image-text retrieval, zero-shot image classification. |
|
|
|
## Uses |
|
|
|
See https://github.com/MIV-XJTU/FLAME. |
|
|
|
## Citation |
|
|
|
```bibtex |
|
@article{cao2024flame, |
|
title={FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-training}, |
|
author={Cao, Anjia and Wei, Xing and Ma, Zhiheng}, |
|
journal={arXiv preprint arXiv:2411.11927}, |
|
year={2024} |
|
} |
|
``` |
|
|
|
|