Model Card for Model ID

VIT-MAE-r is a fine-tuned version of MAE for image reconstuction. We release a version fine-tuned from MAE-Large

Model Details

VIT-MAE-r is already converted to hf format and should be able to be used directly by from_pretrained method.

Model Sources

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoImageProcessor, AutoModelForPreTraining
model = AutoModelForPreTraining.from_pretrained("bytetriper/vit-mae-r")

Evaluation

This model achieves a rFID on ImageNet val set of 1.24, evaluated using the standard tensorflow tool provided by Guided-Diffusion

Citation

BibTeX:

@article{zheng2024lm4lv, title={LM4LV: A Frozen Large Language Model for Low-level Vision Tasks}, author={Zheng, Boyang and Gu, Jinjin and Li, Shijun and Dong, Chao}, journal={arXiv preprint arXiv:2405.15734}, year={2024} }

Model Card Authors

Boyang Zheng

Model Card Contact

bytetriper@gmail.com

Downloads last month
14
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Dataset used to train bytetriper/vit-mae-r