YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Model Card for DuoduoCLIP

In this model repo we provide the official pretrained models used in the paper Duoduo CLIP: Efficient 3D Understanding with Multi-View Images. The model usage and code can be found in the github repo.

Note: We provide the main model in the initial release, we will soon upload the other models used in the paper.

Model Details

Model Description

  • Finetuned from model: OpenCLIP model ("ViT-B-32" architecture and checkpoint "laion2b_s34b_b79k")

Model Sources

Model Checkpoints

  • Four_1to6F_bs1600_LT6.ckpt: The model trained with the Four dataset and 1 to 6 frames sampled during training, with the last 6 attention layers trainable.

Training Data

The dataset card can be found here.

BibTeX:

@misc{lee2024duoduo,
      title={Duoduo CLIP: Efficient 3D Understanding with Multi-View Images}, 
      author={Han-Hung Lee and Yiming Zhang and Angel X. Chang},
      year={2024},
      eprint={2406.11579},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

This work was funded by a CIFAR AI Chair, a NSERC Discovery grant, and a CFI/BCKDF JELF grant.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.