Video-Text-to-Text
Transformers
Safetensors
English
llava
text-generation
multimodal
Eval Results
Inference Endpoints
ZhangYuanhan commited on
Commit
7633731
1 Parent(s): c0a6bab

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -1
README.md CHANGED
@@ -235,4 +235,14 @@ print(text_outputs)
235
  - **Orchestration:** [Huggingface Trainer](https://huggingface.co/docs/transformers/main_classes/trainer)
236
  - **Neural networks:** [PyTorch](https://github.com/pytorch/pytorch)
237
 
238
- # Citation
 
 
 
 
 
 
 
 
 
 
 
235
  - **Orchestration:** [Huggingface Trainer](https://huggingface.co/docs/transformers/main_classes/trainer)
236
  - **Neural networks:** [PyTorch](https://github.com/pytorch/pytorch)
237
 
238
+ # Citation
239
+
240
+ @misc{zhang2024videoinstructiontuningsynthetic,
241
+ title={Video Instruction Tuning With Synthetic Data},
242
+ author={Yuanhan Zhang and Jinming Wu and Wei Li and Bo Li and Zejun Ma and Ziwei Liu and Chunyuan Li},
243
+ year={2024},
244
+ eprint={2410.02713},
245
+ archivePrefix={arXiv},
246
+ primaryClass={cs.CV},
247
+ url={https://arxiv.org/abs/2410.02713},
248
+ }