Text Generation
Transformers
Safetensors
English
llava
multimodal
conversational
Eval Results
Inference Endpoints
ZhangYuanhan commited on
Commit
d15b87d
1 Parent(s): 09f76c2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -118,7 +118,7 @@ base_model:
118
  ---
119
 
120
 
121
- # LLaVA-Video-72B-Qwen2
122
 
123
  ## Table of Contents
124
 
@@ -131,7 +131,7 @@ base_model:
131
 
132
  ## Model Summary
133
 
134
- The LLaVA-Video models are 7/72B parameter models trained on [LLaVA-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-NeXT-Video-SFT-Data), based on Qwen2 language model with a context window of 32K tokens.
135
 
136
  - **Repository:** [LLaVA-VL/LLaVA-NeXT](https://github.com/LLaVA-VL/LLaVA-NeXT?tab=readme-ov-file)
137
  - **Point of Contact:** [Yuanhan Zhang](https://zhangyuanhan-ai.github.io/)
@@ -142,7 +142,7 @@ The LLaVA-Video models are 7/72B parameter models trained on [LLaVA-Video-178K](
142
 
143
  ### Intended use
144
 
145
- The model was trained on [LLaVA-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-NeXT-Video-SFT-Data) and have the ability to interact with images, multi-image and videos, but specific to videos.
146
 
147
  **Feel free to share your generations in the Community tab!**
148
 
 
118
  ---
119
 
120
 
121
+ # LLaVA-NeXT-Video-72B-Qwen2
122
 
123
  ## Table of Contents
124
 
 
131
 
132
  ## Model Summary
133
 
134
+ The LLaVA-NeXT-Video models are 7/72B parameter models trained on [LLaVA-NeXT-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-NeXT-Video-SFT-Data), based on Qwen2 language model with a context window of 32K tokens.
135
 
136
  - **Repository:** [LLaVA-VL/LLaVA-NeXT](https://github.com/LLaVA-VL/LLaVA-NeXT?tab=readme-ov-file)
137
  - **Point of Contact:** [Yuanhan Zhang](https://zhangyuanhan-ai.github.io/)
 
142
 
143
  ### Intended use
144
 
145
+ The model was trained on [LLaVA-NeXT-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-NeXT-Video-SFT-Data) and have the ability to interact with images, multi-image and videos, but specific to videos.
146
 
147
  **Feel free to share your generations in the Community tab!**
148