weizhiwang
commited on
Commit
•
9bc8bda
1
Parent(s):
139d1e2
Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ pipeline_tag: video-text-to-text
|
|
12 |
|
13 |
<!-- Provide a quick summary of what the model is/does. -->
|
14 |
|
15 |
-
Please follow my github repo [LLaVA-
|
16 |
|
17 |
## Updates
|
18 |
- [6/4/2024] The codebase supports the video data fine-tuning for video understanding tasks.
|
@@ -111,7 +111,7 @@ The video is funny because it shows a baby girl wearing glasses and reading a bo
|
|
111 |
```
|
112 |
|
113 |
# Fine-Tune LLaVA-Llama-3 on Your Video Instruction Data
|
114 |
-
Please refer to
|
115 |
|
116 |
|
117 |
## Citation
|
|
|
12 |
|
13 |
<!-- Provide a quick summary of what the model is/does. -->
|
14 |
|
15 |
+
Please follow my github repo [LLaVA-Unified](https://github.com/Victorwz/LLaVA-Unified) for more details on fine-tuning LLaVA model with Llama-3 as the foundatiaon LLM.
|
16 |
|
17 |
## Updates
|
18 |
- [6/4/2024] The codebase supports the video data fine-tuning for video understanding tasks.
|
|
|
111 |
```
|
112 |
|
113 |
# Fine-Tune LLaVA-Llama-3 on Your Video Instruction Data
|
114 |
+
Please refer to our [LLaVA-Unified](https://github.com/Victorwz/LLaVA-Unified) git repo for fine-tuning data preparation and scripts. The data loading function and fastchat conversation template are changed due to a different tokenizer.
|
115 |
|
116 |
|
117 |
## Citation
|