Update README.md
Browse files
README.md
CHANGED
@@ -74,6 +74,7 @@ git clone https://huggingface.co/datasets/liuhaotian/LLaVA-Pretrain --depth=1
|
|
74 |
```
|
75 |
|
76 |
3. Finetune Data
|
|
|
77 |
Please check the final release version
|
78 |
|
79 |
## Cheers! Now train your own model!
|
@@ -83,5 +84,12 @@ NPROC_PER_NODE=8 xtuner train ./llava_internlm2_chat_7b_dinov2_e1_gpu8_pretrain.
|
|
83 |
```
|
84 |
The checkpoint and tensorboard logs are saved by default in ./work_dirs/. I only train it for 1 epoch to be same as the original LLaVA paper. Some researches also report that training for multiple epochs will make the model overfit the training dataset and perform worse in other domains.
|
85 |
|
|
|
|
|
|
|
|
|
|
|
|
|
86 |
2. Instruction following fine-tuning
|
|
|
87 |
Please check the final release version
|
|
|
74 |
```
|
75 |
|
76 |
3. Finetune Data
|
77 |
+
|
78 |
Please check the final release version
|
79 |
|
80 |
## Cheers! Now train your own model!
|
|
|
84 |
```
|
85 |
The checkpoint and tensorboard logs are saved by default in ./work_dirs/. I only train it for 1 epoch to be same as the original LLaVA paper. Some researches also report that training for multiple epochs will make the model overfit the training dataset and perform worse in other domains.
|
86 |
|
87 |
+
This is my loss curve for llava-clip-internlm2-1_8b-pretrain-v1:
|
88 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/642a298ae5f33939cf3ee600/iNxPxfOvSJq8ZPz8uP_sP.png)
|
89 |
+
|
90 |
+
And the learning rate curve:
|
91 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/642a298ae5f33939cf3ee600/U1U9Kapcd6AIEUySvt2RS.png)
|
92 |
+
|
93 |
2. Instruction following fine-tuning
|
94 |
+
|
95 |
Please check the final release version
|