g-h-chen commited on
Commit
f7c7794
1 Parent(s): 73293b8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -89,11 +89,11 @@ See [here](https://github.com/FreedomIntelligence/ALLaVA/tree/main?tab=readme-ov
89
  ## 🏋️‍♂️ Training
90
 
91
  ### Data
92
- <!-- <div align=center>
93
  <img src="training_datasets_by_stage.jpg" width = "640" alt="training_datasets" align=center />
94
- </div> -->
95
 
96
- ALLaVA uses 795K and 1.4M data for PT. and FT., respectively.
97
 
98
 
99
  ### Code
@@ -110,7 +110,7 @@ These two models share the same PT procedure. -->
110
  ### Hyperparameters
111
 
112
  | Global Batch Size| ZeRO Stage| Optimizer | Max LR| Min LR | Scheduler | Weight decay |
113
- | ---: | ---: |--:| ---: | ---: | ---: | ---: | ---: |
114
  | 256 (PT) / 128 (FT) | 1| AdamW | 2e-5 | 2e-6 | CosineAnnealingWarmRestarts | 0 |
115
 
116
  The LM backbone, projector are trainable, while the vision encoder is kept frozen.
 
89
  ## 🏋️‍♂️ Training
90
 
91
  ### Data
92
+ <div align=center>
93
  <img src="training_datasets_by_stage.jpg" width = "640" alt="training_datasets" align=center />
94
+ </div>
95
 
96
+ ALLaVA uses 1.0M and 1.5M data for PT. and FT., respectively.
97
 
98
 
99
  ### Code
 
110
  ### Hyperparameters
111
 
112
  | Global Batch Size| ZeRO Stage| Optimizer | Max LR| Min LR | Scheduler | Weight decay |
113
+ | ---: | ---: |--:| ---: | ---: | ---: | ---: |
114
  | 256 (PT) / 128 (FT) | 1| AdamW | 2e-5 | 2e-6 | CosineAnnealingWarmRestarts | 0 |
115
 
116
  The LM backbone, projector are trainable, while the vision encoder is kept frozen.