g-h-chen commited on
Commit
baa2743
β€’
1 Parent(s): 73025a9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -8
README.md CHANGED
@@ -78,22 +78,24 @@ achieve competitive results on 17 benchmarks.
78
 
79
  ## 🏭 Inference
80
 
81
- ### Load from πŸ€— (Recommended)
82
- See the [example script](https://github.com/FreedomIntelligence/ALLaVA/blob/main/allava/serve/huggingface_inference.py).
 
 
83
 
84
- ### CLI
85
- See [here](https://github.com/FreedomIntelligence/ALLaVA/tree/main?tab=readme-ov-file#cli) for CLI code snippet.
86
 
87
 
88
 
89
  ## πŸ‹οΈβ€β™‚οΈ Training
90
 
91
  ### Data
92
- <!-- <div align=center>
93
  <img src="training_datasets_by_stage.jpg" width = "640" alt="training_datasets" align=center />
94
- </div> -->
95
 
96
- ALLaVA uses 795K and 1.4M data for PT. and FT., respectively.
97
 
98
 
99
  ### Code
@@ -110,7 +112,7 @@ These two models share the same PT procedure. -->
110
  ### Hyperparameters
111
 
112
  | Global Batch Size| ZeRO Stage| Optimizer | Max LR| Min LR | Scheduler | Weight decay |
113
- | ---: | ---: |--:| ---: | ---: | ---: | ---: | ---: |
114
  | 256 (PT) / 128 (FT) | 1| AdamW | 2e-5 | 2e-6 | CosineAnnealingWarmRestarts | 0 |
115
 
116
  The LM backbone, projector are trainable, while the vision encoder is kept frozen.
 
78
 
79
  ## 🏭 Inference
80
 
81
+ All models can be loaded from πŸ€— with `.from_pretrained()`.
82
+ Check out the [example scripts](https://github.com/FreedomIntelligence/ALLaVA/tree/main/allava/serve) and make sure you have the same outputs as shown in the scripts.
83
+ <!-- ### Load from πŸ€— (Recommended)
84
+ See the [example script](https://github.com/FreedomIntelligence/ALLaVA/blob/main/allava/serve/huggingface_inference.py). -->
85
 
86
+ <!-- ### CLI
87
+ See [here](https://github.com/FreedomIntelligence/ALLaVA/tree/main?tab=readme-ov-file#cli) for CLI code snippet. -->
88
 
89
 
90
 
91
  ## πŸ‹οΈβ€β™‚οΈ Training
92
 
93
  ### Data
94
+ <div align=center>
95
  <img src="training_datasets_by_stage.jpg" width = "640" alt="training_datasets" align=center />
96
+ </div>
97
 
98
+ ALLaVA uses 1.0M and 1.5M data for PT. and FT., respectively.
99
 
100
 
101
  ### Code
 
112
  ### Hyperparameters
113
 
114
  | Global Batch Size| ZeRO Stage| Optimizer | Max LR| Min LR | Scheduler | Weight decay |
115
+ | ---: | ---: |--:| ---: | ---: | ---: | ---: |
116
  | 256 (PT) / 128 (FT) | 1| AdamW | 2e-5 | 2e-6 | CosineAnnealingWarmRestarts | 0 |
117
 
118
  The LM backbone, projector are trainable, while the vision encoder is kept frozen.