DongfuJiang commited on
Commit
a34165a
1 Parent(s): cd587ff

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -50
README.md CHANGED
@@ -1,56 +1,8 @@
1
  ---
2
- tags:
3
- - generated_from_trainer
4
  model-index:
5
  - name: llava_siglip_llama3_8b_pretrain_8192
6
  results: []
 
7
  ---
8
 
9
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
10
- should probably proofread and complete it, then remove this comment. -->
11
-
12
- # llava_siglip_llama3_8b_pretrain_8192
13
-
14
- This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
15
-
16
- ## Model description
17
-
18
- More information needed
19
-
20
- ## Intended uses & limitations
21
-
22
- More information needed
23
-
24
- ## Training and evaluation data
25
-
26
- More information needed
27
-
28
- ## Training procedure
29
-
30
- ### Training hyperparameters
31
-
32
- The following hyperparameters were used during training:
33
- - learning_rate: 0.001
34
- - train_batch_size: 1
35
- - eval_batch_size: 8
36
- - seed: 42
37
- - distributed_type: multi-GPU
38
- - num_devices: 8
39
- - gradient_accumulation_steps: 32
40
- - total_train_batch_size: 256
41
- - total_eval_batch_size: 64
42
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
43
- - lr_scheduler_type: cosine
44
- - lr_scheduler_warmup_ratio: 0.03
45
- - num_epochs: 1.0
46
-
47
- ### Training results
48
-
49
-
50
-
51
- ### Framework versions
52
-
53
- - Transformers 4.40.0
54
- - Pytorch 2.2.1
55
- - Datasets 2.17.1
56
- - Tokenizers 0.19.1
 
1
  ---
 
 
2
  model-index:
3
  - name: llava_siglip_llama3_8b_pretrain_8192
4
  results: []
5
+ license: llama3
6
  ---
7
 
8
+ **See the Mantis-Instruct fine-tuned version here [TIGER-Lab/Mantis-8B-siglip-llama3](https://huggingface.co/TIGER-Lab/Mantis-8B-siglip-llama3). This checkpoint is just for experiments reproduction and does not serve as a functional model**