SuperkingbasSKB commited on
Commit
5936b31
1 Parent(s): 29f2d27

End of training

Browse files
README.md CHANGED
@@ -13,7 +13,9 @@ should probably proofread and complete it, then remove this comment. -->
13
 
14
  # idefics2-8b-docvqa-finetuned-tutorial
15
 
16
- This model is a fine-tuned version of [HuggingFaceM4/idefics2-8b](https://huggingface.co/HuggingFaceM4/idefics2-8b) on an unknown dataset.
 
 
17
 
18
  ## Model description
19
 
@@ -33,19 +35,23 @@ More information needed
33
 
34
  The following hyperparameters were used during training:
35
  - learning_rate: 0.0001
36
- - train_batch_size: 2
37
  - eval_batch_size: 8
38
  - seed: 42
39
  - gradient_accumulation_steps: 8
40
- - total_train_batch_size: 16
41
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
  - lr_scheduler_type: linear
43
- - lr_scheduler_warmup_steps: 50
44
  - num_epochs: 2
45
  - mixed_precision_training: Native AMP
46
 
47
  ### Training results
48
 
 
 
 
 
49
 
50
 
51
  ### Framework versions
 
13
 
14
  # idefics2-8b-docvqa-finetuned-tutorial
15
 
16
+ This model is a fine-tuned version of [HuggingFaceM4/idefics2-8b](https://huggingface.co/HuggingFaceM4/idefics2-8b) on the None dataset.
17
+ It achieves the following results on the evaluation set:
18
+ - Loss: 0.0340
19
 
20
  ## Model description
21
 
 
35
 
36
  The following hyperparameters were used during training:
37
  - learning_rate: 0.0001
38
+ - train_batch_size: 8
39
  - eval_batch_size: 8
40
  - seed: 42
41
  - gradient_accumulation_steps: 8
42
+ - total_train_batch_size: 64
43
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
  - lr_scheduler_type: linear
45
+ - lr_scheduler_warmup_steps: 10
46
  - num_epochs: 2
47
  - mixed_precision_training: Native AMP
48
 
49
  ### Training results
50
 
51
+ | Training Loss | Epoch | Step | Validation Loss |
52
+ |:-------------:|:------:|:----:|:---------------:|
53
+ | 0.044 | 0.9984 | 156 | 0.0368 |
54
+ | 0.0361 | 1.9968 | 312 | 0.0340 |
55
 
56
 
57
  ### Framework versions
adapter_config.json CHANGED
@@ -19,7 +19,7 @@
19
  "r": 8,
20
  "rank_pattern": {},
21
  "revision": null,
22
- "target_modules": ".*(text_model|modality_projection|perceiver_resampler).*(down_proj|gate_proj|up_proj|k_proj|q_proj|v_proj|o_proj).*$",
23
  "task_type": null,
24
  "use_dora": false,
25
  "use_rslora": false
 
19
  "r": 8,
20
  "rank_pattern": {},
21
  "revision": null,
22
+ "target_modules": ".*(down_proj|gate_proj|up_proj|k_proj|q_proj|v_proj|o_proj).*$",
23
  "task_type": null,
24
  "use_dora": false,
25
  "use_rslora": false
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f1cac69b2fd6888ae4529f9befa8aa1aa568e4889d291a1b0972970fc60068e2
3
- size 93378688
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:335fcd6fbf5e1b1d4f8d97f85af5d911abb665728c15765f2dcad2279b59703e
3
+ size 99375704
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e79c07f9e8249f828c3f157677ea6dfa18628549245db09feefcbfbeb7d8646e
3
  size 5112
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3724fec2d7e56d43a85516594667a5540289274a9fea040ae3f8b3d3093d9b8b
3
  size 5112