alvarobartt HF staff commited on
Commit
8acbf91
1 Parent(s): 783e5a7

Training in progress, epoch 2

Browse files
Files changed (2) hide show
  1. README.md +7 -8
  2. adapter_model.safetensors +1 -1
README.md CHANGED
@@ -1,6 +1,5 @@
1
  ---
2
- base_model:
3
- - google/gemma-2b-it
4
  datasets:
5
  - generator
6
  library_name: peft
@@ -10,16 +9,16 @@ tags:
10
  - sft
11
  - generated_from_trainer
12
  model-index:
13
- - name: gemma-2b-it-lora-magicoder
14
  results: []
15
  ---
16
 
17
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
18
  should probably proofread and complete it, then remove this comment. -->
19
 
20
- # gemma-2b-it-lora-magicoder
21
 
22
- This model is a fine-tuned version of [google/gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it) on the alvarobartt/Magicoder-OAI dataset.
23
 
24
  ## Model description
25
 
@@ -43,10 +42,10 @@ The following hyperparameters were used during training:
43
  - eval_batch_size: 8
44
  - seed: 42
45
  - distributed_type: multi-GPU
46
- - num_devices: 2
47
  - gradient_accumulation_steps: 2
48
- - total_train_batch_size: 16
49
- - total_eval_batch_size: 16
50
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
51
  - lr_scheduler_type: cosine
52
  - lr_scheduler_warmup_ratio: 0.1
 
1
  ---
2
+ base_model: google/gemma-2-9b-it
 
3
  datasets:
4
  - generator
5
  library_name: peft
 
9
  - sft
10
  - generated_from_trainer
11
  model-index:
12
+ - name: gemma-2-9b-it-lora-magicoder
13
  results: []
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
  should probably proofread and complete it, then remove this comment. -->
18
 
19
+ # gemma-2-9b-it-lora-magicoder
20
 
21
+ This model is a fine-tuned version of [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it) on the alvarobartt/Magicoder-OAI dataset.
22
 
23
  ## Model description
24
 
 
42
  - eval_batch_size: 8
43
  - seed: 42
44
  - distributed_type: multi-GPU
45
+ - num_devices: 4
46
  - gradient_accumulation_steps: 2
47
+ - total_train_batch_size: 32
48
+ - total_eval_batch_size: 32
49
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
50
  - lr_scheduler_type: cosine
51
  - lr_scheduler_warmup_ratio: 0.1
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:97d5e52b5668436977b4675922d9a5540ecdfc875c8488911bbe650cec148de2
3
  size 41582088
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9c66af4715b1ada16a73c930bee05be9361c8f37d63d20cefea0d2bc9930245a
3
  size 41582088