kakinola23 commited on
Commit
a01d1fc
1 Parent(s): 683d44e

Model save

Browse files
README.md CHANGED
@@ -1,10 +1,11 @@
1
  ---
 
2
  library_name: peft
 
3
  tags:
4
  - trl
5
  - sft
6
  - generated_from_trainer
7
- base_model: kakinola23/Meta-Llama-3-8B-bnb-4bit-new
8
  model-index:
9
  - name: results
10
  results: []
@@ -15,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  # results
17
 
18
- This model is a fine-tuned version of [kakinola23/Meta-Llama-3-8B-bnb-4bit-new](https://huggingface.co/kakinola23/Meta-Llama-3-8B-bnb-4bit-new) on the None dataset.
19
 
20
  ## Model description
21
 
@@ -35,15 +36,15 @@ More information needed
35
 
36
  The following hyperparameters were used during training:
37
  - learning_rate: 0.0002
38
- - train_batch_size: 2
39
  - eval_batch_size: 8
40
  - seed: 42
41
- - gradient_accumulation_steps: 2
42
  - total_train_batch_size: 4
43
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
  - lr_scheduler_type: constant
45
  - lr_scheduler_warmup_ratio: 0.03
46
- - num_epochs: 1
47
  - mixed_precision_training: Native AMP
48
 
49
  ### Training results
@@ -52,8 +53,8 @@ The following hyperparameters were used during training:
52
 
53
  ### Framework versions
54
 
55
- - PEFT 0.11.1
56
- - Transformers 4.40.2
57
- - Pytorch 2.2.1+cu121
58
- - Datasets 2.19.1
59
  - Tokenizers 0.19.1
 
1
  ---
2
+ base_model: meta-llama/Llama-3.2-3B
3
  library_name: peft
4
+ license: llama3.2
5
  tags:
6
  - trl
7
  - sft
8
  - generated_from_trainer
 
9
  model-index:
10
  - name: results
11
  results: []
 
16
 
17
  # results
18
 
19
+ This model is a fine-tuned version of [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) on the None dataset.
20
 
21
  ## Model description
22
 
 
36
 
37
  The following hyperparameters were used during training:
38
  - learning_rate: 0.0002
39
+ - train_batch_size: 1
40
  - eval_batch_size: 8
41
  - seed: 42
42
+ - gradient_accumulation_steps: 4
43
  - total_train_batch_size: 4
44
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
  - lr_scheduler_type: constant
46
  - lr_scheduler_warmup_ratio: 0.03
47
+ - num_epochs: 10
48
  - mixed_precision_training: Native AMP
49
 
50
  ### Training results
 
53
 
54
  ### Framework versions
55
 
56
+ - PEFT 0.13.2
57
+ - Transformers 4.44.2
58
+ - Pytorch 2.4.1+cu121
59
+ - Datasets 3.0.1
60
  - Tokenizers 0.19.1
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fd6b190bd1f0e4676082b74f951f820f558263f70c2652f9cb80fabac371062f
3
  size 51402808
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:793c5821318618f1dadf09592f33b0910a896253d30bbda0478eb3b1609384b8
3
  size 51402808
runs/Oct14_17-02-58_7ba796e60d01/events.out.tfevents.1728925469.7ba796e60d01.7803.1 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fa3927665353be7725d6c45e2c6f0bc3c0b9a460fad5015fd01054c7cdcaa81a
3
- size 4184
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1b2d9b9282f4f544370ecf620322e1ee52b00d4b4e78fa2ae46cf560d22f6442
3
+ size 5845