DarshanDeshpande commited on
Commit
7223345
1 Parent(s): 3bdbd19

gemma_2b_social_reasoning_reward_model

Browse files
README.md CHANGED
@@ -9,19 +9,19 @@ base_model: google/gemma-2b
9
  metrics:
10
  - accuracy
11
  model-index:
12
- - name: reward_model
13
  results: []
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
  should probably proofread and complete it, then remove this comment. -->
18
 
19
- # reward_model
20
 
21
  This model is a fine-tuned version of [google/gemma-2b](https://huggingface.co/google/gemma-2b) on an unknown dataset.
22
  It achieves the following results on the evaluation set:
23
- - Loss: 0.6165
24
- - Accuracy: 0.6706
25
 
26
  ## Model description
27
 
@@ -41,13 +41,13 @@ More information needed
41
 
42
  The following hyperparameters were used during training:
43
  - learning_rate: 0.0005
44
- - train_batch_size: 32
45
- - eval_batch_size: 32
46
  - seed: 42
47
  - gradient_accumulation_steps: 4
48
- - total_train_batch_size: 128
49
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
50
- - lr_scheduler_type: linear
51
  - lr_scheduler_warmup_steps: 10
52
  - num_epochs: 3
53
 
@@ -55,7 +55,18 @@ The following hyperparameters were used during training:
55
 
56
  | Training Loss | Epoch | Step | Validation Loss | Accuracy |
57
  |:-------------:|:-----:|:----:|:---------------:|:--------:|
58
- | 0.5172 | 1.67 | 30 | 0.6530 | 0.6391 |
 
 
 
 
 
 
 
 
 
 
 
59
 
60
 
61
  ### Framework versions
 
9
  metrics:
10
  - accuracy
11
  model-index:
12
+ - name: gemma_2b_social_reasoning_reward_model
13
  results: []
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
  should probably proofread and complete it, then remove this comment. -->
18
 
19
+ # gemma_2b_social_reasoning_reward_model
20
 
21
  This model is a fine-tuned version of [google/gemma-2b](https://huggingface.co/google/gemma-2b) on an unknown dataset.
22
  It achieves the following results on the evaluation set:
23
+ - Loss: 0.6045
24
+ - Accuracy: 0.6818
25
 
26
  ## Model description
27
 
 
41
 
42
  The following hyperparameters were used during training:
43
  - learning_rate: 0.0005
44
+ - train_batch_size: 16
45
+ - eval_batch_size: 16
46
  - seed: 42
47
  - gradient_accumulation_steps: 4
48
+ - total_train_batch_size: 64
49
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
50
+ - lr_scheduler_type: cosine
51
  - lr_scheduler_warmup_steps: 10
52
  - num_epochs: 3
53
 
 
55
 
56
  | Training Loss | Epoch | Step | Validation Loss | Accuracy |
57
  |:-------------:|:-----:|:----:|:---------------:|:--------:|
58
+ | 0.8047 | 0.24 | 10 | 0.7041 | 0.5447 |
59
+ | 0.6758 | 0.48 | 20 | 0.6179 | 0.6725 |
60
+ | 0.6436 | 0.72 | 30 | 0.6066 | 0.6690 |
61
+ | 0.6423 | 0.96 | 40 | 0.6053 | 0.6813 |
62
+ | 0.5838 | 1.2 | 50 | 0.6080 | 0.6637 |
63
+ | 0.5871 | 1.44 | 60 | 0.6301 | 0.6480 |
64
+ | 0.5928 | 1.68 | 70 | 0.6338 | 0.6515 |
65
+ | 0.5777 | 1.92 | 80 | 0.6198 | 0.6375 |
66
+ | 0.5378 | 2.16 | 90 | 0.6225 | 0.6392 |
67
+ | 0.5163 | 2.4 | 100 | 0.6222 | 0.6532 |
68
+ | 0.4963 | 2.63 | 110 | 0.6224 | 0.6567 |
69
+ | 0.5176 | 2.87 | 120 | 0.6229 | 0.6550 |
70
 
71
 
72
  ### Framework versions
adapter_config.json CHANGED
@@ -15,7 +15,7 @@
15
  "megatron_core": "megatron.core",
16
  "modules_to_save": null,
17
  "peft_type": "LORA",
18
- "r": 16,
19
  "rank_pattern": {},
20
  "revision": null,
21
  "target_modules": [
 
15
  "megatron_core": "megatron.core",
16
  "modules_to_save": null,
17
  "peft_type": "LORA",
18
+ "r": 32,
19
  "rank_pattern": {},
20
  "revision": null,
21
  "target_modules": [
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:885e4dc9e4914b57e04cf8a53fe6a6a3a8cf7acc903e1f2c29baf16e425ff6a3
3
- size 7382336
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6d38f5803bc03fe4650091c23e4cb0d5cbad8ac7dd2ec4f73d350c5281d1704a
3
+ size 14763488
special_tokens_map.json CHANGED
@@ -13,13 +13,7 @@
13
  "rstrip": false,
14
  "single_word": false
15
  },
16
- "pad_token": {
17
- "content": "<pad>",
18
- "lstrip": false,
19
- "normalized": false,
20
- "rstrip": false,
21
- "single_word": false
22
- },
23
  "unk_token": {
24
  "content": "<unk>",
25
  "lstrip": false,
 
13
  "rstrip": false,
14
  "single_word": false
15
  },
16
+ "pad_token": "<eos>",
 
 
 
 
 
 
17
  "unk_token": {
18
  "content": "<unk>",
19
  "lstrip": false,
tokenizer_config.json CHANGED
@@ -40,7 +40,7 @@
40
  "eos_token": "<eos>",
41
  "legacy": null,
42
  "model_max_length": 1000000000000000019884624838656,
43
- "pad_token": "<pad>",
44
  "sp_model_kwargs": {},
45
  "spaces_between_special_tokens": false,
46
  "tokenizer_class": "GemmaTokenizer",
 
40
  "eos_token": "<eos>",
41
  "legacy": null,
42
  "model_max_length": 1000000000000000019884624838656,
43
+ "pad_token": "<eos>",
44
  "sp_model_kwargs": {},
45
  "spaces_between_special_tokens": false,
46
  "tokenizer_class": "GemmaTokenizer",
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a1ad659a7c27a1bd5a7e8340c70b7820198768eac7e89fea17c23138c078f482
3
- size 4856
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:238ed77d8152efed5f0fea1c0c4f62ae52975da6e0ad69f6d803dbce7985c7fa
3
+ size 4920