Wonder-Griffin commited on
Commit
6b548c3
1 Parent(s): 66bbc51

Model save

Browse files
Files changed (3) hide show
  1. README.md +59 -66
  2. config.json +1 -1
  3. training_args.bin +2 -2
README.md CHANGED
@@ -1,66 +1,59 @@
1
- ---
2
- tags:
3
- - text-generation-inference
4
- model-index:
5
- - name: JudgeLLM2
6
- results: []
7
- license: wtfpl
8
- datasets:
9
- - Salesforce/wikitext
10
- language:
11
- - en
12
- library_name: transformers
13
- pipeline_tag: text-generation
14
- ---
15
-
16
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
17
- should probably proofread and complete it, then remove this comment. -->
18
-
19
- # JudgeLLM2
20
-
21
- This model was trained from scratch on an unknown dataset.
22
- It achieves the following results on the evaluation set:
23
- - Loss: 0.6889
24
-
25
- ## Model description
26
-
27
- More information needed
28
-
29
- ## Intended uses & limitations
30
-
31
- More information needed
32
-
33
- ## Training and evaluation data
34
-
35
- More information needed
36
-
37
- ## Training procedure
38
-
39
- ### Training hyperparameters
40
-
41
- The following hyperparameters were used during training:
42
- - learning_rate: 5e-05
43
- - train_batch_size: 16
44
- - eval_batch_size: 8
45
- - seed: 42
46
- - gradient_accumulation_steps: 4
47
- - total_train_batch_size: 64
48
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
49
- - lr_scheduler_type: linear
50
- - num_epochs: 3
51
-
52
- ### Training results
53
-
54
- | Training Loss | Epoch | Step | Validation Loss |
55
- |:-------------:|:------:|:----:|:---------------:|
56
- | 0.7372 | 0.8715 | 500 | 0.7445 |
57
- | 0.7295 | 1.7429 | 1000 | 0.7078 |
58
- | 0.7078 | 2.6144 | 1500 | 0.6889 |
59
-
60
-
61
- ### Framework versions
62
-
63
- - Transformers 4.43.3
64
- - Pytorch 2.4.0+cu124
65
- - Datasets 2.20.0
66
- - Tokenizers 0.19.1
 
1
+ ---
2
+ tags:
3
+ - generated_from_trainer
4
+ model-index:
5
+ - name: JudgeLLM2
6
+ results: []
7
+ ---
8
+
9
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
10
+ should probably proofread and complete it, then remove this comment. -->
11
+
12
+ # JudgeLLM2
13
+
14
+ This model was trained from scratch on an unknown dataset.
15
+ It achieves the following results on the evaluation set:
16
+ - Loss: 0.6889
17
+
18
+ ## Model description
19
+
20
+ More information needed
21
+
22
+ ## Intended uses & limitations
23
+
24
+ More information needed
25
+
26
+ ## Training and evaluation data
27
+
28
+ More information needed
29
+
30
+ ## Training procedure
31
+
32
+ ### Training hyperparameters
33
+
34
+ The following hyperparameters were used during training:
35
+ - learning_rate: 5e-05
36
+ - train_batch_size: 16
37
+ - eval_batch_size: 8
38
+ - seed: 42
39
+ - gradient_accumulation_steps: 4
40
+ - total_train_batch_size: 64
41
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
+ - lr_scheduler_type: linear
43
+ - num_epochs: 10
44
+
45
+ ### Training results
46
+
47
+ | Training Loss | Epoch | Step | Validation Loss |
48
+ |:-------------:|:------:|:----:|:---------------:|
49
+ | 0.7372 | 0.8715 | 500 | 0.7445 |
50
+ | 0.7295 | 1.7429 | 1000 | 0.7078 |
51
+ | 0.7078 | 2.6144 | 1500 | 0.6889 |
52
+
53
+
54
+ ### Framework versions
55
+
56
+ - Transformers 4.43.3
57
+ - Pytorch 2.4.0+cu124
58
+ - Datasets 2.20.0
59
+ - Tokenizers 0.19.1
 
 
 
 
 
 
 
config.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "_name_": "Judge-GPT2",
3
- "_name_or_path": "C:/Users/wonde/text-generation-ai/JudgeLLM2/checkpoint-466",
4
  "activation_function": "gelu_new",
5
  "architectures": [
6
  "GPT2LMHeadModel"
 
1
  {
2
  "_name_": "Judge-GPT2",
3
+ "_name_or_path": "C:/Users/wonde/text-generation-ai/JudgeLLM2/checkpoint-1719",
4
  "activation_function": "gelu_new",
5
  "architectures": [
6
  "GPT2LMHeadModel"
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7fb1d148aa449fc22de9dbeecbd962246d0a97c5188a5cbada4f57d726cbfbce
3
- size 5176
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0c5d66bebf67904ec00e3196bc94f690f59000b1f98628db2f28b08845c0ecae
3
+ size 5112