mwoelki commited on
Commit
c9351a8
1 Parent(s): bc61006

End of training

Browse files
README.md ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: distilbert/distilgpt2
4
+ tags:
5
+ - generated_from_trainer
6
+ model-index:
7
+ - name: my_patent_abstract_causual_language-model
8
+ results: []
9
+ ---
10
+
11
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
+ should probably proofread and complete it, then remove this comment. -->
13
+
14
+ # my_patent_abstract_causual_language-model
15
+
16
+ This model is a fine-tuned version of [distilbert/distilgpt2](https://huggingface.co/distilbert/distilgpt2) on the None dataset.
17
+ It achieves the following results on the evaluation set:
18
+ - Loss: 1.3468
19
+
20
+ ## Model description
21
+
22
+ More information needed
23
+
24
+ ## Intended uses & limitations
25
+
26
+ More information needed
27
+
28
+ ## Training and evaluation data
29
+
30
+ More information needed
31
+
32
+ ## Training procedure
33
+
34
+ ### Training hyperparameters
35
+
36
+ The following hyperparameters were used during training:
37
+ - learning_rate: 2e-05
38
+ - train_batch_size: 8
39
+ - eval_batch_size: 8
40
+ - seed: 42
41
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
+ - lr_scheduler_type: linear
43
+ - num_epochs: 3.0
44
+
45
+ ### Training results
46
+
47
+ | Training Loss | Epoch | Step | Validation Loss |
48
+ |:-------------:|:-----:|:----:|:---------------:|
49
+ | No log | 1.0 | 486 | 1.5060 |
50
+ | 1.8345 | 2.0 | 972 | 1.3786 |
51
+ | 1.5008 | 3.0 | 1458 | 1.3468 |
52
+
53
+
54
+ ### Framework versions
55
+
56
+ - Transformers 4.41.2
57
+ - Pytorch 2.3.0+cu121
58
+ - Datasets 2.20.0
59
+ - Tokenizers 0.19.1
config.json CHANGED
@@ -7,6 +7,7 @@
7
  ],
8
  "attn_pdrop": 0.1,
9
  "bos_token_id": 50256,
 
10
  "embd_pdrop": 0.1,
11
  "eos_token_id": 50256,
12
  "id2label": {
@@ -17,6 +18,7 @@
17
  "LABEL_0": 0
18
  },
19
  "layer_norm_epsilon": 1e-05,
 
20
  "model_type": "gpt2",
21
  "n_ctx": 1024,
22
  "n_embd": 768,
 
7
  ],
8
  "attn_pdrop": 0.1,
9
  "bos_token_id": 50256,
10
+ "do_sample": true,
11
  "embd_pdrop": 0.1,
12
  "eos_token_id": 50256,
13
  "id2label": {
 
18
  "LABEL_0": 0
19
  },
20
  "layer_norm_epsilon": 1e-05,
21
+ "max_length": 50,
22
  "model_type": "gpt2",
23
  "n_ctx": 1024,
24
  "n_embd": 768,
generation_config.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 50256,
4
+ "do_sample": true,
5
+ "eos_token_id": 50256,
6
+ "max_length": 50,
7
+ "pad_token_id": 50256,
8
+ "transformers_version": "4.41.2"
9
+ }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1132525746036e285687c62819138dc5b339855e4624de5d7474b47951ae3d63
3
  size 327657928
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:549a2ebb459df287fc7ae7f94116531177b0d8a05e912ec196d395ff512992ec
3
  size 327657928
runs/Jun15_08-24-21_3981491d00a1/events.out.tfevents.1718439862.3981491d00a1.941.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8092d1e0897cf30d6d7894a92f6a5f42242c57b5070d0655dbf929839b5f8c38
3
- size 6194
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:60e480d440cb9aafd07bc1f6e625f0c2819fa3eb604d91c8f11929570c1d4d1c
3
+ size 6819
runs/Jun15_08-24-21_3981491d00a1/events.out.tfevents.1718440459.3981491d00a1.941.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:be890245dd48a2489ad362c01c5e269a28ce96d0dfa7ceeb97a58f9eb6f60528
3
+ size 359