nejox
/

distilbert-base-cased-distilled-squad-coffee20230108

Question Answering

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

nejox commited on Jan 8, 2023

Commit

21b3ed7

•

1 Parent(s): e984dfc

update model card README.md

Files changed (1) hide show

README.md +18 -18

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [distilbert-base-cased-distilled-squad](https://huggingface.co/distilbert-base-cased-distilled-squad) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.2375
 ## Model description
@@ -34,8 +34,8 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
-- train_batch_size: 16
-- eval_batch_size: 16
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
@@ -46,21 +46,21 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| No log        | 1.0   | 2    | 2.6360          |
-| No log        | 2.0   | 4    | 2.6283          |
-| No log        | 3.0   | 6    | 2.6177          |
-| No log        | 4.0   | 8    | 2.5990          |
-| No log        | 5.0   | 10   | 2.5762          |
-| No log        | 6.0   | 12   | 2.5464          |
-| No log        | 7.0   | 14   | 2.5160          |
-| No log        | 8.0   | 16   | 2.4832          |
-| No log        | 9.0   | 18   | 2.4497          |
-| No log        | 10.0  | 20   | 2.4124          |
-| No log        | 11.0  | 22   | 2.3740          |
-| No log        | 12.0  | 24   | 2.3414          |
-| No log        | 13.0  | 26   | 2.3065          |
-| No log        | 14.0  | 28   | 2.2717          |
-| No log        | 15.0  | 30   | 2.2375          |
 ### Framework versions

 This model is a fine-tuned version of [distilbert-base-cased-distilled-squad](https://huggingface.co/distilbert-base-cased-distilled-squad) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 4.4291
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
+- train_batch_size: 8
+- eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| No log        | 1.0   | 92   | 1.8403          |
+| 2.3138        | 2.0   | 184  | 1.7960          |
+| 1.5166        | 3.0   | 276  | 1.8769          |
+| 0.9893        | 4.0   | 368  | 2.0993          |
+| 0.7197        | 5.0   | 460  | 2.5598          |
+| 0.4902        | 6.0   | 552  | 2.9495          |
+| 0.4027        | 7.0   | 644  | 2.9910          |
+| 0.3031        | 8.0   | 736  | 3.4344          |
+| 0.1708        | 9.0   | 828  | 3.9058          |
+| 0.1239        | 10.0  | 920  | 3.9043          |
+| 0.0651        | 11.0  | 1012 | 4.0318          |
+| 0.0394        | 12.0  | 1104 | 4.0876          |
+| 0.0394        | 13.0  | 1196 | 4.3854          |
+| 0.0174        | 14.0  | 1288 | 4.4298          |
+| 0.0063        | 15.0  | 1380 | 4.4291          |
 ### Framework versions