File size: 4,562 Bytes
9ea0700
 
 
 
 
 
 
 
 
 
 
 
 
 
3c25720
b9520a2
 
9ea0700
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1f18612
9ea0700
 
283bf6a
203f5a1
 
b9520a2
1f18612
9ea0700
 
 
b9520a2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9ea0700
 
 
 
 
 
82a428b
9ea0700
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
---
tags:
- generated_from_trainer
model-index:
- name: baseline
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# baseline

This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.9254
- Exact Match: 0.702

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 400
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: inverse_sqrt
- lr_scheduler_warmup_steps: 4000
- training_steps: 20000
- label_smoothing_factor: 0.1

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Exact Match |
|:-------------:|:-----:|:-----:|:---------------:|:-----------:|
| 2.8524        | 16.0  | 400   | 1.7375          | 0.059       |
| 1.422         | 32.0  | 800   | 1.6708          | 0.11        |
| 1.0862        | 48.0  | 1200  | 1.7149          | 0.094       |
| 0.9374        | 64.0  | 1600  | 1.6508          | 0.159       |
| 0.8704        | 80.0  | 2000  | 1.6920          | 0.112       |
| 0.8356        | 96.0  | 2400  | 1.5605          | 0.16        |
| 0.8157        | 112.0 | 2800  | 1.5249          | 0.188       |
| 0.8029        | 128.0 | 3200  | 1.3993          | 0.25        |
| 0.7917        | 144.0 | 3600  | 1.2768          | 0.312       |
| 0.7821        | 160.0 | 4000  | 1.2213          | 0.397       |
| 0.7719        | 176.0 | 4400  | 1.1216          | 0.432       |
| 0.7635        | 192.0 | 4800  | 1.1076          | 0.458       |
| 0.7584        | 208.0 | 5200  | 1.0275          | 0.567       |
| 0.7556        | 224.0 | 5600  | 1.0464          | 0.552       |
| 0.7525        | 240.0 | 6000  | 1.0442          | 0.56        |
| 0.7496        | 256.0 | 6400  | 1.0108          | 0.581       |
| 0.7487        | 272.0 | 6800  | 0.9721          | 0.61        |
| 0.7467        | 288.0 | 7200  | 1.0326          | 0.567       |
| 0.7466        | 304.0 | 7600  | 0.9900          | 0.572       |
| 0.7449        | 320.0 | 8000  | 1.0150          | 0.604       |
| 0.7445        | 336.0 | 8400  | 0.9755          | 0.603       |
| 0.7433        | 352.0 | 8800  | 0.9705          | 0.645       |
| 0.7432        | 368.0 | 9200  | 0.9567          | 0.663       |
| 0.7432        | 384.0 | 9600  | 0.9733          | 0.68        |
| 0.7425        | 400.0 | 10000 | 0.9262          | 0.67        |
| 0.7417        | 416.0 | 10400 | 0.9216          | 0.673       |
| 0.7409        | 432.0 | 10800 | 0.9411          | 0.681       |
| 0.7404        | 448.0 | 11200 | 0.9312          | 0.674       |
| 0.7405        | 464.0 | 11600 | 0.9777          | 0.585       |
| 0.7406        | 480.0 | 12000 | 0.9191          | 0.683       |
| 0.7395        | 496.0 | 12400 | 0.9216          | 0.643       |
| 0.7396        | 512.0 | 12800 | 0.9764          | 0.645       |
| 0.7394        | 528.0 | 13200 | 0.9361          | 0.644       |
| 0.7392        | 544.0 | 13600 | 0.9210          | 0.67        |
| 0.739         | 560.0 | 14000 | 0.9387          | 0.688       |
| 0.7389        | 576.0 | 14400 | 0.9385          | 0.67        |
| 0.7383        | 592.0 | 14800 | 0.9500          | 0.655       |
| 0.7386        | 608.0 | 15200 | 0.9405          | 0.67        |
| 0.7383        | 624.0 | 15600 | 0.9335          | 0.691       |
| 0.738         | 640.0 | 16000 | 0.9079          | 0.708       |
| 0.7379        | 656.0 | 16400 | 0.9027          | 0.714       |
| 0.7376        | 672.0 | 16800 | 0.8969          | 0.703       |
| 0.7372        | 688.0 | 17200 | 0.9169          | 0.685       |
| 0.7375        | 704.0 | 17600 | 0.8895          | 0.738       |
| 0.7376        | 720.0 | 18000 | 0.8951          | 0.734       |
| 0.7371        | 736.0 | 18400 | 0.9408          | 0.673       |
| 0.737         | 752.0 | 18800 | 0.9270          | 0.693       |
| 0.7371        | 768.0 | 19200 | 0.9063          | 0.71        |
| 0.7369        | 784.0 | 19600 | 0.9253          | 0.678       |
| 0.7367        | 800.0 | 20000 | 0.9254          | 0.702       |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.1.0+cu118
- Datasets 2.15.0
- Tokenizers 0.15.0