BounharAbdelaziz commited on
Commit
305044d
1 Parent(s): 8626c0f

BounharAbdelaziz/Terjman-Nano-MAX_LEN-512

Browse files
Files changed (3) hide show
  1. README.md +103 -0
  2. generation_config.json +16 -0
  3. model.safetensors +1 -1
README.md ADDED
@@ -0,0 +1,103 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: Helsinki-NLP/opus-mt-en-ar
4
+ tags:
5
+ - generated_from_trainer
6
+ metrics:
7
+ - bleu
8
+ model-index:
9
+ - name: Terjman-Nano-MAX_LEN-512
10
+ results: []
11
+ ---
12
+
13
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
+ should probably proofread and complete it, then remove this comment. -->
15
+
16
+ # Terjman-Nano-MAX_LEN-512
17
+
18
+ This model is a fine-tuned version of [Helsinki-NLP/opus-mt-en-ar](https://huggingface.co/Helsinki-NLP/opus-mt-en-ar) on an unknown dataset.
19
+ It achieves the following results on the evaluation set:
20
+ - Loss: 3.2038
21
+ - Bleu: 10.6239
22
+ - Gen Len: 35.2727
23
+
24
+ ## Model description
25
+
26
+ More information needed
27
+
28
+ ## Intended uses & limitations
29
+
30
+ More information needed
31
+
32
+ ## Training and evaluation data
33
+
34
+ More information needed
35
+
36
+ ## Training procedure
37
+
38
+ ### Training hyperparameters
39
+
40
+ The following hyperparameters were used during training:
41
+ - learning_rate: 3e-05
42
+ - train_batch_size: 64
43
+ - eval_batch_size: 64
44
+ - seed: 42
45
+ - gradient_accumulation_steps: 4
46
+ - total_train_batch_size: 256
47
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
48
+ - lr_scheduler_type: linear
49
+ - lr_scheduler_warmup_ratio: 0.03
50
+ - num_epochs: 40
51
+
52
+ ### Training results
53
+
54
+ | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
55
+ |:-------------:|:-------:|:----:|:---------------:|:-------:|:-------:|
56
+ | No log | 0.9982 | 140 | 4.8431 | 6.4393 | 31.6253 |
57
+ | No log | 1.9964 | 280 | 3.9077 | 7.7671 | 36.1047 |
58
+ | No log | 2.9947 | 420 | 3.6453 | 8.5008 | 35.303 |
59
+ | 4.7676 | 4.0 | 561 | 3.5034 | 9.293 | 34.416 |
60
+ | 4.7676 | 4.9982 | 701 | 3.4161 | 9.3322 | 34.5702 |
61
+ | 4.7676 | 5.9964 | 841 | 3.3582 | 9.6792 | 34.438 |
62
+ | 4.7676 | 6.9947 | 981 | 3.3182 | 9.8804 | 35.27 |
63
+ | 3.7555 | 8.0 | 1122 | 3.2904 | 10.0802 | 34.7576 |
64
+ | 3.7555 | 8.9982 | 1262 | 3.2684 | 10.2161 | 34.1873 |
65
+ | 3.7555 | 9.9964 | 1402 | 3.2534 | 10.0777 | 34.6612 |
66
+ | 3.6059 | 10.9947 | 1542 | 3.2420 | 10.637 | 34.6281 |
67
+ | 3.6059 | 12.0 | 1683 | 3.2325 | 10.6797 | 35.1185 |
68
+ | 3.6059 | 12.9982 | 1823 | 3.2267 | 10.5413 | 34.8898 |
69
+ | 3.6059 | 13.9964 | 1963 | 3.2210 | 10.6098 | 35.0 |
70
+ | 3.5561 | 14.9947 | 2103 | 3.2169 | 10.4863 | 34.8567 |
71
+ | 3.5561 | 16.0 | 2244 | 3.2141 | 10.6152 | 34.7328 |
72
+ | 3.5561 | 16.9982 | 2384 | 3.2119 | 10.6701 | 34.8815 |
73
+ | 3.5363 | 17.9964 | 2524 | 3.2100 | 10.5632 | 34.7576 |
74
+ | 3.5363 | 18.9947 | 2664 | 3.2089 | 10.5707 | 34.8623 |
75
+ | 3.5363 | 20.0 | 2805 | 3.2077 | 10.6275 | 34.8678 |
76
+ | 3.5363 | 20.9982 | 2945 | 3.2066 | 10.6857 | 35.0413 |
77
+ | 3.5299 | 21.9964 | 3085 | 3.2062 | 10.8112 | 35.3251 |
78
+ | 3.5299 | 22.9947 | 3225 | 3.2056 | 10.6908 | 34.0413 |
79
+ | 3.5299 | 24.0 | 3366 | 3.2051 | 10.5719 | 35.4298 |
80
+ | 3.5241 | 24.9982 | 3506 | 3.2046 | 10.5667 | 34.9036 |
81
+ | 3.5241 | 25.9964 | 3646 | 3.2042 | 10.9389 | 35.3361 |
82
+ | 3.5241 | 26.9947 | 3786 | 3.2043 | 10.5972 | 34.9532 |
83
+ | 3.5241 | 28.0 | 3927 | 3.2043 | 10.6626 | 35.3113 |
84
+ | 3.5247 | 28.9982 | 4067 | 3.2042 | 10.5286 | 35.0689 |
85
+ | 3.5247 | 29.9964 | 4207 | 3.2038 | 10.6298 | 34.4959 |
86
+ | 3.5247 | 30.9947 | 4347 | 3.2039 | 10.5897 | 34.9449 |
87
+ | 3.5247 | 32.0 | 4488 | 3.2037 | 10.7971 | 35.4711 |
88
+ | 3.5208 | 32.9982 | 4628 | 3.2039 | 10.6665 | 34.8402 |
89
+ | 3.5208 | 33.9964 | 4768 | 3.2039 | 10.5543 | 35.27 |
90
+ | 3.5208 | 34.9947 | 4908 | 3.2034 | 10.785 | 35.022 |
91
+ | 3.5159 | 36.0 | 5049 | 3.2037 | 10.6311 | 34.3388 |
92
+ | 3.5159 | 36.9982 | 5189 | 3.2037 | 10.4617 | 34.3085 |
93
+ | 3.5159 | 37.9964 | 5329 | 3.2037 | 10.7629 | 34.4518 |
94
+ | 3.5159 | 38.9947 | 5469 | 3.2036 | 10.6729 | 35.2066 |
95
+ | 3.524 | 39.9287 | 5600 | 3.2038 | 10.6239 | 35.2727 |
96
+
97
+
98
+ ### Framework versions
99
+
100
+ - Transformers 4.40.2
101
+ - Pytorch 2.2.1+cu121
102
+ - Datasets 2.19.1
103
+ - Tokenizers 0.19.1
generation_config.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bad_words_ids": [
3
+ [
4
+ 62801
5
+ ]
6
+ ],
7
+ "bos_token_id": 0,
8
+ "decoder_start_token_id": 62801,
9
+ "eos_token_id": 0,
10
+ "forced_eos_token_id": 0,
11
+ "max_length": 512,
12
+ "num_beams": 4,
13
+ "pad_token_id": 62801,
14
+ "renormalize_logits": true,
15
+ "transformers_version": "4.40.2"
16
+ }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e722a528bceceae494c9eaad2e56dfa10bb220a94a1fe055ea12b3a5f3c252d9
3
  size 152740916
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3f500f5474aae498a631be2b8a06f9ddd277efdd5a1568ccdcd30ba9a37c7c0d
3
  size 152740916