BounharAbdelaziz/Terjman-Nano-MAX_LEN-512

Browse files

Files changed (3) hide show

README.md +103 -0
generation_config.json +16 -0
model.safetensors +1 -1

README.md ADDED Viewed

	@@ -0,0 +1,103 @@

+---
+license: apache-2.0
+base_model: Helsinki-NLP/opus-mt-en-ar
+tags:
+- generated_from_trainer
+metrics:
+- bleu
+model-index:
+- name: Terjman-Nano-MAX_LEN-512
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# Terjman-Nano-MAX_LEN-512
+This model is a fine-tuned version of [Helsinki-NLP/opus-mt-en-ar](https://huggingface.co/Helsinki-NLP/opus-mt-en-ar) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 3.2038
+- Bleu: 10.6239
+- Gen Len: 35.2727
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 3e-05
+- train_batch_size: 64
+- eval_batch_size: 64
+- seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 256
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_ratio: 0.03
+- num_epochs: 40
+### Training results
+| Training Loss | Epoch   | Step | Validation Loss | Bleu    | Gen Len |
+|:-------------:|:-------:|:----:|:---------------:|:-------:|:-------:|
+| No log        | 0.9982  | 140  | 4.8431          | 6.4393  | 31.6253 |
+| No log        | 1.9964  | 280  | 3.9077          | 7.7671  | 36.1047 |
+| No log        | 2.9947  | 420  | 3.6453          | 8.5008  | 35.303  |
+| 4.7676        | 4.0     | 561  | 3.5034          | 9.293   | 34.416  |
+| 4.7676        | 4.9982  | 701  | 3.4161          | 9.3322  | 34.5702 |
+| 4.7676        | 5.9964  | 841  | 3.3582          | 9.6792  | 34.438  |
+| 4.7676        | 6.9947  | 981  | 3.3182          | 9.8804  | 35.27   |
+| 3.7555        | 8.0     | 1122 | 3.2904          | 10.0802 | 34.7576 |
+| 3.7555        | 8.9982  | 1262 | 3.2684          | 10.2161 | 34.1873 |
+| 3.7555        | 9.9964  | 1402 | 3.2534          | 10.0777 | 34.6612 |
+| 3.6059        | 10.9947 | 1542 | 3.2420          | 10.637  | 34.6281 |
+| 3.6059        | 12.0    | 1683 | 3.2325          | 10.6797 | 35.1185 |
+| 3.6059        | 12.9982 | 1823 | 3.2267          | 10.5413 | 34.8898 |
+| 3.6059        | 13.9964 | 1963 | 3.2210          | 10.6098 | 35.0    |
+| 3.5561        | 14.9947 | 2103 | 3.2169          | 10.4863 | 34.8567 |
+| 3.5561        | 16.0    | 2244 | 3.2141          | 10.6152 | 34.7328 |
+| 3.5561        | 16.9982 | 2384 | 3.2119          | 10.6701 | 34.8815 |
+| 3.5363        | 17.9964 | 2524 | 3.2100          | 10.5632 | 34.7576 |
+| 3.5363        | 18.9947 | 2664 | 3.2089          | 10.5707 | 34.8623 |
+| 3.5363        | 20.0    | 2805 | 3.2077          | 10.6275 | 34.8678 |
+| 3.5363        | 20.9982 | 2945 | 3.2066          | 10.6857 | 35.0413 |
+| 3.5299        | 21.9964 | 3085 | 3.2062          | 10.8112 | 35.3251 |
+| 3.5299        | 22.9947 | 3225 | 3.2056          | 10.6908 | 34.0413 |
+| 3.5299        | 24.0    | 3366 | 3.2051          | 10.5719 | 35.4298 |
+| 3.5241        | 24.9982 | 3506 | 3.2046          | 10.5667 | 34.9036 |
+| 3.5241        | 25.9964 | 3646 | 3.2042          | 10.9389 | 35.3361 |
+| 3.5241        | 26.9947 | 3786 | 3.2043          | 10.5972 | 34.9532 |
+| 3.5241        | 28.0    | 3927 | 3.2043          | 10.6626 | 35.3113 |
+| 3.5247        | 28.9982 | 4067 | 3.2042          | 10.5286 | 35.0689 |
+| 3.5247        | 29.9964 | 4207 | 3.2038          | 10.6298 | 34.4959 |
+| 3.5247        | 30.9947 | 4347 | 3.2039          | 10.5897 | 34.9449 |
+| 3.5247        | 32.0    | 4488 | 3.2037          | 10.7971 | 35.4711 |
+| 3.5208        | 32.9982 | 4628 | 3.2039          | 10.6665 | 34.8402 |
+| 3.5208        | 33.9964 | 4768 | 3.2039          | 10.5543 | 35.27   |
+| 3.5208        | 34.9947 | 4908 | 3.2034          | 10.785  | 35.022  |
+| 3.5159        | 36.0    | 5049 | 3.2037          | 10.6311 | 34.3388 |
+| 3.5159        | 36.9982 | 5189 | 3.2037          | 10.4617 | 34.3085 |
+| 3.5159        | 37.9964 | 5329 | 3.2037          | 10.7629 | 34.4518 |
+| 3.5159        | 38.9947 | 5469 | 3.2036          | 10.6729 | 35.2066 |
+| 3.524         | 39.9287 | 5600 | 3.2038          | 10.6239 | 35.2727 |
+### Framework versions
+- Transformers 4.40.2
+- Pytorch 2.2.1+cu121
+- Datasets 2.19.1
+- Tokenizers 0.19.1

generation_config.json ADDED Viewed

	@@ -0,0 +1,16 @@

+{
+  "bad_words_ids": [
+    [
+      62801
+    ]
+  ],
+  "bos_token_id": 0,
+  "decoder_start_token_id": 62801,
+  "eos_token_id": 0,
+  "forced_eos_token_id": 0,
+  "max_length": 512,
+  "num_beams": 4,
+  "pad_token_id": 62801,
+  "renormalize_logits": true,
+  "transformers_version": "4.40.2"
+}

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e722a528bceceae494c9eaad2e56dfa10bb220a94a1fe055ea12b3a5f3c252d9
 size 152740916

 version https://git-lfs.github.com/spec/v1
+oid sha256:3f500f5474aae498a631be2b8a06f9ddd277efdd5a1568ccdcd30ba9a37c7c0d
 size 152740916