lachkarsalim
/

LatinDarija_English-v1

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

lachkarsalim commited on Apr 17

Commit

cb76a3b

•

1 Parent(s): 77db517

Update README.md

Files changed (1) hide show

README.md +11 -43

README.md CHANGED Viewed

@@ -1,60 +1,28 @@
 ---
 license: apache-2.0
 base_model: Helsinki-NLP/opus-mt-ar-en
-tags:
-- generated_from_trainer
-metrics:
-- bleu
-model-index:
-- name: results_arabicTranslation
-  results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# results_arabicTranslation
-This model is a fine-tuned version of [Helsinki-NLP/opus-mt-ar-en](https://huggingface.co/Helsinki-NLP/opus-mt-ar-en)
-It achieves the following results on the evaluation set:
-Epoch	Training Loss	Validation Loss	Bleu
-1	2.271900	2.034573	25.406637
-2	1.854200	1.787860	20.556681
-3	1.642800	1.677009	24.274589
-4	1.508300	1.630295	20.556681
-5	1.447700	1.615814	24.274589
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 2e-05
 - train_batch_size: 32
 - eval_batch_size: 32
-- seed: 42
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: linear
 - num_epochs: 5
-- mixed_precision_training: True FP16 enabled
-### Framework versions
-- Transformers 4.38.2
-- Pytorch 2.2.2+cu121
-- Datasets 2.18.0
-- Tokenizers 0.15.2

 ---
 license: apache-2.0
 base_model: Helsinki-NLP/opus-mt-ar-en
+language:
+- ar
+- en
+pipeline_tag: translation
 ---
+---
+license: apache-2.0
+base_model: Helsinki-NLP/opus-mt-ar-en
+# This model's role is to translate Daraija with Latin words or Arabizi into English. It was trained on 60,000 rows of translation examples.
+This model is a fine-tuned version of [Helsinki-NLP/opus-mt-ar-en](https://huggingface.co/Helsinki-NLP/opus-mt-ar-en) on anDarija Open Dataset (DODa), an ambitious open-source project dedicated to the Moroccan dialect. With about 150,000 entries, DODa is arguably the largest open-source collaborative project for Darija <=> English translation built for Natural Language Processing purposes.
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- GPU : A100
 - train_batch_size: 32
 - eval_batch_size: 32
 - num_epochs: 5
+- mixed_precision_training: True FP16 enabled