ArunIcfoss
/

mbart-large-50-many-to-many-mmt-ICFOSS-Malayalam_English_Translation

PEFT

Safetensors

Generated from Trainer

Model card Files Files and versions Community

ArunIcfoss commited on Apr 15, 2024

Commit

c0b476e

verified ·

1 Parent(s): 5340256

End of training

Browse files

Files changed (2) hide show

README.md +71 -0
adapter_model.safetensors +1 -1

README.md ADDED Viewed

	@@ -0,0 +1,71 @@

+---
+library_name: peft
+tags:
+- generated_from_trainer
+base_model: facebook/mbart-large-50-many-to-many-mmt
+metrics:
+- bleu
+- rouge
+model-index:
+- name: mbart-large-50-many-to-many-mmt-ICFOSS-Malayalam_English_Translation
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# mbart-large-50-many-to-many-mmt-ICFOSS-Malayalam_English_Translation
+This model is a fine-tuned version of [facebook/mbart-large-50-many-to-many-mmt](https://huggingface.co/facebook/mbart-large-50-many-to-many-mmt) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.3733
+- Bleu: 28.9041
+- Rouge: {'rouge1': 0.6211709615166336, 'rouge2': 0.3817538086155071, 'rougeL': 0.5654819931253774, 'rougeLsum': 0.5656455299372645}
+- Chrf: {'score': 56.252579884228325, 'char_order': 6, 'word_order': 0, 'beta': 2}
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0002
+- train_batch_size: 16
+- eval_batch_size: 16
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- num_epochs: 8
+### Training results
+| Training Loss | Epoch | Step  | Validation Loss | Bleu    | Rouge                                                                                                                        | Chrf                                                                       |
+|:-------------:|:-----:|:-----:|:---------------:|:-------:|:----------------------------------------------------------------------------------------------------------------------------:|:--------------------------------------------------------------------------:|
+| 1.5329        | 1.0   | 4700  | 1.4284          | 27.0756 | {'rouge1': 0.6054918604734425, 'rouge2': 0.36327221325964765, 'rougeL': 0.5490261054453232, 'rougeLsum': 0.5491186003413475} | {'score': 54.690919979551, 'char_order': 6, 'word_order': 0, 'beta': 2}    |
+| 1.4295        | 2.0   | 9400  | 1.3924          | 28.2063 | {'rouge1': 0.614973366544844, 'rouge2': 0.373550100507563, 'rougeL': 0.5589026806041284, 'rougeLsum': 0.5589661976445393}    | {'score': 55.635529686949894, 'char_order': 6, 'word_order': 0, 'beta': 2} |
+| 1.3942        | 3.0   | 14100 | 1.3792          | 28.5831 | {'rouge1': 0.6187502745206666, 'rouge2': 0.37919936984407143, 'rougeL': 0.5626864397042893, 'rougeLsum': 0.5627150169042504} | {'score': 56.019161628219024, 'char_order': 6, 'word_order': 0, 'beta': 2} |
+| 1.3795        | 4.0   | 18800 | 1.3759          | 28.7523 | {'rouge1': 0.620515288235373, 'rouge2': 0.38072092563685545, 'rougeL': 0.5644953116677603, 'rougeLsum': 0.5646285495158272}  | {'score': 56.162861197192925, 'char_order': 6, 'word_order': 0, 'beta': 2} |
+| 1.3723        | 5.0   | 23500 | 1.3735          | 28.8675 | {'rouge1': 0.6225302294049915, 'rouge2': 0.382440202243451, 'rougeL': 0.5664785907343486, 'rougeLsum': 0.5666347228887372}   | {'score': 56.30835530151895, 'char_order': 6, 'word_order': 0, 'beta': 2}  |
+| 1.3684        | 6.0   | 28200 | 1.3731          | 28.8915 | {'rouge1': 0.6214787732761883, 'rouge2': 0.3815472818692578, 'rougeL': 0.5656767538045446, 'rougeLsum': 0.5657190870277087}  | {'score': 56.251600472693866, 'char_order': 6, 'word_order': 0, 'beta': 2} |
+| 1.3685        | 7.0   | 32900 | 1.3732          | 28.8953 | {'rouge1': 0.6216361131555139, 'rouge2': 0.3821354228713412, 'rougeL': 0.5655300849639422, 'rougeLsum': 0.565595149126267}   | {'score': 56.26874870012928, 'char_order': 6, 'word_order': 0, 'beta': 2}  |
+| 1.3678        | 8.0   | 37600 | 1.3733          | 28.9041 | {'rouge1': 0.6211709615166336, 'rouge2': 0.3817538086155071, 'rougeL': 0.5654819931253774, 'rougeLsum': 0.5656455299372645}  | {'score': 56.252579884228325, 'char_order': 6, 'word_order': 0, 'beta': 2} |
+### Framework versions
+- PEFT 0.10.0
+- Transformers 4.39.3
+- Pytorch 2.1.0+cu121
+- Datasets 2.18.0
+- Tokenizers 0.15.0

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0457168949c2a74a753866ac5eb9db92d10f7236a25100f70ed5cc7d07f7ad84
 size 4739032

 version https://git-lfs.github.com/spec/v1
+oid sha256:a71166ddd77dd9555552a3874713c404a3002ab6ff3b6738d9414f3b1fb4ccb2
 size 4739032