|
--- |
|
language: |
|
- ms |
|
tags: |
|
- paraphrase |
|
datasets: mesolitica/translated-PAWS |
|
metrics: |
|
- sacrebleu |
|
--- |
|
|
|
# finetune-paraphrase-t5-base-standard-bahasa-cased |
|
|
|
Finetuned T5 base on MS paraphrase tasks. |
|
|
|
## Dataset |
|
|
|
1. translated PAWS, https://huggingface.co/datasets/mesolitica/translated-PAWS |
|
2. translated MRPC, https://huggingface.co/datasets/mesolitica/translated-MRPC |
|
3. translated ParaSCI, https://huggingface.co/datasets/mesolitica/translated-paraSCI |
|
|
|
## Finetune details |
|
|
|
1. Finetune using single RTX 3090 Ti. |
|
|
|
Scripts at https://github.com/huseinzol05/malaya/tree/master/session/paraphrase/hf-t5 |
|
|
|
## Supported prefix |
|
|
|
1. `parafrasa: {string}`, for MS paraphrase. |
|
|
|
## Evaluation |
|
|
|
Evaluated on MRPC validation set and ParaSCI Arxiv test set. |
|
|
|
``` |
|
{'name': 'BLEU', |
|
'score': 35.95965899952292, |
|
'_mean': -1.0, |
|
'_ci': -1.0, |
|
'_verbose': '61.7/41.3/32.0/25.8 (BP = 0.944 ratio = 0.946 hyp_len = 95593 ref_len = 101064)', |
|
'bp': 0.9443747373110852, |
|
'counts': [59014, 37157, 27016, 20383], |
|
'totals': [95593, 90049, 84505, 78961], |
|
'sys_len': 95593, |
|
'ref_len': 101064, |
|
'precisions': [61.73464584226878, |
|
41.263090095392506, |
|
31.969705934560086, |
|
25.81400944770203], |
|
'prec_str': '61.7/41.3/32.0/25.8', |
|
'ratio': 0.9458659859099184} |
|
``` |