File size: 1,931 Bytes
eca18f3
 
5fc9c25
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
eca18f3
 
 
 
 
5fc9c25
eca18f3
5fc9c25
eca18f3
66dd30d
 
 
 
 
 
 
 
5fc9c25
eca18f3
5fc9c25
 
eca18f3
a419e9f
5fc9c25
 
eca18f3
5fc9c25
 
 
 
 
 
 
 
 
 
 
 
eca18f3
5fc9c25
 
 
eca18f3
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
---
library_name: transformers
license: cc-by-nc-4.0
language:
- fr
- wo

datasets:
- galsenai/french-wolof-translation
metrics:
- sacrebleu
model-index:
- name: your-model-name
  results:
  - task:
      name: Translation
      type: translation
    dataset:
      name: galsenai/french-wolof-translation
      type: galsenai/french-wolof-translation
    metrics:
      - name: sacrebleu
        type: sacrebleu
        value: 9.17
---

# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->
## Model Description

This model is a fine-tuned version of `facebook/nllb-200-distilled-600M` on the `galsenai/french-wolof-translation` dataset. It is designed to perform translation from French to Wolof.

## Evaluation

The model was evaluated on a subset of 50 lines from the test split of the galsenai/french-wolof-translation dataset. The evaluation metric used was BLEU score, computed using the sacrebleu library.

## Evaluation Results

BLEU score: 9.17

## How to Use

```python
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "cibfaye/nllb-fr-wo"
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

def translate(text, src_lang='fra_Latn', tgt_lang='wol_Latn', a=32, b=3, max_input_length=1024, num_beams=5, **kwargs):
    tokenizer.src_lang = src_lang
    tokenizer.tgt_lang = tgt_lang
    inputs = tokenizer(text, return_tensors='pt', padding=True, truncation=True, max_length=max_input_length)
    result = model.generate(
        **inputs.to(model.device),
        forced_bos_token_id=tokenizer.convert_tokens_to_ids(tgt_lang),
        max_new_tokens=int(a + b * inputs.input_ids.shape[1]),
        num_beams=num_beams,
        **kwargs
    )
    return tokenizer.batch_decode(result, skip_special_tokens=True)

text = "Votre texte en français ici."
translation = translate(text)
print(translation)