improve code example and correct link
Browse files
README.md
CHANGED
@@ -6,7 +6,7 @@ language:
|
|
6 |
# GeNTE Evaluator
|
7 |
|
8 |
The **Gender-Neutral Translation (GeNTE) Evaluator** is a sequence classification model used for evaluating inclusive rewriting and translations into Italian with the [GeNTE corpus](https://huggingface.co/datasets/FBK-MT/GeNTE).
|
9 |
-
It is built by fine-tuning the RoBERTa-based [UmBERTo model](https://huggingface.co/Musixmatch/umberto-
|
10 |
|
11 |
More details on the training process and the reproducibility can be found in the [official repository](https://github.com/hlt-mt/fbk-NEUTR-evAL/blob/main/solutions/GeNTE.md) and the [paper](https://aclanthology.org/2024.eacl-short.23/).
|
12 |
|
@@ -16,18 +16,19 @@ You can use the GeNTE Evaluator as follows:
|
|
16 |
|
17 |
```
|
18 |
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
|
|
19 |
|
20 |
# load the tokenizer of UmBERTo
|
21 |
-
tokenizer = AutoTokenizer.from_pretrained("Musixmatch/umberto-
|
22 |
|
23 |
# load GeNTE Evaluator
|
24 |
model = AutoModelForSequenceClassification.from_pretrained("FBK-MT/GeNTE-evaluator")
|
25 |
|
26 |
# neutral example
|
27 |
-
sample = "Condividiamo il parere di chi ha presentato la relazione
|
28 |
-
|
29 |
-
in particolare nel campo sanitario e della sicurezza."
|
30 |
-
input = tokenizer(sample, return_tensors='pt')
|
31 |
|
32 |
with torch.no_grad():
|
33 |
probs = model(**input).logits
|
|
|
6 |
# GeNTE Evaluator
|
7 |
|
8 |
The **Gender-Neutral Translation (GeNTE) Evaluator** is a sequence classification model used for evaluating inclusive rewriting and translations into Italian with the [GeNTE corpus](https://huggingface.co/datasets/FBK-MT/GeNTE).
|
9 |
+
It is built by fine-tuning the RoBERTa-based [UmBERTo model](https://huggingface.co/Musixmatch/umberto-commoncrawl-cased-v1).
|
10 |
|
11 |
More details on the training process and the reproducibility can be found in the [official repository](https://github.com/hlt-mt/fbk-NEUTR-evAL/blob/main/solutions/GeNTE.md) and the [paper](https://aclanthology.org/2024.eacl-short.23/).
|
12 |
|
|
|
16 |
|
17 |
```
|
18 |
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
19 |
+
import torch
|
20 |
|
21 |
# load the tokenizer of UmBERTo
|
22 |
+
tokenizer = AutoTokenizer.from_pretrained("Musixmatch/umberto-commoncrawl-cased-v1", do_lower_case=False)
|
23 |
|
24 |
# load GeNTE Evaluator
|
25 |
model = AutoModelForSequenceClassification.from_pretrained("FBK-MT/GeNTE-evaluator")
|
26 |
|
27 |
# neutral example
|
28 |
+
sample = ("Condividiamo il parere di chi ha presentato la relazione che ha posto "
|
29 |
+
"notevole enfasi sull'informazione in relazione ai rischi e sulla trasparenza, "
|
30 |
+
"in particolare nel campo sanitario e della sicurezza.")
|
31 |
+
input = tokenizer(sample, return_tensors='pt', truncation=True, max_length=64)
|
32 |
|
33 |
with torch.no_grad():
|
34 |
probs = model(**input).logits
|