google
/

fnet-base

@@ -45,74 +45,6 @@ Note that this model is primarily aimed at being fine-tuned on tasks that use th
 to make decisions, such as sequence classification, token classification or question answering. For tasks such as text
 generation you should look at model like GPT2.
-### How to use
-You can use this model directly with a pipeline for masked language modeling:
-**Note: The mask filling pipeline doesn't work exactly as the original model performs masking after converting to tokens. In masking pipeline an additional space is added after the [MASK].**
-```python
->>> from transformers import FNetForMaskedLM, FNetTokenizer, pipeline
->>> tokenizer = FNetTokenizer.from_pretrained("google/fnet-base")
->>> model = FNetForMaskedLM.from_pretrained("google/fnet-base")
->>> unmasker = pipeline('fill-mask', model=model, tokenizer=tokenizer)
->>> unmasker("Hello I'm a [MASK] model.")
-[
-    {"sequence": "hello i'm a new model.", "score": 0.12073223292827606, "token": 351, "token_str": "new"},
-    {"sequence": "hello i'm a first model.", "score": 0.08501081168651581, "token": 478, "token_str": "first"},
-    {"sequence": "hello i'm a next model.", "score": 0.060546260327100754, "token": 1037, "token_str": "next"},
-    {"sequence": "hello i'm a last model.", "score": 0.038265593349933624, "token": 813, "token_str": "last"},
-    {"sequence": "hello i'm a sister model.", "score": 0.033868927508592606, "token": 6232, "token_str": "sister"},
-]
-```
-Here is how to use this model to get the features of a given text in PyTorch:
-**Note: You must specify the maximum sequence length to be 512 and truncate/pad to the same length because the original model has no attention mask and considers all the hidden states during forward pass.**
-```python
-from transformers import FNetTokenizer, FNetModel
-tokenizer = FNetTokenizer.from_pretrained("google/fnet-base")
-model = FNetModel.from_pretrained("google/fnet-base")
-text = "Replace me by any text you'd like."
-encoded_input = tokenizer(text, return_tensors='pt', padding='max_length', truncation=True, max_length=512)
-output = model(**encoded_input)
-```
-### Limitations and bias
-Even if the training data used for this model could be characterized as fairly neutral, this model can have biased predictions. However, the model's MLM accuracy may also affect answers. Given below are some example where gender-bias could be expected:
-```python
->>> from transformers import FNetForMaskedLM, FNetTokenizer, pipeline
->>> tokenizer = FNetTokenizer.from_pretrained("google/fnet-base")
->>> model = FNetForMaskedLM.from_pretrained("google/fnet-base")
->>> unmasker = pipeline('fill-mask', model=model, tokenizer=tokenizer)
->>> unmasker("The man worked as a [MASK].")
-[
-    {"sequence": "the man worked as a man.", "score": 0.07003913819789886, "token": 283, "token_str": "man"},
-    {"sequence": "the man worked as a..", "score": 0.06601415574550629, "token": 16678, "token_str": "."},
-    {"sequence": "the man worked as a reason.", "score": 0.020491471514105797, "token": 1612, "token_str": "reason"},
-    {"sequence": "the man worked as a use.", "score": 0.017683615908026695, "token": 443, "token_str": "use"},
-    {"sequence": "the man worked as a..", "score": 0.015186904929578304, "token": 845, "token_str": "."},
-]
->>> unmasker("The woman worked as a [MASK].")
-[
-    {"sequence": "the woman worked as a..", "score": 0.12459157407283783, "token": 16678, "token_str": "."},
-    {"sequence": "the woman worked as a man.", "score": 0.022601796314120293, "token": 283, "token_str": "man"},
-    {"sequence": "the woman worked as a..", "score": 0.0209997296333313, "token": 845, "token_str": "."},
-    {"sequence": "the woman worked as a woman.", "score": 0.01911095529794693, "token": 3806, "token_str": "woman"},
-    {"sequence": "the woman worked as a one.", "score": 0.01739976927638054, "token": 276, "token_str": "one"},
-]
-```
-This bias will also affect all fine-tuned versions of this model.
 ## Training data
 The FNet model was pretrained on [C4](https://huggingface.co/datasets/c4), a cleaned version of the Common Crawl dataset.
@@ -171,6 +103,42 @@ For more details, please refer to the checkpoints linked with the scores. On ove
 We can see that FNet-base achieves around 93% of BERT-base's performance while it requires *ca.* 30% less time to fine-tune on the downstream tasks.
 ### BibTeX entry and citation info
 ```bibtex

 to make decisions, such as sequence classification, token classification or question answering. For tasks such as text
 generation you should look at model like GPT2.
 ## Training data
 The FNet model was pretrained on [C4](https://huggingface.co/datasets/c4), a cleaned version of the Common Crawl dataset.
 We can see that FNet-base achieves around 93% of BERT-base's performance while it requires *ca.* 30% less time to fine-tune on the downstream tasks.
+### How to use
+You can use this model directly with a pipeline for masked language modeling:
+**Note: The mask filling pipeline doesn't work exactly as the original model performs masking after converting to tokens. In masking pipeline an additional space is added after the [MASK].**
+```python
+>>> from transformers import FNetForMaskedLM, FNetTokenizer, pipeline
+>>> tokenizer = FNetTokenizer.from_pretrained("google/fnet-base")
+>>> model = FNetForMaskedLM.from_pretrained("google/fnet-base")
+>>> unmasker = pipeline('fill-mask', model=model, tokenizer=tokenizer)
+>>> unmasker("Hello I'm a [MASK] model.")
+[
+    {"sequence": "hello i'm a new model.", "score": 0.12073223292827606, "token": 351, "token_str": "new"},
+    {"sequence": "hello i'm a first model.", "score": 0.08501081168651581, "token": 478, "token_str": "first"},
+    {"sequence": "hello i'm a next model.", "score": 0.060546260327100754, "token": 1037, "token_str": "next"},
+    {"sequence": "hello i'm a last model.", "score": 0.038265593349933624, "token": 813, "token_str": "last"},
+    {"sequence": "hello i'm a sister model.", "score": 0.033868927508592606, "token": 6232, "token_str": "sister"},
+]
+```
+Here is how to use this model to get the features of a given text in PyTorch:
+**Note: You must specify the maximum sequence length to be 512 and truncate/pad to the same length because the original model has no attention mask and considers all the hidden states during forward pass.**
+```python
+from transformers import FNetTokenizer, FNetModel
+tokenizer = FNetTokenizer.from_pretrained("google/fnet-base")
+model = FNetModel.from_pretrained("google/fnet-base")
+text = "Replace me by any text you'd like."
+encoded_input = tokenizer(text, return_tensors='pt', padding='max_length', truncation=True, max_length=512)
+output = model(**encoded_input)
+```
 ### BibTeX entry and citation info
 ```bibtex