patrickvonplaten
commited on
Commit
•
5b0557f
1
Parent(s):
b5c7a9c
Update README.md
Browse files
README.md
CHANGED
@@ -85,39 +85,28 @@ FNet-base was fine-tuned and evaluated on the validation data of the [GLUE bench
|
|
85 |
For comparison, this model (ported to PyTorch) was fine-tuned and evaluated using the [official Hugging Face GLUE evaluation scripts](https://github.com/huggingface/transformers/tree/master/examples/pytorch/text-classification#glue-tasks) alongside [bert-base-cased](https://hf.co/models/bert-base-cased) for comparison.
|
86 |
The training was done on a single 16GB NVIDIA Tesla V100 GPU. For MRPC/WNLI, the models were trained for 5 epochs, while for other tasks, the models were trained for 3 epochs. A sequence length of 512 was used with batch size 16 and learning rate 2e-5.
|
87 |
|
88 |
-
The following table summarizes the results for [fnet-base](https://huggingface.co/google/fnet-base) (called *FNet (PyTorch) - Reproduced*) and [bert-base-cased](https://hf.co/models/bert-base-cased) (called *Bert (PyTorch) - Reproduced*)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
89 |
For more details, please refer to the checkpoints linked with the scores. On overview of all fine-tuned checkpoints of the following table can be accessed [here](https://huggingface.co/models?other=fnet-bert-base-comparison).
|
90 |
|
91 |
-
| Task | Metric | Result | | | Training time | |
|
92 |
-
| ----- | ---------------------- | --------------------------------------------------------------|----------------- | ------------------------------------------------------------------------- | ------------- | -------- |
|
93 |
-
| | | Bert (PyTorch) - Reproduced | FNet (PyTorch) - Reproduced | FNet (Flax) - Official | Bert (PyTorch) - Reproduced | FNet (PyTorch) - Reproduced |
|
94 |
-
| MNLI | Accuracy or Match/Mismatch | [84.10](https://huggingface.co/gchhablani/bert-base-cased-finetuned-mnli) (Accuracy) | [76.75](https://huggingface.co/gchhablani/fnet-base-finetuned-mnli) (Accuracy) | 72/73 (Match/Mismatch) | 09:52:33 | 06:40:55 |
|
95 |
-
| QQP | mean(Accuracy,F1) | [89.26](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qqp) | [86.5](https://huggingface.co/gchhablani/fnet-base-finetuned-qqp) | 83 | 09:25:01 | 06:21:16 |
|
96 |
-
| QNLI | Accuracy | [90.99](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qnli) | [84.39](https://huggingface.co/gchhablani/fnet-base-finetuned-qnli) | 80 |02:40:22 | 01:48:22 |
|
97 |
-
| SST-2 | Accuracy | [92.32](https://huggingface.co/gchhablani/bert-base-cased-finetuned-sst2) | [89.45](https://huggingface.co/gchhablani/fnet-base-finetuned-sst2) | 95 | 01:42:17 | 01:09:27 |
|
98 |
-
| CoLA | Matthews corr or Accuracy | [59.57](https://huggingface.co/gchhablani/bert-base-cased-finetuned-cola) (Matthews corr) | [35.94](https://huggingface.co/gchhablani/fnet-base-finetuned-cola) (Matthews Corr) | 69 (Accuracy) | 14:20 | 09:47 |
|
99 |
-
| STS-B | Spearman corr. | [88.98](https://huggingface.co/gchhablani/bert-base-cased-finetuned-stsb) | [82.19](https://huggingface.co/gchhablani/fnet-base-finetuned-stsb) | 79 |10:24 | 07:09 |
|
100 |
-
| MRPC | mean(F1/Accuracy) | [88.15](https://huggingface.co/gchhablani/bert-base-cased-finetuned-mrpc) | [81.15](https://huggingface.co/gchhablani/fnet-base-finetuned-mrpc) | 76 |11:12 | 07:48 |
|
101 |
-
| RTE | Accuracy | [67.15](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qnli) | [62.82](https://huggingface.co/gchhablani/fnet-base-finetuned-qnli) | 63 |04:51 | 03:24 |
|
102 |
-
| WNLI | Accuracy | [46.48](https://huggingface.co/gchhablani/bert-base-cased-finetuned-wnli) | [54.93](https://huggingface.co/gchhablani/fnet-base-finetuned-wnli) | - |03:23 | 02:37 |
|
103 |
-
|
104 |
-
| Task | Training time | | Metric | Result | | |
|
105 |
-
| ----- | ---------------------- | ------------- | -------- | -------------------------------------------------------------- |----------------- | ------------------------------------------------------------------------- |
|
106 |
-
| | Bert (PyTorch) - Reproduced | FNet (PyTorch) - Reproduced | | Bert (PyTorch) - Reproduced | FNet (PyTorch) - Reproduced | FNet (Flax) - Official |
|
107 |
-
| MNLI | 09:52:33 | 06:40:55 |Accuracy or Match/Mismatch | [84.10](https://huggingface.co/gchhablani/bert-base-cased-finetuned-mnli) (Accuracy) | [76.75](https://huggingface.co/gchhablani/fnet-base-finetuned-mnli) (Accuracy) | 72/73 (Match/Mismatch) |
|
108 |
-
| QQP | 09:25:01 | 06:21:16 |mean(Accuracy,F1) | [89.26](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qqp) | [86.5](https://huggingface.co/gchhablani/fnet-base-finetuned-qqp) | 83 |
|
109 |
-
| QNLI | 02:40:22 | 01:48:22 |Accuracy | [90.99](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qnli) | [84.39](https://huggingface.co/gchhablani/fnet-base-finetuned-qnli) | 80 |
|
110 |
-
| SST-2 | 01:42:17 | 01:09:27 | Accuracy | [92.32](https://huggingface.co/gchhablani/bert-base-cased-finetuned-sst2) | [89.45](https://huggingface.co/gchhablani/fnet-base-finetuned-sst2) | 95 |
|
111 |
-
| CoLA | 14:20 | 09:47 | Matthews corr or Accuracy | [59.57](https://huggingface.co/gchhablani/bert-base-cased-finetuned-cola) (Matthews corr) | [35.94](https://huggingface.co/gchhablani/fnet-base-finetuned-cola) (Matthews Corr) | 69 (Accuracy) |
|
112 |
-
| STS-B | 10:24 | 07:09 |Spearman corr. | [88.98](https://huggingface.co/gchhablani/bert-base-cased-finetuned-stsb) | [82.19](https://huggingface.co/gchhablani/fnet-base-finetuned-stsb) | 79 |
|
113 |
-
| MRPC | 11:12 | 07:48 | mean(F1/Accuracy) | [88.15](https://huggingface.co/gchhablani/bert-base-cased-finetuned-mrpc) | [81.15](https://huggingface.co/gchhablani/fnet-base-finetuned-mrpc) | 76 |
|
114 |
-
| RTE | 04:51 | 03:24 | Accuracy | [67.15](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qnli) | [62.82](https://huggingface.co/gchhablani/fnet-base-finetuned-qnli) | 63 |
|
115 |
-
| WNLI | 03:23 | 02:37 |Accuracy | [46.48](https://huggingface.co/gchhablani/bert-base-cased-finetuned-wnli) | [54.93](https://huggingface.co/gchhablani/fnet-base-finetuned-wnli) | - |
|
116 |
-
|
117 |
-
|
118 |
-
|
119 |
-
We can see that FNet-base achieves around 93% of BERT-base's performance while it requires *ca.* 30% less time to fine-tune on the downstream tasks.
|
120 |
-
|
121 |
### How to use
|
122 |
|
123 |
You can use this model directly with a pipeline for masked language modeling:
|
|
|
85 |
For comparison, this model (ported to PyTorch) was fine-tuned and evaluated using the [official Hugging Face GLUE evaluation scripts](https://github.com/huggingface/transformers/tree/master/examples/pytorch/text-classification#glue-tasks) alongside [bert-base-cased](https://hf.co/models/bert-base-cased) for comparison.
|
86 |
The training was done on a single 16GB NVIDIA Tesla V100 GPU. For MRPC/WNLI, the models were trained for 5 epochs, while for other tasks, the models were trained for 3 epochs. A sequence length of 512 was used with batch size 16 and learning rate 2e-5.
|
87 |
|
88 |
+
The following table summarizes the results for [fnet-base](https://huggingface.co/google/fnet-base) (called *FNet (PyTorch) - Reproduced*) and [bert-base-cased](https://hf.co/models/bert-base-cased) (called *Bert (PyTorch) - Reproduced*) in terms of training times. The format is *hour:min:seconds*.
|
89 |
+
|
90 |
+
| Task | MNLI-(m/mm) | QQP | QNLI | SST-2 | CoLA | STS-B | MRPC | RTE | WNLI | SUM |
|
91 |
+
|:----:|:-----------:|:----:|:----:|:-----:|:----:|:-----:|:----:|:----:|:----:|:-------:|
|
92 |
+
|FNet-base (PyTorch)| [06:40:55](https://huggingface.co/gchhablani/fnet-base-finetuned-mnli)| [06:21:16](https://huggingface.co/gchhablani/fnet-base-finetuned-qqp) | [01:48:22](https://huggingface.co/gchhablani/fnet-base-finetuned-qnli) | [01:09:27](https://huggingface.co/gchhablani/fnet-base-finetuned-sst2) | [00:09:47](https://huggingface.co/gchhablani/fnet-base-finetuned-cola) | [00:07:09](https://huggingface.co/gchhablani/fnet-base-finetuned-stsb) | [00:07:48](https://huggingface.co/gchhablani/fnet-base-finetuned-mrpc) | [00:03:24](https://huggingface.co/gchhablani/fnet-base-finetuned-rte) | [00:02:37](https://huggingface.co/gchhablani/fnet-base-finetuned-wnli) | 16:30:45 |
|
93 |
+
|Bert-base (PyTorch)| [09:52:33](https://huggingface.co/gchhablani/bert-base-cased-finetuned-mnli)| [09:25:01](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qqp) | [02:40:22](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qnli) | [01:42:17](https://huggingface.co/gchhablani/bert-base-cased-finetuned-sst2) | [00:14:20](https://huggingface.co/gchhablani/bert-base-cased-finetuned-cola) | [00:10:24](https://huggingface.co/gchhablani/bert-base-cased-finetuned-stsb) | [00:11:12](https://huggingface.co/gchhablani/bert-base-cased-finetuned-mrpc) | [00:04:51](https://huggingface.co/gchhablani/bert-base-cased-finetuned-rte) | [00:03:23](https://huggingface.co/gchhablani/bert-base-cased-finetuned-wnli) | 24:23:56 |
|
94 |
+
|
95 |
+
On average the PyTorch version of FNet-base requires *ca.* 30% less time for GLUE fine-tuning on GPU.
|
96 |
+
|
97 |
+
The following table summarizes the results for [fnet-base](https://huggingface.co/google/fnet-base) (called *FNet (PyTorch) - Reproduced*) and [bert-base-cased](https://hf.co/models/bert-base-cased) (called *Bert (PyTorch) - Reproduced*) in terms of performance and compares it to the reported performance of the official FNet-base model (called *FNet (Flax) - Official*).
|
98 |
+
|
99 |
+
| Task | MNLI-(m/mm) | QQP | QNLI | SST-2 | CoLA | STS-B | MRPC | RTE | WNLI | Avg |
|
100 |
+
|:----:|:-----------:|:----:|:----:|:-----:|:----:|:-----:|:----:|:----:|:----:|:-------:|
|
101 |
+
| Metric | Accuracy or Match/Mismatch | mean(Accuracy,F1) | Accuracy | Accuracy | Matthews corr or Accuracy | Spearman corr. | mean(F1/Accuracy) | Accuracy | Accuracy | - |
|
102 |
+
|FNet-base (PyTorch)| [76.75](https://huggingface.co/gchhablani/fnet-base-finetuned-mnli)| [86.5](https://huggingface.co/gchhablani/fnet-base-finetuned-qqp) | [84.39](https://huggingface.co/gchhablani/fnet-base-finetuned-qnli) | [89.45](https://huggingface.co/gchhablani/fnet-base-finetuned-sst2) | [35.94](https://huggingface.co/gchhablani/fnet-base-finetuned-cola) | [82.19](https://huggingface.co/gchhablani/fnet-base-finetuned-stsb) | [81.15](https://huggingface.co/gchhablani/fnet-base-finetuned-mrpc) | [62.82](https://huggingface.co/gchhablani/fnet-base-finetuned-rte) | [54.93](https://huggingface.co/gchhablani/fnet-base-finetuned-wnli) | - |
|
103 |
+
|Bert-base (PyTorch)| [84.10](https://huggingface.co/gchhablani/bert-base-cased-finetuned-mnli)| [89.26](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qqp) | [90.99](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qnli) | [92.32](https://huggingface.co/gchhablani/bert-base-cased-finetuned-sst2) | [59.57](https://huggingface.co/gchhablani/bert-base-cased-finetuned-cola) | [88.98](https://huggingface.co/gchhablani/bert-base-cased-finetuned-stsb) | [88.15](https://huggingface.co/gchhablani/bert-base-cased-finetuned-mrpc) | [67.15](https://huggingface.co/gchhablani/bert-base-cased-finetuned-rte) | [46.48](https://huggingface.co/gchhablani/bert-base-cased-finetuned-wnli) | - |
|
104 |
+
| FNet-Base (Flax - official) | 72/73 | 83 | 80 | 95 | 69 | 79 | 76 | 63 | - | 76.7 |
|
105 |
+
|
106 |
+
We can see that FNet-base achieves around 93% of BERT-base's performance on average.
|
107 |
+
|
108 |
For more details, please refer to the checkpoints linked with the scores. On overview of all fine-tuned checkpoints of the following table can be accessed [here](https://huggingface.co/models?other=fnet-bert-base-comparison).
|
109 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
110 |
### How to use
|
111 |
|
112 |
You can use this model directly with a pipeline for masked language modeling:
|