patrickvonplaten commited on
Commit
5b0557f
1 Parent(s): b5c7a9c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -31
README.md CHANGED
@@ -85,39 +85,28 @@ FNet-base was fine-tuned and evaluated on the validation data of the [GLUE bench
85
  For comparison, this model (ported to PyTorch) was fine-tuned and evaluated using the [official Hugging Face GLUE evaluation scripts](https://github.com/huggingface/transformers/tree/master/examples/pytorch/text-classification#glue-tasks) alongside [bert-base-cased](https://hf.co/models/bert-base-cased) for comparison.
86
  The training was done on a single 16GB NVIDIA Tesla V100 GPU. For MRPC/WNLI, the models were trained for 5 epochs, while for other tasks, the models were trained for 3 epochs. A sequence length of 512 was used with batch size 16 and learning rate 2e-5.
87
 
88
- The following table summarizes the results for [fnet-base](https://huggingface.co/google/fnet-base) (called *FNet (PyTorch) - Reproduced*) and [bert-base-cased](https://hf.co/models/bert-base-cased) (called *Bert (PyTorch) - Reproduced*) both in terms of performance and training times and compares it to the reported performance of the official FNet-base model (called *FNet (Flax) - Official*).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
89
  For more details, please refer to the checkpoints linked with the scores. On overview of all fine-tuned checkpoints of the following table can be accessed [here](https://huggingface.co/models?other=fnet-bert-base-comparison).
90
 
91
- | Task | Metric | Result | | | Training time | |
92
- | ----- | ---------------------- | --------------------------------------------------------------|----------------- | ------------------------------------------------------------------------- | ------------- | -------- |
93
- | | | Bert (PyTorch) - Reproduced | FNet (PyTorch) - Reproduced | FNet (Flax) - Official | Bert (PyTorch) - Reproduced | FNet (PyTorch) - Reproduced |
94
- | MNLI | Accuracy or Match/Mismatch | [84.10](https://huggingface.co/gchhablani/bert-base-cased-finetuned-mnli) (Accuracy) | [76.75](https://huggingface.co/gchhablani/fnet-base-finetuned-mnli) (Accuracy) | 72/73 (Match/Mismatch) | 09:52:33 | 06:40:55 |
95
- | QQP | mean(Accuracy,F1) | [89.26](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qqp) | [86.5](https://huggingface.co/gchhablani/fnet-base-finetuned-qqp) | 83 | 09:25:01 | 06:21:16 |
96
- | QNLI | Accuracy | [90.99](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qnli) | [84.39](https://huggingface.co/gchhablani/fnet-base-finetuned-qnli) | 80 |02:40:22 | 01:48:22 |
97
- | SST-2 | Accuracy | [92.32](https://huggingface.co/gchhablani/bert-base-cased-finetuned-sst2) | [89.45](https://huggingface.co/gchhablani/fnet-base-finetuned-sst2) | 95 | 01:42:17 | 01:09:27 |
98
- | CoLA | Matthews corr or Accuracy | [59.57](https://huggingface.co/gchhablani/bert-base-cased-finetuned-cola) (Matthews corr) | [35.94](https://huggingface.co/gchhablani/fnet-base-finetuned-cola) (Matthews Corr) | 69 (Accuracy) | 14:20 | 09:47 |
99
- | STS-B | Spearman corr. | [88.98](https://huggingface.co/gchhablani/bert-base-cased-finetuned-stsb) | [82.19](https://huggingface.co/gchhablani/fnet-base-finetuned-stsb) | 79 |10:24 | 07:09 |
100
- | MRPC | mean(F1/Accuracy) | [88.15](https://huggingface.co/gchhablani/bert-base-cased-finetuned-mrpc) | [81.15](https://huggingface.co/gchhablani/fnet-base-finetuned-mrpc) | 76 |11:12 | 07:48 |
101
- | RTE | Accuracy | [67.15](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qnli) | [62.82](https://huggingface.co/gchhablani/fnet-base-finetuned-qnli) | 63 |04:51 | 03:24 |
102
- | WNLI | Accuracy | [46.48](https://huggingface.co/gchhablani/bert-base-cased-finetuned-wnli) | [54.93](https://huggingface.co/gchhablani/fnet-base-finetuned-wnli) | - |03:23 | 02:37 |
103
-
104
- | Task | Training time | | Metric | Result | | |
105
- | ----- | ---------------------- | ------------- | -------- | -------------------------------------------------------------- |----------------- | ------------------------------------------------------------------------- |
106
- | | Bert (PyTorch) - Reproduced | FNet (PyTorch) - Reproduced | | Bert (PyTorch) - Reproduced | FNet (PyTorch) - Reproduced | FNet (Flax) - Official |
107
- | MNLI | 09:52:33 | 06:40:55 |Accuracy or Match/Mismatch | [84.10](https://huggingface.co/gchhablani/bert-base-cased-finetuned-mnli) (Accuracy) | [76.75](https://huggingface.co/gchhablani/fnet-base-finetuned-mnli) (Accuracy) | 72/73 (Match/Mismatch) |
108
- | QQP | 09:25:01 | 06:21:16 |mean(Accuracy,F1) | [89.26](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qqp) | [86.5](https://huggingface.co/gchhablani/fnet-base-finetuned-qqp) | 83 |
109
- | QNLI | 02:40:22 | 01:48:22 |Accuracy | [90.99](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qnli) | [84.39](https://huggingface.co/gchhablani/fnet-base-finetuned-qnli) | 80 |
110
- | SST-2 | 01:42:17 | 01:09:27 | Accuracy | [92.32](https://huggingface.co/gchhablani/bert-base-cased-finetuned-sst2) | [89.45](https://huggingface.co/gchhablani/fnet-base-finetuned-sst2) | 95 |
111
- | CoLA | 14:20 | 09:47 | Matthews corr or Accuracy | [59.57](https://huggingface.co/gchhablani/bert-base-cased-finetuned-cola) (Matthews corr) | [35.94](https://huggingface.co/gchhablani/fnet-base-finetuned-cola) (Matthews Corr) | 69 (Accuracy) |
112
- | STS-B | 10:24 | 07:09 |Spearman corr. | [88.98](https://huggingface.co/gchhablani/bert-base-cased-finetuned-stsb) | [82.19](https://huggingface.co/gchhablani/fnet-base-finetuned-stsb) | 79 |
113
- | MRPC | 11:12 | 07:48 | mean(F1/Accuracy) | [88.15](https://huggingface.co/gchhablani/bert-base-cased-finetuned-mrpc) | [81.15](https://huggingface.co/gchhablani/fnet-base-finetuned-mrpc) | 76 |
114
- | RTE | 04:51 | 03:24 | Accuracy | [67.15](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qnli) | [62.82](https://huggingface.co/gchhablani/fnet-base-finetuned-qnli) | 63 |
115
- | WNLI | 03:23 | 02:37 |Accuracy | [46.48](https://huggingface.co/gchhablani/bert-base-cased-finetuned-wnli) | [54.93](https://huggingface.co/gchhablani/fnet-base-finetuned-wnli) | - |
116
-
117
-
118
-
119
- We can see that FNet-base achieves around 93% of BERT-base's performance while it requires *ca.* 30% less time to fine-tune on the downstream tasks.
120
-
121
  ### How to use
122
 
123
  You can use this model directly with a pipeline for masked language modeling:
 
85
  For comparison, this model (ported to PyTorch) was fine-tuned and evaluated using the [official Hugging Face GLUE evaluation scripts](https://github.com/huggingface/transformers/tree/master/examples/pytorch/text-classification#glue-tasks) alongside [bert-base-cased](https://hf.co/models/bert-base-cased) for comparison.
86
  The training was done on a single 16GB NVIDIA Tesla V100 GPU. For MRPC/WNLI, the models were trained for 5 epochs, while for other tasks, the models were trained for 3 epochs. A sequence length of 512 was used with batch size 16 and learning rate 2e-5.
87
 
88
+ The following table summarizes the results for [fnet-base](https://huggingface.co/google/fnet-base) (called *FNet (PyTorch) - Reproduced*) and [bert-base-cased](https://hf.co/models/bert-base-cased) (called *Bert (PyTorch) - Reproduced*) in terms of training times. The format is *hour:min:seconds*.
89
+
90
+ | Task | MNLI-(m/mm) | QQP | QNLI | SST-2 | CoLA | STS-B | MRPC | RTE | WNLI | SUM |
91
+ |:----:|:-----------:|:----:|:----:|:-----:|:----:|:-----:|:----:|:----:|:----:|:-------:|
92
+ |FNet-base (PyTorch)| [06:40:55](https://huggingface.co/gchhablani/fnet-base-finetuned-mnli)| [06:21:16](https://huggingface.co/gchhablani/fnet-base-finetuned-qqp) | [01:48:22](https://huggingface.co/gchhablani/fnet-base-finetuned-qnli) | [01:09:27](https://huggingface.co/gchhablani/fnet-base-finetuned-sst2) | [00:09:47](https://huggingface.co/gchhablani/fnet-base-finetuned-cola) | [00:07:09](https://huggingface.co/gchhablani/fnet-base-finetuned-stsb) | [00:07:48](https://huggingface.co/gchhablani/fnet-base-finetuned-mrpc) | [00:03:24](https://huggingface.co/gchhablani/fnet-base-finetuned-rte) | [00:02:37](https://huggingface.co/gchhablani/fnet-base-finetuned-wnli) | 16:30:45 |
93
+ |Bert-base (PyTorch)| [09:52:33](https://huggingface.co/gchhablani/bert-base-cased-finetuned-mnli)| [09:25:01](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qqp) | [02:40:22](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qnli) | [01:42:17](https://huggingface.co/gchhablani/bert-base-cased-finetuned-sst2) | [00:14:20](https://huggingface.co/gchhablani/bert-base-cased-finetuned-cola) | [00:10:24](https://huggingface.co/gchhablani/bert-base-cased-finetuned-stsb) | [00:11:12](https://huggingface.co/gchhablani/bert-base-cased-finetuned-mrpc) | [00:04:51](https://huggingface.co/gchhablani/bert-base-cased-finetuned-rte) | [00:03:23](https://huggingface.co/gchhablani/bert-base-cased-finetuned-wnli) | 24:23:56 |
94
+
95
+ On average the PyTorch version of FNet-base requires *ca.* 30% less time for GLUE fine-tuning on GPU.
96
+
97
+ The following table summarizes the results for [fnet-base](https://huggingface.co/google/fnet-base) (called *FNet (PyTorch) - Reproduced*) and [bert-base-cased](https://hf.co/models/bert-base-cased) (called *Bert (PyTorch) - Reproduced*) in terms of performance and compares it to the reported performance of the official FNet-base model (called *FNet (Flax) - Official*).
98
+
99
+ | Task | MNLI-(m/mm) | QQP | QNLI | SST-2 | CoLA | STS-B | MRPC | RTE | WNLI | Avg |
100
+ |:----:|:-----------:|:----:|:----:|:-----:|:----:|:-----:|:----:|:----:|:----:|:-------:|
101
+ | Metric | Accuracy or Match/Mismatch | mean(Accuracy,F1) | Accuracy | Accuracy | Matthews corr or Accuracy | Spearman corr. | mean(F1/Accuracy) | Accuracy | Accuracy | - |
102
+ |FNet-base (PyTorch)| [76.75](https://huggingface.co/gchhablani/fnet-base-finetuned-mnli)| [86.5](https://huggingface.co/gchhablani/fnet-base-finetuned-qqp) | [84.39](https://huggingface.co/gchhablani/fnet-base-finetuned-qnli) | [89.45](https://huggingface.co/gchhablani/fnet-base-finetuned-sst2) | [35.94](https://huggingface.co/gchhablani/fnet-base-finetuned-cola) | [82.19](https://huggingface.co/gchhablani/fnet-base-finetuned-stsb) | [81.15](https://huggingface.co/gchhablani/fnet-base-finetuned-mrpc) | [62.82](https://huggingface.co/gchhablani/fnet-base-finetuned-rte) | [54.93](https://huggingface.co/gchhablani/fnet-base-finetuned-wnli) | - |
103
+ |Bert-base (PyTorch)| [84.10](https://huggingface.co/gchhablani/bert-base-cased-finetuned-mnli)| [89.26](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qqp) | [90.99](https://huggingface.co/gchhablani/bert-base-cased-finetuned-qnli) | [92.32](https://huggingface.co/gchhablani/bert-base-cased-finetuned-sst2) | [59.57](https://huggingface.co/gchhablani/bert-base-cased-finetuned-cola) | [88.98](https://huggingface.co/gchhablani/bert-base-cased-finetuned-stsb) | [88.15](https://huggingface.co/gchhablani/bert-base-cased-finetuned-mrpc) | [67.15](https://huggingface.co/gchhablani/bert-base-cased-finetuned-rte) | [46.48](https://huggingface.co/gchhablani/bert-base-cased-finetuned-wnli) | - |
104
+ | FNet-Base (Flax - official) | 72/73 | 83 | 80 | 95 | 69 | 79 | 76 | 63 | - | 76.7 |
105
+
106
+ We can see that FNet-base achieves around 93% of BERT-base's performance on average.
107
+
108
  For more details, please refer to the checkpoints linked with the scores. On overview of all fine-tuned checkpoints of the following table can be accessed [here](https://huggingface.co/models?other=fnet-bert-base-comparison).
109
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
110
  ### How to use
111
 
112
  You can use this model directly with a pipeline for masked language modeling: