yoshitomo-matsubara's picture
tuned hyperparameters
92641a4
|
raw
history blame
830 Bytes
metadata
language: en
tags:
  - mnli
  - ax
  - glue
  - torchdistill
license: apache-2.0
datasets:
  - mnli
  - ax
metrics:
  - accuracy

bert-large-uncased fine-tuned on MNLI dataset, using torchdistill and Google Colab.
The hyperparameters are the same as those in Hugging Face's example and/or the paper of BERT, and the training configuration (including hyperparameters) is available here.
I submitted prediction files to the GLUE leaderboard, and the overall GLUE score was 80.2.