microsoft
/

deberta-xxlarge-v2

Inference Endpoints

Model card Files Files and versions Community

DeBERTa commited on Feb 5, 2021

Commit

2b201b4

•

1 Parent(s): fe8e92d

Update README.md

Files changed (1) hide show

README.md +13 -0

README.md CHANGED Viewed

@@ -29,6 +29,19 @@ We present the dev results on SQuAD 1.1/2.0 and several GLUE benchmark tasks.
 |**DeBERTa-XXLarge-V2-mnli**| -         | -         |**91.7/91.8**| -     | -    | -    | 93.5   | -            | -    |-    |
 ### Citation
 If you find DeBERTa useful for your work, please cite the following paper:

 |**DeBERTa-XXLarge-V2-mnli**| -         | -         |**91.7/91.8**| -     | -    | -    | 93.5   | -            | -    |-    |
+## Note
+To try the **XXLarge** model with **HF transformers**, you need to specify **--sharded_ddp**
+```bash
+cd transformers/examples/text-classification/
+python -m torch.distributed.launch --nproc_per_node=8 run_glue.py   --model_name_or_path microsoft/deberta-xxlarge-v2   \
+--task_name $TASK_NAME   --do_train   --do_eval   --max_seq_length 128   --per_device_train_batch_size 4   \
+--learning_rate 3e-6   --num_train_epochs 3   --output_dir /tmp/$TASK_NAME/ --overwrite_output_dir --sharded_ddp
+```
 ### Citation
 If you find DeBERTa useful for your work, please cite the following paper: