DeBERTa commited on
Commit
5272422
1 Parent(s): 26edb09

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -12,7 +12,7 @@ widget:
12
 
13
  ## DeBERTa: Decoding-enhanced BERT with Disentangled Attention
14
 
15
- [DeBERTa](https://arxiv.org/abs/2006.03654) improves the BERT and RoBERTa models using disentangled attention and enhanced mask decoder. With those two improvements, DeBERTa out perform RoBERTa on a majority of NLU tasks with 80GB training data.
16
 
17
  Please check the [official repository](https://github.com/microsoft/DeBERTa) for more details and updates.
18
 
@@ -41,8 +41,8 @@ We present the dev results on SQuAD 1.1/2.0 and several GLUE benchmark tasks.
41
  ```bash
42
  cd transformers/examples/text-classification/
43
  export TASK_NAME=mrpc
44
- python -m torch.distributed.launch --nproc_per_node=8 run_glue.py --model_name_or_path microsoft/deberta-v2-xxlarge \
45
- --task_name $TASK_NAME --do_train --do_eval --max_seq_length 128 --per_device_train_batch_size 4 \
46
  --learning_rate 3e-6 --num_train_epochs 3 --output_dir /tmp/$TASK_NAME/ --overwrite_output_dir --sharded_ddp --fp16
47
  ```
48
 
 
12
 
13
  ## DeBERTa: Decoding-enhanced BERT with Disentangled Attention
14
 
15
+ [DeBERTa](https://arxiv.org/abs/2006.03654) improves the BERT and RoBERTa models using disentangled attention and enhanced mask decoder. It outperforms BERT and RoBERTa on majority of NLU tasks with 80GB training data.
16
 
17
  Please check the [official repository](https://github.com/microsoft/DeBERTa) for more details and updates.
18
 
 
41
  ```bash
42
  cd transformers/examples/text-classification/
43
  export TASK_NAME=mrpc
44
+ python -m torch.distributed.launch --nproc_per_node=8 run_glue.py --model_name_or_path microsoft/deberta-v2-xxlarge \\
45
+ --task_name $TASK_NAME --do_train --do_eval --max_seq_length 128 --per_device_train_batch_size 4 \\
46
  --learning_rate 3e-6 --num_train_epochs 3 --output_dir /tmp/$TASK_NAME/ --overwrite_output_dir --sharded_ddp --fp16
47
  ```
48