Spaces:
Runtime error
Runtime error
# Finetuning RoBERTa on RACE tasks | |
### 1) Download the data from RACE website (http://www.cs.cmu.edu/~glai1/data/race/) | |
### 2) Preprocess RACE data: | |
```bash | |
python ./examples/roberta/preprocess_RACE.py --input-dir <input-dir> --output-dir <extracted-data-dir> | |
./examples/roberta/preprocess_RACE.sh <extracted-data-dir> <output-dir> | |
``` | |
### 3) Fine-tuning on RACE: | |
```bash | |
MAX_EPOCH=5 # Number of training epochs. | |
LR=1e-05 # Peak LR for fixed LR scheduler. | |
NUM_CLASSES=4 | |
MAX_SENTENCES=1 # Batch size per GPU. | |
UPDATE_FREQ=8 # Accumulate gradients to simulate training on 8 GPUs. | |
DATA_DIR=/path/to/race-output-dir | |
ROBERTA_PATH=/path/to/roberta/model.pt | |
CUDA_VISIBLE_DEVICES=0,1 fairseq-train $DATA_DIR --ddp-backend=legacy_ddp \ | |
--restore-file $ROBERTA_PATH \ | |
--reset-optimizer --reset-dataloader --reset-meters \ | |
--best-checkpoint-metric accuracy --maximize-best-checkpoint-metric \ | |
--task sentence_ranking \ | |
--num-classes $NUM_CLASSES \ | |
--init-token 0 --separator-token 2 \ | |
--max-option-length 128 \ | |
--max-positions 512 \ | |
--shorten-method "truncate" \ | |
--arch roberta_large \ | |
--dropout 0.1 --attention-dropout 0.1 --weight-decay 0.01 \ | |
--criterion sentence_ranking \ | |
--optimizer adam --adam-betas '(0.9, 0.98)' --adam-eps 1e-06 \ | |
--clip-norm 0.0 \ | |
--lr-scheduler fixed --lr $LR \ | |
--fp16 --fp16-init-scale 4 --threshold-loss-scale 1 --fp16-scale-window 128 \ | |
--batch-size $MAX_SENTENCES \ | |
--required-batch-size-multiple 1 \ | |
--update-freq $UPDATE_FREQ \ | |
--max-epoch $MAX_EPOCH | |
``` | |
**Note:** | |
a) As contexts in RACE are relatively long, we are using smaller batch size per GPU while increasing update-freq to achieve larger effective batch size. | |
b) Above cmd-args and hyperparams are tested on one Nvidia `V100` GPU with `32gb` of memory for each task. Depending on the GPU memory resources available to you, you can use increase `--update-freq` and reduce `--batch-size`. | |
c) The setting in above command is based on our hyperparam search within a fixed search space (for careful comparison across models). You might be able to find better metrics with wider hyperparam search. | |
### 4) Evaluation: | |
``` | |
DATA_DIR=/path/to/race-output-dir # data directory used during training | |
MODEL_PATH=/path/to/checkpoint_best.pt # path to the finetuned model checkpoint | |
PREDS_OUT=preds.tsv # output file path to save prediction | |
TEST_SPLIT=test # can be test (Middle) or test1 (High) | |
fairseq-validate \ | |
$DATA_DIR \ | |
--valid-subset $TEST_SPLIT \ | |
--path $MODEL_PATH \ | |
--batch-size 1 \ | |
--task sentence_ranking \ | |
--criterion sentence_ranking \ | |
--save-predictions $PREDS_OUT | |
``` | |