
A second finetuing of the flan-T5-large model on the downsampled DeepMind LingAlg 1D dataset, this time with a GPU batch size of 256 as opposed to 32 used before
f9ec778
verified
{ | |
"_from_model_config": true, | |
"decoder_start_token_id": 0, | |
"eos_token_id": 1, | |
"pad_token_id": 0, | |
"transformers_version": "4.47.1" | |
} | |