MarioBarbeque's picture
A second finetuing of the flan-T5-large model on the downsampled DeepMind LingAlg 1D dataset, this time with a GPU batch size of 256 as opposed to 32 used before
f9ec778 verified
raw
history blame contribute delete
142 Bytes
{
"_from_model_config": true,
"decoder_start_token_id": 0,
"eos_token_id": 1,
"pad_token_id": 0,
"transformers_version": "4.47.1"
}