test loss 2.669290 on crumb/flan-ul2-tinystories-complex, initialized from crumb/opentinystories-30m-base, 2 epochs, linear decreasing lr 1e-4. trained with double the batch size (256)

Downloads last month
172
Safetensors
Model size
102M params
Tensor type
F32
·
BF16
·
BOOL
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Datasets used to train crumb/opentinystories-68m-complex