End of training, 20 epochs, 100 batch size, 1000 writer batch size, 1 gradient accumulation steps, learning rate: 0.0001, 30 s d4d28da verified sfedar commited on 26 days ago
End of training, 10 epochs, 100 batch size, 1000 writer batch size, 1 gradient accumulation steps, learning rate: 7e-05, 30 s c5c9f26 verified sfedar commited on about 1 month ago