matlok 's Collections
LMM

Papers - Text - Supervised Fine-tuning - Batch Grouping

Batches are grouped by similar token length to help optimize gpu/hardware. Mini batch lengths are different but the max number of tokens is the same.