metadata
datasets:
- EleutherAI/pile
language:
- en
Based model but uses layernorm instead of QK.sum(-1) for the normalization, for better hardware efficiency.
datasets:
- EleutherAI/pile
language:
- en
Based model but uses layernorm instead of QK.sum(-1) for the normalization, for better hardware efficiency.