Collections
Discover the best community collections!
Collections including paper arxiv:2401.02385
-
Rethinking Optimization and Architecture for Tiny Language Models
Paper • 2402.02791 • Published • 12 -
Specialized Language Models with Cheap Inference from Limited Domain Data
Paper • 2402.01093 • Published • 45 -
Scavenging Hyena: Distilling Transformers into Long Convolution Models
Paper • 2401.17574 • Published • 15 -
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper • 2401.02038 • Published • 61
-
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 49 -
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Paper • 2401.10774 • Published • 53 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 141 -
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding
Paper • 2401.12954 • Published • 28
-
TinyLlama: An Open-Source Small Language Model
Paper • 2401.02385 • Published • 89 -
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Paper • 2401.01335 • Published • 64 -
Asynchronous Local-SGD Training for Language Modeling
Paper • 2401.09135 • Published • 9 -
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Paper • 2404.07143 • Published • 103