MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding Paper • 2408.11049 • Published Aug 20 • 12
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated 20 days ago • 636