Running 2.26k 2.26k The Ultra-Scale Playbook π The ultimate guide to training LLM on large GPU Clusters
GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill and Extreme KV-Cache Compression Paper β’ 2407.12077 β’ Published Jul 16, 2024 β’ 56