Running 2.24k 2.24k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
NousResearch/DeepHermes-3-Llama-3-8B-Preview Text Generation • Updated about 5 hours ago • 17.4k • 294
Running 534 534 Scaling test-time compute 📈 Enhance math problem solving by scaling test-time compute