MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures Paper • 2410.13754 • Published Oct 17 • 74
WildVis: Open Source Visualizer for Million-Scale Chat Logs in the Wild Paper • 2409.03753 • Published Sep 5 • 18
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing Paper • 2406.08464 • Published Jun 12 • 65
From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step Paper • 2405.14838 • Published May 23 • 2
Tree Prompting: Efficient Task Adaptation without Fine-Tuning Paper • 2310.14034 • Published Oct 21, 2023 • 2
Implicit Chain of Thought Reasoning via Knowledge Distillation Paper • 2311.01460 • Published Nov 2, 2023 • 2
DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies Paper • 2310.04610 • Published Oct 6, 2023 • 1
Image-to-Markup Generation with Coarse-to-Fine Attention Paper • 1609.04938 • Published Sep 16, 2016 • 1
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild Paper • 2406.04770 • Published Jun 7 • 27
MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures Paper • 2406.06565 • Published Jun 3 • 9