Foundational Autoraters: Taming Large Language Models for Better Automatic Evaluation Paper • 2407.10817 • Published Jul 15 • 13
Flow Judge v0.1 held-out test datasets Collection This collection contains held-out splits for testing Flow-Judge-v0.1. • 4 items • Updated Sep 14 • 2
Flow-Judge-v0.1 out-of-domain evaluation datasets Collection This collection contains out-of-domain datasets used to evaluate the generalization capabilities of Flow-Judge-v0.1 • 5 items • Updated Sep 13 • 1
🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated about 1 hour ago • 204
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch Paper • 2311.03099 • Published Nov 6, 2023 • 28
Model Merging Papers Collection Collection of relevant papers about model merging • 13 items • Updated Apr 2 • 5