A General Theoretical Paradigm to Understand Learning from Human Preferences Paper • 2310.12036 • Published Oct 18, 2023 • 13
sentence-transformers-from-synthetic-data Collection Example of using distilabel to generate synthetic triplets data for fine-tuning a Sentence Transformer model • 4 items • Updated Jun 21 • 21
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published May 2 • 119
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6 • 183
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12 • 63
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 19 days ago • 697