WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training Paper • 2501.18511 • Published Jan 30 • 19
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs Paper • 2402.14740 • Published Feb 22, 2024 • 13
🧠 Abliteration Collection Uncensored models using abliteration. See this article for more information: huggingface.co/blog/mlabonne/abliteration • 12 items • Updated about 2 hours ago • 33
The Impact of Hyperparameters on Large Language Model Inference Performance: An Evaluation of vLLM and HuggingFace Pipelines Paper • 2408.01050 • Published Aug 2, 2024 • 9
Improving Text Embeddings for Smaller Language Models Using Contrastive Fine-tuning Paper • 2408.00690 • Published Aug 1, 2024 • 25