O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? Paper • 2411.16489 • Published Nov 25, 2024 • 40
view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais • Nov 13, 2024 • 98
TableGPT2: A Large Multimodal Model with Tabular Data Integration Paper • 2411.02059 • Published Nov 4, 2024 • 5
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published Nov 7, 2024 • 111
General Preference Modeling with Preference Representations for Aligning Language Models Paper • 2410.02197 • Published Oct 3, 2024 • 8
LongRecipe: Recipe for Efficient Long Context Generalization in Large Languge Models Paper • 2409.00509 • Published Aug 31, 2024 • 37
RegMix: Data Mixture as Regression for Language Model Pre-training Paper • 2407.01492 • Published Jul 1, 2024 • 35
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper • 2406.17557 • Published Jun 25, 2024 • 87
AnyTaskTune-Psychology Collection Elevating Domain-Specific Model Performance with Precision Task-Specific Fine-Tuning • 3 items • Updated Jun 7, 2024 • 3
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design Paper • 2401.14112 • Published Jan 25, 2024 • 18
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression Paper • 2403.15447 • Published Mar 18, 2024 • 16
view article Article ⚗️ 🧑🏼🌾 Let's grow some Domain Specific Datasets together By burtenshaw • Apr 29, 2024 • 29
view article Article 🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets By dvilasuero • Jun 4, 2024 • 73