Fine-tuning - a kd303 Collection

kd303 's Collections

Reasoning-lastest

code

Models

RAG

Synthetic Data papers

Agents

Fine-tuning

updated Dec 31, 2024

Extending Llama-3's Context Ten-Fold Overnight

Paper • 2404.19553 • Published Apr 30, 2024 • 34
ReFT: Representation Finetuning for Language Models

Paper • 2404.03592 • Published Apr 4, 2024 • 94
Why do small language models underperform? Studying Language Model Saturation via the Softmax Bottleneck

Paper • 2404.07647 • Published Apr 11, 2024 • 4
SciGLM: Training Scientific Language Models with Self-Reflective Instruction Annotation and Tuning

Paper • 2401.07950 • Published Jan 15, 2024 • 4
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Paper • 2312.06585 • Published Dec 11, 2023 • 29
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning

Paper • 2412.16849 • Published Dec 22, 2024 • 9