Data Efficient Approaches - a floom Collection

floom 's Collections

Coding

ICL

RL

Agents

NLU

RAG

Data Efficient Approaches

Personalization

sentence-transformer-models

Tool Use & more

Feedback Analysis

Memory

SSM

Efficient Serving/Inference

Synthetic Data Generation

Frontier research ideas

Data Efficient Approaches

updated Jul 18

How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15 • 39
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

Paper • 2403.15042 • Published Mar 22 • 25
MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets

Paper • 2403.03194 • Published Mar 5 • 12
Orca-Math: Unlocking the potential of SLMs in Grade School Math

Paper • 2402.14830 • Published Feb 16 • 24
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Paper • 2402.13064 • Published Feb 20 • 47
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements

Paper • 2402.10963 • Published Feb 13 • 10
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss

Paper • 2402.10790 • Published Feb 16 • 41
BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15 • 18
Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11 • 84
LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15 • 87
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback

Paper • 2406.00888 • Published Jun 2 • 30
Deep Bayesian Active Learning for Preference Modeling in Large Language Models

Paper • 2406.10023 • Published Jun 14 • 2
Unlocking Continual Learning Abilities in Language Models

Paper • 2406.17245 • Published Jun 25 • 28
Increasing Model Capacity for Free: A Simple Strategy for Parameter Efficient Fine-tuning

Paper • 2407.01320 • Published Jul 1