Better Synthetic Data by Retrieving and Transforming Existing Datasets Paper • 2404.14361 • Published Apr 22 • 1
Generative AI for Synthetic Data Generation: Methods, Challenges and the Future Paper • 2403.04190 • Published Mar 7
Best Practices and Lessons Learned on Synthetic Data for Language Models Paper • 2404.07503 • Published Apr 11 • 29
A Multi-Faceted Evaluation Framework for Assessing Synthetic Data Generated by Large Language Models Paper • 2404.14445 • Published Apr 20
Becoming self-instruct: introducing early stopping criteria for minimal instruct tuning Paper • 2307.03692 • Published Jul 5, 2023 • 25
Self-Refine Instruction-Tuning for Aligning Reasoning in Language Models Paper • 2405.00402 • Published May 1
TinyStories: How Small Can Language Models Be and Still Speak Coherent English? Paper • 2305.07759 • Published May 12, 2023 • 33