README / README.md
Zhangchen Xu
Update README.md
d8f8550 verified
metadata
title: README
emoji: 🐱
colorFrom: green
colorTo: red
sdk: static
pinned: false

🐱 KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding

KodCode is the largest fully-synthetic open-source dataset providing verifiable solutions and tests for coding tasks. It contains 12 distinct subsets spanning various domains (from algorithmic to package-specific knowledge) and difficulty levels (from basic coding exercises to interview and competitive programming challenges). KodCode is designed for both supervised fine-tuning (SFT) and RL tuning.

πŸ•ΈοΈ Project Website | πŸ“„ Technical Report | πŸ’Ύ Github Repo | πŸ€— KodCode-V1 (For RL) | πŸ€— KodCode-V1-SFT-R1 (for SFT)