metadata
title: README
emoji: π±
colorFrom: green
colorTo: red
sdk: static
pinned: false
π± KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding
KodCode is the largest fully-synthetic open-source dataset providing verifiable solutions and tests for coding tasks. It contains 12 distinct subsets spanning various domains (from algorithmic to package-specific knowledge) and difficulty levels (from basic coding exercises to interview and competitive programming challenges). KodCode is designed for both supervised fine-tuning (SFT) and RL tuning.
πΈοΈ Project Website | π Technical Report | πΎ Github Repo | π€ KodCode-V1 (For RL) | π€ KodCode-V1-SFT-R1 (for SFT)