StarCoder: may the source be with you!
Paper
•
2305.06161
•
Published
•
30
All models, datasets, and demos related to StarCoder!
Note StarCoder is based on StarCoderBase and trained on additional 30B tokens of Python.
Note StarCoderBase is a 15B parameter decoder trained on 1T tokens of code in 80+ programming languages.
Note StarCoderPlus is based on StarCoderBase and trained on additional 600B tokens of natural text from RefinedWeb and Wikipedia.
Note The fully processed code dataset used to train StarCoder.