Text Generation
Transformers
Safetensors
English
llama
code
text-generation-inference
Inference Endpoints

Model Card for CodeDrafter-500M

A draft model for Llama3.1/3.2/3.3 series models, specialized in python coding. This model is finetuned from the first 4 layers of facebook/layerskip-llama3.2-1B.

Citation

@article{chen2024sequoia,
  title={Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding},
  author={Chen, Zhuoming and May, Avner and Svirschevski, Ruslan and Huang, Yuhsun and Ryabinin, Max and Jia, Zhihao and Chen, Beidi},
  journal={arXiv preprint arXiv:2402.12374},
  year={2024}
}
Downloads last month
86
Safetensors
Model size
506M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for InfiniAILab/CodeDrafter-500M

Finetuned
(1)
this model

Datasets used to train InfiniAILab/CodeDrafter-500M