--- language: - code datasets: - nuprl/EditPackFT library_name: transformers pipeline_tag: text2text-generation tags: - code model-index: - name: EditCoder-6.7b-v1 results: - task: type: text-generation dataset: type: nuprl/CanItEdit name: CanItEdit Descriptive metrics: - name: pass@1 type: pass@1 value: 0.4815 verified: false - task: type: text-generation dataset: type: nuprl/CanItEdit name: CanItEdit Lazy metrics: - name: pass@1 type: pass@1 value: 0.3696 verified: false --- EditCoder-6.7b (version 1) is a fine-tuned version of [DeepSeek Coder](deepseek-ai/deepseek-coder-6.7b-base) (base model, 6.7b parameters) for instructional code editing. We utilize [EditPackFT](https://huggingface.co/datasets/nuprl/EditPackFT) as our fine-tuning dataset, and we show state-of-the-art performance among non-distilled open source models for code editing, using the [CanItEdit](https://huggingface.co/datasets/nuprl/CanItEdit) benchmark. More information can be found on [our paper](https://arxiv.org/abs/2312.12450). **NOTE: This is the model trained on EditPackFT, not Commits2023FT. We are working on releasing that one soon.** ## Citation If you use our work, please cite our paper as such: ``` @inproceedings{cassano2023edit, title={{Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions}}, author={Federico Cassano and Luisa Li and Akul Sethi and Noah Shinn and Abby Brennan-Jones and Anton Lozhkov and Carolyn Jane Anderson and Arjun Guha}, booktitle={The First International Workshop on Large Language Model for Code}, year={2024}, url={https://arxiv.org/abs/2312.12450} } ``` # Prompt The model has been trained on the following prompt format: ``` ## Code Before: {before} ## Instruction: {instruction} ## Code After: {after} ``` Here is a python function that can be used for formatting the prompt correctly: ```py def edit_prompt(old, instr): before = f"""## Code Before:\n{old}\n""" instr = f"""## Instruction:\n{instr}\n""" after = f"""## Code After:\n""" return before + instr + after ``` # Train Your Own EditCoder We provide the full pipeline that was used for training our own edit-coder model. The pipeline and instructions can be found on our [GitHub repository](https://github.com/nuprl/CanItEdit/tree/main/editcoder).