File size: 2,877 Bytes
608d434 fc51336 15cd0e2 fc51336 b83a16b fc51336 71c65ef 15cd0e2 6840889 051e153 15cd0e2 0adad45 12408c7 0adad45 12408c7 0adad45 15cd0e2 12408c7 15cd0e2 12408c7 15cd0e2 12408c7 15cd0e2 608d434 fc51336 fda6687 fc51336 8b4238c fc51336 15cd0e2 fe38445 15cd0e2 fc51336 45fcf9f fc51336 45fcf9f fc51336 45fcf9f fc51336 45fcf9f fc51336 45fcf9f fc51336 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 |
---
license: apache-2.0
datasets:
- lambdasec/cve-single-line-fixes
- lambdasec/gh-top-1000-projects-vulns
language:
- code
tags:
- code
programming_language:
- Java
- JavaScript
- Python
inference: false
model-index:
- name: SantaFixer
results:
- task:
type: text-generation
dataset:
type: openai/human-eval-infilling
name: HumanEval
metrics:
- name: single-line infilling pass@1
type: pass@1
value: 0.47
verified: false
- name: single-line infilling pass@10
type: pass@10
value: 0.73
verified: false
- task:
type: text-generation
dataset:
type: lambdasec/gh-top-1000-projects-vulns
name: GH Top 1000 Projects Vulnerabilities
metrics:
- name: pass@1 (Java)
type: pass@1
value: 0.1
verified: false
- name: pass@10 (Java)
type: pass@10
value: 0.1
verified: false
- name: pass@1 (Python)
type: pass@1
value: 0.2
verified: false
- name: pass@10 (Python)
type: pass@10
value: 0.2
verified: false
- name: pass@1 (JavaScript)
type: pass@1
value: 0.3
verified: false
- name: pass@10 (JavaScript)
type: pass@10
value: 0.3
verified: false
---
# Model Card for SantaFixer
<!-- Provide a quick summary of what the model is/does. -->
This is a LLM for code that is focussed on generating bug fixes using infilling.
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
- **Developed by:** [codelion](https://huggingface.co/codelion)
- **Model type:** GPT-2
- **Finetuned from model:** [bigcode/santacoder](https://huggingface.co/bigcode/santacoder)
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
### Direct Use
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
[More Information Needed]
## How to Get Started with the Model
Use the code below to get started with the model.
[More Information Needed]
## Training Details
- **GPU:** Tesla P100
- **Time:** ~5 hrs
### Training Data
The model was fine-tuned on the [CVE single line fixes dataset](https://huggingface.co/datasets/lambdasec/cve-single-line-fixes)
### Training Procedure
Supervised Fine Tuning (SFT)
#### Training Hyperparameters
- **optim:** adafactor
- **gradient_accumulation_steps:** 4
- **gradient_checkpointing:** true
- **fp16:** false
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
### Testing Data, Factors & Metrics
#### Testing Data
<!-- This should link to a Data Card if possible. -->
[More Information Needed]
### Results
[More Information Needed]
#### Summary
[More Information Needed] |