update readme
Browse files
README.md
CHANGED
@@ -1,3 +1,170 @@
|
|
1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
license: bigcode-openrail-m
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
pipeline_tag: text-generation
|
3 |
+
inference: true
|
4 |
+
widget:
|
5 |
+
- text: '<commit_before>def has_close_elements(numbers: List[float], threshold: float) -> bool:\n for idx, elem in enumerate(numbers):\n for idx2, elem2 in enumerate(numbers):\n if idx != idx2:\n distance = elem - elem2\n if distance < threshold:\n return True\n\n return False<commit_message>Fix bugs in has_close_elements.<commit_after>'
|
6 |
+
example_title: Fix has_close_elements
|
7 |
+
group: Python
|
8 |
license: bigcode-openrail-m
|
9 |
+
datasets:
|
10 |
+
- bigcode/commits-8129-v2
|
11 |
+
metrics:
|
12 |
+
- code_eval
|
13 |
+
library_name: transformers
|
14 |
+
tags:
|
15 |
+
- code
|
16 |
+
model-index:
|
17 |
+
- name: SantaCoderPack
|
18 |
+
results:
|
19 |
+
- task:
|
20 |
+
type: text-generation
|
21 |
+
dataset:
|
22 |
+
type: bigcode/humanevalpack
|
23 |
+
name: HumanEvalFix Python
|
24 |
+
metrics:
|
25 |
+
- name: pass@1
|
26 |
+
type: pass@1
|
27 |
+
value: 3.2
|
28 |
+
verified: false
|
29 |
+
- task:
|
30 |
+
type: text-generation
|
31 |
+
dataset:
|
32 |
+
type: bigcode/humanevalpack
|
33 |
+
name: HumanEvalFix JavaScript
|
34 |
+
metrics:
|
35 |
+
- name: pass@1
|
36 |
+
type: pass@1
|
37 |
+
value: 4.9
|
38 |
+
verified: false
|
39 |
+
- task:
|
40 |
+
type: text-generation
|
41 |
+
dataset:
|
42 |
+
type: bigcode/humanevalpack
|
43 |
+
name: HumanEvalFix Java
|
44 |
+
metrics:
|
45 |
+
- name: pass@1
|
46 |
+
type: pass@1
|
47 |
+
value: 1.8
|
48 |
+
verified: false
|
49 |
+
- task:
|
50 |
+
type: text-generation
|
51 |
+
dataset:
|
52 |
+
type: bigcode/humanevalpack
|
53 |
+
name: HumanEvalFix Go
|
54 |
+
metrics:
|
55 |
+
- name: pass@1
|
56 |
+
type: pass@1
|
57 |
+
value: 3.6
|
58 |
+
verified: false
|
59 |
+
- task:
|
60 |
+
type: text-generation
|
61 |
+
dataset:
|
62 |
+
type: bigcode/humanevalpack
|
63 |
+
name: HumanEvalFix C++
|
64 |
+
metrics:
|
65 |
+
- name: pass@1
|
66 |
+
type: pass@1
|
67 |
+
value: 4.2
|
68 |
+
verified: false
|
69 |
+
- task:
|
70 |
+
type: text-generation
|
71 |
+
dataset:
|
72 |
+
type: bigcode/humanevalpack
|
73 |
+
name: HumanEvalFix Rust
|
74 |
+
metrics:
|
75 |
+
- name: pass@1
|
76 |
+
type: pass@1
|
77 |
+
value: 1.7
|
78 |
+
verified: false
|
79 |
+
- task:
|
80 |
+
type: text-generation
|
81 |
+
dataset:
|
82 |
+
type: bigcode/humanevalpack
|
83 |
+
name: HumanEvalFix Average
|
84 |
+
metrics:
|
85 |
+
- name: pass@1
|
86 |
+
type: pass@1
|
87 |
+
value: 3.3
|
88 |
+
verified: false
|
89 |
---
|
90 |
+

|
91 |
+
|
92 |
+
# Table of Contents
|
93 |
+
|
94 |
+
1. [Model Summary](#model-summary)
|
95 |
+
2. [Use](#use)
|
96 |
+
3. [Training](#training)
|
97 |
+
4. [Citation](#citation)
|
98 |
+
|
99 |
+
# Model Summary
|
100 |
+
|
101 |
+
SantaCoderPack is an pre-trained model with the same architecture of SantaCoder on
|
102 |
+
<th><a href=https://huggingface.co/datasets/bigcode/commitpack>CommitPack</a> using this format: `<commit_before>code_before<commit_msg>message<commit_after>`
|
103 |
+
|
104 |
+
- **Repository:** [bigcode/octopack](https://github.com/bigcode-project/octopack)
|
105 |
+
- **Paper:** [TODO]()
|
106 |
+
- **Languages:** Python, JavaScript, Java, C++, Go, Rust
|
107 |
+
- **SantaCoderPack:**
|
108 |
+
<table>
|
109 |
+
<tr>
|
110 |
+
<th>Data</t>
|
111 |
+
<th><a href=https://huggingface.co/datasets/bigcode/commitpack>CommitPack</a></th>
|
112 |
+
<td>4TB of GitHub commits across 350 programming languages</td>
|
113 |
+
</tr>
|
114 |
+
<tr>
|
115 |
+
<th>Model</t>
|
116 |
+
<th><a href=https://huggingface.co/bigcode/octocoder>SantaCoderPack</a></th>
|
117 |
+
<td>SantaCoderPack (1.1B parameters) pre-trained on CommitPack</td>
|
118 |
+
</tr>
|
119 |
+
<tr>
|
120 |
+
<th>Evaluation </t>
|
121 |
+
<th><a href=https://huggingface.co/datasets/bigcode/humanevalpack>HumanEvalPack/HumanEvalFix</a></th>
|
122 |
+
<td>Extension of OpenAI's HumanEval to HumanEvalFix</td>
|
123 |
+
</tr>
|
124 |
+
</table>
|
125 |
+
|
126 |
+
|
127 |
+
# Use
|
128 |
+
|
129 |
+
## Intended use
|
130 |
+
|
131 |
+
The model follows instructions provided in the input. We recommend prefacing your input with "<commit_before>def has_close_elements(numbers: List[float], threshold: float) -> bool:\n for idx, elem in enumerate(numbers):\n for idx2, elem2 in enumerate(numbers):\n if idx != idx2:\n distance = elem - elem2\n if distance < threshold:\n return True\n\n return False<commit_message>Fix bugs in has_close_elements.<commit_after>"
|
132 |
+
|
133 |
+
**Feel free to share your generations in the Community tab!**
|
134 |
+
|
135 |
+
## Generation
|
136 |
+
```python
|
137 |
+
# pip install -q transformers
|
138 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
139 |
+
checkpoint = "bigcode/santacoderpack"
|
140 |
+
device = "cuda" # for GPU usage or "cpu" for CPU usage
|
141 |
+
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
|
142 |
+
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)
|
143 |
+
inputs = tokenizer.encode("Q<commit_before>def has_close_elements(numbers: List[float], threshold: float) -> bool:\n for idx, elem in enumerate(numbers):\n for idx2, elem2 in enumerate(numbers):\n if idx != idx2:\n distance = elem - elem2\n if distance < threshold:\n return True\n\n return False<commit_message>Fix bugs in has_close_elements.<commit_after>", return_tensors="pt").to(device)
|
144 |
+
outputs = model.generate(inputs)
|
145 |
+
print(tokenizer.decode(outputs[0]))
|
146 |
+
```
|
147 |
+
|
148 |
+
# Training
|
149 |
+
|
150 |
+
## Model
|
151 |
+
|
152 |
+
- **Architecture:** GPT-2 model with multi-query attention
|
153 |
+
- **Steps:** 250k pretraining
|
154 |
+
- **Pretraining tokens:** 131B
|
155 |
+
- **Precision:** bfloat16
|
156 |
+
|
157 |
+
## Hardware
|
158 |
+
|
159 |
+
- **Pretraining:**
|
160 |
+
- **GPUs:** 32 Tesla A100
|
161 |
+
- **Training time:** 15 days
|
162 |
+
|
163 |
+
## Software
|
164 |
+
|
165 |
+
- **Orchestration:** [Megatron-LM/Transformers](https://github.com/bigcode-project/santacoderpack#training)
|
166 |
+
- **Neural networks:** [PyTorch](https://github.com/pytorch/pytorch)
|
167 |
+
|
168 |
+
# Citation
|
169 |
+
|
170 |
+
TODO
|