legraphista commited on
Commit
152441e
β€’
1 Parent(s): 7c4c6c3

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +363 -0
README.md ADDED
@@ -0,0 +1,363 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: ibm-granite/granite-20b-code-instruct
3
+ datasets:
4
+ - bigcode/commitpackft
5
+ - TIGER-Lab/MathInstruct
6
+ - meta-math/MetaMathQA
7
+ - glaiveai/glaive-code-assistant-v3
8
+ - glaive-function-calling-v2
9
+ - bugdaryan/sql-create-context-instruction
10
+ - garage-bAInd/Open-Platypus
11
+ - nvidia/HelpSteer
12
+ inference: false
13
+ library_name: gguf
14
+ license: apache-2.0
15
+ metrics:
16
+ - code_eval
17
+ model-index:
18
+ - name: granite-20b-code-instruct
19
+ results:
20
+ - dataset:
21
+ name: HumanEvalSynthesis(Python)
22
+ type: bigcode/humanevalpack
23
+ metrics:
24
+ - name: pass@1
25
+ type: pass@1
26
+ value: 60.4
27
+ veriefied: false
28
+ task:
29
+ type: text-generation
30
+ - dataset:
31
+ name: HumanEvalSynthesis(JavaScript)
32
+ type: bigcode/humanevalpack
33
+ metrics:
34
+ - name: pass@1
35
+ type: pass@1
36
+ value: 53.7
37
+ veriefied: false
38
+ task:
39
+ type: text-generation
40
+ - dataset:
41
+ name: HumanEvalSynthesis(Java)
42
+ type: bigcode/humanevalpack
43
+ metrics:
44
+ - name: pass@1
45
+ type: pass@1
46
+ value: 58.5
47
+ veriefied: false
48
+ task:
49
+ type: text-generation
50
+ - dataset:
51
+ name: HumanEvalSynthesis(Go)
52
+ type: bigcode/humanevalpack
53
+ metrics:
54
+ - name: pass@1
55
+ type: pass@1
56
+ value: 42.1
57
+ veriefied: false
58
+ task:
59
+ type: text-generation
60
+ - dataset:
61
+ name: HumanEvalSynthesis(C++)
62
+ type: bigcode/humanevalpack
63
+ metrics:
64
+ - name: pass@1
65
+ type: pass@1
66
+ value: 45.7
67
+ veriefied: false
68
+ task:
69
+ type: text-generation
70
+ - dataset:
71
+ name: HumanEvalSynthesis(Rust)
72
+ type: bigcode/humanevalpack
73
+ metrics:
74
+ - name: pass@1
75
+ type: pass@1
76
+ value: 42.7
77
+ veriefied: false
78
+ task:
79
+ type: text-generation
80
+ - dataset:
81
+ name: HumanEvalExplain(Python)
82
+ type: bigcode/humanevalpack
83
+ metrics:
84
+ - name: pass@1
85
+ type: pass@1
86
+ value: 44.5
87
+ veriefied: false
88
+ task:
89
+ type: text-generation
90
+ - dataset:
91
+ name: HumanEvalExplain(JavaScript)
92
+ type: bigcode/humanevalpack
93
+ metrics:
94
+ - name: pass@1
95
+ type: pass@1
96
+ value: 42.7
97
+ veriefied: false
98
+ task:
99
+ type: text-generation
100
+ - dataset:
101
+ name: HumanEvalExplain(Java)
102
+ type: bigcode/humanevalpack
103
+ metrics:
104
+ - name: pass@1
105
+ type: pass@1
106
+ value: 49.4
107
+ veriefied: false
108
+ task:
109
+ type: text-generation
110
+ - dataset:
111
+ name: HumanEvalExplain(Go)
112
+ type: bigcode/humanevalpack
113
+ metrics:
114
+ - name: pass@1
115
+ type: pass@1
116
+ value: 32.3
117
+ veriefied: false
118
+ task:
119
+ type: text-generation
120
+ - dataset:
121
+ name: HumanEvalExplain(C++)
122
+ type: bigcode/humanevalpack
123
+ metrics:
124
+ - name: pass@1
125
+ type: pass@1
126
+ value: 42.1
127
+ veriefied: false
128
+ task:
129
+ type: text-generation
130
+ - dataset:
131
+ name: HumanEvalExplain(Rust)
132
+ type: bigcode/humanevalpack
133
+ metrics:
134
+ - name: pass@1
135
+ type: pass@1
136
+ value: 18.3
137
+ veriefied: false
138
+ task:
139
+ type: text-generation
140
+ - dataset:
141
+ name: HumanEvalFix(Python)
142
+ type: bigcode/humanevalpack
143
+ metrics:
144
+ - name: pass@1
145
+ type: pass@1
146
+ value: 43.9
147
+ veriefied: false
148
+ task:
149
+ type: text-generation
150
+ - dataset:
151
+ name: HumanEvalFix(JavaScript)
152
+ type: bigcode/humanevalpack
153
+ metrics:
154
+ - name: pass@1
155
+ type: pass@1
156
+ value: 43.9
157
+ veriefied: false
158
+ task:
159
+ type: text-generation
160
+ - dataset:
161
+ name: HumanEvalFix(Java)
162
+ type: bigcode/humanevalpack
163
+ metrics:
164
+ - name: pass@1
165
+ type: pass@1
166
+ value: 45.7
167
+ veriefied: false
168
+ task:
169
+ type: text-generation
170
+ - dataset:
171
+ name: HumanEvalFix(Go)
172
+ type: bigcode/humanevalpack
173
+ metrics:
174
+ - name: pass@1
175
+ type: pass@1
176
+ value: 41.5
177
+ veriefied: false
178
+ task:
179
+ type: text-generation
180
+ - dataset:
181
+ name: HumanEvalFix(C++)
182
+ type: bigcode/humanevalpack
183
+ metrics:
184
+ - name: pass@1
185
+ type: pass@1
186
+ value: 41.5
187
+ veriefied: false
188
+ task:
189
+ type: text-generation
190
+ - dataset:
191
+ name: HumanEvalFix(Rust)
192
+ type: bigcode/humanevalpack
193
+ metrics:
194
+ - name: pass@1
195
+ type: pass@1
196
+ value: 29.9
197
+ veriefied: false
198
+ task:
199
+ type: text-generation
200
+ pipeline_tag: text-generation
201
+ quantized_by: legraphista
202
+ tags:
203
+ - code
204
+ - granite
205
+ - quantized
206
+ - GGUF
207
+ - quantization
208
+ - imat
209
+ - imatrix
210
+ - static
211
+ - 16bit
212
+ - 8bit
213
+ - 6bit
214
+ - 5bit
215
+ - 4bit
216
+ - 3bit
217
+ - 2bit
218
+ - 1bit
219
+ ---
220
+
221
+ # granite-20b-code-instruct-IMat-GGUF
222
+ _Llama.cpp imatrix quantization of ibm-granite/granite-20b-code-instruct_
223
+
224
+ Original Model: [ibm-granite/granite-20b-code-instruct](https://huggingface.co/ibm-granite/granite-20b-code-instruct)
225
+ Original dtype: `BF16` (`bfloat16`)
226
+ Quantized by: llama.cpp [b3649](https://github.com/ggerganov/llama.cpp/releases/tag/b3649)
227
+ IMatrix dataset: [here](https://gist.githubusercontent.com/bartowski1182/eb213dccb3571f863da82e99418f81e8/raw/b2869d80f5c16fd7082594248e80144677736635/calibration_datav3.txt)
228
+
229
+ - [Files](#files)
230
+ - [IMatrix](#imatrix)
231
+ - [Common Quants](#common-quants)
232
+ - [All Quants](#all-quants)
233
+ - [Downloading using huggingface-cli](#downloading-using-huggingface-cli)
234
+ - [Inference](#inference)
235
+ - [Simple chat template](#simple-chat-template)
236
+ - [Chat template with system prompt](#chat-template-with-system-prompt)
237
+ - [Llama.cpp](#llama-cpp)
238
+ - [FAQ](#faq)
239
+ - [Why is the IMatrix not applied everywhere?](#why-is-the-imatrix-not-applied-everywhere)
240
+ - [How do I merge a split GGUF?](#how-do-i-merge-a-split-gguf)
241
+
242
+ ---
243
+
244
+ ## Files
245
+
246
+ ### IMatrix
247
+ Status: ⏳ Processing
248
+ Link: [here](https://huggingface.co/legraphista/granite-20b-code-instruct-IMat-GGUF/blob/main/imatrix.dat)
249
+
250
+ ### Common Quants
251
+ | Filename | Quant type | File Size | Status | Uses IMatrix | Is Split |
252
+ | -------- | ---------- | --------- | ------ | ------------ | -------- |
253
+ | granite-20b-code-instruct.Q8_0 | Q8_0 | - | ⏳ Processing | βšͺ Static | -
254
+ | granite-20b-code-instruct.Q6_K | Q6_K | - | ⏳ Processing | βšͺ Static | -
255
+ | granite-20b-code-instruct.Q4_K | Q4_K | - | ⏳ Processing | 🟒 IMatrix | -
256
+ | granite-20b-code-instruct.Q3_K | Q3_K | - | ⏳ Processing | 🟒 IMatrix | -
257
+ | granite-20b-code-instruct.Q2_K | Q2_K | - | ⏳ Processing | 🟒 IMatrix | -
258
+
259
+
260
+ ### All Quants
261
+ | Filename | Quant type | File Size | Status | Uses IMatrix | Is Split |
262
+ | -------- | ---------- | --------- | ------ | ------------ | -------- |
263
+ | granite-20b-code-instruct.BF16 | BF16 | - | ⏳ Processing | βšͺ Static | -
264
+ | granite-20b-code-instruct.FP16 | F16 | - | ⏳ Processing | βšͺ Static | -
265
+ | granite-20b-code-instruct.Q8_0 | Q8_0 | - | ⏳ Processing | βšͺ Static | -
266
+ | granite-20b-code-instruct.Q6_K | Q6_K | - | ⏳ Processing | βšͺ Static | -
267
+ | granite-20b-code-instruct.Q5_K | Q5_K | - | ⏳ Processing | βšͺ Static | -
268
+ | granite-20b-code-instruct.Q5_K_S | Q5_K_S | - | ⏳ Processing | βšͺ Static | -
269
+ | granite-20b-code-instruct.Q4_K | Q4_K | - | ⏳ Processing | 🟒 IMatrix | -
270
+ | granite-20b-code-instruct.Q4_K_S | Q4_K_S | - | ⏳ Processing | 🟒 IMatrix | -
271
+ | granite-20b-code-instruct.IQ4_NL | IQ4_NL | - | ⏳ Processing | 🟒 IMatrix | -
272
+ | granite-20b-code-instruct.IQ4_XS | IQ4_XS | - | ⏳ Processing | 🟒 IMatrix | -
273
+ | granite-20b-code-instruct.Q3_K | Q3_K | - | ⏳ Processing | 🟒 IMatrix | -
274
+ | granite-20b-code-instruct.Q3_K_L | Q3_K_L | - | ⏳ Processing | 🟒 IMatrix | -
275
+ | granite-20b-code-instruct.Q3_K_S | Q3_K_S | - | ⏳ Processing | 🟒 IMatrix | -
276
+ | granite-20b-code-instruct.IQ3_M | IQ3_M | - | ⏳ Processing | 🟒 IMatrix | -
277
+ | granite-20b-code-instruct.IQ3_S | IQ3_S | - | ⏳ Processing | 🟒 IMatrix | -
278
+ | granite-20b-code-instruct.IQ3_XS | IQ3_XS | - | ⏳ Processing | 🟒 IMatrix | -
279
+ | granite-20b-code-instruct.IQ3_XXS | IQ3_XXS | - | ⏳ Processing | 🟒 IMatrix | -
280
+ | granite-20b-code-instruct.Q2_K | Q2_K | - | ⏳ Processing | 🟒 IMatrix | -
281
+ | granite-20b-code-instruct.Q2_K_S | Q2_K_S | - | ⏳ Processing | 🟒 IMatrix | -
282
+ | granite-20b-code-instruct.IQ2_M | IQ2_M | - | ⏳ Processing | 🟒 IMatrix | -
283
+ | granite-20b-code-instruct.IQ2_S | IQ2_S | - | ⏳ Processing | 🟒 IMatrix | -
284
+ | granite-20b-code-instruct.IQ2_XS | IQ2_XS | - | ⏳ Processing | 🟒 IMatrix | -
285
+ | granite-20b-code-instruct.IQ2_XXS | IQ2_XXS | - | ⏳ Processing | 🟒 IMatrix | -
286
+ | granite-20b-code-instruct.IQ1_M | IQ1_M | - | ⏳ Processing | 🟒 IMatrix | -
287
+ | granite-20b-code-instruct.IQ1_S | IQ1_S | - | ⏳ Processing | 🟒 IMatrix | -
288
+
289
+
290
+ ## Downloading using huggingface-cli
291
+ If you do not have hugginface-cli installed:
292
+ ```
293
+ pip install -U "huggingface_hub[cli]"
294
+ ```
295
+ Download the specific file you want:
296
+ ```
297
+ huggingface-cli download legraphista/granite-20b-code-instruct-IMat-GGUF --include "granite-20b-code-instruct.Q8_0.gguf" --local-dir ./
298
+ ```
299
+ If the model file is big, it has been split into multiple files. In order to download them all to a local folder, run:
300
+ ```
301
+ huggingface-cli download legraphista/granite-20b-code-instruct-IMat-GGUF --include "granite-20b-code-instruct.Q8_0/*" --local-dir ./
302
+ # see FAQ for merging GGUF's
303
+ ```
304
+
305
+ ---
306
+
307
+ ## Inference
308
+
309
+ ### Simple chat template
310
+ ```
311
+ Question:
312
+ {user_prompt}
313
+
314
+ Answer:
315
+ {assistant_response}
316
+
317
+ Question:
318
+ {next_user_prompt}
319
+
320
+
321
+ ```
322
+
323
+ ### Chat template with system prompt
324
+ ```
325
+ System:
326
+ {system_prompt}
327
+
328
+ Question:
329
+ {user_prompt}
330
+
331
+ Answer:
332
+ {assistant_response}
333
+
334
+ Question:
335
+ {next_user_prompt}
336
+
337
+
338
+ ```
339
+
340
+ ### Llama.cpp
341
+ ```
342
+ llama.cpp/main -m granite-20b-code-instruct.Q8_0.gguf --color -i -p "prompt here (according to the chat template)"
343
+ ```
344
+
345
+ ---
346
+
347
+ ## FAQ
348
+
349
+ ### Why is the IMatrix not applied everywhere?
350
+ According to [this investigation](https://www.reddit.com/r/LocalLLaMA/comments/1993iro/ggufs_quants_can_punch_above_their_weights_now/), it appears that lower quantizations are the only ones that benefit from the imatrix input (as per hellaswag results).
351
+
352
+ ### How do I merge a split GGUF?
353
+ 1. Make sure you have `gguf-split` available
354
+ - To get hold of `gguf-split`, navigate to https://github.com/ggerganov/llama.cpp/releases
355
+ - Download the appropriate zip for your system from the latest release
356
+ - Unzip the archive and you should be able to find `gguf-split`
357
+ 2. Locate your GGUF chunks folder (ex: `granite-20b-code-instruct.Q8_0`)
358
+ 3. Run `gguf-split --merge granite-20b-code-instruct.Q8_0/granite-20b-code-instruct.Q8_0-00001-of-XXXXX.gguf granite-20b-code-instruct.Q8_0.gguf`
359
+ - Make sure to point `gguf-split` to the first chunk of the split.
360
+
361
+ ---
362
+
363
+ Got a suggestion? Ping me [@legraphista](https://x.com/legraphista)!