Requantized with fixed imatrix
Browse files- README.md +3 -34
- Yi-Coder-9B-Chat-bf16-00001-of-00002.gguf +2 -2
- Yi-Coder-9B-Chat.IQ1_M.gguf +2 -2
- Yi-Coder-9B-Chat.IQ1_S.gguf +2 -2
- Yi-Coder-9B-Chat.IQ2_M.gguf +2 -2
- Yi-Coder-9B-Chat.IQ2_S.gguf +2 -2
- Yi-Coder-9B-Chat.IQ2_XS.gguf +2 -2
- Yi-Coder-9B-Chat.IQ2_XXS.gguf +2 -2
- Yi-Coder-9B-Chat.IQ3_M.gguf +2 -2
- Yi-Coder-9B-Chat.IQ3_S.gguf +2 -2
- Yi-Coder-9B-Chat.IQ3_XS.gguf +2 -2
- Yi-Coder-9B-Chat.IQ3_XXS.gguf +2 -2
- Yi-Coder-9B-Chat.IQ4_XS.gguf +2 -2
- Yi-Coder-9B-Chat.imatrix.dat +1 -1
README.md
CHANGED
@@ -1,5 +1,6 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
3 |
tags:
|
4 |
- code
|
5 |
language:
|
@@ -24,12 +25,12 @@ This repo contains State Of The Art quantized GGUF format model files for [Yi-Co
|
|
24 |
|
25 |
Quantization was done with an importance matrix that was trained for ~1M tokens (256 batches of 4096 tokens) of answers from the [CodeFeedback-Filtered-Instruction](https://huggingface.co/datasets/m-a-p/CodeFeedback-Filtered-Instruction) dataset.
|
26 |
|
|
|
|
|
27 |
**Update September 5th**: Marked <|im_start|> as a special token, fixing tokenization.
|
28 |
|
29 |
Corrected EOS (<|im_end|>) and added EOT (<|endoftext|>) token to prevent infinite responses (am I the only one actually dog-fooding my own quants?).
|
30 |
|
31 |
-
Fill-in-Middle token metadata has been added, see [example](#simple-llama-cpp-python-example-fill-in-middle-code). NOTE: Yi's FIM requires support for [SPM infill mode](https://github.com/abetlen/llama-cpp-python/pull/1492)! However it seems it has not been extensively trained for this (perhaps not at all), so don't expect particularly great results...
|
32 |
-
|
33 |
<!-- description end -->
|
34 |
|
35 |
|
@@ -177,38 +178,6 @@ print(llm.create_chat_completion(
|
|
177 |
))
|
178 |
```
|
179 |
|
180 |
-
#### Simple llama-cpp-python example fill-in-middle code
|
181 |
-
|
182 |
-
```python
|
183 |
-
from llama_cpp import Llama
|
184 |
-
|
185 |
-
# Completion API
|
186 |
-
|
187 |
-
prompt = "def add("
|
188 |
-
suffix = "\n return sum\n\n"
|
189 |
-
|
190 |
-
llm = Llama(model_path="./Yi-Coder-9B-Chat.IQ4_XS.gguf", n_gpu_layers=49, n_ctx=131072, spm_infill=True)
|
191 |
-
output = llm.create_completion(
|
192 |
-
temperature = 0.0,
|
193 |
-
repeat_penalty = 1.0,
|
194 |
-
prompt = prompt,
|
195 |
-
suffix = suffix
|
196 |
-
)
|
197 |
-
|
198 |
-
# Models sometimes repeat suffix in response, attempt to filter that
|
199 |
-
response = output["choices"][0]["text"]
|
200 |
-
response_stripped = response.rstrip()
|
201 |
-
unwanted_response_suffix = suffix.rstrip()
|
202 |
-
unwanted_response_length = len(unwanted_response_suffix)
|
203 |
-
|
204 |
-
filtered = False
|
205 |
-
if unwanted_response_suffix and response_stripped[-unwanted_response_length:] == unwanted_response_suffix:
|
206 |
-
response = response_stripped[:-unwanted_response_length]
|
207 |
-
filtered = True
|
208 |
-
|
209 |
-
print(f"Fill-in-Middle completion{' (filtered)' if filtered else ''}:\n\n{prompt}\033[32m{response}\033[{'33' if filtered else '0'}m{suffix}\033[0m")
|
210 |
-
```
|
211 |
-
|
212 |
<!-- README_GGUF.md-how-to-run end -->
|
213 |
|
214 |
<!-- original-model-card start -->
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
pipeline_tag: text-generation
|
4 |
tags:
|
5 |
- code
|
6 |
language:
|
|
|
25 |
|
26 |
Quantization was done with an importance matrix that was trained for ~1M tokens (256 batches of 4096 tokens) of answers from the [CodeFeedback-Filtered-Instruction](https://huggingface.co/datasets/m-a-p/CodeFeedback-Filtered-Instruction) dataset.
|
27 |
|
28 |
+
**Update September 19th**: Requantized with new imatrix after finding a [bug](https://github.com/ggerganov/llama.cpp/pull/9543) in `llama-imatrix` that degraded the data set. Also removed the Fill-in-Middle tokens as they are [not properly supported](https://huggingface.co/01-ai/Yi-Coder-9B-Chat/discussions/5).
|
29 |
+
|
30 |
**Update September 5th**: Marked <|im_start|> as a special token, fixing tokenization.
|
31 |
|
32 |
Corrected EOS (<|im_end|>) and added EOT (<|endoftext|>) token to prevent infinite responses (am I the only one actually dog-fooding my own quants?).
|
33 |
|
|
|
|
|
34 |
<!-- description end -->
|
35 |
|
36 |
|
|
|
178 |
))
|
179 |
```
|
180 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
181 |
<!-- README_GGUF.md-how-to-run end -->
|
182 |
|
183 |
<!-- original-model-card start -->
|
Yi-Coder-9B-Chat-bf16-00001-of-00002.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2cb211e2ee5e0276772bb952baa723c624578b8b16eb1531243ce5cae4713976
|
3 |
+
size 1477995
|
Yi-Coder-9B-Chat.IQ1_M.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b1a7af2e7b3fc3e85e811a09a1701ea1f6cc9f025af04d7629df97c24a70bf16
|
3 |
+
size 2181641024
|
Yi-Coder-9B-Chat.IQ1_S.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5180f9e3d872d3db1b7caff4f35e6d7973dc5dd8596ad84522e9056df5927a37
|
3 |
+
size 2014573376
|
Yi-Coder-9B-Chat.IQ2_M.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f5515525a85fe96ffacb3e9133a72375156c0be063801e2c1db3c001806e681e
|
3 |
+
size 3098112832
|
Yi-Coder-9B-Chat.IQ2_S.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:82f1a4a9d69180978e90de03a0c7b2682cb026383d723a8b043924d7c002171e
|
3 |
+
size 2875355968
|
Yi-Coder-9B-Chat.IQ2_XS.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e6124847223b973dbb776b467e35ce7121f532b2301327abc66a55b1ab768452
|
3 |
+
size 2708009792
|
Yi-Coder-9B-Chat.IQ2_XXS.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5508f3e75294d78cd6b7cf7cca5fbb2a4a86ec264f7a20cffb6a91cebc23b1ad
|
3 |
+
size 2460087104
|
Yi-Coder-9B-Chat.IQ3_M.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7df755d7533e67a5abc51eb666bbf8262317eebcd21b31ef772576d859260332
|
3 |
+
size 4055462720
|
Yi-Coder-9B-Chat.IQ3_S.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5435e44b532b27c33567f07187f516f788fee8dc90439ea56c7599547cc25aab
|
3 |
+
size 3912577856
|
Yi-Coder-9B-Chat.IQ3_XS.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f6e4fbe44ee79d4be26a70169578260d223a145e2a152edcc653d9af2a2ca34d
|
3 |
+
size 3717935936
|
Yi-Coder-9B-Chat.IQ3_XXS.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c25f9a7bec86e399e01d7af69509d63c55e2bb4919b9991258a95b2aa1587753
|
3 |
+
size 3474322240
|
Yi-Coder-9B-Chat.IQ4_XS.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7dc8ca686e3a721fb92d7eb8dd8e2ffb2632543d90cdc9f093da82cc934eb517
|
3 |
+
size 4785009472
|
Yi-Coder-9B-Chat.imatrix.dat
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 6843280
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a3b191c30944617e38f152af64a72d26c0f5e2ec2999ae1df09ce54adf61264e
|
3 |
size 6843280
|