dranger003
commited on
Commit
•
17e1904
1
Parent(s):
19fa7c3
Update README.md
Browse files
README.md
CHANGED
@@ -10,6 +10,8 @@ base_model: CohereForAI/c4ai-command-r-plus
|
|
10 |
Noeda's fork will not work with these weights, you will need the main branch of llama.cpp.
|
11 |
Also, I am currently running perplexity on all the quants posted here, and will update this model page with the results.
|
12 |
|
|
|
|
|
13 |
* GGUF importance matrix (imatrix) quants for https://huggingface.co/CohereForAI/c4ai-command-r-plus
|
14 |
* The importance matrix is trained for ~100K tokens (200 batches of 512 tokens) using [wiki.train.raw](https://huggingface.co/datasets/wikitext).
|
15 |
* [Which GGUF is right for me? (from Artefact2)](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9) - X axis is file size and Y axis is perplexity (lower perplexity is better quality). Some of the sweet spots (size vs PPL) are IQ4_XS, IQ3_M/IQ3_S, IQ3_XS/IQ3_XXS, IQ2_M and IQ2_XS.
|
|
|
10 |
Noeda's fork will not work with these weights, you will need the main branch of llama.cpp.
|
11 |
Also, I am currently running perplexity on all the quants posted here, and will update this model page with the results.
|
12 |
|
13 |
+
**NOTE**: Do not concatenate splits (or chunks) - you need to use `gguf-split` to merge files if you need to (most likely not needed for most use cases).
|
14 |
+
|
15 |
* GGUF importance matrix (imatrix) quants for https://huggingface.co/CohereForAI/c4ai-command-r-plus
|
16 |
* The importance matrix is trained for ~100K tokens (200 batches of 512 tokens) using [wiki.train.raw](https://huggingface.co/datasets/wikitext).
|
17 |
* [Which GGUF is right for me? (from Artefact2)](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9) - X axis is file size and Y axis is perplexity (lower perplexity is better quality). Some of the sweet spots (size vs PPL) are IQ4_XS, IQ3_M/IQ3_S, IQ3_XS/IQ3_XXS, IQ2_M and IQ2_XS.
|