dranger003
/

dbrx-instruct-iMat.GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

dranger003 commited on Apr 14

Commit

f80614d

•

1 Parent(s): 4f67f3f

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -21,7 +21,7 @@ The quants here are meant to test imatrix quantized weights.
 **2024-04-13**: Support for this model has just being merged - [`PR #6515`](https://github.com/ggerganov/llama.cpp/pull/6515).
 **<u>You will need this llama.cpp commit [`4bd0f93e`](https://github.com/ggerganov/llama.cpp/commit/4bd0f93e4ab4fe6682e7d0241c1bdec1397e954a) to run this model</u>**
-Quants in this repo are tested running the following command (quants under IQ4_XS are very sensitive and unreliable so far - the imatrix may require to be trained on FP16 weights rather than Q8_0 and for longer than 200 chunks):
 ```
 ./build/bin/main -ngl 41 -s 0 -e -p "<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\nWrite an essay about AI.<|im_end|>\n<|im_start|>assistant\n" -m ggml-dbrx-instruct-16x12b-<<quant-to-test>>.gguf
 ```

 **2024-04-13**: Support for this model has just being merged - [`PR #6515`](https://github.com/ggerganov/llama.cpp/pull/6515).
 **<u>You will need this llama.cpp commit [`4bd0f93e`](https://github.com/ggerganov/llama.cpp/commit/4bd0f93e4ab4fe6682e7d0241c1bdec1397e954a) to run this model</u>**
+Quants in this repo are tested running the following command (quants under IQ3 are very sensitive and unreliable so far - the imatrix may require to be trained on FP16 weights rather than Q8_0 and for longer than 200 chunks):
 ```
 ./build/bin/main -ngl 41 -s 0 -e -p "<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\nWrite an essay about AI.<|im_end|>\n<|im_start|>assistant\n" -m ggml-dbrx-instruct-16x12b-<<quant-to-test>>.gguf
 ```