dranger003
/

dbrx-instruct-iMat.GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

dranger003 commited on Apr 11

Commit

b8185c1

•

1 Parent(s): 655e516

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -9,6 +9,8 @@ base_model: databricks/dbrx-instruct
 **2024-04-11**: Support for this model is still being worked on - [`PR #6515`](https://github.com/ggerganov/llama.cpp/pull/6515).
 We are currently testing quants and I will upload them once they are working.
 * GGUF importance matrix (imatrix) quants for https://huggingface.co/databricks/dbrx-instruct
 * The importance matrix is trained for ~100K tokens (200 batches of 512 tokens) using [wiki.train.raw](https://huggingface.co/datasets/wikitext).
 * [Which GGUF is right for me? (from Artefact2)](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9) - X axis is file size and Y axis is perplexity (lower perplexity is better quality).

 **2024-04-11**: Support for this model is still being worked on - [`PR #6515`](https://github.com/ggerganov/llama.cpp/pull/6515).
 We are currently testing quants and I will upload them once they are working.
+**NOTE**: Do not download the model unless it states above that testing is conclusive, otherwise the model won't work.
 * GGUF importance matrix (imatrix) quants for https://huggingface.co/databricks/dbrx-instruct
 * The importance matrix is trained for ~100K tokens (200 batches of 512 tokens) using [wiki.train.raw](https://huggingface.co/datasets/wikitext).
 * [Which GGUF is right for me? (from Artefact2)](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9) - X axis is file size and Y axis is perplexity (lower perplexity is better quality).