qwp4w3hyb
/

Meta-Llama-3-8B-Instruct-iMat-GGUF

Text Generation

importance matrix

Inference Endpoints

Model card Files Files and versions Community

qwp4w3hyb commited on Apr 22, 2024

Commit

9a8b443

·

verified ·

1 Parent(s): 1b339fc

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ license_link: LICENSE
 # Quant Infos
 - quants done with an importance matrix for improved quantization loss
-- K & IQ quants in basically all variants from Q6_K down to IQ_S
 - fixed end token for instruct mode (<|eot_id|>[128009])
 Quantized with [llama.cpp](https://github.com/ggerganov/llama.cpp) commit [0d56246f4b9764158525d894b96606f6163c53a8](https://github.com/ggerganov/llama.cpp/commit/0d56246f4b9764158525d894b96606f6163c53a8) (master from 2024-04-18)

 # Quant Infos
 - quants done with an importance matrix for improved quantization loss
+- K & IQ quants in basically all variants from Q6_K down to IQ1_S
 - fixed end token for instruct mode (<|eot_id|>[128009])
 Quantized with [llama.cpp](https://github.com/ggerganov/llama.cpp) commit [0d56246f4b9764158525d894b96606f6163c53a8](https://github.com/ggerganov/llama.cpp/commit/0d56246f4b9764158525d894b96606f6163c53a8) (master from 2024-04-18)