qwp4w3hyb
/

Meta-Llama-3-8B-Instruct-iMat-GGUF

Text Generation

importance matrix

Inference Endpoints

Model card Files Files and versions Community

qwp4w3hyb commited on Apr 29, 2024

Commit

5733ec4

·

verified ·

1 Parent(s): 3717968

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -17,8 +17,11 @@ license: other
 license_name: llama3
 license_link: LICENSE
 ---
 # Quant Infos
 - quants done with an importance matrix for improved quantization loss
 - K & IQ quants in basically all variants from Q6_K down to IQ1_S
 - fixed end token for instruct mode (<|eot_id|>[128009])

 license_name: llama3
 license_link: LICENSE
 ---
+# Updated beta quants based on new fixed tokenizer, only works with llama.cpp branch gg/bpe-preprocess
 # Quant Infos
+- Updated for latest bpe pre-tokenizer fixes
 - quants done with an importance matrix for improved quantization loss
 - K & IQ quants in basically all variants from Q6_K down to IQ1_S
 - fixed end token for instruct mode (<|eot_id|>[128009])