Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
nisten
/
qwenv2-7b-inst-imatrix-gguf
like
3
GGUF
License:
apache-2.0
Model card
Files
Files and versions
Community
Use this model
51abeba
qwenv2-7b-inst-imatrix-gguf
1 contributor
History:
22 commits
nisten
standard iq4xs imatrix quant from bf16 gguf so it has better perplexity
51abeba
verified
17 days ago
.gitattributes
3.24 kB
standard iq4xs imatrix quant from bf16 gguf so it has better perplexity
17 days ago
8bitimatrix.dat
4.54 MB
LFS
calculated imatrix in 8bit, was jsut as good as f16 imatrix
17 days ago
README.md
1.55 kB
Update README.md
17 days ago
qwen7bv2inst_iq4xs_embedding4xs_output6k.gguf
4.22 GB
LFS
standard iq4xs imatrix quant from bf16 gguf so it has better perplexity
17 days ago
qwen7bv2inst_iq4xs_embedding8_outputq8.gguf
4.64 GB
LFS
great quant if your chip has 8bit acceleration, slightly better than 4k embedding
17 days ago
qwen7bv2inst_iq4xs_output8bit.gguf
4.35 GB
LFS
best speed/perplexity quant for mobile devices with 8bit acceleration
17 days ago
qwen7bv2inst_q4km_embedding4k_output8bit.gguf
4.82 GB
LFS
very good quant for speed/perplexity, embedding is at q4k
17 days ago
qwen7bv2inst_q4km_embeddingf16_outputf16.gguf
6.11 GB
LFS
Good speed reference quant for older CPUs, however not much improvement from f16 embedding
17 days ago
qwen7bv2instruct_bf16.gguf
15.2 GB
LFS
Rename qwen7bf16.gguf to qwen7bv2instruct_bf16.gguf
17 days ago
qwen7bv2instruct_q5km.gguf
5.58 GB
LFS
standard q5km conversions with 8bit output for reference.
17 days ago
qwen7bv2instruct_q8.gguf
8.1 GB
LFS
Best q8 conversion down from bf16 with slightly better perplexity than f16 based quants
17 days ago