nisten
/

qwenv2-7b-inst-imatrix-gguf

Model card Files Files and versions Community

qwenv2-7b-inst-imatrix-gguf

1 contributor

History: 11 commits

nisten's picture

Probably best speed to perplexity ratio of any 7b gguf model so far

0e76852 verified 24 days ago

.gitattributes

2.5 kB

Probably best speed to perplexity ratio of any 7b gguf model so far 24 days ago
8bitimatrix.dat

4.54 MB
LFS

calculated imatrix in 8bit, was jsut as good as f16 imatrix 24 days ago
README.md

1.55 kB

Update README.md 24 days ago
qwen7bf16.gguf
15.2 GB
LFS

Upload 9 files 24 days ago
qwen7bq4kembeddingf16outputf16.gguf
6.11 GB
LFS

Rename qwen7bq4kembeddingbf16outputbf16.gguf to qwen7bq4kembeddingf16outputf16.gguf 24 days ago
qwen7bq4koutput8bit.gguf
4.82 GB
LFS

Upload 9 files 24 days ago
qwen7bq4xsembedding8output8.gguf
4.64 GB
LFS

Rename qwen7bq4xsembedding5bitkoutput8bit.gguf to qwen7bq4xsembedding8output8.gguf 24 days ago
qwen7bq4xsoutput6k.gguf

4.22 GB
LFS

Rename qwen7bq4xs.gguf to qwen7bq4xsoutput6k.gguf 24 days ago
qwen7bv2_iq4xs_output8bit.gguf
4.35 GB
LFS

Probably best speed to perplexity ratio of any 7b gguf model so far 24 days ago
qwen7bv2instruct_q5km.gguf
5.58 GB
LFS

standard q5km conversions with 8bit output for reference. 24 days ago
qwenv2instruct7b_q8.gguf
8.1 GB
LFS

Good conversion from bf16 down instead of from f16 24 days ago