bartowski
/

gemma-2-27b-it-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

bartowski commited on Jul 2

Commit

3cc2b9f

•

1 Parent(s): 2840181

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -20,6 +20,8 @@ Original model: https://huggingface.co/google/gemma-2-27b-it
 All quants made using imatrix option with dataset from [here](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8)
 ## What's new
 - July 21 2024: Contains latest tokenizer fixes, which addressed a few oddities from the original fix, should be closest to correct performance yet. Also has metadata for SWA and logit softcapping.

 All quants made using imatrix option with dataset from [here](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8)
+Experimental quants are made with `--output-tensor-type f16 --token-embedding-type f16` per [ZeroWw](https://huggingface.co/ZeroWw)'s suggestion, please provide any feedback on quality differences you spot.
 ## What's new
 - July 21 2024: Contains latest tokenizer fixes, which addressed a few oddities from the original fix, should be closest to correct performance yet. Also has metadata for SWA and logit softcapping.