bartowski
/

gemma-2-27b-it-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

bartowski commited on Jul 2

Commit

601ae69

•

1 Parent(s): 6659e13

Update README.md

Files changed (1) hide show

README.md +6 -4

README.md CHANGED Viewed

@@ -13,20 +13,22 @@ quantized_by: bartowski
 ## Llamacpp imatrix Quantizations of gemma-2-27b-it
-Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b3277">b3277</a> for quantization.
 Original model: https://huggingface.co/google/gemma-2-27b-it
 All quants made using imatrix option with dataset from [here](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8)
 ## Prompt format
 ```
-<bos><start_of_turn>user
 {prompt}<end_of_turn>
 <start_of_turn>model
-<end_of_turn>
-<start_of_turn>model
 ```

 ## Llamacpp imatrix Quantizations of gemma-2-27b-it
+Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b3278">b3278</a> for quantization.
 Original model: https://huggingface.co/google/gemma-2-27b-it
 All quants made using imatrix option with dataset from [here](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8)
+## What's new
+- July 21 2024: Contains latest tokenizer fixes, which addressed a few oddities from the original fix, should be closest to correct performance yet. Also has metadata for SWA and logit softcapping.
 ## Prompt format
 ```
+<start_of_turn>user
 {prompt}<end_of_turn>
 <start_of_turn>model
 ```