Vezora
/

Qwen2-7B-Instruct-128k-GGUF

Inference Endpoints

Model card Files Files and versions Community

Vezora commited on Jun 15

Commit

0836179

•

1 Parent(s): 1009e69

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -1,4 +1,4 @@
 ---
 license: apache-2.0
 ---
-The Qwen authors highlight in their blogpost that qwen 2 7b can handle sequences up to 128k, but the GGUF meta-data is set to 32k. This is a version with 131k max context length, Using the llama.cpp script, and this command: `python gguf-set-metadata.py qwen2-7b-instruct-q5_k_m.gguf qwen2.context_length 131072 --force`

 ---
 license: apache-2.0
 ---
+The Qwen authors highlight in their blogpost that qwen 2 7b can handle sequences up to 128k, but the GGUF meta-data is set to 32k. This is a version with 131k max context length, Using the llama.cpp script, also available here, along with this command: `python gguf-set-metadata.py qwen2-7b-instruct-q5_k_m.gguf qwen2.context_length 131072 --force`