Qwen
/

Qwen1.5-110B-Chat-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

JustinLin610 commited on Apr 30, 2024

Commit

dd8ff77

·

verified ·

1 Parent(s): 70343e1

Update README.md

Files changed (1) hide show

README.md +13 -3

README.md CHANGED Viewed

@@ -38,9 +38,19 @@ We advise you to clone [`llama.cpp`](https://github.com/ggerganov/llama.cpp) and
 ## How to use
-Cloning the repo may be inefficient, and thus you can manually download the GGUF file that you need or use `huggingface-cli` (`pip install huggingface_hub`) as shown below:
-```shell
-huggingface-cli download Qwen/Qwen1.5-110B-Chat-GGUF qwen1_5-110b-chat-q5_k_m.gguf --local-dir . --local-dir-use-symlinks False
 ```
 We demonstrate how to use `llama.cpp` to run Qwen1.5:

 ## How to use
+For starters, the 110B model is large and for most GGUF files, due to the limitation of uploading, we split the byte strings into 2 or 3 segments, so you can see files with theirs names ended with `.a` or `.b`.
+Cloning the repo may be inefficient, and thus you can manually download the GGUF file that you need or use `huggingface-cli` (`pip install huggingface_hub`). For each GGUF model, you need to download all the files with the same prefix. For example, for the q_5_k_m model, you need to download both files with `.a` and `.b` at the end.
+```bash
+huggingface-cli download Qwen/Qwen1.5-110B-Chat-GGUF qwen1_5-110b-chat-q5_k_m.gguf.a --local-dir . --local-dir-use-symlinks False
+huggingface-cli download Qwen/Qwen1.5-110B-Chat-GGUF qwen1_5-110b-chat-q5_k_m.gguf.b --local-dir . --local-dir-use-symlinks False
+```
+After, you need to concatenate them to obtain a whole GGUF file:
+```bash
+cat qwen1_5-110b-chat-q5_k_m.gguf.* > qwen1_5-110b-chat-q5_k_m.gguf
 ```
 We demonstrate how to use `llama.cpp` to run Qwen1.5: