JustinLin610
commited on
Commit
•
dd8ff77
1
Parent(s):
70343e1
Update README.md
Browse files
README.md
CHANGED
@@ -38,9 +38,19 @@ We advise you to clone [`llama.cpp`](https://github.com/ggerganov/llama.cpp) and
|
|
38 |
|
39 |
|
40 |
## How to use
|
41 |
-
|
42 |
-
|
43 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
44 |
```
|
45 |
|
46 |
We demonstrate how to use `llama.cpp` to run Qwen1.5:
|
|
|
38 |
|
39 |
|
40 |
## How to use
|
41 |
+
|
42 |
+
For starters, the 110B model is large and for most GGUF files, due to the limitation of uploading, we split the byte strings into 2 or 3 segments, so you can see files with theirs names ended with `.a` or `.b`.
|
43 |
+
|
44 |
+
Cloning the repo may be inefficient, and thus you can manually download the GGUF file that you need or use `huggingface-cli` (`pip install huggingface_hub`). For each GGUF model, you need to download all the files with the same prefix. For example, for the q_5_k_m model, you need to download both files with `.a` and `.b` at the end.
|
45 |
+
```bash
|
46 |
+
huggingface-cli download Qwen/Qwen1.5-110B-Chat-GGUF qwen1_5-110b-chat-q5_k_m.gguf.a --local-dir . --local-dir-use-symlinks False
|
47 |
+
huggingface-cli download Qwen/Qwen1.5-110B-Chat-GGUF qwen1_5-110b-chat-q5_k_m.gguf.b --local-dir . --local-dir-use-symlinks False
|
48 |
+
```
|
49 |
+
|
50 |
+
After, you need to concatenate them to obtain a whole GGUF file:
|
51 |
+
|
52 |
+
```bash
|
53 |
+
cat qwen1_5-110b-chat-q5_k_m.gguf.* > qwen1_5-110b-chat-q5_k_m.gguf
|
54 |
```
|
55 |
|
56 |
We demonstrate how to use `llama.cpp` to run Qwen1.5:
|