|
# GGML 4-bit/5-bit quantized IDEA-CCNL/Ziya-LLaMA-13B-v1 |
|
* You need the latest version of llama-cpp or llama-cpp-python (to support ggml format v3). |
|
* Currently llama-cpp can not tokenize '\<human\>', '\<bot\>' special tokens, I changed these to π€π§ emojis. |
|
* Promote like this: |
|
```python |
|
inputs = 'π§:' + query.strip() + '\nπ€:' |
|
``` |
|
* If you wanna quantize Ziya to GGML yourself, you should override its 'add_tokens.json' file with ours, which is provided in this repository. |
|
--- |
|
license: gpl-3.0 |
|
--- |
|
|