|
--- |
|
language: |
|
- zh |
|
- en |
|
license: apache-2.0 |
|
tags: |
|
- mlx |
|
--- |
|
|
|
# mlx-community/BiliBot-7B-Q |
|
|
|
The Model [mlx-community/BiliBot-7B-Q](https://huggingface.co/mlx-community/BiliBot-7B-Q) was converted to MLX format from [Kadins/BiliBot-Qwen2-7B-Q-FT](https://huggingface.co/Kadins/BiliBot-Qwen2-7B-Q-FT) using mlx-lm version **0.13.0**. |
|
|
|
## Use with mlx |
|
|
|
```bash |
|
pip install mlx-lm |
|
``` |
|
|
|
```python |
|
from mlx_lm import load, generate |
|
|
|
model, tokenizer = load("mlx-community/BiliBot-7B-Q") |
|
|
|
# Template content |
|
template = """ |
|
<|im_start|>system |
|
You are a helpful assistant<|im_end|> |
|
<|im_start|>user |
|
请对以下问题给出简短、机智的回答: |
|
{usr_msg}<|im_end|> |
|
<|im_start|>assistant |
|
""" |
|
|
|
while True: |
|
usr_msg = input("用户: ") # Get user message from terminal |
|
if usr_msg.lower() == 'quit()': # Allows the user to exit the loop |
|
break |
|
|
|
prompt = template.replace("{usr_msg}", usr_msg) |
|
|
|
time_ckpt = time.time() |
|
response = generate( |
|
model, |
|
tokenizer, |
|
prompt=prompt, |
|
temp=0.3, |
|
max_tokens=500, |
|
verbose=False |
|
) |
|
|
|
print("%s: %s (Time %d ms)\n" % ("回答", response, (time.time() - time_ckpt) * 1000)) |
|
``` |
|
|