|
--- |
|
language: |
|
- ko |
|
- en |
|
license: apache-2.0 |
|
tags: |
|
- text-generation |
|
- qwen2.5 |
|
- korean |
|
- instruct |
|
- mlx |
|
- 4bit |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
## Qwen2.5-7B-Instruct-kowiki-qa-4bit mlx convert model |
|
- Original model is [beomi/Qwen2.5-7B-Instruct-kowiki-qa](https://huggingface.co/beomi/Qwen2.5-7B-Instruct-kowiki-qa) |
|
|
|
|
|
## Requirement |
|
- `pip install mlx-lm` |
|
|
|
## Usage |
|
- [Generate with CLI](https://github.com/ml-explore/mlx-examples/blob/main/llms/README.md#command-line) |
|
```bash |
|
mlx_lm.generate --model mlx-community/Qwen2.5-7B-Instruct-kowiki-qa-4bit --prompt "νλμ΄ νλ μ΄μ κ° λμΌ?" |
|
``` |
|
|
|
- [In Python](https://github.com/ml-explore/mlx-examples/blob/main/llms/README.md#python-api) |
|
```python |
|
from mlx_lm import load, generate |
|
|
|
model, tokenizer = load( |
|
"mlx-community/Qwen2.5-7B-Instruct-kowiki-qa-4bit", |
|
tokenizer_config={"trust_remote_code": True}, |
|
) |
|
|
|
prompt = "νλμ΄ νλ μ΄μ κ° λμΌ?" |
|
|
|
messages = [ |
|
{"role": "system", "content": "λΉμ μ μΉμ² ν μ±λ΄μ
λλ€."}, |
|
{"role": "user", "content": prompt}, |
|
] |
|
prompt = tokenizer.apply_chat_template( |
|
messages, |
|
tokenize=False, |
|
add_generation_prompt=True, |
|
) |
|
|
|
text = generate( |
|
model, |
|
tokenizer, |
|
prompt=prompt, |
|
# verbose=True, |
|
# max_tokens=8196, |
|
# temp=0.0, |
|
) |
|
``` |
|
|
|
- [OpenAI Compitable HTTP Server](https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/SERVER.md) |
|
```bash |
|
mlx_lm.server --model mlx-community/Qwen2.5-7B-Instruct-kowiki-qa-4bit --host 0.0.0.0 |
|
``` |
|
|
|
```python |
|
import openai |
|
|
|
|
|
client = openai.OpenAI( |
|
base_url="http://localhost:8080/v1", |
|
) |
|
|
|
prompt = "νλμ΄ νλ μ΄μ κ° λμΌ?" |
|
|
|
messages = [ |
|
{"role": "system", "content": "λΉμ μ μΉμ ν μ±λ΄μ
λλ€.",}, |
|
{"role": "user", "content": prompt}, |
|
] |
|
res = client.chat.completions.create( |
|
model='mlx-community/Qwen2.5-7B-Instruct-kowiki-qa-4bit', |
|
messages=messages, |
|
temperature=0.2, |
|
) |
|
|
|
print(res.choices[0].message.content) |
|
``` |
|
|