File size: 2,235 Bytes

---

language:
- ko
- en
license: apache-2.0
tags:
- text-generation
- qwen2.5
- korean
- instruct
- mlx
- 8bit
pipeline_tag: text-generation
---


## Qwen2.5-7B-Instruct-kowiki-qa-8bit mlx convert model
- Original model is [beomi/Qwen2.5-7B-Instruct-kowiki-qa](https://huggingface.co/beomi/Qwen2.5-7B-Instruct-kowiki-qa)


## Requirement
- `pip install mlx-lm`

## Usage
- [Generate with CLI](https://github.com/ml-explore/mlx-examples/blob/main/llms/README.md#command-line)
    ```bash

    mlx_lm.generate --model sucream/Qwen2.5-7B-Instruct-kowiki-qa-8bit --prompt "하늘이 파란 이유가 뭐야?"

    ```


- [In Python](https://github.com/ml-explore/mlx-examples/blob/main/llms/README.md#python-api)
    ```python

    from mlx_lm import load, generate

    

    model, tokenizer = load(

        "sucream/Qwen2.5-7B-Instruct-kowiki-qa-8bit",

        tokenizer_config={"trust_remote_code": True},

    )


    prompt = "하늘이 파란 이유가 뭐야?"

    

    messages = [

        {"role": "system", "content": "당신은 친철한 챗봇입니다."},

        {"role": "user", "content": prompt},

    ]

    prompt = tokenizer.apply_chat_template(

        messages,

        tokenize=False,

        add_generation_prompt=True,

    )

    

    text = generate(

        model,

        tokenizer,

        prompt=prompt,

        # verbose=True,

        # max_tokens=8196,

        # temp=0.0,

    )

    ```


- [OpenAI Compitable HTTP Server](https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/SERVER.md)
    ```bash

    mlx_lm.server --model sucream/Qwen2.5-7B-Instruct-kowiki-qa-8bit --host 0.0.0.0

    ```


    ```python

    import openai



    client = openai.OpenAI(

        base_url="http://localhost:8080/v1",

    )


    prompt = "하늘이 파란 이유가 뭐야?"


    messages = [

        {"role": "system", "content": "당신은 친절한 챗봇입니다.",},

        {"role": "user", "content": prompt},

    ]

    res = client.chat.completions.create(

        model='sucream/Qwen2.5-7B-Instruct-kowiki-qa-8bit',

        messages=messages,

        temperature=0.2,

    )

    

    print(res.choices[0].message.content)

    ```