This is a generative model converted to fp16 format based on IlyaGusev/saiga_mistral_7b_lora

Install vLLM:

pip install vllm

Start server:

python -u -m vllm.entrypoints.openai.api_server --host 0.0.0.0 --model Gaivoronsky/Mistral-7B-Saiga

Client:

import openai

openai.api_base = "http://localhost:8000/v1"
openai.api_key = "none"

DEFAULT_MESSAGE_TEMPLATE = "<s>{role}\n{content}</s>"
DEFAULT_RESPONSE_TEMPLATE = "<s>bot\n"
DEFAULT_SYSTEM_PROMPT = "Ты — Сайга, русскоязычный автоматический ассистент. Ты разговариваешь с людьми и помогаешь им."


class Conversation:
    def __init__(
        self,
        message_template=DEFAULT_MESSAGE_TEMPLATE,
        system_prompt=DEFAULT_SYSTEM_PROMPT,
        response_template=DEFAULT_RESPONSE_TEMPLATE
    ):
        self.message_template = message_template
        self.response_template = response_template
        self.messages = [{
            "role": "system",
            "content": system_prompt
        }]

    def add_user_message(self, message):
        self.messages.append({
            "role": "user",
            "content": message
        })

    def add_bot_message(self, message):
        self.messages.append({
            "role": "bot",
            "content": message
        })

    def get_prompt(self):
        final_text = ""
        for message in self.messages:
            message_text = self.message_template.format(**message)
            final_text += message_text
        final_text += DEFAULT_RESPONSE_TEMPLATE
        return final_text.strip()


query = "Сколько весит жираф?"
conversation = Conversation()
conversation.add_user_message(query)
prompt = conversation.get_prompt()

response = openai.ChatCompletion.create(
        model="Gaivoronsky/Mistral-7B-Saiga",
        messages=[{"role": "user", "content": prompt}],
        system=DEFAULT_SYSTEM_PROMPT,
        max_tokens=512,
        stop=['</s>']
)
response['choices'][0]['message']['content']
Downloads last month
17
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Gaivoronsky/Mistral-7B-Saiga

Merges
6 models

Datasets used to train Gaivoronsky/Mistral-7B-Saiga