How to deploy model?

#3
by Foresta - opened

Thanks for your contribution, but I found some problems when using it. I downloaded the model and made sure that there was no file corruption, then I used

            pipeline = transformers.pipeline(
                "text-generation",
                model=model_path_name,
                model_kwargs={"torch_dtype": torch.bfloat16},
                device_map='cuda',
            )

to load the model.

I apply chat template by

            message_list.append([
                {'role': 'system', 'content': instruction},
                {'role': 'user', 'content': prompt}
                ])

I tokenize prompt by

        # tokenizing
        prompts = [
            pipeline.tokenizer.apply_chat_template(
                messages,
                tokenize=False,
                add_generation_prompt=True,
            )
            for messages in message_list
        ]

At last I try to generate text by

        outputs = pipeline(
            prompts,
            max_new_tokens=4096,
            do_sample=True,
            temperature=0.5,
            top_p=0.5,
            eos_token_id=terminators,
            pad_token_id=pipeline.tokenizer.eos_token_id,
        )

Where prompts here is: ['<|im_start|>system\nYou are an chatbot.<|im_end|>\n<|im_start|>user\n Who are you?<|im_end|>\n<|im_start|>assistant\n']

But I didn't get any output. Any idea? Maybe it doesn't support such loading approch?

I would be very grateful for any your suggestion.

Thank you for reaching out, here's an example how to run this using python and transformers:

import transformers
import torch

# Model and tokenizer initialization
model_path_name = "SicariusSicariiStuff/LLAMA-3_8B_Unaligned_BETA"  # Replace with your model path

# Initialize the pipeline
pipeline = transformers.pipeline(
    "text-generation",
    model=model_path_name,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",  # Adjust to 'cuda' if needed
)

# Prepare the message list
message_list = [
    [
        {'role': 'system', 'content': "You are an AI assistant."},
        {'role': 'user', 'content': "Who are you?"}
    ]
]

# Apply the chat template or manually format the prompts
try:
    prompts = [
        pipeline.tokenizer.apply_chat_template(
            messages,
            tokenize=False,
            add_generation_prompt=True,
        )
        for messages in message_list
    ]
except AttributeError:
    # Fallback: Manually format the prompts if `apply_chat_template` is unsupported
    prompts = [
        f"<|im_start|>system\n{msg[0]['content']}<|im_end|>\n"
        f"<|im_start|>user\n{msg[1]['content']}<|im_end|>\n<|im_start|>assistant\n"
        for msg in message_list
    ]

# Debugging: Print prompts
print("Formatted Prompts:", prompts)

# Validate tokenizer and model's EOS and PAD token IDs
eos_token_id = pipeline.tokenizer.eos_token_id or 50256  # Default fallback for GPT-like models
pad_token_id = eos_token_id  # Ensure consistency
print("EOS Token ID:", eos_token_id)

# Tokenize the prompts (optional debugging step)
tokens = pipeline.tokenizer(prompts, padding=True, return_tensors="pt")
print("Tokenized Input:", tokens)

# Generate the output
try:
    outputs = pipeline(
        prompts,
        max_new_tokens=100,  # Reduce for debugging purposes
        do_sample=True,
        temperature=0.5,
        top_p=0.5,
        eos_token_id=eos_token_id,
        pad_token_id=pad_token_id,
    )
    print("Outputs:", outputs)
except Exception as e:
    print("Error during generation:", str(e))

I've tested this, and it works. I'll also add the code to the repo.
Enjoy :)

SicariusSicariiStuff changed discussion status to closed

I succeeded, thank you very much. Also, I would like to know if you have papers or any references to your series of work, which I use in a research, so I can cite it formally or just attach a link.

No papers yet, but you can link my HF username:
https://huggingface.co/SicariusSicariiStuff

and model:
https://huggingface.co/SicariusSicariiStuff/LLAMA-3_8B_Unaligned_BETA

And good-luck with the research!
I would love to read it.

Sign up or log in to comment