from transformers import pipeline

Define your messages

messages = [
{"role": "user", "content": "Who are you?"},
]

Set up the pipeline with multiple GPUs

pipe = pipeline(
"text-generation",
model="mukaj/Llama-3.1-Hawkish-8B",
device_map="auto", # Automatically map the model across available GPUs
model_kwargs={"torch_dtype": "float16"} # Use mixed precision for efficiency
)

Generate the output

output = pipe(messages)
print(output)

Above sample code from model card gives error (below).

Exception: data did not match any variant of untagged enum ModelWrapper at line 1251015 column 3

mukaj
/

Llama-3.1-Hawkish-8B

🚩 Report: Not working

Define your messages

Set up the pipeline with multiple GPUs

Generate the output