metadata
tags:
- autotrain
- text-generation-inference
- text-generation
- peft
- mlx
library_name: transformers
base_model: ben-at-jorah/emergency-llama32-1b-finetune-rmsys3
widget:
- messages:
- role: user
content: What is your favorite condiment?
license: other
datasets:
- ben-at-jorah/emergency-training-data_2024-11-20
ben-at-jorah/emergency-llama32-1b-finetune-rmsys3_mlx-8bit
The Model ben-at-jorah/emergency-llama32-1b-finetune-rmsys3_mlx-8bit was converted to MLX format from ben-at-jorah/emergency-llama32-1b-finetune-rmsys3 using mlx-lm version 0.19.2.
Use with mlx
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("ben-at-jorah/emergency-llama32-1b-finetune-rmsys3_mlx-8bit")
prompt="hello"
if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)