Jul 2

Hello, thank you for such a nice work, but can you provide a sample code, including how to load the model, the detailed process of providing the prompts and generating the results, and how to set some parameters that control the generation？

wenine changed discussion title from How to use？ to Can provide a use sample? Jul 2

oddlyspaced

Aug 3

from transformers import AutoModelForCausalLM, AutoTokenizer

Load model and tokenizer

model = "Sheared-LLaMA-1.3B/" # Replace with the actual path
tokenizer = AutoTokenizer.from_pretrained(model)
model = AutoModelForCausalLM.from_pretrained(model)

Input prompt

input_text = "Once upon a time"
input_ids = tokenizer.encode(input_text, return_tensors='pt')

Generate text

output = model.generate(
input_ids,
max_length=100,
num_return_sequences=1,
no_repeat_ngram_size=2,
temperature=0.7,
top_p=0.9,
do_sample=True
)

Decode and print the generated text

generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)