pksx01's picture
Update README.md
5f0692a verified
metadata
library_name: transformers
datasets:
  - pksx01/alpaca_bhojpuri_instruction
language:
  - bh
base_model:
  - sarvamai/sarvam-1

This model has been instruction tuned from sarvamai/sarvam-1. This is an early checkpoint trained for one complete epoch. Checkpoints with further training will be released in future.

Uses

This model can be used to chat in Bhojpuri language.

How to Get Started with the Model

Use the code below to get started with the model.

import torch

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("pksx01/sarvam-1-it-bhojpuri")

# Load base model
model = AutoModelForCausalLM.from_pretrained(
    "sarvamai/sarvam-1",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model.resize_token_embeddings(len(tokenizer))

# Load the PEFT model
peft_model = PeftModel.from_pretrained(
    model,
    "pksx01/sarvam-1-it-bhojpuri",
    is_trainable=False
)

message = [{"role": "user", "content": "भारत के पहिला प्रधानमंत्री के रहे?"}]
model_ip = tokenizer.apply_chat_template(message, tokenize=False)
tokenized_ip = tokenizer(model_ip, return_tensors="pt").to("cuda")

peft_model.eval()
with torch.no_grad():
    op_tokens = peft_model.generate(
        **tokenized_ip,
        max_new_tokens=250,
        temperature=0.01,
        top_k=50,
        top_p=0.95,
        eos_token_id=tokenizer.eos_token_id,
        pad_token_id=tokenizer.pad_token_id
    )

op = tokenizer.decode(op_tokens[0], skip_special_tokens=True)
print(op)

Training Details

Training Data

This model has be trained on an instruction dataset - pksx01/alpaca_bhojpuri_instruction.