pksx01
/

sarvam-1-it-bhojpuri

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

sarvam-1-it-bhojpuri / README.md

pksx01's picture

Update README.md

5f0692a verified 24 days ago

|

history blame contribute delete

2.14 kB

	---
	library_name: transformers
	datasets:
	- pksx01/alpaca_bhojpuri_instruction
	language:
	- bh
	base_model:
	- sarvamai/sarvam-1
	---

	This model has been instruction tuned from [sarvamai/sarvam-1](https://huggingface.co/sarvamai/sarvam-1). This is an early checkpoint trained for one complete epoch. Checkpoints with further training will be released in future.
	## Uses

	<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
	This model can be used to chat in Bhojpuri language.


	## How to Get Started with the Model

	Use the code below to get started with the model.
	```
	import torch

	# Load the tokenizer
	tokenizer = AutoTokenizer.from_pretrained("pksx01/sarvam-1-it-bhojpuri")

	# Load base model
	model = AutoModelForCausalLM.from_pretrained(
	"sarvamai/sarvam-1",
	torch_dtype=torch.bfloat16,
	device_map="auto"
	)
	model.resize_token_embeddings(len(tokenizer))

	# Load the PEFT model
	peft_model = PeftModel.from_pretrained(
	model,
	"pksx01/sarvam-1-it-bhojpuri",
	is_trainable=False
	)

	message = [{"role": "user", "content": "भारत के पहिला प्रधानमंत्री के रहे?"}]
	model_ip = tokenizer.apply_chat_template(message, tokenize=False)
	tokenized_ip = tokenizer(model_ip, return_tensors="pt").to("cuda")

	peft_model.eval()
	with torch.no_grad():
	op_tokens = peft_model.generate(
	**tokenized_ip,
	max_new_tokens=250,
	temperature=0.01,
	top_k=50,
	top_p=0.95,
	eos_token_id=tokenizer.eos_token_id,
	pad_token_id=tokenizer.pad_token_id
	)

	op = tokenizer.decode(op_tokens[0], skip_special_tokens=True)
	print(op)
	```

	## Training Details

	### Training Data

	<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
	This model has be trained on an instruction dataset - [pksx01/alpaca_bhojpuri_instruction](https://huggingface.co/datasets/pksx01/alpaca_bhojpuri_instruction).