Edit model card

This is an early checkpoint of sarvam-2b, a small, yet powerful language model pre-trained from scratch on 4 trillion tokens. It is trained to be good at 10 Indic languages + English. Officially, the Indic languages supported are: Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, and Telugu.

sarvam-2b will be trained on a data mixture containing equal parts English (2T) and Indic (2T) tokens. The current checkpoint has seen a total of 2 trillion tokens, and has not undergone any post-training.

Getting started:

from transformers import pipeline
pipe = pipeline(model='sarvamai/sarvam-2b-v0.5', device=0)
pipe('भारत के प्रथम प्रधानमंत्री', max_new_tokens=15, temperature=0.1, repetition_penalty=1.2)[0]['generated_text']
# 'भारत के प्रथम प्रधानमंत्री जवाहरलाल नेहरू की बेटी इंदिरा गांधी थीं।\n\n'

More technical details like evaluations and benchmarking will be posted soon.

Downloads last month
9
GGUF
Model size
2.51B params
Architecture
llama

6-bit

Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.