|
--- |
|
license: apache-2.0 |
|
--- |
|
|
|
# Baby Nandi |
|
Baby Nandi (part of the Nandi series of Telugu LLMs) is a Telugu Instruction Tuned Version of Gemma 2B, part of an attempt to develop smaller and efficient Indic LLMs, useful for practical purposes. |
|
It beats the original gemma-2b overall, but still is behind the latest gemma-2b-1.1-it. |
|
|
|
**π Benchmarks** |
|
| Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench | |
|
|---|---:|---:|---:|---:|---:| |
|
|[bharadwajswarna/gemma-2b-sft-telugu](bharadwajswarna/gemma-2b-sft-telugu)[π](https://gist.github.com/bharadwajswarna2/6d5088f1b86890249e5b9e509ca7a8ce)| 38.99| 21.53| 55.56| 48.33| 30.56| |
|
| [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it) [π](https://gist.github.com/mlabonne/db0761e74175573292acf497da9e5d95) | 36.1 | 23.76 | 43.6 | 47.64 | 29.41 | |
|
| [google/gemma-2b](https://huggingface.co/google/gemma-2b) [π](https://gist.github.com/mlabonne/7df1f238c515a5f63a750c8792cef59e) | 34.26 | 22.7 | 43.35 | 39.96 | 31.03 | |
|
|
|
**Training Process & Datasets :** |
|
1. Gemma 2b Base model has been further pretrained on a part of AI4Bharat Sangraha dataset (280k Telugu Samples). |
|
2. SFT on a mix of Telugu Alpaca + Telugu GPTeacher from Telugu LLM Labs and English Alpaca |
|
|
|
You can find the link to this model here : [Gemma-2b-Telugu-Base-Model](bharadwajswarna/gemma-2b-tel-base-6ep) |
|
|
|
**Training Duration :** |
|
1. Pretraining for 6 epochs, nearly 35 hours (This might not be enough) |
|
2. SFT for 3 epochs |
|
|
|
**Inference Prompt Template:** |
|
``` |
|
""" |
|
### Instruction: |
|
{} |
|
|
|
### Input: |
|
{} |
|
|
|
### Response: |
|
{} |
|
""" |
|
``` |
|
**Developer :** |
|
[Bharadwaj Swarna](https://www.linkedin.com/in/bharadwajswarna/)\ |
|
You can reach out to me for any questions/suggestions/collaborations. |
|
|