File size: 1,728 Bytes
94f6af1
 
 
51bb939
 
7b9e3f2
637d20a
51bb939
 
 
 
e5407c6
51bb939
 
 
 
637d20a
51bb939
637d20a
 
51bb939
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
637d20a
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
---
license: apache-2.0
---

# Baby Nandi
Baby Nandi (part of the Nandi series of Telugu LLMs) is a Telugu Instruction Tuned Version of Gemma 2B, part of an attempt to develop smaller and efficient Indic LLMs, useful for practical purposes.
It beats the original gemma-2b overall, but still is behind the latest gemma-2b-1.1-it.

**πŸ† Benchmarks**
| Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
|---|---:|---:|---:|---:|---:|
|[bharadwajswarna/gemma-2b-sft-telugu](bharadwajswarna/gemma-2b-sft-telugu)[πŸ“„](https://gist.github.com/bharadwajswarna2/6d5088f1b86890249e5b9e509ca7a8ce)|  38.99|  21.53|     55.56|   48.33|  30.56|
| [google/gemma-2b-it](https://huggingface.co/google/gemma-2b-it) [πŸ“„](https://gist.github.com/mlabonne/db0761e74175573292acf497da9e5d95) | 36.1 | 23.76 | 43.6 | 47.64 | 29.41 |
| [google/gemma-2b](https://huggingface.co/google/gemma-2b) [πŸ“„](https://gist.github.com/mlabonne/7df1f238c515a5f63a750c8792cef59e) | 34.26 | 22.7 | 43.35 | 39.96 | 31.03 |

**Training Process & Datasets :**
1. Gemma 2b Base model has been further pretrained on a part of AI4Bharat Sangraha dataset (280k Telugu Samples).
2. SFT on a mix of Telugu Alpaca + Telugu GPTeacher from Telugu LLM Labs and English Alpaca
   
You can find the link to this model here : [Gemma-2b-Telugu-Base-Model](bharadwajswarna/gemma-2b-tel-base-6ep)

**Training Duration :**
1. Pretraining for 6 epochs, nearly 35 hours (This might not be enough)
2. SFT for 3 epochs

**Inference Prompt Template:**
```
"""
### Instruction:
{}

### Input:
{}

### Response:
{}
"""
```
**Developer :**
[Bharadwaj Swarna](https://www.linkedin.com/in/bharadwajswarna/)\
You can reach out to me for any questions/suggestions/collaborations.