bharadwajswarna
commited on
Commit
β’
637d20a
1
Parent(s):
e5407c6
update readme
Browse files
README.md
CHANGED
@@ -4,6 +4,7 @@ license: apache-2.0
|
|
4 |
|
5 |
# Baby Nandi
|
6 |
Baby Nandi is a Telugu Instruction Tuned Version of Gemma 2B, part of an attempt to develop smaller and efficient Indic LLMs, useful for practical purposes.
|
|
|
7 |
|
8 |
**π Benchmarks**
|
9 |
| Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
|
@@ -13,8 +14,10 @@ Baby Nandi is a Telugu Instruction Tuned Version of Gemma 2B, part of an attempt
|
|
13 |
| [google/gemma-2b](https://huggingface.co/google/gemma-2b) [π](https://gist.github.com/mlabonne/7df1f238c515a5f63a750c8792cef59e) | 34.26 | 22.7 | 43.35 | 39.96 | 31.03 |
|
14 |
|
15 |
**Training Process & Datasets :**
|
16 |
-
1. Gemma 2b Base model has been further pretrained on a part of AI4Bharat Sangraha dataset (280k Telugu Samples)
|
17 |
2. SFT on a mix of Telugu Alpaca + Telugu GPTeacher from Telugu LLM Labs and English Alpaca
|
|
|
|
|
18 |
|
19 |
**Training Duration :**
|
20 |
1. Pretraining for 6 epochs, nearly 35 hours (This might not be enough)
|
@@ -33,4 +36,6 @@ Baby Nandi is a Telugu Instruction Tuned Version of Gemma 2B, part of an attempt
|
|
33 |
{}
|
34 |
"""
|
35 |
```
|
36 |
-
|
|
|
|
|
|
4 |
|
5 |
# Baby Nandi
|
6 |
Baby Nandi is a Telugu Instruction Tuned Version of Gemma 2B, part of an attempt to develop smaller and efficient Indic LLMs, useful for practical purposes.
|
7 |
+
It beats the original gemma-2b overall, but still is behind the latest gemma-2b-1.1-it.
|
8 |
|
9 |
**π Benchmarks**
|
10 |
| Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
|
|
|
14 |
| [google/gemma-2b](https://huggingface.co/google/gemma-2b) [π](https://gist.github.com/mlabonne/7df1f238c515a5f63a750c8792cef59e) | 34.26 | 22.7 | 43.35 | 39.96 | 31.03 |
|
15 |
|
16 |
**Training Process & Datasets :**
|
17 |
+
1. Gemma 2b Base model has been further pretrained on a part of AI4Bharat Sangraha dataset (280k Telugu Samples).
|
18 |
2. SFT on a mix of Telugu Alpaca + Telugu GPTeacher from Telugu LLM Labs and English Alpaca
|
19 |
+
|
20 |
+
You can find the link to this model here : [Gemma-2b-Telugu-Base-Model](bharadwajswarna/gemma-2b-tel-base-6ep)
|
21 |
|
22 |
**Training Duration :**
|
23 |
1. Pretraining for 6 epochs, nearly 35 hours (This might not be enough)
|
|
|
36 |
{}
|
37 |
"""
|
38 |
```
|
39 |
+
**Developer :**
|
40 |
+
[Bharadwaj Swarna](https://www.linkedin.com/in/bharadwajswarna/)\
|
41 |
+
You can reach out to me for any questions/suggestions/collaborations.
|