bharadwajswarna commited on
Commit
637d20a
β€’
1 Parent(s): e5407c6

update readme

Browse files
Files changed (1) hide show
  1. README.md +7 -2
README.md CHANGED
@@ -4,6 +4,7 @@ license: apache-2.0
4
 
5
  # Baby Nandi
6
  Baby Nandi is a Telugu Instruction Tuned Version of Gemma 2B, part of an attempt to develop smaller and efficient Indic LLMs, useful for practical purposes.
 
7
 
8
  **πŸ† Benchmarks**
9
  | Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
@@ -13,8 +14,10 @@ Baby Nandi is a Telugu Instruction Tuned Version of Gemma 2B, part of an attempt
13
  | [google/gemma-2b](https://huggingface.co/google/gemma-2b) [πŸ“„](https://gist.github.com/mlabonne/7df1f238c515a5f63a750c8792cef59e) | 34.26 | 22.7 | 43.35 | 39.96 | 31.03 |
14
 
15
  **Training Process & Datasets :**
16
- 1. Gemma 2b Base model has been further pretrained on a part of AI4Bharat Sangraha dataset (280k Telugu Samples)
17
  2. SFT on a mix of Telugu Alpaca + Telugu GPTeacher from Telugu LLM Labs and English Alpaca
 
 
18
 
19
  **Training Duration :**
20
  1. Pretraining for 6 epochs, nearly 35 hours (This might not be enough)
@@ -33,4 +36,6 @@ Baby Nandi is a Telugu Instruction Tuned Version of Gemma 2B, part of an attempt
33
  {}
34
  """
35
  ```
36
-
 
 
 
4
 
5
  # Baby Nandi
6
  Baby Nandi is a Telugu Instruction Tuned Version of Gemma 2B, part of an attempt to develop smaller and efficient Indic LLMs, useful for practical purposes.
7
+ It beats the original gemma-2b overall, but still is behind the latest gemma-2b-1.1-it.
8
 
9
  **πŸ† Benchmarks**
10
  | Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
 
14
  | [google/gemma-2b](https://huggingface.co/google/gemma-2b) [πŸ“„](https://gist.github.com/mlabonne/7df1f238c515a5f63a750c8792cef59e) | 34.26 | 22.7 | 43.35 | 39.96 | 31.03 |
15
 
16
  **Training Process & Datasets :**
17
+ 1. Gemma 2b Base model has been further pretrained on a part of AI4Bharat Sangraha dataset (280k Telugu Samples).
18
  2. SFT on a mix of Telugu Alpaca + Telugu GPTeacher from Telugu LLM Labs and English Alpaca
19
+
20
+ You can find the link to this model here : [Gemma-2b-Telugu-Base-Model](bharadwajswarna/gemma-2b-tel-base-6ep)
21
 
22
  **Training Duration :**
23
  1. Pretraining for 6 epochs, nearly 35 hours (This might not be enough)
 
36
  {}
37
  """
38
  ```
39
+ **Developer :**
40
+ [Bharadwaj Swarna](https://www.linkedin.com/in/bharadwajswarna/)\
41
+ You can reach out to me for any questions/suggestions/collaborations.