bharadwajswarna
/

gemma-2b-sft-telugu

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

bharadwajswarna commited on Apr 18

Commit

637d20a

•

1 Parent(s): e5407c6

update readme

Files changed (1) hide show

README.md +7 -2

README.md CHANGED Viewed

@@ -4,6 +4,7 @@ license: apache-2.0
 # Baby Nandi
 Baby Nandi is a Telugu Instruction Tuned Version of Gemma 2B, part of an attempt to develop smaller and efficient Indic LLMs, useful for practical purposes.
 **🏆 Benchmarks**
 | Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
@@ -13,8 +14,10 @@ Baby Nandi is a Telugu Instruction Tuned Version of Gemma 2B, part of an attempt
 | [google/gemma-2b](https://huggingface.co/google/gemma-2b) [📄](https://gist.github.com/mlabonne/7df1f238c515a5f63a750c8792cef59e) | 34.26 | 22.7 | 43.35 | 39.96 | 31.03 |
 **Training Process & Datasets :**
-1. Gemma 2b Base model has been further pretrained on a part of AI4Bharat Sangraha dataset (280k Telugu Samples)
 2. SFT on a mix of Telugu Alpaca + Telugu GPTeacher from Telugu LLM Labs and English Alpaca
 **Training Duration :**
 1. Pretraining for 6 epochs, nearly 35 hours (This might not be enough)
@@ -33,4 +36,6 @@ Baby Nandi is a Telugu Instruction Tuned Version of Gemma 2B, part of an attempt
 {}
 """
 ```

 # Baby Nandi
 Baby Nandi is a Telugu Instruction Tuned Version of Gemma 2B, part of an attempt to develop smaller and efficient Indic LLMs, useful for practical purposes.
+It beats the original gemma-2b overall, but still is behind the latest gemma-2b-1.1-it.
 **🏆 Benchmarks**
 | Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
 | [google/gemma-2b](https://huggingface.co/google/gemma-2b) [📄](https://gist.github.com/mlabonne/7df1f238c515a5f63a750c8792cef59e) | 34.26 | 22.7 | 43.35 | 39.96 | 31.03 |
 **Training Process & Datasets :**
+1. Gemma 2b Base model has been further pretrained on a part of AI4Bharat Sangraha dataset (280k Telugu Samples).
 2. SFT on a mix of Telugu Alpaca + Telugu GPTeacher from Telugu LLM Labs and English Alpaca
+You can find the link to this model here : [Gemma-2b-Telugu-Base-Model](bharadwajswarna/gemma-2b-tel-base-6ep)
 **Training Duration :**
 1. Pretraining for 6 epochs, nearly 35 hours (This might not be enough)
 {}
 """
 ```
+**Developer :**
+[Bharadwaj Swarna](https://www.linkedin.com/in/bharadwajswarna/)\
+You can reach out to me for any questions/suggestions/collaborations.