File size: 2,171 Bytes
dacff11 570c514 dacff11 570c514 dacff11 29a4019 364393a 29a4019 570c514 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 |
---
language: bn
license: mit
datasets:
- mc4
---
# Bengali GPT-2
Bengali GPT-2 demo. Part of the [Huggingface JAX/Flax event](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/). Also features a [finetuned](https://huggingface.co/khalidsaifullaah/bengali-lyricist-gpt2?) model on bengali song lyrics.
# Model Description
OpenAI GPT-2 model was proposed in [Language Models are Unsupervised Multitask Learners](https://paperswithcode.com/paper/language-models-are-unsupervised-multitask) paper .Original GPT2 model was a causal (unidirectional) transformer pretrained using language modeling on a very large corpus of ~40 GB of text data. This model has same configuration but has been pretrained on bengali corpus of mC4(multilingual C4) dataset. The code for training the model has all been open-sourced [here](https://huggingface.co/flax-community/gpt2-bengali/tree/main).
# Training Details
Overall Result:
```Eval loss : 1.45, Eval Perplexity : 3.141```
Data: [mC4-bn](https://huggingface.co/datasets/mc4)
Train Steps: 250k steps
link 🤗 flax-community/gpt2-bengali
Demo : https://huggingface.co/spaces/flax-community/Gpt2-bengali
# Usage
For using the model there are multiple options available. For example using the pipeline directly we can try to generate sentences.
```
from transformers import pipeline
gpt2_bengali = pipeline('text-generation',model="flax-community/gpt2-bengali", tokenizer='flax-community/gpt2-bengali')
```
Similarly for using the finetuned model on bangla songs we can use following.
```
from transformers import pipeline
singer = pipeline('text-generation',model="khalidsaifullaah/bengali-lyricist-gpt2", tokenizer='khalidsaifullaah/bengali-lyricist-gpt2')
```
For using on other tasks the model needs to be fine-tuned on custom datasets. Details can be found in huggingface [documentation](https://huggingface.co/transformers/training.html)
# Contributors
* Khalid Saifullah
* Tasmiah Tahsin Mayeesha
* Ritobrata Ghosh
* Ibrahim Musa
* M Saiful Bari
### BibTeX entry and citation info
Coming soon!
<!-- ```bibtex
@inproceedings{...,
year={2020}
}
``` --> |