--- license: mit language: - ru metrics: - perplexity pipeline_tag: text-generation --- This model was created by [ilnikolaev](https://huggingface.co/ilnikolaev) Trained from scratch using Tensorflow Keras [200mb Russian Comments from 2ch](https://www.kaggle.com/datasets/fizzzgen/65mb-of-dvach-conversations) dataset used - Type: decoder-only - Tokenizer: BPE - Vocabulary size: 32000 - Max sequence length: 120 - Hidden size: 768 - FFN size: 3072 - Attention heads: 24 - Decoder layers: 4