fhswf
/

BPE_GPT2_TinyStoriesV2_cleaned_4096

text generation

Model card Files Files and versions Community

Edit model card

BPE Tokenizer for TinyStoriesV2

Based on get-neo BPE Tokenizer, but with a smaller vocabulary. Trained with TinyStoriesV2.

Vocab Size: 4096
256 Base chars
1 extra Token: <|endoftext|>
3839 merges

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference API

Unable to determine this model's library. Check the docs .

Dataset used to train fhswf/BPE_GPT2_TinyStoriesV2_cleaned_4096