--- license: other license_name: rwkv-decepticon license_link: LICENSE datasets: - roneneldan/TinyStories language: - en --- # Dataset This model was trained using the TinyStories dataset, specifically with the GPT-4 version. # The Model The name "Deception" stems from the model's unique architecture, which combines elements of both Transformer and RNN architechtures. This fusion creates a deceptive yet beneficial design. The model features a context length of 1024, but in theory, it can be extended indefinitely through fine-tuning. Thank you to the creators of RWKV who made all of this possible. Their repo is here: https://github.com/BlinkDL/RWKV-LM