jono1234
/

RWKV-Decepticon

Model card Files Files and versions Community

jono1234 commited on Oct 3, 2023

Commit

5f1fa3d

•

1 Parent(s): 4e3f2a5

Update README.md

Files changed (1) hide show

README.md +15 -1

README.md CHANGED Viewed

@@ -6,4 +6,18 @@ datasets:
 - roneneldan/TinyStories
 language:
 - en
----

 - roneneldan/TinyStories
 language:
 - en
+---
+# Dataset
+This model was trained using the TinyStories dataset, specifically with the GPT-4 version.
+# The Model
+The name "Deception" stems from the model's unique architecture, which combines elements of both Transformer and RNN architechtures. This fusion creates a deceptive yet beneficial design.
+The model features a context length of 1024, but in theory, it can be extended indefinitely through fine-tuning.
+Thank you to the creators of RWKV who made all of this possible. Their repo is here: https://github.com/BlinkDL/RWKV-LM