Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ language:
|
|
12 |
This model was trained using the TinyStories dataset, specifically with the GPT-4 version.
|
13 |
|
14 |
# The Model
|
15 |
-
The name "
|
16 |
|
17 |
The model features a context length of 1024, but in theory, it can be extended indefinitely through fine-tuning.
|
18 |
|
|
|
12 |
This model was trained using the TinyStories dataset, specifically with the GPT-4 version.
|
13 |
|
14 |
# The Model
|
15 |
+
The name "Decepticon" stems from the model's unique architecture, which combines elements of both Transformer and RNN architechtures. This fusion creates a deceptive yet beneficial design.
|
16 |
|
17 |
The model features a context length of 1024, but in theory, it can be extended indefinitely through fine-tuning.
|
18 |
|