MartialTerran
commited on
Commit
•
c12160e
1
Parent(s):
72ab105
Update README.md
Browse files
README.md
CHANGED
@@ -1,10 +1,10 @@
|
|
1 |
I am not sure if anyone has ever demonstrated a 8-float embeddings, Two four-float attention heads, (approximately 9,526 parameters) one-megabyte GPT Model that produced coherent text (The Gettysburg Address) with correct punctuation, in response to a one-word prompt.
|
2 |
This result is reproducible. I have accomplished this in practially all training runs with 'n_embd': 8, 'n_layer': 1, 'n_head': 2, 'n_inner': 128, (with some undisclosed modifications of the model.py)
|
3 |
-
|
4 |
This last training run was repeated to generate a pytorch parameter checkpoint on disk, for parameter-measurement purposes.
|
5 |
model_checkpoint_epoch_30000_Nano_Gettysburg_GPT2_v1.5_loss_DisplayPerBatch.py_2024-11-26_00-48-36.pth
|
6 |
Size_on_disk: 1.08 MB (1,138,688 bytes)
|
7 |
-
Google Gemini Estimates that this nano LLM has about Parameters
|
8 |
|
9 |
Essential Hyperparameters (plus undisclosed modifications):
|
10 |
'n_embd': 8, 'n_layer': 1, 'n_head': 2, 'n_inner': 128,
|
|
|
1 |
I am not sure if anyone has ever demonstrated a 8-float embeddings, Two four-float attention heads, (approximately 9,526 parameters) one-megabyte GPT Model that produced coherent text (The Gettysburg Address) with correct punctuation, in response to a one-word prompt.
|
2 |
This result is reproducible. I have accomplished this in practially all training runs with 'n_embd': 8, 'n_layer': 1, 'n_head': 2, 'n_inner': 128, (with some undisclosed modifications of the model.py)
|
3 |
+
The model is believed to be capable of correctly reciting the entire Gettysburg Address. For time purposes, the training was limited to the beggining of that speach.
|
4 |
This last training run was repeated to generate a pytorch parameter checkpoint on disk, for parameter-measurement purposes.
|
5 |
model_checkpoint_epoch_30000_Nano_Gettysburg_GPT2_v1.5_loss_DisplayPerBatch.py_2024-11-26_00-48-36.pth
|
6 |
Size_on_disk: 1.08 MB (1,138,688 bytes)
|
7 |
+
Google Gemini Estimates that this nano LLM has about 9,526 Parameters (I have a different estimate based on my undisclosed modifications)
|
8 |
|
9 |
Essential Hyperparameters (plus undisclosed modifications):
|
10 |
'n_embd': 8, 'n_layer': 1, 'n_head': 2, 'n_inner': 128,
|