MartialTerran
commited on
Commit
•
8711406
1
Parent(s):
c12160e
Update README.md
Browse files
README.md
CHANGED
@@ -2,8 +2,10 @@ I am not sure if anyone has ever demonstrated a 8-float embeddings, Two four-flo
|
|
2 |
This result is reproducible. I have accomplished this in practially all training runs with 'n_embd': 8, 'n_layer': 1, 'n_head': 2, 'n_inner': 128, (with some undisclosed modifications of the model.py)
|
3 |
The model is believed to be capable of correctly reciting the entire Gettysburg Address. For time purposes, the training was limited to the beggining of that speach.
|
4 |
This last training run was repeated to generate a pytorch parameter checkpoint on disk, for parameter-measurement purposes.
|
|
|
5 |
model_checkpoint_epoch_30000_Nano_Gettysburg_GPT2_v1.5_loss_DisplayPerBatch.py_2024-11-26_00-48-36.pth
|
6 |
Size_on_disk: 1.08 MB (1,138,688 bytes)
|
|
|
7 |
Google Gemini Estimates that this nano LLM has about 9,526 Parameters (I have a different estimate based on my undisclosed modifications)
|
8 |
|
9 |
Essential Hyperparameters (plus undisclosed modifications):
|
|
|
2 |
This result is reproducible. I have accomplished this in practially all training runs with 'n_embd': 8, 'n_layer': 1, 'n_head': 2, 'n_inner': 128, (with some undisclosed modifications of the model.py)
|
3 |
The model is believed to be capable of correctly reciting the entire Gettysburg Address. For time purposes, the training was limited to the beggining of that speach.
|
4 |
This last training run was repeated to generate a pytorch parameter checkpoint on disk, for parameter-measurement purposes.
|
5 |
+
|
6 |
model_checkpoint_epoch_30000_Nano_Gettysburg_GPT2_v1.5_loss_DisplayPerBatch.py_2024-11-26_00-48-36.pth
|
7 |
Size_on_disk: 1.08 MB (1,138,688 bytes)
|
8 |
+
|
9 |
Google Gemini Estimates that this nano LLM has about 9,526 Parameters (I have a different estimate based on my undisclosed modifications)
|
10 |
|
11 |
Essential Hyperparameters (plus undisclosed modifications):
|