MartialTerran
/

coherent_text_from_1_megabyte_GPT2_model

Model card Files Files and versions Community

MartialTerran commited on 26 days ago

Commit

c12160e

•

1 Parent(s): 72ab105

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -1,10 +1,10 @@
 I am not sure if anyone has ever demonstrated a 8-float embeddings, Two four-float attention heads, (approximately 9,526 parameters) one-megabyte GPT Model that produced coherent text (The Gettysburg Address) with correct punctuation, in response to a one-word prompt.
 This result is reproducible.  I have accomplished this in practially all training runs with 'n_embd': 8, 'n_layer': 1, 'n_head': 2, 'n_inner': 128,   (with some undisclosed modifications of the model.py)
 This last training run was repeated to generate a pytorch parameter checkpoint on disk, for parameter-measurement purposes.
 model_checkpoint_epoch_30000_Nano_Gettysburg_GPT2_v1.5_loss_DisplayPerBatch.py_2024-11-26_00-48-36.pth
 Size_on_disk: 1.08 MB (1,138,688 bytes)
-Google Gemini Estimates that this nano LLM has about Parameters.
 Essential Hyperparameters (plus undisclosed modifications):
 'n_embd': 8, 'n_layer': 1, 'n_head': 2, 'n_inner': 128,

 I am not sure if anyone has ever demonstrated a 8-float embeddings, Two four-float attention heads, (approximately 9,526 parameters) one-megabyte GPT Model that produced coherent text (The Gettysburg Address) with correct punctuation, in response to a one-word prompt.
 This result is reproducible.  I have accomplished this in practially all training runs with 'n_embd': 8, 'n_layer': 1, 'n_head': 2, 'n_inner': 128,   (with some undisclosed modifications of the model.py)
+The model is believed to be capable of correctly reciting the entire Gettysburg Address.  For time purposes, the training was limited to the beggining of that speach.
 This last training run was repeated to generate a pytorch parameter checkpoint on disk, for parameter-measurement purposes.
 model_checkpoint_epoch_30000_Nano_Gettysburg_GPT2_v1.5_loss_DisplayPerBatch.py_2024-11-26_00-48-36.pth
 Size_on_disk: 1.08 MB (1,138,688 bytes)
+Google Gemini Estimates that this nano LLM has about 9,526 Parameters (I have a different estimate based on my undisclosed modifications)
 Essential Hyperparameters (plus undisclosed modifications):
 'n_embd': 8, 'n_layer': 1, 'n_head': 2, 'n_inner': 128,