MartialTerran
/

coherent_text_from_1_megabyte_GPT2_model

Model card Files Files and versions Community

MartialTerran commited on 26 days ago

Commit

8711406

•

1 Parent(s): c12160e

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -2,8 +2,10 @@ I am not sure if anyone has ever demonstrated a 8-float embeddings, Two four-flo
 This result is reproducible.  I have accomplished this in practially all training runs with 'n_embd': 8, 'n_layer': 1, 'n_head': 2, 'n_inner': 128,   (with some undisclosed modifications of the model.py)
 The model is believed to be capable of correctly reciting the entire Gettysburg Address.  For time purposes, the training was limited to the beggining of that speach.
 This last training run was repeated to generate a pytorch parameter checkpoint on disk, for parameter-measurement purposes.
 model_checkpoint_epoch_30000_Nano_Gettysburg_GPT2_v1.5_loss_DisplayPerBatch.py_2024-11-26_00-48-36.pth
 Size_on_disk: 1.08 MB (1,138,688 bytes)
 Google Gemini Estimates that this nano LLM has about 9,526 Parameters (I have a different estimate based on my undisclosed modifications)
 Essential Hyperparameters (plus undisclosed modifications):

 This result is reproducible.  I have accomplished this in practially all training runs with 'n_embd': 8, 'n_layer': 1, 'n_head': 2, 'n_inner': 128,   (with some undisclosed modifications of the model.py)
 The model is believed to be capable of correctly reciting the entire Gettysburg Address.  For time purposes, the training was limited to the beggining of that speach.
 This last training run was repeated to generate a pytorch parameter checkpoint on disk, for parameter-measurement purposes.
 model_checkpoint_epoch_30000_Nano_Gettysburg_GPT2_v1.5_loss_DisplayPerBatch.py_2024-11-26_00-48-36.pth
 Size_on_disk: 1.08 MB (1,138,688 bytes)
 Google Gemini Estimates that this nano LLM has about 9,526 Parameters (I have a different estimate based on my undisclosed modifications)
 Essential Hyperparameters (plus undisclosed modifications):