MartialTerran commited on
Commit
8711406
1 Parent(s): c12160e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -2,8 +2,10 @@ I am not sure if anyone has ever demonstrated a 8-float embeddings, Two four-flo
2
  This result is reproducible. I have accomplished this in practially all training runs with 'n_embd': 8, 'n_layer': 1, 'n_head': 2, 'n_inner': 128, (with some undisclosed modifications of the model.py)
3
  The model is believed to be capable of correctly reciting the entire Gettysburg Address. For time purposes, the training was limited to the beggining of that speach.
4
  This last training run was repeated to generate a pytorch parameter checkpoint on disk, for parameter-measurement purposes.
 
5
  model_checkpoint_epoch_30000_Nano_Gettysburg_GPT2_v1.5_loss_DisplayPerBatch.py_2024-11-26_00-48-36.pth
6
  Size_on_disk: 1.08 MB (1,138,688 bytes)
 
7
  Google Gemini Estimates that this nano LLM has about 9,526 Parameters (I have a different estimate based on my undisclosed modifications)
8
 
9
  Essential Hyperparameters (plus undisclosed modifications):
 
2
  This result is reproducible. I have accomplished this in practially all training runs with 'n_embd': 8, 'n_layer': 1, 'n_head': 2, 'n_inner': 128, (with some undisclosed modifications of the model.py)
3
  The model is believed to be capable of correctly reciting the entire Gettysburg Address. For time purposes, the training was limited to the beggining of that speach.
4
  This last training run was repeated to generate a pytorch parameter checkpoint on disk, for parameter-measurement purposes.
5
+
6
  model_checkpoint_epoch_30000_Nano_Gettysburg_GPT2_v1.5_loss_DisplayPerBatch.py_2024-11-26_00-48-36.pth
7
  Size_on_disk: 1.08 MB (1,138,688 bytes)
8
+
9
  Google Gemini Estimates that this nano LLM has about 9,526 Parameters (I have a different estimate based on my undisclosed modifications)
10
 
11
  Essential Hyperparameters (plus undisclosed modifications):