MartialTerran commited on
Commit
c12160e
1 Parent(s): 72ab105

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -1,10 +1,10 @@
1
  I am not sure if anyone has ever demonstrated a 8-float embeddings, Two four-float attention heads, (approximately 9,526 parameters) one-megabyte GPT Model that produced coherent text (The Gettysburg Address) with correct punctuation, in response to a one-word prompt.
2
  This result is reproducible. I have accomplished this in practially all training runs with 'n_embd': 8, 'n_layer': 1, 'n_head': 2, 'n_inner': 128, (with some undisclosed modifications of the model.py)
3
-
4
  This last training run was repeated to generate a pytorch parameter checkpoint on disk, for parameter-measurement purposes.
5
  model_checkpoint_epoch_30000_Nano_Gettysburg_GPT2_v1.5_loss_DisplayPerBatch.py_2024-11-26_00-48-36.pth
6
  Size_on_disk: 1.08 MB (1,138,688 bytes)
7
- Google Gemini Estimates that this nano LLM has about Parameters.
8
 
9
  Essential Hyperparameters (plus undisclosed modifications):
10
  'n_embd': 8, 'n_layer': 1, 'n_head': 2, 'n_inner': 128,
 
1
  I am not sure if anyone has ever demonstrated a 8-float embeddings, Two four-float attention heads, (approximately 9,526 parameters) one-megabyte GPT Model that produced coherent text (The Gettysburg Address) with correct punctuation, in response to a one-word prompt.
2
  This result is reproducible. I have accomplished this in practially all training runs with 'n_embd': 8, 'n_layer': 1, 'n_head': 2, 'n_inner': 128, (with some undisclosed modifications of the model.py)
3
+ The model is believed to be capable of correctly reciting the entire Gettysburg Address. For time purposes, the training was limited to the beggining of that speach.
4
  This last training run was repeated to generate a pytorch parameter checkpoint on disk, for parameter-measurement purposes.
5
  model_checkpoint_epoch_30000_Nano_Gettysburg_GPT2_v1.5_loss_DisplayPerBatch.py_2024-11-26_00-48-36.pth
6
  Size_on_disk: 1.08 MB (1,138,688 bytes)
7
+ Google Gemini Estimates that this nano LLM has about 9,526 Parameters (I have a different estimate based on my undisclosed modifications)
8
 
9
  Essential Hyperparameters (plus undisclosed modifications):
10
  'n_embd': 8, 'n_layer': 1, 'n_head': 2, 'n_inner': 128,