nicholasKluge commited on
Commit
ab03a05
1 Parent(s): 0794ac2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -92,7 +92,7 @@ These are the main arguments used in the training of this model:
92
  | adam epsilon | 0.00000001 |
93
  | weight decay | 0.01 |
94
  | scheduler type | "cosine" |
95
- | warmup ratio | 0.01 |
96
  | gradient checkpointing | false |
97
  | seed | 42 |
98
  | mixed precision | 'no' |
@@ -101,7 +101,7 @@ These are the main arguments used in the training of this model:
101
 
102
  ## Intended Uses
103
 
104
- The primary intended use of TeenyTinyLlama is research on the behavior, functionality, and limitations of large language models. Checkpoints saved during training are intended to provide a controlled setting for performing scientific experiments. You may also further fine-tune and adapt TeenyTinyLlama-162m for deployment, as long as your use is in accordance with the Apache 2.0 license. If you decide to use pre-trained TeenyTinyLlama-162 as a basis for your fine-tuned model, please conduct your own risk and bias assessment.
105
 
106
  ## Basic usage
107
 
 
92
  | adam epsilon | 0.00000001 |
93
  | weight decay | 0.01 |
94
  | scheduler type | "cosine" |
95
+ | warmup steps | 50000 |
96
  | gradient checkpointing | false |
97
  | seed | 42 |
98
  | mixed precision | 'no' |
 
101
 
102
  ## Intended Uses
103
 
104
+ The primary intended use of TeenyTinyLlama is to research the behavior, functionality, and limitations of large language models. Checkpoints saved during training are intended to provide a controlled setting for performing scientific experiments. You may also further fine-tune and adapt TeenyTinyLlama-162m for deployment, as long as your use is in accordance with the Apache 2.0 license. If you decide to use pre-trained TeenyTinyLlama-162 as a basis for your fine-tuned model, please conduct your own risk and bias assessment.
105
 
106
  ## Basic usage
107