nicholasKluge
/

TeenyTinyLlama-160m

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

nicholasKluge commited on Dec 25, 2023

Commit

ab03a05

•

1 Parent(s): 0794ac2

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -92,7 +92,7 @@ These are the main arguments used in the training of this model:
 | adam epsilon                  | 0.00000001                           |
 | weight decay                  | 0.01                                 |
 | scheduler type                | "cosine"                             |
-| warmup ratio                  | 0.01                                 |
 | gradient checkpointing        | false                                |
 | seed                          | 42                                   |
 | mixed precision               | 'no'                                 |
@@ -101,7 +101,7 @@ These are the main arguments used in the training of this model:
 ## Intended Uses
-The primary intended use of TeenyTinyLlama is research on the behavior, functionality, and limitations of large language models. Checkpoints saved during training are intended to provide a controlled setting for performing scientific experiments. You may also further fine-tune and adapt TeenyTinyLlama-162m for deployment, as long as your use is in accordance with the Apache 2.0 license. If you decide to use pre-trained TeenyTinyLlama-162 as a basis for your fine-tuned model, please conduct your own risk and bias assessment.
 ## Basic usage

 | adam epsilon                  | 0.00000001                           |
 | weight decay                  | 0.01                                 |
 | scheduler type                | "cosine"                             |
+| warmup steps                  | 50000                                |
 | gradient checkpointing        | false                                |
 | seed                          | 42                                   |
 | mixed precision               | 'no'                                 |
 ## Intended Uses
+The primary intended use of TeenyTinyLlama is to research the behavior, functionality, and limitations of large language models. Checkpoints saved during training are intended to provide a controlled setting for performing scientific experiments. You may also further fine-tune and adapt TeenyTinyLlama-162m for deployment, as long as your use is in accordance with the Apache 2.0 license. If you decide to use pre-trained TeenyTinyLlama-162 as a basis for your fine-tuned model, please conduct your own risk and bias assessment.
 ## Basic usage