syzymon
/

long_llama_code_7b

Text Generation

text-generation-inference

Model card Files Files and versions Community

syzymon commited on Sep 24, 2023

Commit

3dea16c

•

1 Parent(s): 81abcb1

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -129,7 +129,8 @@ with three layers used for context extension. **Crucially, LongLLaMA is able to
 |----------------|----------|----------|-----------|
 | Source model         | [OpenLLaMA-3B](https://huggingface.co/openlm-research/open_llama_3b_easylm)      | [OpenLLaMA-3Bv2](https://huggingface.co/openlm-research/open_llama_3b_v2_easylm) | [CodeLLaMA-7b-hf](https://huggingface.co/codellama/CodeLlama-7b-hf)       |
 | Source model tokens     | 1T      |  1 T |  2T + 0.5 T       |
-| Fine-tuning tokens  | 10B     | 5B | 35B     | - |
 | Memory layers         |  6, 12, 18        |   6, 12, 18        |  8, 16, 24        |
 </div>

 |----------------|----------|----------|-----------|
 | Source model         | [OpenLLaMA-3B](https://huggingface.co/openlm-research/open_llama_3b_easylm)      | [OpenLLaMA-3Bv2](https://huggingface.co/openlm-research/open_llama_3b_v2_easylm) | [CodeLLaMA-7b-hf](https://huggingface.co/codellama/CodeLlama-7b-hf)       |
 | Source model tokens     | 1T      |  1 T |  2T + 0.5 T       |
+| Fine-tuning context | 8K      | 32K | 32K |
+| Fine-tuning tokens  | 10B     | 5B | 35B     |
 | Memory layers         |  6, 12, 18        |   6, 12, 18        |  8, 16, 24        |
 </div>