whynlp commited on
Commit
9d15e38
·
verified ·
1 Parent(s): 39e1675

Update README.md

Browse files

add info about kv cache saving

Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -66,6 +66,8 @@ print(response[0]["generated_text"])
66
 
67
  ## The LCKV Collection
68
 
 
 
69
  This model was randomly initialized, then pre-trained on 100B tokens from [SlimPajama](https://huggingface.co/datasets/cerebras/SlimPajama-627B).
70
 
71
  The evaluation follows that of TinyLlama. Refer to [our paper](https://arxiv.org/abs/2405.10637) for more details.
 
66
 
67
  ## The LCKV Collection
68
 
69
+ The model has 2 warmup layers. i.e. 3/22 KV cache of a standard TinyLlama.
70
+
71
  This model was randomly initialized, then pre-trained on 100B tokens from [SlimPajama](https://huggingface.co/datasets/cerebras/SlimPajama-627B).
72
 
73
  The evaluation follows that of TinyLlama. Refer to [our paper](https://arxiv.org/abs/2405.10637) for more details.