yuchenglu commited on
Commit
62502aa
1 Parent(s): 65e7cb3

Update README.md: add PG19 evaluation results

Browse files
Files changed (1) hide show
  1. README.md +12 -0
README.md CHANGED
@@ -80,6 +80,18 @@ Their personalities, so diverse,
80
  Their charm, a gift, that's forever told.
81
  ```
82
 
 
 
 
 
 
 
 
 
 
 
 
 
83
  ## Limitations and Bias
84
 
85
  As with all language models, LLaMA-2-7B-32K-Chat may generate incorrect or biased content. It's important to keep this in mind when using the model.
 
80
  Their charm, a gift, that's forever told.
81
  ```
82
 
83
+ ## Model Evaluation
84
+
85
+ We evaluate the model with [PG19 dataset](https://huggingface.co/datasets/pg19) and compare the perplexity with [Llama-2-7b-chat](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf),
86
+ the results are summarized below (note that the perplexity is normalized following the protocol [here](https://together.ai/blog/llama-2-7b-32k)).
87
+
88
+ | Model | 2K Seq | 4K Seq | 8K Seq | 16K Seq | 32K Seq |
89
+ | -------- | ------- | ------- | ------- | ------- | ------- |
90
+ | LLaMA-2-7B-Chat (Meta) | 1.844 | 1.833 | N/A | N/A | N/A |
91
+ | LLaMA-2-7B-32K-Chat (ours) | 1.813 | 1.798 | 1.781 | 1.778 | 1.772|
92
+
93
+ We observe that LLaMA-2-7B-32K-Chat obtains reasonable (and even better) perplexity, comparable to the original LLaMA-2-7B-Chat model.
94
+
95
  ## Limitations and Bias
96
 
97
  As with all language models, LLaMA-2-7B-32K-Chat may generate incorrect or biased content. It's important to keep this in mind when using the model.