Update README.md
Browse files
README.md
CHANGED
@@ -1,12 +1,4 @@
|
|
1 |
-
CAMEL-13B-Role-Playing-Data is a chat large language model obtained by finetuning LLaMA-13B model on a total of 229K conversations created through our role-playing framework proposed in [CAMEL](https://arxiv.org/abs/2303.17760). We evaluate our model offline using EleutherAI's language model evaluation harness used by Huggingface's Open LLM Benchmark. CAMEL-13B scores an average of
|
2 |
-
|
3 |
-
| Model | size | ARC-C (25 shots, acc_norm) | HellaSwag (10 shots, acc_norm) | MMLU (5 shots, acc_norm) | TruthfulQA (0 shot, mc2) | Average | Delta |
|
4 |
-
|-------------|:----:|:---------------------------:|:-------------------------------:|:-------------------------:|:-------------------------:|:-------:|-------|
|
5 |
-
| LLaMA | 13B | 50.8 | 78.9 | 37.7 | 39.9 | 51.8 | - |
|
6 |
-
| Vicuna | 13B | 47.4 | 75.2 | 39.6 | 49.8 | 53.7 | 1.9 |
|
7 |
-
| CAMEL | 13B | 54.9 | 79.3 | 48.5 | 46.2 | **57.2** | 5.4 |
|
8 |
-
| LLaMA | 30B | 57.1 | 82.6 | 45.7 | 42.3 | 56.9 | 5.1 |
|
9 |
-
|
10 |
---
|
11 |
license: cc-by-nc-4.0
|
12 |
---
|
|
|
1 |
+
CAMEL-13B-Role-Playing-Data is a chat large language model obtained by finetuning LLaMA-13B model on a total of 229K conversations created through our role-playing framework proposed in [CAMEL](https://arxiv.org/abs/2303.17760). We evaluate our model offline using EleutherAI's language model evaluation harness used by Huggingface's Open LLM Benchmark. CAMEL-13B scores an average of 57.2.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
---
|
3 |
license: cc-by-nc-4.0
|
4 |
---
|