davidkim205
commited on
Commit
•
bc3806e
1
Parent(s):
40bd979
Update README.md
Browse files
README.md
CHANGED
@@ -217,12 +217,6 @@ This method proposes a novel method for generating datasets for DPO (Self-superv
|
|
217 |
Randomly selecting data from each category within the training dataset, we constructed a DPO (Direct Preference Optimization) dataset using sentences with logits lower than the mean within the model-generated sentences.
|
218 |
* I'm sorry I can't reveal it.
|
219 |
|
220 |
-
## Evaluation
|
221 |
-
### [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
222 |
-
| **model** | **average** | **arc** | **hellaswag** | **mmlu** | **truthfulQA** | **winogrande** | **GSM8k** |
|
223 |
-
| ------------- | ----------- | ------- | ------------- | -------- | -------------- | -------------- | --------- |
|
224 |
-
| Rhea-72b-v0.5 | 81.22 | 79.78 | 91.15 | 77.95 | 74.5 | 87.85 | 76.12 |
|
225 |
-
|
226 |
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
227 |
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_davidkim205__Rhea-72b-v0.5)
|
228 |
|
|
|
217 |
Randomly selecting data from each category within the training dataset, we constructed a DPO (Direct Preference Optimization) dataset using sentences with logits lower than the mean within the model-generated sentences.
|
218 |
* I'm sorry I can't reveal it.
|
219 |
|
|
|
|
|
|
|
|
|
|
|
|
|
220 |
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
221 |
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_davidkim205__Rhea-72b-v0.5)
|
222 |
|