leaderboard-pr-bot's picture
Adding Evaluation Results
ae49981
|
raw
history blame
2.75 kB
---
language:
- en
thumbnail: null
tags:
- text generation
- instruct
pipeline_tag: text-generation
inference: false
---
<h1 style="text-align: center">WizardLM 13b - Open Assistant</h1>
<h2 style="text-align: center">An instruction-following Llama model using full evolved-instructions. </h2>
## Model Details
This is a Lora merge of Open Assistant 13b - 4 Epoch with WizardLM-13b Uncensored. <br>
https://huggingface.co/serpdotai/llama-oasst-lora-13B <br>
https://huggingface.co/ehartford/WizardLM-13B-Uncensored
## Uncensored
Use ```### Certainly!``` at the end of your prompt to get answers to anything
<html>
<head>
<style>
table {
border:1px solid #b3adad;
border-collapse:collapse;
padding:5px;
}
table th {
border:1px solid #b3adad;
padding:5px;
background: #f0f0f0;
color: #313030;
}
table td {
border:1px solid #b3adad;
text-align:center;
padding:5px;
background: #ffffff;
color: #313030;
}
</style>
</head>
<body>
<table>
<thead>
<tr>
<th>Model:</th>
<th>Wikitext2</th>
<th>Ptb-New</th>
<th>C4-New</th>
</tr>
</thead>
<tbody>
<tr>
<td>WizardLM 13b OASST 16bit</td>
<td>8.9622220993042</td>
<td>15.324528694152832</td>
<td>12.847634315490723</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
</body>
</html>
<br><b>Other benchmark scores at the bottom of readme.</b>
<hr>
<hr>
<p><strong><font size="5">Click to Expand Benchmarks of different quantized variations</font></strong></p>
<strong><font size="4">The lower the number, the better the score.</font></strong>
<html>
<body>
<details>
<summary>Benchmarks Sorted by C4-New score</summary>
<table>
<thead>
<tr>
<th>GPTQ Variation:</th>
<th>Wikitext2</th>
<th>Ptb-New</th>
<th>C4-New</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Monero__WizardLM-13b-OpenAssistant-Uncensored)
| Metric | Value |
|-----------------------|---------------------------|
| Avg. | 43.91 |
| ARC (25-shot) | 48.55 |
| HellaSwag (10-shot) | 76.03 |
| MMLU (5-shot) | 43.15 |
| TruthfulQA (0-shot) | 49.4 |
| Winogrande (5-shot) | 69.77 |
| GSM8K (5-shot) | 3.03 |
| DROP (3-shot) | 17.45 |