juvi21's picture
Update README.md
66b69dd verified
|
raw
history blame
20.9 kB
---
license: apache-2.0
datasets:
- BAAI/Infinity-Instruct
tags:
- axolotl
- NousResearch/Hermes-2-Pro-Mistral-7B
- finetune
- gguf
---
# Hermes 2 Pro Mistral-7B Infinity-Instruct GGUF
This model is a fine-tuned version of [NousResearch/Hermes-2-Pro-Mistral-7B](https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B) on the [BAAI/Infinity-Instruct](https://huggingface.co/datasets/BAAI/Infinity-Instruct) dataset.
You can find the main model page [here](https://huggingface.co/juvi21/Hermes-2-Pro-Mistral-7B-infinity).
## Model Details
- **Base Model:** NousResearch/Hermes-2-Pro-Mistral-7B
- **Dataset:** BAAI/Infinity-Instruct
- **Sequence Length:** 8192 tokens
- **Training:**
- **Epochs:** 1
- **Hardware:** 4 Nodes x 4 NVIDIA A100 40GB GPUs
- **Duration:** 26:56:43
- **Cluster:** KIT SCC Cluster
## Benchmark n_shots=0
![Benchmark Results](https://cdn-uploads.huggingface.co/production/uploads/659c4ecb413a1376bee2f661/gzwCfT8HTBRpRAzj2mN67.png)
| Benchmark | Score |
|-----------|-------|
| ARC (Challenge) | 52.47% |
| ARC (Easy) | 81.65% |
| BoolQ | 87.22% |
| HellaSwag | 60.52% |
| OpenBookQA | 33.60% |
| PIQA | 81.12% |
| Winogrande | 72.22% |
| AGIEval | 38.46% |
| TruthfulQA | 44.22% |
| MMLU | 59.72% |
| IFEval | 47.96% |
For detailed benchmark results, including sub-categories and various metrics, please refer to the [full benchmark table](#full-benchmark-results) at the end of this README.
## License
This model is released under the Apache 2.0 license.
## chatml
```
<|im_start|>system
{system_prompt}<|im_end|>
<|im_start|>user
Knock Knock, who is there?<|im_end|>
<|im_start|>assistant
Hi there! <|im_end|>
```
## Acknowledgements
Special thanks to:
- NousResearch for their excellent base model
- BAAI for providing the Infinity-Instruct dataset
- KIT SCC for FLOPS
## Citation
If you use this model in your research, consider citing. Although definetly cite NousResearch and BAAI:
```bibtex
@misc{hermes2pro-mistral-7b-infinity,
author = {juvi21},
title = {Hermes 2 Pro Mistral-7B Infinity-Instruct},
year = {2024},
}
```
## full-benchmark-results
| Tasks |Version|Filter|n-shot| Metric | | Value | |Stderr|
|---------------------------------------|-------|------|-----:|-----------------------|---|------:|---|------|
|agieval |N/A |none | 0|acc |↑ | 0.3846|± |0.0051|
| | |none | 0|acc_norm |↑ | 0.4186|± |0.0056|
| - agieval_aqua_rat | 1|none | 0|acc |↑ | 0.2520|± |0.0273|
| | |none | 0|acc_norm |↑ | 0.2323|± |0.0265|
| - agieval_gaokao_biology | 1|none | 0|acc |↑ | 0.2952|± |0.0316|
| | |none | 0|acc_norm |↑ | 0.3381|± |0.0327|
| - agieval_gaokao_chemistry | 1|none | 0|acc |↑ | 0.2560|± |0.0304|
| | |none | 0|acc_norm |↑ | 0.2850|± |0.0315|
| - agieval_gaokao_chinese | 1|none | 0|acc |↑ | 0.2317|± |0.0270|
| | |none | 0|acc_norm |↑ | 0.2236|± |0.0266|
| - agieval_gaokao_english | 1|none | 0|acc |↑ | 0.6667|± |0.0270|
| | |none | 0|acc_norm |↑ | 0.6863|± |0.0266|
| - agieval_gaokao_geography | 1|none | 0|acc |↑ | 0.3869|± |0.0346|
| | |none | 0|acc_norm |↑ | 0.4020|± |0.0348|
| - agieval_gaokao_history | 1|none | 0|acc |↑ | 0.4468|± |0.0325|
| | |none | 0|acc_norm |↑ | 0.3957|± |0.0320|
| - agieval_gaokao_mathcloze | 1|none | 0|acc |↑ | 0.0254|± |0.0146|
| - agieval_gaokao_mathqa | 1|none | 0|acc |↑ | 0.2507|± |0.0232|
| | |none | 0|acc_norm |↑ | 0.2621|± |0.0235|
| - agieval_gaokao_physics | 1|none | 0|acc |↑ | 0.2900|± |0.0322|
| | |none | 0|acc_norm |↑ | 0.3100|± |0.0328|
| - agieval_jec_qa_ca | 1|none | 0|acc |↑ | 0.4735|± |0.0158|
| | |none | 0|acc_norm |↑ | 0.4695|± |0.0158|
| - agieval_jec_qa_kd | 1|none | 0|acc |↑ | 0.5290|± |0.0158|
| | |none | 0|acc_norm |↑ | 0.5140|± |0.0158|
| - agieval_logiqa_en | 1|none | 0|acc |↑ | 0.3579|± |0.0188|
| | |none | 0|acc_norm |↑ | 0.3779|± |0.0190|
| - agieval_logiqa_zh | 1|none | 0|acc |↑ | 0.3103|± |0.0181|
| | |none | 0|acc_norm |↑ | 0.3318|± |0.0185|
| - agieval_lsat_ar | 1|none | 0|acc |↑ | 0.2217|± |0.0275|
| | |none | 0|acc_norm |↑ | 0.2217|± |0.0275|
| - agieval_lsat_lr | 1|none | 0|acc |↑ | 0.5333|± |0.0221|
| | |none | 0|acc_norm |↑ | 0.5098|± |0.0222|
| - agieval_lsat_rc | 1|none | 0|acc |↑ | 0.5948|± |0.0300|
| | |none | 0|acc_norm |↑ | 0.5353|± |0.0305|
| - agieval_math | 1|none | 0|acc |↑ | 0.1520|± |0.0114|
| - agieval_sat_en | 1|none | 0|acc |↑ | 0.7864|± |0.0286|
| | |none | 0|acc_norm |↑ | 0.7621|± |0.0297|
| - agieval_sat_en_without_passage | 1|none | 0|acc |↑ | 0.4660|± |0.0348|
| | |none | 0|acc_norm |↑ | 0.4272|± |0.0345|
| - agieval_sat_math | 1|none | 0|acc |↑ | 0.3591|± |0.0324|
| | |none | 0|acc_norm |↑ | 0.3045|± |0.0311|
|arc_challenge | 1|none | 0|acc |↑ | 0.5247|± |0.0146|
| | |none | 0|acc_norm |↑ | 0.5538|± |0.0145|
|arc_easy | 1|none | 0|acc |↑ | 0.8165|± |0.0079|
| | |none | 0|acc_norm |↑ | 0.7934|± |0.0083|
|boolq | 2|none | 0|acc |↑ | 0.8722|± |0.0058|
|hellaswag | 1|none | 0|acc |↑ | 0.6052|± |0.0049|
| | |none | 0|acc_norm |↑ | 0.7941|± |0.0040|
|ifeval | 2|none | 0|inst_level_loose_acc |↑ | 0.5132|± |N/A |
| | |none | 0|inst_level_strict_acc |↑ | 0.4796|± |N/A |
| | |none | 0|prompt_level_loose_acc |↑ | 0.4122|± |0.0212|
| | |none | 0|prompt_level_strict_acc|↑ | 0.3734|± |0.0208|
|mmlu |N/A |none | 0|acc |↑ | 0.5972|± |0.0039|
| - abstract_algebra | 0|none | 0|acc |↑ | 0.3100|± |0.0465|
| - anatomy | 0|none | 0|acc |↑ | 0.5852|± |0.0426|
| - astronomy | 0|none | 0|acc |↑ | 0.6447|± |0.0389|
| - business_ethics | 0|none | 0|acc |↑ | 0.5800|± |0.0496|
| - clinical_knowledge | 0|none | 0|acc |↑ | 0.6830|± |0.0286|
| - college_biology | 0|none | 0|acc |↑ | 0.7153|± |0.0377|
| - college_chemistry | 0|none | 0|acc |↑ | 0.4500|± |0.0500|
| - college_computer_science | 0|none | 0|acc |↑ | 0.4900|± |0.0502|
| - college_mathematics | 0|none | 0|acc |↑ | 0.3100|± |0.0465|
| - college_medicine | 0|none | 0|acc |↑ | 0.6069|± |0.0372|
| - college_physics | 0|none | 0|acc |↑ | 0.4020|± |0.0488|
| - computer_security | 0|none | 0|acc |↑ | 0.7200|± |0.0451|
| - conceptual_physics | 0|none | 0|acc |↑ | 0.5234|± |0.0327|
| - econometrics | 0|none | 0|acc |↑ | 0.4123|± |0.0463|
| - electrical_engineering | 0|none | 0|acc |↑ | 0.4759|± |0.0416|
| - elementary_mathematics | 0|none | 0|acc |↑ | 0.4180|± |0.0254|
| - formal_logic | 0|none | 0|acc |↑ | 0.4286|± |0.0443|
| - global_facts | 0|none | 0|acc |↑ | 0.3400|± |0.0476|
| - high_school_biology | 0|none | 0|acc |↑ | 0.7419|± |0.0249|
| - high_school_chemistry | 0|none | 0|acc |↑ | 0.4631|± |0.0351|
| - high_school_computer_science | 0|none | 0|acc |↑ | 0.6300|± |0.0485|
| - high_school_european_history | 0|none | 0|acc |↑ | 0.7394|± |0.0343|
| - high_school_geography | 0|none | 0|acc |↑ | 0.7323|± |0.0315|
| - high_school_government_and_politics| 0|none | 0|acc |↑ | 0.8238|± |0.0275|
| - high_school_macroeconomics | 0|none | 0|acc |↑ | 0.6308|± |0.0245|
| - high_school_mathematics | 0|none | 0|acc |↑ | 0.3333|± |0.0287|
| - high_school_microeconomics | 0|none | 0|acc |↑ | 0.6387|± |0.0312|
| - high_school_physics | 0|none | 0|acc |↑ | 0.2914|± |0.0371|
| - high_school_psychology | 0|none | 0|acc |↑ | 0.8128|± |0.0167|
| - high_school_statistics | 0|none | 0|acc |↑ | 0.4907|± |0.0341|
| - high_school_us_history | 0|none | 0|acc |↑ | 0.8186|± |0.0270|
| - high_school_world_history | 0|none | 0|acc |↑ | 0.8186|± |0.0251|
| - human_aging | 0|none | 0|acc |↑ | 0.6771|± |0.0314|
| - human_sexuality | 0|none | 0|acc |↑ | 0.7176|± |0.0395|
| - humanities |N/A |none | 0|acc |↑ | 0.5411|± |0.0066|
| - international_law | 0|none | 0|acc |↑ | 0.7603|± |0.0390|
| - jurisprudence | 0|none | 0|acc |↑ | 0.7593|± |0.0413|
| - logical_fallacies | 0|none | 0|acc |↑ | 0.7239|± |0.0351|
| - machine_learning | 0|none | 0|acc |↑ | 0.5268|± |0.0474|
| - management | 0|none | 0|acc |↑ | 0.7864|± |0.0406|
| - marketing | 0|none | 0|acc |↑ | 0.8547|± |0.0231|
| - medical_genetics | 0|none | 0|acc |↑ | 0.6500|± |0.0479|
| - miscellaneous | 0|none | 0|acc |↑ | 0.7918|± |0.0145|
| - moral_disputes | 0|none | 0|acc |↑ | 0.6705|± |0.0253|
| - moral_scenarios | 0|none | 0|acc |↑ | 0.2268|± |0.0140|
| - nutrition | 0|none | 0|acc |↑ | 0.6961|± |0.0263|
| - other |N/A |none | 0|acc |↑ | 0.6720|± |0.0081|
| - philosophy | 0|none | 0|acc |↑ | 0.6945|± |0.0262|
| - prehistory | 0|none | 0|acc |↑ | 0.6975|± |0.0256|
| - professional_accounting | 0|none | 0|acc |↑ | 0.4539|± |0.0297|
| - professional_law | 0|none | 0|acc |↑ | 0.4537|± |0.0127|
| - professional_medicine | 0|none | 0|acc |↑ | 0.6176|± |0.0295|
| - professional_psychology | 0|none | 0|acc |↑ | 0.6275|± |0.0196|
| - public_relations | 0|none | 0|acc |↑ | 0.6364|± |0.0461|
| - security_studies | 0|none | 0|acc |↑ | 0.7061|± |0.0292|
| - social_sciences |N/A |none | 0|acc |↑ | 0.7043|± |0.0080|
| - sociology | 0|none | 0|acc |↑ | 0.8458|± |0.0255|
| - stem |N/A |none | 0|acc |↑ | 0.5027|± |0.0086|
| - us_foreign_policy | 0|none | 0|acc |↑ | 0.8400|± |0.0368|
| - virology | 0|none | 0|acc |↑ | 0.5060|± |0.0389|
| - world_religions | 0|none | 0|acc |↑ | 0.8421|± |0.0280|
|openbookqa | 1|none | 0|acc |↑ | 0.3360|± |0.0211|
| | |none | 0|acc_norm |↑ | 0.4380|± |0.0222|
|piqa | 1|none | 0|acc |↑ | 0.8112|± |0.0091|
| | |none | 0|acc_norm |↑ | 0.8194|± |0.0090|
|truthfulqa |N/A |none | 0|acc |↑ | 0.4422|± |0.0113|
| | |none | 0|bleu_acc |↑ | 0.5398|± |0.0174|
| | |none | 0|bleu_diff |↑ | 6.0075|± |0.9539|
| | |none | 0|bleu_max |↑ |30.9946|± |0.8538|
| | |none | 0|rouge1_acc |↑ | 0.5545|± |0.0174|
| | |none | 0|rouge1_diff |↑ | 8.7352|± |1.2500|
| | |none | 0|rouge1_max |↑ |57.5941|± |0.8750|
| | |none | 0|rouge2_acc |↑ | 0.4810|± |0.0175|
| | |none | 0|rouge2_diff |↑ | 7.9063|± |1.3837|
| | |none | 0|rouge2_max |↑ |43.4572|± |1.0786|
| | |none | 0|rougeL_acc |↑ | 0.5239|± |0.0175|
| | |none | 0|rougeL_diff |↑ | 8.3871|± |1.2689|
| | |none | 0|rougeL_max |↑ |54.6542|± |0.9060|
| - truthfulqa_gen | 3|none | 0|bleu_acc |↑ | 0.5398|± |0.0174|
| | |none | 0|bleu_diff |↑ | 6.0075|± |0.9539|
| | |none | 0|bleu_max |↑ |30.9946|± |0.8538|
| | |none | 0|rouge1_acc |↑ | 0.5545|± |0.0174|
| | |none | 0|rouge1_diff |↑ | 8.7352|± |1.2500|
| | |none | 0|rouge1_max |↑ |57.5941|± |0.8750|
| | |none | 0|rouge2_acc |↑ | 0.4810|± |0.0175|
| | |none | 0|rouge2_diff |↑ | 7.9063|± |1.3837|
| | |none | 0|rouge2_max |↑ |43.4572|± |1.0786|
| | |none | 0|rougeL_acc |↑ | 0.5239|± |0.0175|
| | |none | 0|rougeL_diff |↑ | 8.3871|± |1.2689|
| | |none | 0|rougeL_max |↑ |54.6542|± |0.9060|
| - truthfulqa_mc1 | 2|none | 0|acc |↑ | 0.3574|± |0.0168|
| - truthfulqa_mc2 | 2|none | 0|acc |↑ | 0.5269|± |0.0152|
|winogrande | 1|none | 0|acc |↑ | 0.7222|± |0.0126|
| Groups |Version|Filter|n-shot| Metric | | Value | |Stderr|
|------------------|-------|------|-----:|-----------|---|------:|---|-----:|
|agieval |N/A |none | 0|acc |↑ | 0.3846|± |0.0051|
| | |none | 0|acc_norm |↑ | 0.4186|± |0.0056|
|mmlu |N/A |none | 0|acc |↑ | 0.5972|± |0.0039|
| - humanities |N/A |none | 0|acc |↑ | 0.5411|± |0.0066|
| - other |N/A |none | 0|acc |↑ | 0.6720|± |0.0081|
| - social_sciences|N/A |none | 0|acc |↑ | 0.7043|± |0.0080|
| - stem |N/A |none | 0|acc |↑ | 0.5027|± |0.0086|
|truthfulqa |N/A |none | 0|acc |↑ | 0.4422|± |0.0113|
| | |none | 0|bleu_acc |↑ | 0.5398|± |0.0174|
| | |none | 0|bleu_diff |↑ | 6.0075|± |0.9539|
| | |none | 0|bleu_max |↑ |30.9946|± |0.8538|
| | |none | 0|rouge1_acc |↑ | 0.5545|± |0.0174|
| | |none | 0|rouge1_diff|↑ | 8.7352|± |1.2500|
| | |none | 0|rouge1_max |↑ |57.5941|± |0.8750|
| | |none | 0|rouge2_acc |↑ | 0.4810|± |0.0175|
| | |none | 0|rouge2_diff|↑ | 7.9063|± |1.3837|
| | |none | 0|rouge2_max |↑ |43.4572|± |1.0786|
| | |none | 0|rougeL_acc |↑ | 0.5239|± |0.0175|
| | |none | 0|rougeL_diff|↑ | 8.3871|± |1.2689|
| | |none | 0|rougeL_max |↑ |54.6542|± |0.9060|