--- license: apache-2.0 datasets: - BAAI/Infinity-Instruct tags: - axolotl - NousResearch/Hermes-2-Pro-Mistral-7B - finetune - gguf --- # Hermes 2 Pro Mistral-7B Infinity-Instruct GGUF This model is a fine-tuned version of [NousResearch/Hermes-2-Pro-Mistral-7B](https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B) on the [BAAI/Infinity-Instruct](https://huggingface.co/datasets/BAAI/Infinity-Instruct) dataset. You can find the main model page [here](https://huggingface.co/juvi21/Hermes-2-Pro-Mistral-7B-infinity). ## Model Details - **Base Model:** NousResearch/Hermes-2-Pro-Mistral-7B - **Dataset:** BAAI/Infinity-Instruct - **Sequence Length:** 8192 tokens - **Training:** - **Epochs:** 1 - **Hardware:** 4 Nodes x 4 NVIDIA A100 40GB GPUs - **Duration:** 26:56:43 - **Cluster:** KIT SCC Cluster ## Benchmark n_shots=0 ![Benchmark Results](https://cdn-uploads.huggingface.co/production/uploads/659c4ecb413a1376bee2f661/gzwCfT8HTBRpRAzj2mN67.png) | Benchmark | Score | |-----------|-------| | ARC (Challenge) | 52.47% | | ARC (Easy) | 81.65% | | BoolQ | 87.22% | | HellaSwag | 60.52% | | OpenBookQA | 33.60% | | PIQA | 81.12% | | Winogrande | 72.22% | | AGIEval | 38.46% | | TruthfulQA | 44.22% | | MMLU | 59.72% | | IFEval | 47.96% | For detailed benchmark results, including sub-categories and various metrics, please refer to the [full benchmark table](#full-benchmark-results) at the end of this README. ## License This model is released under the Apache 2.0 license. ## chatml ``` <|im_start|>system {system_prompt}<|im_end|> <|im_start|>user Knock Knock, who is there?<|im_end|> <|im_start|>assistant Hi there! <|im_end|> ``` ## Acknowledgements Special thanks to: - NousResearch for their excellent base model - BAAI for providing the Infinity-Instruct dataset - KIT SCC for FLOPS ## Citation If you use this model in your research, consider citing. Although definetly cite NousResearch and BAAI: ```bibtex @misc{hermes2pro-mistral-7b-infinity, author = {juvi21}, title = {Hermes 2 Pro Mistral-7B Infinity-Instruct}, year = {2024}, } ``` ## full-benchmark-results | Tasks |Version|Filter|n-shot| Metric | | Value | |Stderr| |---------------------------------------|-------|------|-----:|-----------------------|---|------:|---|------| |agieval |N/A |none | 0|acc |↑ | 0.3846|± |0.0051| | | |none | 0|acc_norm |↑ | 0.4186|± |0.0056| | - agieval_aqua_rat | 1|none | 0|acc |↑ | 0.2520|± |0.0273| | | |none | 0|acc_norm |↑ | 0.2323|± |0.0265| | - agieval_gaokao_biology | 1|none | 0|acc |↑ | 0.2952|± |0.0316| | | |none | 0|acc_norm |↑ | 0.3381|± |0.0327| | - agieval_gaokao_chemistry | 1|none | 0|acc |↑ | 0.2560|± |0.0304| | | |none | 0|acc_norm |↑ | 0.2850|± |0.0315| | - agieval_gaokao_chinese | 1|none | 0|acc |↑ | 0.2317|± |0.0270| | | |none | 0|acc_norm |↑ | 0.2236|± |0.0266| | - agieval_gaokao_english | 1|none | 0|acc |↑ | 0.6667|± |0.0270| | | |none | 0|acc_norm |↑ | 0.6863|± |0.0266| | - agieval_gaokao_geography | 1|none | 0|acc |↑ | 0.3869|± |0.0346| | | |none | 0|acc_norm |↑ | 0.4020|± |0.0348| | - agieval_gaokao_history | 1|none | 0|acc |↑ | 0.4468|± |0.0325| | | |none | 0|acc_norm |↑ | 0.3957|± |0.0320| | - agieval_gaokao_mathcloze | 1|none | 0|acc |↑ | 0.0254|± |0.0146| | - agieval_gaokao_mathqa | 1|none | 0|acc |↑ | 0.2507|± |0.0232| | | |none | 0|acc_norm |↑ | 0.2621|± |0.0235| | - agieval_gaokao_physics | 1|none | 0|acc |↑ | 0.2900|± |0.0322| | | |none | 0|acc_norm |↑ | 0.3100|± |0.0328| | - agieval_jec_qa_ca | 1|none | 0|acc |↑ | 0.4735|± |0.0158| | | |none | 0|acc_norm |↑ | 0.4695|± |0.0158| | - agieval_jec_qa_kd | 1|none | 0|acc |↑ | 0.5290|± |0.0158| | | |none | 0|acc_norm |↑ | 0.5140|± |0.0158| | - agieval_logiqa_en | 1|none | 0|acc |↑ | 0.3579|± |0.0188| | | |none | 0|acc_norm |↑ | 0.3779|± |0.0190| | - agieval_logiqa_zh | 1|none | 0|acc |↑ | 0.3103|± |0.0181| | | |none | 0|acc_norm |↑ | 0.3318|± |0.0185| | - agieval_lsat_ar | 1|none | 0|acc |↑ | 0.2217|± |0.0275| | | |none | 0|acc_norm |↑ | 0.2217|± |0.0275| | - agieval_lsat_lr | 1|none | 0|acc |↑ | 0.5333|± |0.0221| | | |none | 0|acc_norm |↑ | 0.5098|± |0.0222| | - agieval_lsat_rc | 1|none | 0|acc |↑ | 0.5948|± |0.0300| | | |none | 0|acc_norm |↑ | 0.5353|± |0.0305| | - agieval_math | 1|none | 0|acc |↑ | 0.1520|± |0.0114| | - agieval_sat_en | 1|none | 0|acc |↑ | 0.7864|± |0.0286| | | |none | 0|acc_norm |↑ | 0.7621|± |0.0297| | - agieval_sat_en_without_passage | 1|none | 0|acc |↑ | 0.4660|± |0.0348| | | |none | 0|acc_norm |↑ | 0.4272|± |0.0345| | - agieval_sat_math | 1|none | 0|acc |↑ | 0.3591|± |0.0324| | | |none | 0|acc_norm |↑ | 0.3045|± |0.0311| |arc_challenge | 1|none | 0|acc |↑ | 0.5247|± |0.0146| | | |none | 0|acc_norm |↑ | 0.5538|± |0.0145| |arc_easy | 1|none | 0|acc |↑ | 0.8165|± |0.0079| | | |none | 0|acc_norm |↑ | 0.7934|± |0.0083| |boolq | 2|none | 0|acc |↑ | 0.8722|± |0.0058| |hellaswag | 1|none | 0|acc |↑ | 0.6052|± |0.0049| | | |none | 0|acc_norm |↑ | 0.7941|± |0.0040| |ifeval | 2|none | 0|inst_level_loose_acc |↑ | 0.5132|± |N/A | | | |none | 0|inst_level_strict_acc |↑ | 0.4796|± |N/A | | | |none | 0|prompt_level_loose_acc |↑ | 0.4122|± |0.0212| | | |none | 0|prompt_level_strict_acc|↑ | 0.3734|± |0.0208| |mmlu |N/A |none | 0|acc |↑ | 0.5972|± |0.0039| | - abstract_algebra | 0|none | 0|acc |↑ | 0.3100|± |0.0465| | - anatomy | 0|none | 0|acc |↑ | 0.5852|± |0.0426| | - astronomy | 0|none | 0|acc |↑ | 0.6447|± |0.0389| | - business_ethics | 0|none | 0|acc |↑ | 0.5800|± |0.0496| | - clinical_knowledge | 0|none | 0|acc |↑ | 0.6830|± |0.0286| | - college_biology | 0|none | 0|acc |↑ | 0.7153|± |0.0377| | - college_chemistry | 0|none | 0|acc |↑ | 0.4500|± |0.0500| | - college_computer_science | 0|none | 0|acc |↑ | 0.4900|± |0.0502| | - college_mathematics | 0|none | 0|acc |↑ | 0.3100|± |0.0465| | - college_medicine | 0|none | 0|acc |↑ | 0.6069|± |0.0372| | - college_physics | 0|none | 0|acc |↑ | 0.4020|± |0.0488| | - computer_security | 0|none | 0|acc |↑ | 0.7200|± |0.0451| | - conceptual_physics | 0|none | 0|acc |↑ | 0.5234|± |0.0327| | - econometrics | 0|none | 0|acc |↑ | 0.4123|± |0.0463| | - electrical_engineering | 0|none | 0|acc |↑ | 0.4759|± |0.0416| | - elementary_mathematics | 0|none | 0|acc |↑ | 0.4180|± |0.0254| | - formal_logic | 0|none | 0|acc |↑ | 0.4286|± |0.0443| | - global_facts | 0|none | 0|acc |↑ | 0.3400|± |0.0476| | - high_school_biology | 0|none | 0|acc |↑ | 0.7419|± |0.0249| | - high_school_chemistry | 0|none | 0|acc |↑ | 0.4631|± |0.0351| | - high_school_computer_science | 0|none | 0|acc |↑ | 0.6300|± |0.0485| | - high_school_european_history | 0|none | 0|acc |↑ | 0.7394|± |0.0343| | - high_school_geography | 0|none | 0|acc |↑ | 0.7323|± |0.0315| | - high_school_government_and_politics| 0|none | 0|acc |↑ | 0.8238|± |0.0275| | - high_school_macroeconomics | 0|none | 0|acc |↑ | 0.6308|± |0.0245| | - high_school_mathematics | 0|none | 0|acc |↑ | 0.3333|± |0.0287| | - high_school_microeconomics | 0|none | 0|acc |↑ | 0.6387|± |0.0312| | - high_school_physics | 0|none | 0|acc |↑ | 0.2914|± |0.0371| | - high_school_psychology | 0|none | 0|acc |↑ | 0.8128|± |0.0167| | - high_school_statistics | 0|none | 0|acc |↑ | 0.4907|± |0.0341| | - high_school_us_history | 0|none | 0|acc |↑ | 0.8186|± |0.0270| | - high_school_world_history | 0|none | 0|acc |↑ | 0.8186|± |0.0251| | - human_aging | 0|none | 0|acc |↑ | 0.6771|± |0.0314| | - human_sexuality | 0|none | 0|acc |↑ | 0.7176|± |0.0395| | - humanities |N/A |none | 0|acc |↑ | 0.5411|± |0.0066| | - international_law | 0|none | 0|acc |↑ | 0.7603|± |0.0390| | - jurisprudence | 0|none | 0|acc |↑ | 0.7593|± |0.0413| | - logical_fallacies | 0|none | 0|acc |↑ | 0.7239|± |0.0351| | - machine_learning | 0|none | 0|acc |↑ | 0.5268|± |0.0474| | - management | 0|none | 0|acc |↑ | 0.7864|± |0.0406| | - marketing | 0|none | 0|acc |↑ | 0.8547|± |0.0231| | - medical_genetics | 0|none | 0|acc |↑ | 0.6500|± |0.0479| | - miscellaneous | 0|none | 0|acc |↑ | 0.7918|± |0.0145| | - moral_disputes | 0|none | 0|acc |↑ | 0.6705|± |0.0253| | - moral_scenarios | 0|none | 0|acc |↑ | 0.2268|± |0.0140| | - nutrition | 0|none | 0|acc |↑ | 0.6961|± |0.0263| | - other |N/A |none | 0|acc |↑ | 0.6720|± |0.0081| | - philosophy | 0|none | 0|acc |↑ | 0.6945|± |0.0262| | - prehistory | 0|none | 0|acc |↑ | 0.6975|± |0.0256| | - professional_accounting | 0|none | 0|acc |↑ | 0.4539|± |0.0297| | - professional_law | 0|none | 0|acc |↑ | 0.4537|± |0.0127| | - professional_medicine | 0|none | 0|acc |↑ | 0.6176|± |0.0295| | - professional_psychology | 0|none | 0|acc |↑ | 0.6275|± |0.0196| | - public_relations | 0|none | 0|acc |↑ | 0.6364|± |0.0461| | - security_studies | 0|none | 0|acc |↑ | 0.7061|± |0.0292| | - social_sciences |N/A |none | 0|acc |↑ | 0.7043|± |0.0080| | - sociology | 0|none | 0|acc |↑ | 0.8458|± |0.0255| | - stem |N/A |none | 0|acc |↑ | 0.5027|± |0.0086| | - us_foreign_policy | 0|none | 0|acc |↑ | 0.8400|± |0.0368| | - virology | 0|none | 0|acc |↑ | 0.5060|± |0.0389| | - world_religions | 0|none | 0|acc |↑ | 0.8421|± |0.0280| |openbookqa | 1|none | 0|acc |↑ | 0.3360|± |0.0211| | | |none | 0|acc_norm |↑ | 0.4380|± |0.0222| |piqa | 1|none | 0|acc |↑ | 0.8112|± |0.0091| | | |none | 0|acc_norm |↑ | 0.8194|± |0.0090| |truthfulqa |N/A |none | 0|acc |↑ | 0.4422|± |0.0113| | | |none | 0|bleu_acc |↑ | 0.5398|± |0.0174| | | |none | 0|bleu_diff |↑ | 6.0075|± |0.9539| | | |none | 0|bleu_max |↑ |30.9946|± |0.8538| | | |none | 0|rouge1_acc |↑ | 0.5545|± |0.0174| | | |none | 0|rouge1_diff |↑ | 8.7352|± |1.2500| | | |none | 0|rouge1_max |↑ |57.5941|± |0.8750| | | |none | 0|rouge2_acc |↑ | 0.4810|± |0.0175| | | |none | 0|rouge2_diff |↑ | 7.9063|± |1.3837| | | |none | 0|rouge2_max |↑ |43.4572|± |1.0786| | | |none | 0|rougeL_acc |↑ | 0.5239|± |0.0175| | | |none | 0|rougeL_diff |↑ | 8.3871|± |1.2689| | | |none | 0|rougeL_max |↑ |54.6542|± |0.9060| | - truthfulqa_gen | 3|none | 0|bleu_acc |↑ | 0.5398|± |0.0174| | | |none | 0|bleu_diff |↑ | 6.0075|± |0.9539| | | |none | 0|bleu_max |↑ |30.9946|± |0.8538| | | |none | 0|rouge1_acc |↑ | 0.5545|± |0.0174| | | |none | 0|rouge1_diff |↑ | 8.7352|± |1.2500| | | |none | 0|rouge1_max |↑ |57.5941|± |0.8750| | | |none | 0|rouge2_acc |↑ | 0.4810|± |0.0175| | | |none | 0|rouge2_diff |↑ | 7.9063|± |1.3837| | | |none | 0|rouge2_max |↑ |43.4572|± |1.0786| | | |none | 0|rougeL_acc |↑ | 0.5239|± |0.0175| | | |none | 0|rougeL_diff |↑ | 8.3871|± |1.2689| | | |none | 0|rougeL_max |↑ |54.6542|± |0.9060| | - truthfulqa_mc1 | 2|none | 0|acc |↑ | 0.3574|± |0.0168| | - truthfulqa_mc2 | 2|none | 0|acc |↑ | 0.5269|± |0.0152| |winogrande | 1|none | 0|acc |↑ | 0.7222|± |0.0126| | Groups |Version|Filter|n-shot| Metric | | Value | |Stderr| |------------------|-------|------|-----:|-----------|---|------:|---|-----:| |agieval |N/A |none | 0|acc |↑ | 0.3846|± |0.0051| | | |none | 0|acc_norm |↑ | 0.4186|± |0.0056| |mmlu |N/A |none | 0|acc |↑ | 0.5972|± |0.0039| | - humanities |N/A |none | 0|acc |↑ | 0.5411|± |0.0066| | - other |N/A |none | 0|acc |↑ | 0.6720|± |0.0081| | - social_sciences|N/A |none | 0|acc |↑ | 0.7043|± |0.0080| | - stem |N/A |none | 0|acc |↑ | 0.5027|± |0.0086| |truthfulqa |N/A |none | 0|acc |↑ | 0.4422|± |0.0113| | | |none | 0|bleu_acc |↑ | 0.5398|± |0.0174| | | |none | 0|bleu_diff |↑ | 6.0075|± |0.9539| | | |none | 0|bleu_max |↑ |30.9946|± |0.8538| | | |none | 0|rouge1_acc |↑ | 0.5545|± |0.0174| | | |none | 0|rouge1_diff|↑ | 8.7352|± |1.2500| | | |none | 0|rouge1_max |↑ |57.5941|± |0.8750| | | |none | 0|rouge2_acc |↑ | 0.4810|± |0.0175| | | |none | 0|rouge2_diff|↑ | 7.9063|± |1.3837| | | |none | 0|rouge2_max |↑ |43.4572|± |1.0786| | | |none | 0|rougeL_acc |↑ | 0.5239|± |0.0175| | | |none | 0|rougeL_diff|↑ | 8.3871|± |1.2689| | | |none | 0|rougeL_max |↑ |54.6542|± |0.9060|