--- license: apache-2.0 tags: - merge - mergekit - lazymergekit - NousResearch/Nous-Hermes-2-Yi-34B - jondurbin/bagel-dpo-34b-v0.2 base_model: - NousResearch/Nous-Hermes-2-Yi-34B - jondurbin/bagel-dpo-34b-v0.2 model-index: - name: HermesBagel-34B-v0.1 results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 70.56 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=dfurman/HermesBagel-34B-v0.1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 85.74 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=dfurman/HermesBagel-34B-v0.1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 77.38 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=dfurman/HermesBagel-34B-v0.1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 67.34 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=dfurman/HermesBagel-34B-v0.1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 84.61 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=dfurman/HermesBagel-34B-v0.1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 65.28 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=dfurman/HermesBagel-34B-v0.1 name: Open LLM Leaderboard --- # HermesBagel-34B-v0.1 HermesBagel-34B-v0.1 is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing): * [NousResearch/Nous-Hermes-2-Yi-34B](https://huggingface.co/NousResearch/Nous-Hermes-2-Yi-34B) * [jondurbin/bagel-dpo-34b-v0.2](https://huggingface.co/jondurbin/bagel-dpo-34b-v0.2) ## 🧩 Configuration ```yaml slices: - sources: - model: NousResearch/Nous-Hermes-2-Yi-34B layer_range: [0, 60] - model: jondurbin/bagel-dpo-34b-v0.2 layer_range: [0, 60] merge_method: slerp base_model: NousResearch/Nous-Hermes-2-Yi-34B parameters: t: - filter: self_attn value: [0, 0.5, 0.3, 0.7, 1] - filter: mlp value: [1, 0.5, 0.7, 0.3, 0] - value: 0.5 dtype: bfloat16 ``` ## Basic Usage
Setup ```python !pip install -qU transformers accelerate bitsandbytes from transformers import ( AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig ) import torch model = "dfurman/HermesBagel-34B-v0.1" nf4_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True, bnb_4bit_compute_dtype=torch.bfloat16 ) tokenizer = AutoTokenizer.from_pretrained(model) model = AutoModelForCausalLM.from_pretrained( model, torch_dtype=torch.bfloat16, device_map="auto", quantization_config=nf4_config, ) ```
```python messages = [ {"role": "user", "content": "What is a large language model?"}, ] print("\n\n*** Prompt:") input_ids = tokenizer.apply_chat_template( messages, tokenize=True, return_tensors="pt", ) print(tokenizer.decode(input_ids[0])) print("\n\n*** Generate:") with torch.autocast("cuda", dtype=torch.bfloat16): output = model.generate( input_ids=input_ids.to("cuda"), max_new_tokens=256, return_dict_in_generate=True, ) response = tokenizer.decode( output["sequences"][0][len(input_ids[0]):], skip_special_tokens=True ) print(response) ``` # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_dfurman__HermesBagel-34B-v0.1) | Metric |Value| |---------------------------------|----:| |Avg. |75.15| |AI2 Reasoning Challenge (25-Shot)|70.56| |HellaSwag (10-Shot) |85.74| |MMLU (5-Shot) |77.38| |TruthfulQA (0-shot) |67.34| |Winogrande (5-shot) |84.61| |GSM8k (5-shot) |65.28|