metadata

language:
  - en
license: cc-by-nc-4.0
pipeline_tag: text-generation
model-index:
  - name: quantum-dpo-v0.1
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 72.53
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 88.37
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 65.29
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 69.92
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 82.32
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 70.81
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=quantumaikr/quantum-dpo-v0.1
          name: Open LLM Leaderboard

quantumaikr/quantum-dpo-v0.1

Usage

Start chatting with quantumaikr/quantum-dpo-v0.1 using the following code snippet:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

tokenizer = AutoTokenizer.from_pretrained("quantumaikr/quantum-dpo-v0.1")
model = AutoModelForCausalLM.from_pretrained("quantumaikr/quantum-dpo-v0.1", torch_dtype=torch.float16, device_map="auto")

system_prompt = "You are QuantumLM, an AI that follows instructions extremely well. Help as much as you can. Remember, be safe, and don't do anything illegal."

message = "Write me a poem please"
prompt = f"[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n{message}[/INST]"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
output = model.generate(**inputs, do_sample=True, temperature=0.7, top_p=0.95, top_k=30, max_new_tokens=2048)

print(tokenizer.decode(output[0], skip_special_tokens=True))

QuantumLM should be used with this prompt format:

### System:
This is a system prompt, please behave and help the user.

### User:
Your prompt here

### Assistant
The output of QuantumLM

Use and Limitations

Intended Use

These models are intended for research only, in adherence with the CC BY-NC-4.0 license.

Limitations and bias

Although the aforementioned dataset helps to steer the base language models into "safer" distributions of text, not all biases and toxicity can be mitigated through fine-tuning. We ask that users be mindful of such potential issues that can arise in generated responses. Do not treat model outputs as substitutes for human judgment or as sources of truth. Please use it responsibly.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	74.87
AI2 Reasoning Challenge (25-Shot)	72.53
HellaSwag (10-Shot)	88.37
MMLU (5-Shot)	65.29
TruthfulQA (0-shot)	69.92
Winogrande (5-shot)	82.32
GSM8k (5-shot)	70.81