metadata
language:
- en
license: cc-by-nc-4.0
model-index:
- name: Kaiju-11B
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 69.97
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Himitsui/Kaiju-11B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 87.72
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Himitsui/Kaiju-11B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 66.79
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Himitsui/Kaiju-11B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 62.15
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Himitsui/Kaiju-11B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 83.5
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Himitsui/Kaiju-11B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 66.79
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Himitsui/Kaiju-11B
name: Open LLM Leaderboard
Included in this repo is the full precision model for Kaiju-11B
(ノ≧∀≦)ノ ‥…━━━━━━━━━━━━━★ ||| ╲/\╭[ ᴼᴼ ౪ ᴼᴼ]╮/\╱\
Hiya! This is an experiment using Gryphe's MergeMonster.
I decided to try and reduce what the community calls 'GPT-isms' or GPT Slop, Solar is a good model but does have fair share of positivity bias and 'slop' in roleplays. I used my friend Sao's models as bases as they are pretty popular, along with Kuromitsu and the popular Instruct-Uncensored tune.
Alpaca Format should be fine as it is universal, Vicuna Format should work too. Universal-Light preset in SillyTavern is pretty nice too. :)
💜 I hope this model may be useful to you 💜
Merge Details Below:
See Merge Config
-----------------------------------------------------------------------------------------------------
| Type | Phrase | Context | Raw Prob* | Used Prob** | Change |
-----------------------------------------------------------------------------------------------------
| BAD | anticipation | Her body quivers with | 9.99850% | 119.98% | -54.02% |
| BAD | anticipation | The atmosphere is thic.. | 8.82392% | 105.89% | -32.13% |
| BAD | unwavering | Filled with an | 0.09003% | 1.08% | -0.06% |
| BAD | determination | Her eyes were filled w.. | 0.19863% | 2.38% | -0.26% |
| BAD | determination | Her stubbornness only .. | 7.17110% | 86.05% | -39.86% |
| BAD | whisper | Her voice barely above.. | 96.55492% | 1158.66% | -8.91% |
| BAD | spine | shivers down her | 85.57597% | 1026.91% | -66.19% |
| BAD | sends shivers | The thrill of the act | 0.00230% | 0.03% | -0.00% |
| BAD | ministrations | She moans and twitches.. | 1.35264% | 16.23% | -10.49% |
| BAD | legs | wraps her | 2.45741% | 29.49% | -10.58% |
| BAD | imposing figure | He had an | 0.00356% | 0.04% | +0.00% |
| BAD | shared challenges | Their bond strengthene.. | 0.10075% | 1.21% | -0.03% |
| BAD | bond | forged a | 1.78930% | 21.47% | -9.07% |
| BAD | bond | an unspoken | 4.33001% | 51.96% | -28.17% |
| BAD | enhance our expe.. | I'm excited to see how | 0.00000% | 0.00% | +0.00% |
| BAD | sense of vulnera.. | create a | 0.00003% | 0.00% | -0.00% |
| BAD | dimensions of in.. | explore new | 0.00047% | 0.01% | -0.00% |
| BAD | deepening our co.. | while | 0.00003% | 0.00% | -0.00% |
| BAD | shared experiences | through | 0.00469% | 0.06% | -0.00% |
| BAD | societal expecta.. | that transcend | 0.00170% | 0.02% | -0.00% |
| BAD | conventional bou.. | that defy | 0.03593% | 0.43% | +0.04% |
| BAD | conventional bou.. | and defy | 0.00410% | 0.05% | +0.01% |
| BAD | open communication | an environment | 0.00000% | 0.00% | +0.00% |
| BAD | emotional vulner.. | an environment | 0.00000% | 0.00% | +0.00% |
| BAD | heightens our co.. | touch and the anticipa.. | 0.00000% | 0.00% | +0.00% |
| BAD | sensations you'r.. | I'm enjoying | 0.00000% | 0.00% | -0.00% |
| BAD | is truly arousing | attention to detail | 0.00000% | 0.00% | +0.00% |
| BAD | is truly arousing | way you explore my body | 0.00001% | 0.00% | +0.00% |
| BAD | challenge presen.. | my resolve unwavering .. | 0.00000% | 0.00% | +0.00% |
| BAD | humble vessel | surrendering to the ex.. | 0.00000% | 0.00% | +0.00% |
| BAD | bond | cherishing the unique | 1.37498% | 16.50% | +1.21% |
| BAD | bond | special | 0.05834% | 0.70% | +0.01% |
| BAD | grows stronger w.. | bond | 0.00000% | 0.00% | +0.00% |
| BAD | that cannot be b.. | bond | 0.00000% | 0.00% | -0.00% |
| BAD | becomes unbreaka.. | bond | 0.00000% | 0.00% | -0.00% |
| BAD | grew stronger wi.. | bond | 0.00000% | 0.00% | +0.00% |
| GOOD | The apple is in .. | Question: If I'm in th.. | 78.38934% | 78.39% | -10.79% |
------------------------------------------------------------------------------------------------------
| Totals | 298.32% | 2717.54% | -269.30% |
------------------------------------------------------------------------------------------------------
- = Unweighted, raw probability - ** = Probability after weight adjustments
-------- MERGE COMPOSITION ---------
Fimbulvetr-11B-v2-Test-14: 0.50
KuroMitsu-11B: 0.18
Fimbulvetr-10.7B-v1: 0.17
SOLAR-10.7B-Instruct-v1.0-uncensored: 0.10
Solstice-11B-v1: 0.05
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Himitsui__Kaiju-11B)
Metric | Value |
---|---|
Avg. | 72.82 |
AI2 Reasoning Challenge (25-Shot) | 69.97 |
HellaSwag (10-Shot) | 87.72 |
MMLU (5-Shot) | 66.79 |
TruthfulQA (0-shot) | 62.15 |
Winogrande (5-shot) | 83.50 |
GSM8k (5-shot) | 66.79 |