Update README.md

19eebd2 verified 10 months ago

7.98 kB

	---
	license: apache-2.0
	tags:
	- moe
	- frankenmoe
	- merge
	- mergekit
	- Himitsui/Kaiju-11B
	- Sao10K/Fimbulvetr-11B-v2
	- decapoda-research/Antares-11b-v2
	- beberik/Nyxene-v3-11B
	base_model:
	- Himitsui/Kaiju-11B
	- Sao10K/Fimbulvetr-11B-v2
	- decapoda-research/Antares-11b-v2
	- beberik/Nyxene-v3-11B
	model-index:
	- name: Umbra-v3-MoE-4x11b
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: AI2 Reasoning Challenge (25-Shot)
	type: ai2_arc
	config: ARC-Challenge
	split: test
	args:
	num_few_shot: 25
	metrics:
	- type: acc_norm
	value: 68.43
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Umbra-v3-MoE-4x11b
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: HellaSwag (10-Shot)
	type: hellaswag
	split: validation
	args:
	num_few_shot: 10
	metrics:
	- type: acc_norm
	value: 87.83
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Umbra-v3-MoE-4x11b
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU (5-Shot)
	type: cais/mmlu
	config: all
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 65.99
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Umbra-v3-MoE-4x11b
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: TruthfulQA (0-shot)
	type: truthful_qa
	config: multiple_choice
	split: validation
	args:
	num_few_shot: 0
	metrics:
	- type: mc2
	value: 69.3
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Umbra-v3-MoE-4x11b
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: Winogrande (5-shot)
	type: winogrande
	config: winogrande_xl
	split: validation
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 83.9
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Umbra-v3-MoE-4x11b
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GSM8k (5-shot)
	type: gsm8k
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 63.08
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Umbra-v3-MoE-4x11b
	name: Open LLM Leaderboard
	---


	ExllamaV2 version of the model created by [Steelskull](https://huggingface.co/Steelskull)!

	Original Model https://huggingface.co/Steelskull/Umbra-v3-MoE-4x11b

	calibration dataset [here.](https://huggingface.co/datasets/royallab/PIPPA-cleaned)

	Requires ExllamaV2, which is being developed by turboderp https://github.com/turboderp/exllamav2 under an MIT license.

	Test using 4096 measurement length and rp dataset. Perplexity came out to an 8 vs the Wiki which was at a 6. Haven't tested enough to tell if there is much difference in practice between the two.

	Branch is 8b8h using wikitext at 4096 length

	-----

	<!DOCTYPE html>
	<style>
	body {
	font-family: 'Quicksand', sans-serif;
	background: linear-gradient(135deg, #2E3440 0%, #1A202C 100%);
	color: #D8DEE9;
	margin: 0;
	padding: 0;
	font-size: 16px;
	}

	.container {
	width: 80%;
	max-width: 800px;
	margin: 20px auto;
	background-color: rgba(255, 255, 255, 0.02);
	padding: 20px;
	border-radius: 12px;
	box-shadow: 0 4px 10px rgba(0, 0, 0, 0.2);
	backdrop-filter: blur(10px);
	border: 1px solid rgba(255, 255, 255, 0.1);
	}

	.header h1 {
	font-size: 28px;
	color: #ECEFF4;
	margin: 0 0 20px 0;
	text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.3);
	}

	.update-section {
	margin-top: 30px;
	}

	.update-section h2 {
	font-size: 24px;
	color: #88C0D0;
	}

	.update-section p {
	font-size: 16px;
	line-height: 1.6;
	color: #ECEFF4;
	}

	.info img {
	width: 100%;
	border-radius: 10px;
	margin-bottom: 15px;
	}

	a {
	color: #88C0D0;
	text-decoration: none;
	}

	a:hover {
	color: #A3BE8C;
	}

	.button {
	display: inline-block;
	background-color: #5E81AC;
	color: #E5E9F0;
	padding: 10px 20px;
	border-radius: 5px;
	cursor: pointer;
	text-decoration: none;
	}

	.button:hover {
	background-color: #81A1C1;
	}

	</style>
	<html lang="en">
	<head>
	<meta charset="UTF-8">
	<meta name="viewport" content="width=device-width, initial-scale=1.0">
	<title>Umbra-v3-MoE-4x11b Data Card</title>
	<link href="https://fonts.googleapis.com/css2?family=Quicksand:wght@400;500;600&display=swap" rel="stylesheet">
	</head>
	<body>
	<div class="container">
	<div class="header">
	<h1>Umbra-v3-MoE-4x11b</h1>
	</div>
	<div class="info">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/MHmVGOLGh4I5MfQ83iiXS.jpeg">
	<p><strong>Creator:</strong> <a href="https://huggingface.co/Steelskull" target="_blank">SteelSkull</a></p>
	<p><strong>About Umbra-v3-MoE-4x11b:</strong> A Mixture of Experts model designed for general assistance with a special knack for storytelling and RP/ERP</p>
	<p>Integrates models from notable sources for enhanced performance in diverse tasks.</p>
	<p><strong>Source Models:</strong></p>
	<ul>
	<li><a href="https://huggingface.co/Himitsui/Kaiju-11B">Himitsui/Kaiju-11B</a></li>
	<li><a href="https://huggingface.co/Sao10K/Fimbulvetr-11B-v2">Sao10K/Fimbulvetr-11B-v2</a></li>
	<li><a href="https://huggingface.co/decapoda-research/Antares-11b-v2">decapoda-research/Antares-11b-v2</a></li>
	<li><a href="https://huggingface.co/beberik/Nyxene-v3-11B">beberik/Nyxene-v3-11B</a></li>
	</ul>
	</div>
	<div class="update-section">
	<h2>Update-Log:</h2>
	<p>The [Umbra Series] keeps rolling out from the [Lumosia Series] garage, aiming to be your digital Alfred with a side of Shakespeare for those RP/ERP nights.</p>
	<p><strong>What's Fresh in v3?</strong></p>
	<p>Didn’t reinvent the wheel, just slapped on some fancier rims. Upgraded the models and tweaked the prompts a bit. Now, Umbra's not just a general use LLM; it's also focused on spinning stories and "Stories".</p>
	<p><strong>Negative Prompt Minimalism</strong></p>
	<p>Got the prompts to do a bit of a diet and gym routine—more beef on the positives, trimming down the negatives as usual with a dash of my midnight musings.</p>
	<p><strong>Still Guessing, Aren’t We?</strong></p>
	<p>Just so we're clear, "v3" is not the messiah of updates. It’s another experiment in the saga.</p>
	<p>Dive into Umbra v3 and toss your two cents my way. Your feedback is the caffeine in my code marathon.</p>
	</div>
	</div>
	</body>
	</html>
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Steelskull__Umbra-v3-MoE-4x11b)

	\| Metric \|Value\|
	\|---------------------------------\|----:\|
	\|Avg. \|73.09\|
	\|AI2 Reasoning Challenge (25-Shot)\|68.43\|
	\|HellaSwag (10-Shot) \|87.83\|
	\|MMLU (5-Shot) \|65.99\|
	\|TruthfulQA (0-shot) \|69.30\|
	\|Winogrande (5-shot) \|83.90\|
	\|GSM8k (5-shot) \|63.08\|