Dans-CreepingSenseOfDoom-13b / README.md

Adding Evaluation Results

53cda72 verified 7 months ago

6.43 kB

	---
	language:
	- en
	model-index:
	- name: Dans-CreepingSenseOfDoom
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: AI2 Reasoning Challenge (25-Shot)
	type: ai2_arc
	config: ARC-Challenge
	split: test
	args:
	num_few_shot: 25
	metrics:
	- type: acc_norm
	value: 53.33
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PocketDoc/Dans-CreepingSenseOfDoom
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: HellaSwag (10-Shot)
	type: hellaswag
	split: validation
	args:
	num_few_shot: 10
	metrics:
	- type: acc_norm
	value: 78.9
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PocketDoc/Dans-CreepingSenseOfDoom
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU (5-Shot)
	type: cais/mmlu
	config: all
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 48.09
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PocketDoc/Dans-CreepingSenseOfDoom
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: TruthfulQA (0-shot)
	type: truthful_qa
	config: multiple_choice
	split: validation
	args:
	num_few_shot: 0
	metrics:
	- type: mc2
	value: 37.84
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PocketDoc/Dans-CreepingSenseOfDoom
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: Winogrande (5-shot)
	type: winogrande
	config: winogrande_xl
	split: validation
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 73.32
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PocketDoc/Dans-CreepingSenseOfDoom
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GSM8k (5-shot)
	type: gsm8k
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 0.0
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PocketDoc/Dans-CreepingSenseOfDoom
	name: Open LLM Leaderboard
	---
	### What is the model for?
	This model is proficient in crafting text-based adventure games. It can both concise replies and more expansive, novel-like descriptions. The ability to alternate between these two response styles can be triggered by a distinct system message.

	### What's in the sausage?

	This model was trained on [Holodeck-1](https://huggingface.co/KoboldAI/LLAMA2-13B-Holodeck-1) using a deduped version of the skein text adventure dataset augmented with system messages using the 'Metharme' prompting format.

	### PROMPT FORMAT:
	Consistent with the Pygmalion Metharme format which is shown below.
	```
	<\|system\|>{system message here}<\|user\|>{user action here}<\|model\|>{model response}
	<\|system\|>{system message here}<\|model\|>{model response}
	<\|system\|>{system message here}<\|user\|>{user action here}<\|model\|>{model response}<\|user\|>{user action here}<\|model\|>{model response}
	```


	### EXAMPLES:
	##### For shorter responses:
	```
	<\|system\|>Mode: Adventure
	Theme: Science Fiction, cats, money, aliens, space, stars, siblings, future, trade
	Tense: Second person present
	Extra: Short response length<\|user\|>you look around<\|model\|>{CURSOR HERE}
	```
	```
	<\|system\|>You are a dungeon master of sorts, guiding the reader through a story based on the following themes: Lovecraftian, Horror, city, research. Do not be afraid to get creative with your responses or to tell them they can't do something when it doesnt make sense for the situation. Narrate their actions and observations as they occur and drive the story forward.<\|user\|>you look around<\|model\|>{CURSOR HERE}
	```
	##### For longer novel like responses:
	```
	<\|system\|>You're tasked with creating an interactive story around the genres of historical, historical, RPG, serious. Guide the user through this tale, describing their actions and surroundings using second person present tense. Lengthy and descriptive responses will enhance the experience.<\|user\|>you look around<\|model\|>{CURSOR HERE}
	```
	##### With a model message first:
	```
	<\|system\|>Mode: Story
	Theme: fantasy, female protagonist, grimdark
	Perspective and Tense: Second person present
	Directions: Write something to hook the user into the story then narrate their actions and observations as they occur while driving the story forward.<\|model\|>{CURSOR HERE}
	```
	### Some quick and dirty training details:
	- [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="150" height="24"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
	- Sequence length: 4096
	- \# of epochs: 3
	- Training time: 8 hours
	- Hardware: 1x RTX 3090
	- Training type: QLoRA
	- PEFT R/A: 32/32

	### Credits:
	#### Holodeck-1:
	Thank you to Mr. Seeker and the Kobold AI team for the wonderful model Holodeck-1

	[Holodeck-1 Huggingface page](https://huggingface.co/KoboldAI/LLAMA2-13B-Holodeck-1)

	#### Skein Text Adventure Data:
	Thank you to the [Kobold AI](https://huggingface.co/KoboldAI) community for curating the Skein dataset, which is pivotal to this model's capabilities.
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_PocketDoc__Dans-CreepingSenseOfDoom)

	\| Metric \|Value\|
	\|---------------------------------\|----:\|
	\|Avg. \|48.58\|
	\|AI2 Reasoning Challenge (25-Shot)\|53.33\|
	\|HellaSwag (10-Shot) \|78.90\|
	\|MMLU (5-Shot) \|48.09\|
	\|TruthfulQA (0-shot) \|37.84\|
	\|Winogrande (5-shot) \|73.32\|
	\|GSM8k (5-shot) \| 0.00\|