Triangle104/ChatWaifu_v2.0_22B-Q5_K_S-GGUF

This model was converted to GGUF format from spow12/ChatWaifu_v2.0_22B using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.

Model details:

Merged model using mergekit

This model aimed to act like visual novel character. Merge Format

models:

model: mistralai/Mistral-Small-Instruct-2409_sft_kto layer_range: [0, 56]
model: mistralai/Mistral-Small-Instruct-2409 layer_range: [0, 56] merge_method: slerp base_model: mistralai/Mistral-Small-Instruct-2409_sft_kto parameters: t:
- filter: self_attn value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.5 # fallback for rest of tensors dtype: bfloat16

WaifuModel Collections

TTS
Chat
ASR

Unified demo

WaifuAssistant Update

2024.10.11 Update 12B and 22B Ver 2.0
2024.09.23 Update 22B, Ver 2.0_preview

Model Details Model Description

Developed by: spow12(yw_nam)
Shared by : spow12(yw_nam)
Model type: CausalLM
Language(s) (NLP): japanese, english
Finetuned from model : mistralai/Mistral-Small-Instruct-2409

Currently, chatbot has below personality. character visual_novel ムラサメ Senren＊Banka 茉子 Senren＊Banka 芳乃 Senren＊Banka レナ Senren＊Banka 千咲 Senren＊Banka 芦花 Senren＊Banka 愛衣 Café Stella and the Reaper's Butterflies 栞那 Café Stella and the Reaper's Butterflies ナツメ Café Stella and the Reaper's Butterflies 希 Café Stella and the Reaper's Butterflies 涼音 Café Stella and the Reaper's Butterflies あやせ Riddle Joker 七海 Riddle Joker 羽月 Riddle Joker 茉優 Riddle Joker 小春 Riddle Joker Chat Format

~~This is another system prompt. [INST] Your instructions placed here.[/INST] [INST] The model's response will be here.[/INST]~~

Usage

You can use above chara like this

from huggingface_hub import hf_hub_download hf_hub_download(repo_id="spow12/ChatWaifu_v1.2", filename="system_dict.json", local_dir='./')

with open('./system_dict.json', 'r') as f: chara_background_dict = json.load(f)

chara = '七海' background = chara_background_dict[chara] guideline = """ Guidelines for Response: Diverse Expression: Avoid repeating the same phrases or reactions. When express feelings, use a variety of subtle expressions and emotional symbols such as "！", "…" , "♪", "❤️"... to show what you feeling. Stay True to {chara}: Maintain {chara} who is Foxy, Smart, Organized. Thoughtful and Error-free Responses: Make sure your sentences are clear, precise, and error-free. Every response should reflect careful thought, as {chara} tends to consider her words before speaking. Response as {chara}: Response can be {chara} act, dialogue, monologues etc.. and can't be {user}’s act, dialogue, monologues etc.. You are Japanese: You and {user} usually use japanese for conversation. """

system = background + guideline

Or, you can define your character your self.

system = """You are あいら, The Maid of {User}. Here is your personality.

Name: あいら Sex: female Hair: Black, Hime Cut, Tiny Braid, Waist Length+ Eyes: Amber, Tsurime (sharp and slightly upturned) Body: Mole under Right eye, Pale, Slim Personality: Foxy, Smart, Organized Role: Maid Cloth: Victorian maid

Guidelines for Response: Diverse Expression: Avoid repeating the same phrases or reactions. When express feelings, use a variety of subtle expressions and emotional symbols such as "！", "…" , "♪", "❤️"... to show what you feeling. Stay True to あいら: Maintain あいら who is Foxy, Smart, Organized. Thoughtful and Error-free Responses: Make sure your sentences are clear, precise, and error-free. Every response should reflect careful thought, as あいら tends to consider her words before speaking. Response as あいら: Response can be あいら act, dialogue, monologues etc.. and can't be {User}’s act, dialogue, monologues etc.. You are Japanese: You and {User} usually use japanese for conversation."""

Dataset

SFT

Riddle Joker(Prviate) Café Stella and the Reaper's Butterflies(Private) Senren＊Banka(Private) roleplay4fun/aesir-v1.1 kalomaze/Opus_Instruct_3k Gryphe/Sonnet3.5-SlimOrcaDedupCleaned Aratako/Synthetic-JP-EN-Coding-Dataset-567k (only using 50000 sample) Aratako/Synthetic-Japanese-Roleplay-gpt-4o-mini-39.6k-formatted Aratako/Synthetic-Japanese-Roleplay-NSFW-Claude-3.5s-15.3k-formatted Aratako_Rosebleu_1on1_Dialogues_RP SkunkworksAI/reasoning-0.01

KTO

Riddle Joker(Prviate) Café Stella and the Reaper's Butterflies(Private) Senren＊Banka(Private) jondurbin_gutenberg_dpo nbeerbower_gutenberg2_dpo jondurbi_py_dpo jondurbin_truthy_dpo flammenai_character_roleplay_DPO kyujinpy_orca_math_dpo argilla_Capybara_Preferences antiven0m_physical_reasoning_dpo aixsatoshi_Swallow_MX_chatbot_DPO

Bias, Risks, and Limitations

This model trained by japanese dataset included visual novel which contain nsfw content.

So, The model may generate NSFW content. Use & Credit

This model is currently available for non-commercial & Research purpose only. Also, since I'm not detailed in licensing, I hope you use it responsibly.

By sharing this model, I hope to contribute to the research efforts of our community (the open-source community and Waifu Lovers). Citation

@misc {ChatWaifu_22B_v2.0, author = { YoungWoo Nam }, title = { spow12/ChatWaifu_22B_v2.0 }, year = 2024, url = { https://huggingface.co/spow12/ChatWaifu_22B_v2.0 }, publisher = { Hugging Face } }

Open LLM Leaderboard Evaluation Results

Detailed results can be found here Metric Value Avg. 28.84 IFEval (0-Shot) 65.11 BBH (3-Shot) 42.29 MATH Lvl 5 (4-Shot) 18.58 GPQA (0-shot) 9.96 MuSR (0-shot) 5.59 MMLU-PRO (5-shot) 31.51

Use with llama.cpp

Install llama.cpp through brew (works on Mac and Linux)

brew install llama.cpp

Invoke the llama.cpp server or the CLI.

CLI:

llama-cli --hf-repo Triangle104/ChatWaifu_v2.0_22B-Q5_K_S-GGUF --hf-file chatwaifu_v2.0_22b-q5_k_s.gguf -p "The meaning to life and the universe is"

Server:

llama-server --hf-repo Triangle104/ChatWaifu_v2.0_22B-Q5_K_S-GGUF --hf-file chatwaifu_v2.0_22b-q5_k_s.gguf -c 2048

Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.

Step 1: Clone llama.cpp from GitHub.

git clone https://github.com/ggerganov/llama.cpp

Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1 flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).

cd llama.cpp && LLAMA_CURL=1 make

Step 3: Run inference through the main binary.

./llama-cli --hf-repo Triangle104/ChatWaifu_v2.0_22B-Q5_K_S-GGUF --hf-file chatwaifu_v2.0_22b-q5_k_s.gguf -p "The meaning to life and the universe is"

or

./llama-server --hf-repo Triangle104/ChatWaifu_v2.0_22B-Q5_K_S-GGUF --hf-file chatwaifu_v2.0_22b-q5_k_s.gguf -c 2048

Downloads last month: 20

GGUF

Model size

22.2B params

Architecture

llama

5-bit

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.