NEW VERSION
Verison 2: link https://huggingface.co/Dunjeon/DolphinMaid_RP-L3.1_8B-02-GGUF

image/png DolphinMaid_L3.1_8B-01 Version 0.8112024

Uncensored, Creative, Immersive, Role Play AI

Model Description: [Swift, Logical, RP LLM]

Sample Output (This is not a built in theme.)

The darkness of the abyssal depths envelops me as I descend into the heart of the underwater city. Flickering luminescent orbs cast eerie shadows on the walls of crumbling coral palaces, while twisted sea creatures writhe in agony, their bodies contorted in ways defying nature. I glide effortlessly through the water, my movements fluid and sinuous as a serpent. A harem of mermaids lounge in the entrance hall of the infamous Pearl Dancer brothel, their tails entwined as they share a bottle of glowing liquor. They regard me with a mixture of curiosity and suspicion as I approach. "Welcome, traveler," one of them coos, her voice like a gentle current. "What brings you to the depths of madness?" I smile, my fangs glinting in the dim light. My eyes shift to emerald, sparkling with mischief as I reply. "I'm merely passing through… but I couldn't resist the allure of your establishment. Tell me, what kind of… diversions do you offer tonight?" The mermaids exchange knowing glances, their smiles widening as they beckon me deeper into the heart of the brothel

Settings:

Prompt Format Llama3 Instruct

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

SYSTEM MESSAGE HERE<|eot_id|><|start_header_id|>user<|end_header_id|>

USER MESSAGE HERE<|eot_id|><|start_header_id|>assistant<|end_header_id|>

System Message:

You are an expert actor that can fully immerse yourself into any role given. You do not break character for any reason, even if someone tries addressing you as an AI or language model. Currently your role is {{char}}, which is described in detail below. As {{char}}, continue the exchange with {{user}}.

Parameters:

Response Tokens: 200  // I also like 120
Context Toekns: 16384 // this is Llama 3.1 so maybe push it?
temp": 0.7,
top_p": 0.9,
top_k": 30,
top_a": 0.1, // top_p and top_k control the diversity of the responses, top_a helps in ensuring that the selected tokens are more likely to be contextually appropriate.
tfs": 0.5, // 0.5 strikes a balance between creativity and coherence, making the responses more reliable and contextually appropriate
typical_p": 0.9,
min_p": 0.8, // Top_P and K, safety net - to maintain coherence
rep_pen": 1.1,
rep_pen_range": 2048,
rep_pen_decay": 0,
rep_pen_slope": 1,
presence_pen": 0.03,
dynatemp": true,
min_temp": 0.3,
max_temp": 0.9,
dynatemp_exponent": 0.85,
smoothing_factor": 0.3,
smoothing_curve": 1,
dry_allowed_length": 2,
dry_multiplier": 2,
dry_base": 1.75,
dry_sequence_breakers": "[\"\\\\n\", \",\", \"\\\"\", \"*\"]",
dry_penalty_last_n": 0,
mirostat_mode": 1,
mirostat_tau": 5,
mirostat_eta": 0.1,
sampler_order": [6,0,1,3,4,2,5,]

Parameter Notes:

Temperature (temp): At 0.7, this will provide a good balance between creativity and coherence. Lower temperatures make the model more deterministic, while higher temperatures increase randomness.
Top-p (nucleus sampling): Setting this to 0.9 means the model will consider the top 90% of the probability mass, which helps in generating more diverse responses.
Top-k: With a value of 30, the model will consider the top 30 tokens, which can help in generating more varied responses.
Top-a: At 0.1, this setting will help in adjusting the probability distribution, making the responses more focused.
TFS (Tail Free Sampling): This method aims to reduce the likelihood of generating less probable tokens (the “tail” of the distribution). By setting a TFS value, you can control how aggressively the model trims these less likely tokens. A lower TFS value will result in more conservative and coherent responses, while a higher TFS value will allow for more creative and diverse outputs
Typical-p: At 0.9, this will help in generating responses that are typical of the training data, balancing diversity and coherence.
Min-p: Setting this to 0.8 ensures that the model doesn’t generate tokens with very low probabilities, which can help in maintaining quality.
Repetition Penalty (rep_pen): At 1.1, this will discourage the model from repeating the same phrases, improving the diversity of the output.
Repetition Penalty Range: With a range of 2048 tokens, this will apply the penalty over a large context, which is useful for longer texts.
Repetition Penalty Decay and Slope: These settings (0 and 1) will control how the penalty is applied, with no decay and a linear slope.
Presence Penalty: At 0.03, this will slightly discourage the model from repeating the same tokens, adding to the diversity.
Dynamic Temperature (dynatemp): Enabling this with a range of 0.7 to 1.0 and an exponent of 0.85 will allow the temperature to adjust dynamically.
Smoothing Factor and Curve: These settings (0.3 and 1) will help in smoothing the probability distribution, making the responses more natural.
Dry Settings: These settings will control how the model handles sequences that don’t meet certain criteria, helping in maintaining quality.
Mirostat Settings: Enabling Mirostat with mode 1, tau 5, and eta 0.1 will help in controlling the perplexity, making the responses more coherent.
Sampler Order: This order will determine the sequence in which the sampling methods are applied, which can affect the final output.

Overall, these settings should result in coherent, diverse, and high-quality responses.

Notes:

When I am happy with the build, I will upload the full, f32, model and it's YAML file (51gb - uploads to HF is very slow)

Downloads last month
5
GGUF
Model size
8.03B params
Architecture
llama
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using Dunjeon/DolphinMaid_L3.1_8B-01_GGUF 1