big thanks to lore for the 8xH100 gpus

awq

zero point, 128 group size, 4 bit gemm

training

base model is meta llama 3 8b instruct trained on pippa then i trained that model on limarp, both at 32k context for 2 epochs each

gen settings

i would start with every sampler off and temperature at 1 and just make min p 0.05, i got good prompts from this but u can also try to gen settings from shori which are copy pasted below

  • Main choice (may have repetition issues)
    • Temperature: 1.0; Min-P: 0.05-0.10; Presence Penalty: 0.35-0.45
  • Alternative 1 (appears to solve repetition issues while being coherent, but reponses might possibly be less truthful)
    • Temperature: 2.40-2.50; Min-P: 0.40; Frequency penalty: 0.10-0.15; Temperature last.
  • Alternative 2
    • Mirostat type: 2, Mirostat Tau: 2.80-3.00; Mirostat Eta: 0.0175-0.0200; neutralize or disable all other samplers

prompting

use the llama 3 instruct format

<|eot_id|> as stopping sequence/string/token

ST jsons: instruct context

agnaistic prompt:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>{{#if system}}<|begin_of_text|><|start_header_id|>system<|end_header_id|>{{system}}<|eot_id|>{{/if}}Write {{char}}'s next reply in a fictional roleplay chat between {{#each bot}}{{.name}}, {{/each}}{{char}} and {{user}}.

{{char}}'s Persona: {{personality}}

{{#if memory}}
Important details:
{{memory}}
{{/if}}

{{#if example_dialogue}}This is how {{char}} should talk:
{{example_dialogue}}{{/if}}

This scenario of the conversation: {{scenario}}

Then the roleplay chat between {{#each bot}}{{.name}}, {{/each}}{{char}} and {{user}} begins.<|eot_id|>

{{#each msg}}{{#if .isbot}}<|start_header_id|>response<|end_header_id|>{{/if}}{{#if .isuser}}<|start_header_id|>user<|end_header_id|>{{/if}}{{.name}}: {{.msg}}<|eot_id|>
{{/each}}
{{#if ujb}}<|begin_of_text|><|start_header_id|>system<|end_header_id|>{{ujb}}<|eot_id|>{{/if}}
<|start_header_id|>response<|end_header_id|>{{post}}
Downloads last month
7
Safetensors
Model size
1.98B params
Tensor type
I32
·
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train ludis/tsukasa-llama-3-8b-qlora-awq

Collection including ludis/tsukasa-llama-3-8b-qlora-awq