Edit model card

Spellbound Jamba Mini: Creative output over long contexts

Main Goals

The main goals of the base model choice and post-trained regime are

  • Strong steerability
  • Coherence over long context lengths
  • Flexible writing styles
  • Advanced formatting that allows identifying individual speakers

There was also a secondary training objective: to teach the model to understand and produce directives in XML tags.

  • <${characterName}Description>: A definition of a character defined as a markdown list of details. For example:
    • Name: Character Name
    • Personality: Character Personality
    • Speaker ID: 32AN4R (see <quote> tag below)
    • ...
  • <writingInstructions>: A block of markdown formatted instructions representing what should happen in the story.
  • <pastStory>: A block containing the preceeding events to the story being written

Output can optionally include the following tags:

  • <quote speaker="{speakerId}">: When a character is defined with a speaker ID, the model will output the speech surrounded by <quote speaker="{speakerId}"> and </quote>. The model learns to keep speech in character this way, and it allows for identifying different speakers for rendering and text-to-speech purposes
  • <action>: Represents an action taken by a character
  • <sound>: Represents a sound made in the story

Instructing the model to produce these tags is optional, but the model should produce best possible output if the frontend being used can parse/ignore these

Post-training Details

Post-training consists of 1 epoch of SFT LORA training

  • Trained on synthetic instructions for strong steerability
  • Outputs rated by tryspellbound.com beta users who opted-in
  • Lora Rank: 8
  • Batch Size: 2
  • Learning Rate: 1e-5

Model Creator

Made by tryspellbound.com.

Note: This is a continued run from the chkpt53 version

Downloads last month
279
Safetensors
Model size
51.6B params
Tensor type
BF16
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.