Spellbound Jamba Mini: Creative output over long contexts
Main Goals
The main goals of the base model choice and post-trained regime are
- Strong steerability
- Coherence over long context lengths
- Flexible writing styles
- Advanced formatting that allows identifying individual speakers
There was also a secondary training objective: to teach the model to understand and produce directives in XML tags.
<${characterName}Description>
: A definition of a character defined as a markdown list of details. For example:- Name: Character Name
- Personality: Character Personality
- Speaker ID: 32AN4R (see
<quote>
tag below) - ...
<writingInstructions>
: A block of markdown formatted instructions representing what should happen in the story.<pastStory>
: A block containing the preceeding events to the story being written
Output can optionally include the following tags:
<quote speaker="{speakerId}">
: When a character is defined with a speaker ID, the model will output the speech surrounded by<quote speaker="{speakerId}">
and</quote>
. The model learns to keep speech in character this way, and it allows for identifying different speakers for rendering and text-to-speech purposes<action>
: Represents an action taken by a character<sound>
: Represents a sound made in the story
Instructing the model to produce these tags is optional, but the model should produce best possible output if the frontend being used can parse/ignore these
Post-training Details
Post-training consists of 1 epoch of SFT LORA training
- Trained on synthetic instructions for strong steerability
- Outputs rated by tryspellbound.com beta users who opted-in
- Lora Rank: 8
- Batch Size: 2
- Learning Rate: 1e-5
Model Creator
Made by tryspellbound.com.
Note: This is a continued run from the chkpt53 version
- Downloads last month
- 4
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.