hf-100
/

Jamba-1.5-mini-Spellbound-StoryWriter-0.1-6583896-ckpt81-merged

Text Generation

Mixture of Experts

Inference Endpoints

Model card Files Files and versions Community

Spellbound Jamba Mini: Creative output over long contexts

Main Goals

The main goals of the base model choice and post-trained regime are

Strong steerability
Coherence over long context lengths
Flexible writing styles
Advanced formatting that allows identifying individual speakers

There was also a secondary training objective: to teach the model to understand and produce directives in XML tags.

<${characterName}Description>: A definition of a character defined as a markdown list of details. For example:
- Name: Character Name
- Personality: Character Personality
- Speaker ID: 32AN4R (see <quote> tag below)
- ...
<writingInstructions>: A block of markdown formatted instructions representing what should happen in the story.
<pastStory>: A block containing the preceeding events to the story being written

Output can optionally include the following tags:

<quote speaker="{speakerId}">: When a character is defined with a speaker ID, the model will output the speech surrounded by <quote speaker="{speakerId}"> and </quote>. The model learns to keep speech in character this way, and it allows for identifying different speakers for rendering and text-to-speech purposes
<action>: Represents an action taken by a character
<sound>: Represents a sound made in the story

Instructing the model to produce these tags is optional, but the model should produce best possible output if the frontend being used can parse/ignore these

Post-training Details

Post-training consists of 1 epoch of SFT LORA training

Trained on synthetic instructions for strong steerability
Outputs rated by tryspellbound.com beta users who opted-in
Lora Rank: 8
Batch Size: 2
Learning Rate: 1e-5

Model Creator

Made by tryspellbound.com.

Note: This is a continued run from the chkpt53 version

Downloads last month: 10

Safetensors

Model size

51.6B params

Tensor type

BF16

·

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.