hf-100
/

Jamba-1.5-mini-Spellbound-StoryWriter-0.1-6583896-ckpt53-merged

Text Generation

Mixture of Experts

Inference Endpoints

Model card Files Files and versions Community

Spellbound Jamba Mini: Creative output over long contexts

Main Goals

The main goals of the base model choice and post-trained regime are

Strong steerability
Coherence over long context lengths
Flexible writing styles
Advanced formatting that allows identifying individual speakers

There was also a secondary training objective: to teach the model to understand and produce directives in XML tags.

<${characterName}Description>: A definition of a character defined as a markdown list of details. For example:
- Name: Character Name
- Personality: Character Personality
- Speaker ID: 32AN4R (see <quote> tag below)
- ...
<writingInstructions>: A block of markdown formatted instructions representing what should happen in the story.
<pastStory>: A block containing the preceeding events to the story being written

Output can optionally include the following tags:

<quote speaker="{speakerId}">: When a character is defined with a speaker ID, the model will output the speech surrounded by <quote speaker="{speakerId}"> and </quote>. The model learns to keep speech in character this way, and it allows for identifying different speakers for rendering and text-to-speech purposes
<action>: Represents an action taken by a character
<sound>: Represents a sound made in the story

Instructing the model to produce these tags is optional, but the model should produce best possible output if the frontend being used can parse/ignore these

Post-training Details

Post-training consists of 1 epoch of SFT LORA training

Trained on synthetic instructions for strong steerability
Outputs rated by tryspellbound.com beta users who opted-in
Lora Rank: 8
Batch Size: 2
Learning Rate: 1e-5

Model Creator

Made by tryspellbound.com.

Downloads last month: 12

Safetensors

Model size

51.6B params

Tensor type

BF16

·

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.