Jamba-Small v1

This is a pruned version of AI21 Labs' Jamba-v0.1 model that is ~25% the size of Jamba-v0.1.

Model Details

Whereas Jamba-v0.1 contains 4 Jamba blocks, Jamba-Small contains only 1 Jamba block. Jamba-Small's Jamba blocks follow the same structure seen in Jamba-v0.1, with a 1:7 ratio of attention-to-Mamba layers and MoE applied every 2 layers.

Jamba-Small's weights are initialized from various layers in the original Jamba-v0.1 model. For v1, the layer weights are mapped as follows (left is Jamba-Small layer number, right is Jamba-v0.1 layer number):

0: 0
1: 1
2: 2
3: 3
4: 4
5: 5
6: 30
7: 31

Note that no additional fine-tuning has been performed on this model. As such, its performance is exceptionally poor. This should not be used in production without additional training.

Model Description

  • Developed by: Nathan Brown (OxxoCodes)
  • Compute provided by: Clemson Palmetto Cluster
  • Model type: Joint Attention and Mamba (Jamba)
  • Language(s) (NLP): English
  • License: Apache 2.0
  • Original model: Jamba-v0.1
  • Jamba paper: https://arxiv.org/pdf/2403.19887.pdf
Downloads last month
22
Safetensors
Model size
13.3B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.