Update README.md
Browse files
README.md
CHANGED
@@ -15,7 +15,7 @@ Jamba is the first production-scale Mamba implementation, which opens up interes
|
|
15 |
|
16 |
This model card is for the base version of Jamba. It’s a pretrained, mixture-of-experts (MoE) generative text model, with 12B active parameters and a total of 52B parameters across all experts. It supports a 256K context length, and can fit up to 140K tokens on a single 80GB GPU.
|
17 |
|
18 |
-
For full details of this model please read the [release blog post](https://www.ai21.com/blog/announcing-jamba).
|
19 |
|
20 |
## Model Details
|
21 |
|
|
|
15 |
|
16 |
This model card is for the base version of Jamba. It’s a pretrained, mixture-of-experts (MoE) generative text model, with 12B active parameters and a total of 52B parameters across all experts. It supports a 256K context length, and can fit up to 140K tokens on a single 80GB GPU.
|
17 |
|
18 |
+
For full details of this model please read the [white paper](https://arxiv.org/abs/2403.19887) and the [release blog post](https://www.ai21.com/blog/announcing-jamba).
|
19 |
|
20 |
## Model Details
|
21 |
|