Update README.md
Browse files
README.md
CHANGED
@@ -3,7 +3,9 @@ license: apache-2.0
|
|
3 |
---
|
4 |
|
5 |
## Model Details
|
6 |
-
<
|
|
|
|
|
7 |
|
8 |
# Model Card for Bamba 9B
|
9 |
We introduce Bamba-9B, a decoder-only language model based on the [Mamba-2](https://github.com/state-spaces/mamba) architecture and is designed to handle a wide range of text generation tasks. It is trained from scratch using a two-stage training approach. In the first stage, the model is trained on 2 trillion tokens from the Dolma v1.7 dataset. In the second stage, it undergoes additional training on 200 billion tokens, leveraging a carefully curated blend of high-quality data to further refine its performance and enhance output quality.
|
|
|
3 |
---
|
4 |
|
5 |
## Model Details
|
6 |
+
<p align="center">
|
7 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/64b6c638ac6d20bae0b93219/GOzs8o4G1apceun92ZC4d.png" alt="Bamba" width="400" height="400">
|
8 |
+
</p>
|
9 |
|
10 |
# Model Card for Bamba 9B
|
11 |
We introduce Bamba-9B, a decoder-only language model based on the [Mamba-2](https://github.com/state-spaces/mamba) architecture and is designed to handle a wide range of text generation tasks. It is trained from scratch using a two-stage training approach. In the first stage, the model is trained on 2 trillion tokens from the Dolma v1.7 dataset. In the second stage, it undergoes additional training on 200 billion tokens, leveraging a carefully curated blend of high-quality data to further refine its performance and enhance output quality.
|