tiiuae
/

falcon-mamba-7b

Text Generation

Inference Endpoints

Model card Files Files and versions Community

JingweiZuo commited on Jul 24, 2024

Commit

8c8f700

•

1 Parent(s): 89fee4a

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -250,7 +250,7 @@ The model is based on the Mamba architecture ([Gu et al., 2023](https://arxiv.or
 | `d_model`          | 4096      | Hidden dimension                       |
 | `d_state`          | 16        | The SSM state dimension                |
 | Vocabulary         | 65024     | Vocabulary Size                        |
-| Sequence length    | 8192      | During stages 4 and LR Decay stage     |
 ## Compute Infrastructure

 | `d_model`          | 4096      | Hidden dimension                       |
 | `d_state`          | 16        | The SSM state dimension                |
 | Vocabulary         | 65024     | Vocabulary Size                        |
+| Sequence length    | 8192      | During the last training stages        |
 ## Compute Infrastructure