Text Generation
Transformers
Safetensors
English
falcon_mamba
Eval Results
Inference Endpoints
JingweiZuo commited on
Commit
8c8f700
1 Parent(s): 89fee4a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -250,7 +250,7 @@ The model is based on the Mamba architecture ([Gu et al., 2023](https://arxiv.or
250
  | `d_model` | 4096 | Hidden dimension |
251
  | `d_state` | 16 | The SSM state dimension |
252
  | Vocabulary | 65024 | Vocabulary Size |
253
- | Sequence length | 8192 | During stages 4 and LR Decay stage |
254
 
255
  ## Compute Infrastructure
256
 
 
250
  | `d_model` | 4096 | Hidden dimension |
251
  | `d_state` | 16 | The SSM state dimension |
252
  | Vocabulary | 65024 | Vocabulary Size |
253
+ | Sequence length | 8192 | During the last training stages |
254
 
255
  ## Compute Infrastructure
256