puneeshkhanna commited on
Commit
e3bdfe3
1 Parent(s): c0da908

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -28,7 +28,7 @@ Falcon3-10B-Base supports 4 languages (english, french, spanish, portuguese) and
28
  - Grouped query attention (GQA) for faster inference: 12 query heads and 4 key value heads
29
  - Wider head dimension: 256
30
  - High RoPE value to support long context understanding: 1000042
31
- - Use SwiGLu and RMSNorm
32
  - 32K context length
33
  - 131K vocab size
34
  - Depth-up-scaled from **Falcon3-7B-Base** with 2 Teratokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 2048 H100 GPU chips
 
28
  - Grouped query attention (GQA) for faster inference: 12 query heads and 4 key value heads
29
  - Wider head dimension: 256
30
  - High RoPE value to support long context understanding: 1000042
31
+ - Uses SwiGLu and RMSNorm
32
  - 32K context length
33
  - 131K vocab size
34
  - Depth-up-scaled from **Falcon3-7B-Base** with 2 Teratokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 2048 H100 GPU chips