tiiuae
/

Falcon3-10B-Base

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

puneeshkhanna commited on Dec 16, 2024

Commit

e3bdfe3

·

verified ·

1 Parent(s): c0da908

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -28,7 +28,7 @@ Falcon3-10B-Base supports 4 languages (english, french, spanish, portuguese) and
   - Grouped query attention (GQA) for faster inference: 12 query heads and 4 key value heads
   - Wider head dimension: 256
   - High RoPE value to support long context understanding: 1000042
-  - Use SwiGLu and RMSNorm
   - 32K context length
   - 131K vocab size
 - Depth-up-scaled from **Falcon3-7B-Base** with 2 Teratokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 2048 H100 GPU chips

   - Grouped query attention (GQA) for faster inference: 12 query heads and 4 key value heads
   - Wider head dimension: 256
   - High RoPE value to support long context understanding: 1000042
+  - Uses SwiGLu and RMSNorm
   - 32K context length
   - 131K vocab size
 - Depth-up-scaled from **Falcon3-7B-Base** with 2 Teratokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 2048 H100 GPU chips