typo fix
Browse files
README.md
CHANGED
@@ -310,8 +310,8 @@ output = tokenizer.batch_decode(output)
|
|
310 |
print(output)
|
311 |
```
|
312 |
|
313 |
-
**Model
|
314 |
-
Granite-3.0-3B-A800M-Instruct is based on a decoder-only sparse Mixture of Experts(MoE) transformer architecture. Core components of this architecture are: Fine-grained Experts, Dropless Token Routing, and Load Balancing Loss.
|
315 |
|
316 |
| Model | 2B Dense | 8B Dense | 1B MoE | 3B MoE |
|
317 |
| :-------- | :--------| :--------| :--------| :-------- |
|
|
|
310 |
print(output)
|
311 |
```
|
312 |
|
313 |
+
**Model Architecture:**
|
314 |
+
Granite-3.0-3B-A800M-Instruct is based on a decoder-only sparse Mixture of Experts (MoE) transformer architecture. Core components of this architecture are: Fine-grained Experts, Dropless Token Routing, and Load Balancing Loss.
|
315 |
|
316 |
| Model | 2B Dense | 8B Dense | 1B MoE | 3B MoE |
|
317 |
| :-------- | :--------| :--------| :--------| :-------- |
|