Muennighoff commited on
Commit
8c53520
1 Parent(s): 4cb792b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -13
README.md CHANGED
@@ -217,25 +217,25 @@ The performance may vary depending on the prompt. For BLOOMZ models, we recommen
217
 
218
  ## Model
219
 
220
- - Architecture: Same as [bloom](https://huggingface.co/bigscience/bloom), also refer to the `config.json` file
221
- - Finetuning steps: 498
222
- - Finetuning tokens: 2.09 billion
223
- - Finetuning layout: 72x pipeline parallel, 1x tensor parallel, 4x data parallel
224
- - Precision: bfloat16
225
 
226
  ## Hardware
227
 
228
- - 288 A100 80GB GPUs (36 nodes)
229
- - 8 GPUs per node using NVLink 4 inter-gpu connects, 4 OmniPath links
230
- - NCCL-communications network: a fully dedicated subnet
231
- - AMD CPUs with 512GB memory per node
232
 
233
  ## Software
234
 
235
- - [Megatron-DeepSpeed](https://github.com/bigscience-workshop/Megatron-DeepSpeed)
236
- - [DeepSpeed](https://github.com/microsoft/DeepSpeed))
237
- - [PyTorch](https://github.com/pytorch/pytorch) (pytorch-1.11 w/ CUDA-11.5)
238
- - [apex](https://github.com/NVIDIA/apex)
239
 
240
  # Evaluation
241
 
 
217
 
218
  ## Model
219
 
220
+ - **Architecture:** Same as [bloom](https://huggingface.co/bigscience/bloom), also refer to the `config.json` file
221
+ - **Finetuning steps:** 498
222
+ - **Finetuning tokens:** 2.09 billion
223
+ - **Finetuning layout:** 72x pipeline parallel, 1x tensor parallel, 4x data parallel
224
+ - **Precision:** bfloat16
225
 
226
  ## Hardware
227
 
228
+ - **CPUs:** AMD CPUs with 512GB memory per node
229
+ - **GPUs:** 288 A100 80GB GPUs (36 nodes) with 8 GPUs per node using NVLink 4 inter-gpu connects, 4 OmniPath links
230
+ - **Communication:** NCCL-communications network with a fully dedicated subnet
231
+
232
 
233
  ## Software
234
 
235
+ - **Orchestration:** [Megatron-DeepSpeed](https://github.com/bigscience-workshop/Megatron-DeepSpeed)
236
+ - **Optimizer & parallelism:** [DeepSpeed](https://github.com/microsoft/DeepSpeed)
237
+ - **Neural networks:** [PyTorch](https://github.com/pytorch/pytorch) (pytorch-1.11 w/ CUDA-11.5)
238
+ - **FP16 if applicable:** [apex](https://github.com/NVIDIA/apex)
239
 
240
  # Evaluation
241