AdamG012 commited on
Commit
cac6c47
1 Parent(s): b273968

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -20
README.md CHANGED
@@ -70,26 +70,6 @@ To view the details behind each step head into their respective links and view t
70
  | Prescale gradients | True |
71
 
72
 
73
-
74
- ## Why did we choose DeepSpeed?
75
-
76
- **DeepSpeed Training:**
77
-
78
- The `main.py` Python code take the DeepSpeed config with the argument `--deepspeed_config ./ds_config.json`.
79
-
80
- We read up on the DeepSpeed documentation and created a specific coniguration based on their work. The json file `ds_config.json` here is set to take the [ZeRO-2](https://www.microsoft.com/en-us/research/blog/ZeRO-2-deepspeed-shattering-barriers-of-deep-learning-speed-scale/) stage and FP16, allowing must faster training and GPU memory saving. Note that ZeRO-2 is just one of the examples using our DeepSpeed. You may use ZeRO-1, Zero-3, ZeRO-Offload and ZeRO-infinity. For more information on DeepSpeed ZeRO family, please see this [tutorial link](https://www.deepspeed.ai/tutorials/zero/) for Zero-1/2/3 and this [tutorial ](https://www.deepspeed.ai/tutorials/zero-offload/)for Zero-Offload.
81
-
82
- To enable the DeepSpeed Zero family training, we injected several lines of code in order to enable this i.e.:
83
-
84
- ```python
85
- model, optimizer, _, lr_scheduler = deepspeed.initialize(model=model, \
86
- optimizer=optimizer, \
87
- args=args, \
88
- lr_scheduler=lr_scheduler, \
89
- dist_init_required=True)
90
- ```
91
-
92
-
93
  ## **Acknowledgements**
94
 
95
  We thank the following papers and open-source repositories. We especially thank DeepSpeed for their frameworks as well.
 
70
  | Prescale gradients | True |
71
 
72
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
73
  ## **Acknowledgements**
74
 
75
  We thank the following papers and open-source repositories. We especially thank DeepSpeed for their frameworks as well.