QPHutu commited on
Commit
abf8fd0
1 Parent(s): 790f4eb

minor change

Browse files
Files changed (1) hide show
  1. description1.md +5 -5
description1.md CHANGED
@@ -7,10 +7,10 @@ From our findings, we need approximately 1/3 memory under ideal conditions (F, B
7
  Check out our paper at [Arxiv](https://arxiv.org/abs/2405.15362).
8
 
9
 
10
- | Comparison assuming T_F=T_B=T_W | 1F1B | V-Min | V-Half | V-ZB |
11
- | ----------------------------------------------------- |-------|------- | ---------- | ---- |
12
- | Bubble Rate | ~ p/m | ~ 2p/3m | ~ p/ 2m | 0 |
13
- | Activation Memory <br> (Compared to 1F1B) | p | (p+4)/3 | (p+2)/2 | p |
14
 
15
 
16
- Bubble Rate here is calculated as `1 - (F+B+W)*m / longest_stage_time`.
 
7
  Check out our paper at [Arxiv](https://arxiv.org/abs/2405.15362).
8
 
9
 
10
+ | Method | 1F1B | V-Min | V-Half | V-ZB |
11
+ |------------------------------------------|-------|----------|----------| ---- |
12
+ | Bubble Rate <br> (assuming T_F=T_B=T_W) | ~ p/m | ~ 2p/3m | ~ p/ 2m | 0 |
13
+ | Activation Memory <br> (by #micro-batch) | p | (p+4)//3 | (p+2)//2 | p |
14
 
15
 
16
+ Bubble Rate here is calculated as `1 - (F+B+W)*m / longest_stage_time`.