Pipeline Parallellism with Controllable Memory
Check out our paper at Arxiv.
Bubble Rate here is calculated as (1 - longest stage time/(F+B+W)/m).
Check out our paper at Arxiv.
Bubble Rate here is calculated as (1 - longest stage time/(F+B+W)/m).