Spaces:
Sleeping
Sleeping
File size: 448 Bytes
6fc683c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
- Code release: https://github.com/microsoft/torchscale
- March 2022: release preprint [DeepNet: Scaling Transformers to 1,000 Layers](https://arxiv.org/abs/2203.00555)
```
@article{deepnet,
author = {Hongyu Wang and Shuming Ma and Li Dong and Shaohan Huang and Dongdong Zhang and Furu Wei},
title = {{DeepNet}: Scaling {Transformers} to 1,000 Layers},
journal = {CoRR},
volume = {abs/2203.00555},
year = {2022},
}
```
|