File size: 1,320 Bytes
c85423a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
---
language:
- zh
license: apache-2.0
inference: false
---
# Randeng-770M model (Chinese),one model of [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM).
The 770 million parameter Randeng-770M large model, using 280G Chinese data, 16 A100 training for 14 days,which is a standard transformer structure.
## Usage
There is no structure of Randeng-770M in [Transformers](https://github.com/huggingface/transformers), you can run follow code to get structure of Randeng-770M from [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM)
```shell
git clone https://github.com/IDEA-CCNL/Fengshenbang-LM.git
```
## Usage
```python
from model.megatron_t5.modeling_megatron_t5 import T5EncoderModel
from model.megatron_t5.configuration_magetron_t5 import T5Config
from model.megatron_t5.tokenization_megatron_t5 import T5Tokenizer
tokenizer = T5Tokenizer.from_pretrained('IDEA-CCNL/Randeng-770M')
config = T5Config.from_pretrained('IDEA-CCNL/Randeng-770M')
model = T5EncoderModel.from_pretrained('IDEA-CCNL/Randeng-770M')
```
## Citation
If you find the resource is useful, please cite the following website in your paper.
```
@misc{Fengshenbang-LM,
title={Fengshenbang-LM},
author={IDEA-CCNL},
year={2021},
howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
}
``` |