|
--- |
|
library_name: transformers |
|
tags: [] |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
This a Chinese LLaMA2, built upon the original LLaMA2 with continue pre-training on 12B corpus. |
|
|
|
### Continue Pre-training Data: |
|
- WuDaoCorpora |
|
- BaiduBaike |
|
- Baidu News |
|
- Tiger Dataset |
|
- C4 samples |
|
|
|
### SFT Data: |
|
- Chinese SFT Data: |
|
- Alpaca-GPT4-zh |
|
- InstinWild-ch |
|
- Psychology-instruction |
|
- Firefly dataset |
|
- English SFT Data: |
|
- Code-Alpaca |
|
- InstinWild-en |
|
- School-math dataset |
|
- Ultrachat dataset |
|
|
|
# Results |
|
|
|
- CEval: 34.47% |
|
- CMMLU: 34.58% |
|
- MMLU: 41.23% |
|
|