high-quality Chinese training datasets
Collection
a suite of high-quality Chinese datasets, used for pretraining, fine-tuning or reinforcement learning.
•
9 items
•
Updated
•
3
opencsg/csg-wukong-2b-smoltalk-chinese
, using ultrafeedback-chinese-binarized
as the DPO dataset.Base model
opencsg/csg-wukong-2b-chinese-fineweb-edu