xiaol's picture
Update README.md
98f54e2
|
raw
history blame
No virus
1.47 kB
metadata
license: apache-2.0
datasets:
  - Norquinal/claude_multiround_chat_30k
  - OpenLeecher/Teatime

We proudly announce this is the world first 128k context model based on RWKV architecture today, 2023-08-10.

This model trained with instructions datasets and chinese web novel and tradition wuxia, more trainning details would be updated.

Full finetuned using this repo to train 128k context model , 4*A800 40hours with 1.3B tokens. https://github.com/SynthiaDL/TrainChatGalRWKV/blob/main/train_world.sh

QQ图片20230810153529.jpg

Using RWKV Runner https://github.com/josStorer/RWKV-Runner to test this , use temp 0.1-0.2 topp 0.7 for more precise answer ,temp between 1-2.x is more creatively.

image.png

微信截图_20230810142220.png

4UYBX4RA0%8PA{1YSSK)AVW.png

QQ图片20230810143840.png

image.png