File size: 2,182 Bytes
26c02f0 32d7e50 26c02f0 8014230 9d6957e 87c0275 5054750 87c0275 9977c0c 9d6957e 8014230 0f64ecb 759a8f9 eb818d2 8014230 0e6972b 8014230 98f54e2 2e4b794 8014230 9d6957e 98f54e2 759a8f9 0f64ecb 2b0513f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
---
license: apache-2.0
datasets:
- Norquinal/claude_multiround_chat_30k
- OpenLeecher/Teatime
---
We proudly announce this is the world first 128k context model based on RWKV architecture today, 2023-08-10.
With RWKV world tokenizer,multi-langs have 1:1 tokenization ratio ,one word to one token.
(https://github.com/BlinkDL/ChatRWKV/blob/2a13ddecd81f8fd615b6da3a8f1091a594689e30/tokenizer/rwkv_tokenizer.py#L163)
This model trained with instructions datasets and chinese web novel and tradition wuxia,
more trainning details would be updated.
Test input 85k tokens to summary ,can find conversation files in example folders ,more cases are coming.
Full finetuned using this repo to train 128k context model , 4*A800 40hours with 1.3B tokens.
https://github.com/SynthiaDL/TrainChatGalRWKV/blob/main/train_world.sh
![QQ图片20230810153529.jpg](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/d8ekmc4Lfhy2lYEdrRKXz.jpeg)
Using RWKV Runner https://github.com/josStorer/RWKV-Runner to test this , use temp 0.1-0.2 topp 0.7 for more precise answer ,temp between 1-2.x is more creatively.
![微信截图_20230810162303.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/Ww45-WMngl4Jyt1OZDAa_.png)
![image.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/5zDQVbGb-fX8Y8h98tUF0.png)
![微信截图_20230810142220.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/u2wA-l1UcW-Mt9KIoa_4q.png)
![4UYBX4RA0%8PA{1YSSK)AVW.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/gzr8Yt4JRkBz31-ieRSOE.png)
![QQ图片20230810143840.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/LgEjfHJ7XD7PlGM9b3RAf.png)
![image.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/b_6KCBdZKW7Q7HwipxE-l.png)
85k input test
![image.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/F9unOJfhmJPXsciPHLsrl.png)
![image.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/35j5C1QD_4cO-AjfxV7tl.png) |