File size: 2,635 Bytes
26c02f0
 
32d7e50
 
 
26c02f0
524a6e3
 
 
 
9d6957e
87c0275
5054750
87c0275
9977c0c
524a6e3
9977c0c
9d6957e
8014230
 
0da086f
759a8f9
eb818d2
8014230
0e6972b
 
8014230
524a6e3
 
b394794
2e4b794
 
8014230
 
9d6957e
 
 
 
 
98f54e2
 
759a8f9
 
b394794
2b0513f
 
a58ca8b
 
 
 
9367b99
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
---
license: apache-2.0
datasets:
- Norquinal/claude_multiround_chat_30k
- OpenLeecher/Teatime
---

# RWKV 7B World 128k for novel writing

We proudly announce this is the world first **128k context** model based on RWKV architecture today, 2023-08-10.

With RWKV world tokenizer,multi-langs have 1:1 tokenization ratio ,one word to one token.
(https://github.com/BlinkDL/ChatRWKV/blob/2a13ddecd81f8fd615b6da3a8f1091a594689e30/tokenizer/rwkv_tokenizer.py#L163)


# How to train infinte context model?

This model trained with instructions datasets and chinese web novel and tradition wuxia, 
more trainning details would be updated.

Tested to summary 85k tokens  to 5 keypoints ,can find conversation files in example folders ,more cases are coming.

Full finetuned using this repo to train 128k context model , 4*A800 40hours with 1.3B tokens.
https://github.com/SynthiaDL/TrainChatGalRWKV/blob/main/train_world.sh

![QQ图片20230810153529.jpg](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/d8ekmc4Lfhy2lYEdrRKXz.jpeg)

# How to Test?

Using RWKV Runner https://github.com/josStorer/RWKV-Runner  to test this model, only need 16G vram to run fp16 or 8G vram fp16i8, use temp 0.1-0.2 topp 0.7 for more precise answer ,temp between 1-2.x is more creatively.
![微信截图_20230810162303.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/Ww45-WMngl4Jyt1OZDAa_.png)


![image.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/5zDQVbGb-fX8Y8h98tUF0.png)

![微信截图_20230810142220.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/u2wA-l1UcW-Mt9KIoa_4q.png)

![4UYBX4RA0%8PA{1YSSK)AVW.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/gzr8Yt4JRkBz31-ieRSOE.png)

![QQ图片20230810143840.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/LgEjfHJ7XD7PlGM9b3RAf.png)

![image.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/b_6KCBdZKW7Q7HwipxE-l.png)

85k tokens test
![image.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/F9unOJfhmJPXsciPHLsrl.png)

![image.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/35j5C1QD_4cO-AjfxV7tl.png)

![微信截图_20230810201844.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/5dVQrJxg05C0ww7_AVhrW.png)

![83be699bab815e4396254eb5e855090.png](https://cdn-uploads.huggingface.co/production/uploads/6176b32847ee6431f632981e/yde7ZcsxecbrQUIFvDtUs.png)