Text Generation
PyTorch
causal-lm
rwkv
BlinkDL commited on
Commit
5f88540
1 Parent(s): b4d5e2a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -37,7 +37,9 @@ How to use:
37
  The differences between World & Raven:
38
  * set pipeline = PIPELINE(model, "rwkv_vocab_v20230424") instead of 20B_tokenizer.json (EXACTLY AS WRITTEN HERE. "rwkv_vocab_v20230424" is included in rwkv 0.7.4+)
39
  * use Question/Answer or User/AI or Human/Bot for chat. **DO NOT USE Bob/Alice or Q/A**
40
- * use **fp32** (will overflow in fp16 at this moment - fixable in future) or bf16 (slight degradation)
 
 
41
 
42
  NOTE: the new greedy tokenizer (https://github.com/BlinkDL/ChatRWKV/blob/main/tokenizer/rwkv_tokenizer.py) will tokenize '\n\n' as one single token instead of ['\n','\n']
43
 
 
37
  The differences between World & Raven:
38
  * set pipeline = PIPELINE(model, "rwkv_vocab_v20230424") instead of 20B_tokenizer.json (EXACTLY AS WRITTEN HERE. "rwkv_vocab_v20230424" is included in rwkv 0.7.4+)
39
  * use Question/Answer or User/AI or Human/Bot for chat. **DO NOT USE Bob/Alice or Q/A**
40
+
41
+ For small models, use **fp32** for first layer (will overflow in fp16 at this moment - fixable in future), or bf16 if you have 30xx/40xx GPUs.
42
+ Example strategy: cuda fp32 *1 -> cuda fp16
43
 
44
  NOTE: the new greedy tokenizer (https://github.com/BlinkDL/ChatRWKV/blob/main/tokenizer/rwkv_tokenizer.py) will tokenize '\n\n' as one single token instead of ['\n','\n']
45