arbml
/

rwkv-4-world-7b-arabic

Text Generation

Inference Endpoints

Model card Files Files and versions Community

rwkv-4-world-7b-arabic / README.md

khalidalt's picture

Update README.md

69b42b6 about 1 year ago

|

1.62 kB

	---
	language:
	- en
	- zh
	- de
	- fr
	- es
	- pt
	- ru
	- it
	- ja
	- ko
	- vi
	- ar
	tags:
	- pytorch
	- text-generation
	- causal-lm
	- rwkv
	license: apache-2.0
	datasets:
	- khalidalt/Joud
	---

	# RWKV-4-World-7b-Arabic

	## Model Description


	RWKV-4-World-7b-Arabic is a pretrinaed version of RWKV-4-world that finetuned on Arabic datasets.

	RWKV-4

	How to use:
	* use https://github.com/josStorer/RWKV-Runner for GUI
	* use latest rwkv pip package (0.8.0+)
	* use https://github.com/BlinkDL/ChatRWKV/blob/main/v2/benchmark_world.py and https://github.com/BlinkDL/ChatRWKV/blob/main/API_DEMO_WORLD.py to test it

	The differences between World & Raven:
	* set pipeline = PIPELINE(model, "rwkv_vocab_v20230424") instead of 20B_tokenizer.json (EXACTLY AS WRITTEN HERE. "rwkv_vocab_v20230424" is included in rwkv 0.7.4+)
	* use Question/Answer or User/AI or Human/Bot for chat. DO NOT USE Bob/Alice or Q/A

	For 0.1/0.4/1.5B models, use fp32 for first layer (will overflow in fp16 at this moment - fixable in future), or bf16 if you have 30xx/40xx GPUs. Example strategy: cuda fp32 *1 -> cuda fp16

	NOTE: the new greedy tokenizer (https://github.com/BlinkDL/ChatRWKV/blob/main/tokenizer/rwkv_tokenizer.py) will tokenize '\n\n' as one single token instead of ['\n','\n']

	QA prompt (replace \n\n in xxx to \n):
	```
	Question: xxx

	Answer:
	```
	and
	```
	Instruction: xxx

	Input: xxx

	Response:
	```

	A good chat prompt (replace \n\n in xxx to \n):
	```
	User: hi

	Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.

	User: xxx

	Assistant:
	```