wxgeorge commited on
Commit
cd922ef
1 Parent(s): 9d7c1ad

correct link for featherless inference + one typo

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -37,7 +37,7 @@ This approach demonstrates the architecture design and scalability of RWKV, rein
37
 
38
  One downside to this technique is that the model's inherent knowledge and dataset training are inherited from its "parent" model. Consequently, unlike previous RWKV models trained on over 100+ languages, the QRWKV model is limited to approximately 30 languages supported by the Qwen line of models.
39
 
40
- Due to the the lack of RWKV-based channel mix and feedforward layers, seperate inference code is needed for this specific model.
41
 
42
  Furthermore, due to compute constraints, we were only able to train up to 16K token context length. While the model is stable beyond this limit, additional training might be required to support longer context lengths.
43
 
@@ -53,8 +53,8 @@ Lastly, we intend to provide details on the conversion along with our paper afte
53
  ## Links
54
  - [Our wiki](https://wiki.rwkv.com)
55
  - [TensorWave - The AMD Cloud](https://tensorwave.com) - Access MI300X today!
56
- - [Recursal.AI Cloud Platform](https://recursal.ai)
57
- - [Featherless Inference](https://featherless.ai/models/RWKV/)
58
 
59
  ## Acknowledgement
60
  We are grateful for the help and support from the following key groups:
 
37
 
38
  One downside to this technique is that the model's inherent knowledge and dataset training are inherited from its "parent" model. Consequently, unlike previous RWKV models trained on over 100+ languages, the QRWKV model is limited to approximately 30 languages supported by the Qwen line of models.
39
 
40
+ Due to the the lack of RWKV-based channel mix and feedforward layers, separate inference code is needed for this specific model.
41
 
42
  Furthermore, due to compute constraints, we were only able to train up to 16K token context length. While the model is stable beyond this limit, additional training might be required to support longer context lengths.
43
 
 
53
  ## Links
54
  - [Our wiki](https://wiki.rwkv.com)
55
  - [TensorWave - The AMD Cloud](https://tensorwave.com) - Access MI300X today!
56
+ - [Recursal.AI Cloud Platform](https://platform.recursal.ai)
57
+ - [Featherless Inference](https://featherless.ai/model-families/rwkv6/)
58
 
59
  ## Acknowledgement
60
  We are grateful for the help and support from the following key groups: