Amazing results with Raven 3B!! It speaks other languages, it knows the date.. How does this work?
The first time I tried to run rwkv and raven llms on my imac (i5, 16 gb ram) but it was too slow and the more important thing, the output wasn't that coherent, even with the 7B model. Now i ran the 3B model on my macbook with m1 and only 8gb ram, but
- it runs amazingly fast, like there is almost no loading time!
- I recognized that the conversations make much more sense now. The answers are coherent, they are accurate, the llm doesnt hallucinate (at least until now. alice now says "i don't know" or "i never heard from it" etc), the conversation in general feels quite natural.
- Just for fun I tried to talk in german with Alice she/the llm answers in pretty good german. I am really surprised, since it is trained with 99% english stuff and the llm's german skills are better than alpaca's. I mean I am talking about the 3B model (q4, flt16, it runs with 2 gb in ram!!!). I can have long and consistent conversations with alice in german language. HOW does this work?
- I asked Alice which date is today and she/the llm answerd with "8th of April 2023" - so this is the date the file was published. How does the llm knows this day? I assume the model was trained until this date, but I thought with datasets that were older of course.
- What is reason for such different outputs on different computers? i used the same files and havent change anything. does the output quality correlate with the speed/general performance?
I want to thank the developer of these llms for this great work! I am excited what the 7B model will show on the m1 : D
Thank you!
Will be multiple times faster after optimization.
Yeah and I just released v11. I think we can reach v30 this year.
RWKV has better understanding of human language :)
The base model was trained on Pile v1 (2020), then finetuned on ChatGPT data and these are newer.
The same is true for all language models. You can reduce "top-p" for more consistent results, but it will also be boring (lack of variations).
How to load this model from huggingface? It does not seem to have a model card?
Use https://github.com/BlinkDL/ChatRWKV for now
We are working on HF integration: https://github.com/huggingface/transformers/pull/22797
The first time I tried to run rwkv and raven llms on my imac (i5, 16 gb ram) but it was too slow and the more important thing, the output wasn't that coherent, even with the 7B model. Now i ran the 3B model on my macbook with m1 and only 8gb ram, but
- it runs amazingly fast, like there is almost no loading time!
- I recognized that the conversations make much more sense now. The answers are coherent, they are accurate, the llm doesnt hallucinate (at least until now. alice now says "i don't know" or "i never heard from it" etc), the conversation in general feels quite natural.
- Just for fun I tried to talk in german with Alice she/the llm answers in pretty good german. I am really surprised, since it is trained with 99% english stuff and the llm's german skills are better than alpaca's. I mean I am talking about the 3B model (q4, flt16, it runs with 2 gb in ram!!!). I can have long and consistent conversations with alice in german language. HOW does this work?
- I asked Alice which date is today and she/the llm answerd with "8th of April 2023" - so this is the date the file was published. How does the llm knows this day? I assume the model was trained until this date, but I thought with datasets that were older of course.
- What is reason for such different outputs on different computers? i used the same files and havent change anything. does the output quality correlate with the speed/general performance?
I want to thank the developer of these llms for this great work! I am excited what the 7B model will show on the m1 : D
Do you have m1 to get q4, flt16 and good loading example code