Post
GPU Poor POV: Building a RAG which solves specific task.
Everyone loves benchmarks.
They are great because we have standarized approach, competitive feeling. But if you are in specific area, trying to implement some LLM/RAG use case, these benchmarks cannot exactly reflect on the data that you have to deal with.
I built RAG system on bunch of niche procedures/regulation etc, which can be finally deployed as an virtual assistant to minimize the effort in searching through a lot of documentations manually.
Tested a lot of different methods/models/pretrains, finetunes and whats interesting is that, final solution which was scored by human feedback is based on relatively low param models, with multitask ability
Something like:
BAAI/llm-embedder
LLMs help summarize the chunk version of knowledge base found, does not require the model with high number of params, because tradeoff between inference time and accuracy has to be made. Some lightweight models have ability to perform certain task based on instructions, so eg. qwen 7b or mistral 7b (not moe one), realized a task really nicely. And what is more important is that in overall we are able to deploy a RAG system in smaller tasks, in specific area. They can be used by people who need it, give additive value and positive feedback, which IMO is what is all of the building process about.
Have a great day and think about problem which your models have to solve <3
Everyone loves benchmarks.
They are great because we have standarized approach, competitive feeling. But if you are in specific area, trying to implement some LLM/RAG use case, these benchmarks cannot exactly reflect on the data that you have to deal with.
I built RAG system on bunch of niche procedures/regulation etc, which can be finally deployed as an virtual assistant to minimize the effort in searching through a lot of documentations manually.
Tested a lot of different methods/models/pretrains, finetunes and whats interesting is that, final solution which was scored by human feedback is based on relatively low param models, with multitask ability
Something like:
BAAI/llm-embedder
LLMs help summarize the chunk version of knowledge base found, does not require the model with high number of params, because tradeoff between inference time and accuracy has to be made. Some lightweight models have ability to perform certain task based on instructions, so eg. qwen 7b or mistral 7b (not moe one), realized a task really nicely. And what is more important is that in overall we are able to deploy a RAG system in smaller tasks, in specific area. They can be used by people who need it, give additive value and positive feedback, which IMO is what is all of the building process about.
Have a great day and think about problem which your models have to solve <3