@s3nh on Hugging Face: "GPU Poor POV: Building a RAG which solves specific task. Everyone loves…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

s3nh

posted an update Jan 17

Post

GPU Poor POV: Building a RAG which solves specific task.

Everyone loves benchmarks.
They are great because we have standarized approach, competitive feeling. But if you are in specific area, trying to implement some LLM/RAG use case, these benchmarks cannot exactly reflect on the data that you have to deal with.

I built RAG system on bunch of niche procedures/regulation etc, which can be finally deployed as an virtual assistant to minimize the effort in searching through a lot of documentations manually.

Tested a lot of different methods/models/pretrains, finetunes and whats interesting is that, final solution which was scored by human feedback is based on relatively low param models, with multitask ability
Something like:

BAAI/llm-embedder

LLMs help summarize the chunk version of knowledge base found, does not require the model with high number of params, because tradeoff between inference time and accuracy has to be made. Some lightweight models have ability to perform certain task based on instructions, so eg. qwen 7b or mistral 7b (not moe one), realized a task really nicely. And what is more important is that in overall we are able to deploy a RAG system in smaller tasks, in specific area. They can be used by people who need it, give additive value and positive feedback, which IMO is what is all of the building process about.

Have a great day and think about problem which your models have to solve <3

osanseviero

Jan 17

cc @tomaarsen

not-lain

Jan 18

leaving this for future readers, huggingface has a really nice documentation about RAG which i highly recommend reading if you're interested about using them. It incorporates clear examples and neatly formatted code :
https://huggingface.co/docs/transformers/model_doc/rag

In this post