Performance with Real World Data

#5
by Panoplos - opened

Hello,

We were excited to find your model, but after some testing against real world data found that it performs quite poorly compared to OpenAI's standard embeddings (text-embedding-3-small) for knowledge base retrieval, per below.

========================================[top_k: 3]
OpenAI Results: Passed: 138 | Failed: 14
--------------------------------------------------
VietSBERT Results: Passed: 112 | Failed: 40
==================================================

This is using cosine similarity test to match user questions against a knowledge base. Is this the suggested way?

Sign up or log in to comment