keepitreal/vietnamese-sbert · Performance with Real World Data

Hello,

We were excited to find your model, but after some testing against real world data found that it performs quite poorly compared to OpenAI's standard embeddings (text-embedding-3-small) for knowledge base retrieval, per below.

========================================[top_k: 3]
OpenAI Results: Passed: 138 | Failed: 14
--------------------------------------------------
VietSBERT Results: Passed: 112 | Failed: 40
==================================================

This is using cosine similarity test to match user questions against a knowledge base. Is this the suggested way?