|
--- |
|
pipeline_tag: sentence-similarity |
|
tags: |
|
- feature-extraction |
|
- sentence-similarity |
|
license: mit |
|
language: |
|
- fr |
|
- en |
|
--- |
|
|
|
# Solon Embeddings — base 0.1 |
|
|
|
SOTA Open source french embedding model. |
|
|
|
| Model | Mean Score | |
|
| --- | --- | |
|
| cohere/embed-multilingual-v3 | 0.7402 | |
|
| **OrdalieTech/Solon-embeddings-base-0.1** | 0.7306 | |
|
| openai/ada-002 | 0.7290 | |
|
| cohere/embed-multilingual-light-v3 | 0.6945 | |
|
| antoinelouis/biencoder-camembert-base-mmarcoFR | 0.6826 | |
|
| dangvantuan/sentence-camembert-large | 0.6756 | |
|
| voyage/voyage-01 | 0.6753 | |
|
| intfloat/multilingual-e5-large | 0.6660 | |
|
| intfloat/multilingual-e5-base | 0.6597 | |
|
| Sbert/paraphrase-multilingual-mpnet-base-v2 | 0.5975 | |
|
| dangvantuan/sentence-camembert-base | 0.5456 | |
|
| EuropeanParliament/eubert_embedding_v1 | 0.5063 | |
|
|
|
These results have been obtained through 9 benchmarks on a variety of text similarity tasks (classification, reranking, STS) : |
|
- AmazonReviewsClassification |
|
- MassiveIntentClassification |
|
- MassiveScenarioClassification |
|
- MTOPDomainClassification |
|
- MTOPIntentClassification |
|
- STS22 |
|
- MiraclFRRerank |
|
- OrdalieFRSTS |
|
- OrdalieFRReranking |
|
|
|
(evaluation script currently available here : github.com/netapy/mteb) |
|
|
|
-------- |
|
|
|
(Large version comming soon) |