RetrievaEmbedding-01: AMBER
The AMBER (Adaptive Multitask Bilingual Embedding Representations) is a text embedding model trained by Retrieva, Inc. This model is primarily designed for Japanese, but it also supports English. We trained this model on various datasets related to Japanese and English.
This model size is 315M parameters (large size).
Model Details
Model Description
The AMBER model is a text embedding model based on the sbintuitions/modernbert-ja-310m architecture, designed for Japanese text. This model was trained on a variety of datasets related to Japanese, and also includes English datasets. The model can be used for English text as well. During training, prompts (instructions) in natural language were included, allowing the model to generate embeddings tailored to specific tasks.
- Developed by: Retrieva, Inc.
- Model type: Based on the ModernBERT Architecture.
- Language(s) (NLP): Primarily Japanese (optional support for English).
- License: Apache 2.0
- Finetuned from model:
sbintuitions/modernbert-ja-310m
- Model Type: Sentence Transformer
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Uses
How to Get Started with the Model
Install Library
First install the python library using pip:
pip install sentence-transformers sentencepiece
Run Inference
Then you can load this model and run inference.
You can specify the prompt at inference time by adding an argument called prompt
to model.encode
.
The prompts used in the Japanese benchmark are described in jmteb/tasks
, and the prompts used in the English benchmark are described in mteb/models/retrieva_en.py
.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("retrieva-jp/amber-large")
# Run inference
queries = [
"自然言語処理とはなんですか?",
"株式会社レトリバについて教えて",
]
documents = [
"自然言語処理(しぜんげんごしょり、英語: Natural language processing、略称:NLP)は、人間が日常的に使っている自然言語をコンピュータに処理させる一連の技術であり、人工知能と言語学の一分野である。",
"株式会社レトリバは、自然言語処理と機械学習を核としたAI技術で組織の課題解決を支援するテクノロジー企業である。",
]
queries_embeddings = model.encode(queries, prompt_name="Retrieval-query")
documents_embeddings = model.encode(documents, prompt_name="Retrieval-passage")
similarities = model.similarity(queries_embeddings, documents_embeddings)
print(similarities.shape)
Training Details
Training Data
We used multiple datasets to train this model. We selected datasets from llm-jp-eval, llm-japanese-dataset, and hpprc/emb for Japanese datasets. For English datasets, we mainly used some of the datasets utilized in Asai et al. (2023). Additionally, we partially used the English datasets at the sentence-transformers repository and kilt-tasks. To consider cross-lingual between Japanese and English, we also used translation datasets between Japanese and English.
For Japanese, we used synthetic data created by LLM to prepare a sufficient amount of training data.
Evaluation
We evaluated the model on the following benchmarks:
- Japanese Benchmark: JMTEB
- Japanese Retrieval Tasks: JQaRA, JaCWIR, MLDR Japanese Subset
- English Benchmark: MTEB(eng, v2).
The scores in the table are all calculated by us unless otherwise noted.
Japanese Benchmark: JMTEB
Note that the Mean (TaskType)
in the following leaderboard is the same as the Avg.
in the original JMTEB leaderboard.
The files used for evaluation are stored in the jmteb
directory.
Model | # Parameters | Mean (TaskType) | Mean (Task) | Retrieval | STS | Classification | Reranking | Clustering | PairClassification |
---|---|---|---|---|---|---|---|---|---|
base models | < 300M | ||||||||
cl-nagoya/ruri-base | 111M | 72.60 | 71.56 | 69.53 | 82.87 | 75.49 | 92.91 | 52.40 | 62.38 |
AMBER-base | 130M | 72.12 | 72.12 | 73.40 | 77.81 | 76.14 | 93.27 | 48.05 | 64.03 |
pkshatech/GLuCoSE-base-ja-v2 | 133M | 72.89 | 72.47 | 73.03 | 82.96 | 74.02 | 93.01 | 51.96 | 62.37 |
pkshatech/RoSEtta-base-ja | 190M | 72.49 | 72.05 | 73.14 | 81.39 | 72.37 | 92.69 | 53.60 | 61.74 |
intfloat/multilingual-e5-base | 278M | 71.11 | 69.72 | 69.45 | 80.45 | 69.86 | 92.90 | 51.62 | 62.35 |
large models | 300M < | ||||||||
AMBER-large (this model) |
315M | 72.52 | 73.22 | 75.40 | 79.32 | 77.14 | 93.54 | 48.73 | 60.97 |
cl-nagoya/ruri-large | 337M | 73.20 | 73.06 | 72.86 | 83.14 | 77.15 | 93.00 | 50.78 | 62.29 |
intfloat/multilingual-e5-large | 560M | 72.06 | 71.29 | 71.71 | 80.87 | 72.45 | 93.29 | 51.59 | 62.42 |
Japanese Retrieval Tasks: JQaRA, JaCWIR, MLDR Japanese Subset
The files used for MLDR are stored in the mldr
directory.
The prompts used in JQaRA and JaCWIR are Retrieval-query
and Retrieval-passage
described in config_sentence_transformers.json
.
Model | # Parameters | JQaRA (nDCG@10) | JaCWIR (MAP@10) | MLDR Japanese Subset (nDCG@10) |
---|---|---|---|---|
base models | < 300M | |||
cl-nagoya/ruri-base | 111M | 58.4 | 83.3 | 32.77 |
AMBER-base | 130M | 57.1 | 81.6 | 35.69 |
pkshatech/GLuCoSE-base-ja-v2 | 133M | 60.6 | 85.3 | 33.99 |
intfloat/multilingual-e5-base | 278M | 47.1 | 85.3 | 25.46 |
large models | 300M < | |||
AMBER-large (this model) |
315M | 62.5 | 82.4 | 34.57 |
cl-nagoya/ruri-large | 337M | 62.8 | 82.5 | 34.78 |
intfloat/multilingual-e5-large | 560M | 55.4 | 87.3 | 29.95 |
English Benchmark: MTEB(eng, v2)
The files used for evaluation are stored in the mteb
directory.
Model | # Parameters | Mean (TaskType) | Mean (Task) | Retrieval | STS | Classification | Reranking | Clustering | PairClassification | Summarization |
---|---|---|---|---|---|---|---|---|---|---|
base models | < 300M | |||||||||
AMBER-base | 130M | 54.75 | 58.20 | 40.11 | 81.29 | 70.39 | 42.98 | 42.27 | 80.12 | 26.08 |
intfloat/multilingual-e5-base | 278M | 56.21 | 59.75 | 43.22 | 80.50 | 73.84 | 43.87 | 42.19 | 83.74 | 26.10 |
large models | 300M < | |||||||||
AMBER-large (this model) |
315M | 56.08 | 59.13 | 41.04 | 81.52 | 72.23 | 43.83 | 42.71 | 81.00 | 30.21 |
intfloat/multilingual-e5-large | 560M | 57.06 | 60.84 | 46.17 | 81.11 | 74.88 | 44.31 | 41.91 | 84.33 | 26.67 |
More Information
TBA
Model Card Authors
Satoru Katsumata, Daisuke Kimura, Jiro Nishitoba
Model Card Contact
pr[at]retrieva.jp
- Downloads last month
- 12
Model tree for retrieva-jp/amber-large
Base model
sbintuitions/modernbert-ja-310mEvaluation results
- accuracy on MTEB AmazonCounterfactualClassification (en)test set self-reported73.343
- f1 on MTEB AmazonCounterfactualClassification (en)test set self-reported67.290
- f1_weighted on MTEB AmazonCounterfactualClassification (en)test set self-reported75.795
- ap on MTEB AmazonCounterfactualClassification (en)test set self-reported36.123
- ap_weighted on MTEB AmazonCounterfactualClassification (en)test set self-reported36.123
- main_score on MTEB AmazonCounterfactualClassification (en)test set self-reported73.343
- v_measure on MTEB ArXivHierarchicalClusteringP2P (default)test set self-reported53.394
- v_measure_std on MTEB ArXivHierarchicalClusteringP2P (default)test set self-reported3.973
- main_score on MTEB ArXivHierarchicalClusteringP2P (default)test set self-reported53.394
- v_measure on MTEB ArXivHierarchicalClusteringS2S (default)test set self-reported51.360