Question about MTEB Benchmark Settings : 'max_seq_length'😭
#25
by
george31
- opened
I noticed that in MTEB benchmark implementations(eval_mteb.py
), the 'max_seq_length' is set to 512 tokens by default, even the model that support much longer sequences (like 32K tokens).
For example, when benchmarking embedding models with MTEB:
- Default max_seq_length: 512
- Actual model capacity: 32K tokens
This seems to potentially underutilize the model's capabilities and might not provide a fair comparison, especially for tasks involving longer documents.
Questions:
- Is this a common practice in the industry? If so, what's the rationale behind it?
- Wouldn't it be more appropriate to use the model's full sequence length capability for fair benchmarking?
- Are there any specific technical or practical reasons why 512 tokens became the de facto standard for MTEB benchmarks?
I'd appreciate any insights from the community on this benchmarking practice.
george31
changed discussion status to
closed
george31
changed discussion status to
open