Question about MTEB Benchmark Settings : 'max_seq_length'😭

#25
by george31 - opened

I noticed that in MTEB benchmark implementations(eval_mteb.py), the 'max_seq_length' is set to 512 tokens by default, even the model that support much longer sequences (like 32K tokens).

For example, when benchmarking embedding models with MTEB:

  • Default max_seq_length: 512
  • Actual model capacity: 32K tokens

This seems to potentially underutilize the model's capabilities and might not provide a fair comparison, especially for tasks involving longer documents.

Questions:

  1. Is this a common practice in the industry? If so, what's the rationale behind it?
  2. Wouldn't it be more appropriate to use the model's full sequence length capability for fair benchmarking?
  3. Are there any specific technical or practical reasons why 512 tokens became the de facto standard for MTEB benchmarks?

I'd appreciate any insights from the community on this benchmarking practice.
image.png

george31 changed discussion status to closed
george31 changed discussion status to open

Sign up or log in to comment