Edit model card

Cat Embeddings

A set of embedding model trained for study embedding quality vs model architecture (width/depth) given a size constraint (12M params).

  • cat-emb-2-128: 2 layers/hidden size 128/4.4m
  • cat-emb-4-128: 4 layers/H 128/4.8m
  • cat-emb-8-128: 8 layers/H 128/5.6m
  • cat-emb-12-128: 12 layers/H 128/6.4m
  • cat-emb-2-256: 2 layers/H 256/9.7m
  • cat-emb-4-256: 4 layers/H 256/11.3m

Training

  • stage 1: seq 192, batch size 2048, 50k steps, sentence pairs.
  • stage 2: seq 512, batch size 64, 5k steps, sentence triplets.

Perf

MRL dim\Task BIOSSES SICK-R STS12 STS13 STS14 STS15 STS16 STSB SummEval
128 0.7107 0.7126 0.6815 0.7343 0.7038 0.8163 0.7495 0.7652 0.2958
64 0.713 0.7123 0.6829 0.7348 0.7008 0.813 0.7475 0.7609 0.2861
32 0.6714 0.7094 0.6847 0.7345 0.6911 0.7989 0.7385 0.7545 0.3106
16 0.6637 0.697 0.669 0.7096 0.6665 0.7589 0.7183 0.7307 0.3164
Downloads last month
42
Unable to determine this model’s pipeline type. Check the docs .