Question on Pooling method

#2
by davidmezzetti - opened

Nice work on these smaller models.

I see that the 1_Pooling/config.json file uses mean pooling but the underlying model configuration is for CLS pooling. Is that expected?

Taylor org

Yeah, I trained this one with mean pooling, if I could go back and do it again I would be more careful to make sure the pooling I used matched how the base model was trained, but oh well. It should probably work okay with CLS pooling since it was trained to distill all logits from the teacher model.

Sign up or log in to comment