[CLS] token representation or Pooled tokens?

by aarabil - opened 27 days ago

27 days ago

How is the base model used during finetuning, do you use the [CLS] hidden token representation or do you pool the tokens together somehow (e.g. averaging)?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment