Model and training documentation

#1
by ftvalentini - opened

Is there any details or documentation about how these sentence embeddings are extracted from Croissant LLM and how they are fine-tuned, if they are?
Thanks in advance!

Hey !
Not documented but basically just take the CroissantLLM hidden state for the EOS token (or the weighted average of the tokens) and train contrastively with sentence transformers on the dataset that is listed !

great, thanks! And do you happen to know how that dataset was built?

Sign up or log in to comment