Vocab Size Mismatch

#2
by manu261297 - opened

Thanks for sharing this model!
I have a question, if the vocab size is of 33, why is the logits output shape last dimension 64?
I probably missing something simple.
Thanks uprfont!

manu261297 changed discussion title from Vocab Size to Vocab Size Mismatch
Synthyra org

Hi @manu261297 ,

The vocab size is actually 64, ESMC (and ESM++) use a different tokenizer than ESM2. Most of these tokens aren't used in practice, but matrices that are divisible by 8 play nicely with GPUs :)
Let me know if you have any other questions.
Best,

  • Logan

Great, makes sense. So if I want to get the probability of different AAs for a given position, I just use these IDs from the first 33 that actually correspond to the used vocab, right?
Thanks for your work and your help!

Synthyra org

Yep! Indexing / slicing the logits is the best way. You can even slice before softmax if you like, have had some people prefer that for these types of models with unused tokens. Going to close the issue, if you have any other questions feel free to reopen. Take care!

lhallee changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment