Vocab Size Mismatch

by manu261297 - opened Feb 6

Feb 6

Thanks for sharing this model!
I have a question, if the vocab size is of 33, why is the logits output shape last dimension 64?
I probably missing something simple.
Thanks uprfont!

manu261297 changed discussion title from Vocab Size to Vocab Size Mismatch Feb 6

lhallee

Synthyra org Feb 7

Hi @manu261297 ,

The vocab size is actually 64, ESMC (and ESM++) use a different tokenizer than ESM2. Most of these tokens aren't used in practice, but matrices that are divisible by 8 play nicely with GPUs :)
Let me know if you have any other questions.
Best,

Logan

manu261297

about 1 month ago

Great, makes sense. So if I want to get the probability of different AAs for a given position, I just use these IDs from the first 33 that actually correspond to the used vocab, right?
Thanks for your work and your help!

lhallee

Synthyra org about 1 month ago

Yep! Indexing / slicing the logits is the best way. You can even slice before softmax if you like, have had some people prefer that for these types of models with unused tokens. Going to close the issue, if you have any other questions feel free to reopen. Take care!

lhallee changed discussion status to closed about 1 month ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment