Expecting bi-modal disstribution of probabilities
#9
by
christianclough
- opened
Hello, thanks for the great paper, and publishing the model here!
I'm getting some unusual results: I'm computing large numbers of masked-token probabilities on human DNA for Nucleotide Transformer, DNABERT and DNABERT 2. I see decent bi-modal probability distributions (group of high and low) for all the models except for DNABERT2, which is mono-modal of low probabilities.
Is inference via HuggingFace definitely working for everyone? Note I'm using the model in Google Colab.