Is there a way to add new tokens to the tokenizer?

#14
by Meshwa - opened

I tried doing tokenizer.add_tokens

But it doesn't work for this. My dataset has some tokens like <laughter>, <cough>, etc. and I would like to make them 1 single token. Is there any method to add new tokens to this tokenizer or is it just not possible?

Sign up or log in to comment