I tried doing tokenizer.add_tokens
But it doesn't work for this. My dataset has some tokens like <laughter>, <cough>, etc. and I would like to make them 1 single token. Is there any method to add new tokens to this tokenizer or is it just not possible?
<laughter>
<cough>
· Sign up or log in to comment