Is there a way to add new tokens to the tokenizer?

#14

by Meshwa - opened Dec 18, 2024

Dec 18, 2024

I tried doing tokenizer.add_tokens

But it doesn't work for this. My dataset has some tokens like <laughter>, <cough>, etc. and I would like to make them 1 single token. Is there any method to add new tokens to this tokenizer or is it just not possible?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment