XLM Roberta Tokenizer trained with 162M tokens of Khmer text.
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("seanghay/xlm-roberta-khmer-32k-tokenizer")
tokenizer.tokenize("αα½ααααΈααααα»ααΆ!")
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
HF Inference deployability: The model has no pipeline_tag.