Update tokenization_phi3_small.py

#29
by XirenZhou - opened
Microsoft org

This change deals with the scenario where user has restricted internet access and thus fails to download the tokenizer file from https://openaipublic.blob.core.windows.net/encodings/cl100k_base.tiktoken by directly calling base = tiktoken.get_encoding("cl100k_base"). The workaround is to download the cl100k_base.tiktoken file from this repo to user's local. It is assumed that user should be able to access files on huggingface hub.

nguyenbh changed pull request status to merged

Sign up or log in to comment