Slow to load tokenizer
#2
by
gptzerozero
- opened
Anyone notice it takes a long time (2 minutes) to load the tokenizer for this GPTQ model, but other GPTQ models like TheBloke/WizardLM-33B-V1-0-Uncensored-SuperHOT-8K-GPTQ
loads within a second (100ms).
model_id = path_to_downloaded_models/TheBloke_LongChat-13B-GPTQ
tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True)
Oh, I never uploaded a fast tokenizer for this. I'll sort that out now
Done, trigger a download of the model again and it'll download tokenizer.json
and then it will load instantly.
Hello, is there 8bit version for gptq?