Slow to load tokenizer

by gptzerozero - opened Jul 4, 2023

Jul 4, 2023

•

edited Jul 4, 2023

Anyone notice it takes a long time (2 minutes) to load the tokenizer for this GPTQ model, but other GPTQ models like TheBloke/WizardLM-33B-V1-0-Uncensored-SuperHOT-8K-GPTQ loads within a second (100ms).

model_id = path_to_downloaded_models/TheBloke_LongChat-13B-GPTQ
tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True)

TheBloke

Owner Jul 7, 2023

Oh, I never uploaded a fast tokenizer for this. I'll sort that out now

TheBloke

Owner Jul 7, 2023

Done, trigger a download of the model again and it'll download tokenizer.json and then it will load instantly.

jacobbao

Jul 8, 2023

Hello, is there 8bit version for gptq?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment