Having trouble loading this in ooba

#1
by Efaarts - opened

Getting this on load

image.png

Traceback (most recent call last):

File "C:\Projects\MachineLearning\LLMWebUI\text-generation-webui\modules\ui_model_menu.py", line 209, in load_model_wrapper

shared.model, shared.tokenizer = load_model(shared.model_name, loader)

                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "C:\Projects\MachineLearning\LLMWebUI\text-generation-webui\modules\models.py", line 93, in load_model

tokenizer = load_tokenizer(model_name, model)

            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "C:\Projects\MachineLearning\LLMWebUI\text-generation-webui\modules\models.py", line 113, in load_tokenizer

tokenizer = AutoTokenizer.from_pretrained(

            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "C:\Projects\MachineLearning\LLMWebUI\text-generation-webui\installer_files\env\Lib\site-packages\transformers\models\auto\tokenization_auto.py", line 751, in from_pretrained

tokenizer_class = get_class_from_dynamic_module(class_ref, pretrained_model_name_or_path, **kwargs)

                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "C:\Projects\MachineLearning\LLMWebUI\text-generation-webui\installer_files\env\Lib\site-packages\transformers\dynamic_module_utils.py", line 487, in get_class_from_dynamic_module

final_module = get_cached_module_file(

               ^^^^^^^^^^^^^^^^^^^^^^^

File "C:\Projects\MachineLearning\LLMWebUI\text-generation-webui\installer_files\env\Lib\site-packages\transformers\dynamic_module_utils.py", line 293, in get_cached_module_file

resolved_module_file = cached_file(

                       ^^^^^^^^^^^^

File "C:\Projects\MachineLearning\LLMWebUI\text-generation-webui\installer_files\env\Lib\site-packages\transformers\utils\hub.py", line 401, in cached_file

raise EnvironmentError(

OSError: models\Yi-34B-GiftedConvo-merged-6.0bpw-h6-exl2 does not appear to have a file named tokenization_yi.py. Checkout 'https://huggingface.co/models\Yi-34B-GiftedConvo-merged-6.0bpw-h6-exl2/None' for available files.

I had a similar error when loading the 5.0bpw in Exllamav2_HF.
But it loaded okay in Exllamav2.

Also, please make sure you're Exllamav2 is updated to the latest version.

I never load in Exllamav2_HF as I don't need any of the extra options. I tried the 8.0bpw model on my 2x4090 system and it's really slow for inference. Like 7 t/s speeds slow. The 4.0bpw model is 23 t/s and does not seem any different in terms of inference quality.

Sign up or log in to comment