runtime error

οΏ½οΏ½ | 3.22G/4.96G [00:27<00:11, 149MB/s] pytorch_model-00002-of-00002.bin: 72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 3.55G/4.96G [00:28<00:07, 182MB/s] pytorch_model-00002-of-00002.bin: 79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 3.93G/4.96G [00:29<00:04, 219MB/s] pytorch_model-00002-of-00002.bin: 85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 4.19G/4.96G [00:31<00:03, 225MB/s] pytorch_model-00002-of-00002.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 4.96G/4.96G [00:31<00:00, 156MB/s] Downloading shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2/2 [01:46<00:00, 49.62s/it] Downloading shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2/2 [01:46<00:00, 53.28s/it] config.json: 0%| | 0.00/4.61k [00:00<?, ?B/s] config.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.61k/4.61k [00:00<00:00, 16.2MB/s] config.json: 0%| | 0.00/4.61k [00:00<?, ?B/s] config.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4.61k/4.61k [00:00<00:00, 14.9MB/s] Traceback (most recent call last): File "/home/user/app/app.py", line 143, in <module> handler = Chat(model_path, conv_mode=conv_mode, load_8bit=load_8bit, load_4bit=load_8bit, device=device) File "/home/user/app/llava/serve/gradio_utils.py", line 56, in __init__ self.tokenizer, self.model, processor, context_len = load_pretrained_model(model_path, model_base, model_name, File "/home/user/app/llava/model/builder.py", line 114, in load_pretrained_model model = LlavaLlamaForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, **kwargs) File "/home/user/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2903, in from_pretrained ) = cls._load_pretrained_model( File "/home/user/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3002, in _load_pretrained_model raise ValueError( ValueError: The current `device_map` had weights offloaded to the disk. Please provide an `offload_folder` for them. Alternatively, make sure you have `safetensors` installed if the model you are using offers the weights in this format.

Container logs:

Fetching error logs...