RuntimeError: [enforce fail at C:\cb\pytorch_1000000000000\work\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 1048576 bytes.
I am getting this error. Setting CPU and GPU to 1MiB doesn't change this error at all.
Traceback (most recent call last):
File “D:\AI\oobabooga-windows\text-generation-webui\server.py”, line 85, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name)
File “D:\AI\oobabooga-windows\text-generation-webui\modules\models.py”, line 100, in load_model
model = load_quantized(model_name)
File “D:\AI\oobabooga-windows\text-generation-webui\modules\GPTQ_loader.py”, line 151, in load_quantized
model = load_quant(str(path_to_model), str(pt_path), shared.args.wbits, shared.args.groupsize, kernel_switch_threshold=threshold)
File “D:\AI\oobabooga-windows\text-generation-webui\modules\GPTQ_loader.py”, line 32, in _load_quant
model = AutoModelForCausalLM.from_config(config)
File “D:\AI\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\auto\auto_factory.py”, line 411, in from_config
return model_class._from_config(config, **kwargs)
File “D:\AI\oobabooga-windows\installer_files\env\lib\site-packages\transformers\modeling_utils.py”, line 1146, in _from_config
model = cls(config, **kwargs)
File “D:\AI\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py”, line 614, in init
self.model = LlamaModel(config)
File “D:\AI\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py”, line 445, in init
self.layers = nn.ModuleList([LlamaDecoderLayer(config) for _ in range(config.num_hidden_layers)])
File “D:\AI\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py”, line 445, in
self.layers = nn.ModuleList([LlamaDecoderLayer(config) for _ in range(config.num_hidden_layers)])
File “D:\AI\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py”, line 255, in init
self.self_attn = LlamaAttention(config=config)
File “D:\AI\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py”, line 180, in init
self.rotary_emb = LlamaRotaryEmbedding(self.head_dim, max_position_embeddings=self.max_position_embeddings)
File “D:\AI\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py”, line 106, in init
self.register_buffer(“cos_cached”, emb.cos()[None, None, :, :], persistent=False)
RuntimeError: [enforce fail at C:\cb\pytorch_1000000000000\work\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 1048576 bytes.
The same issue. that's good I'm not alone....
I am getting this error. Setting CPU and GPU to 1MiB doesn't change this error at all.
Traceback (most recent call last):
File “D:\AI\oobabooga-windows\text-generation-webui\server.py”, line 85, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name)
File “D:\AI\oobabooga-windows\text-generation-webui\modules\models.py”, line 100, in load_model
model = load_quantized(model_name)
File “D:\AI\oobabooga-windows\text-generation-webui\modules\GPTQ_loader.py”, line 151, in load_quantized
model = load_quant(str(path_to_model), str(pt_path), shared.args.wbits, shared.args.groupsize, kernel_switch_threshold=threshold)
File “D:\AI\oobabooga-windows\text-generation-webui\modules\GPTQ_loader.py”, line 32, in _load_quant
model = AutoModelForCausalLM.from_config(config)
File “D:\AI\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\auto\auto_factory.py”, line 411, in from_config
return model_class._from_config(config, **kwargs)
File “D:\AI\oobabooga-windows\installer_files\env\lib\site-packages\transformers\modeling_utils.py”, line 1146, in _from_config
model = cls(config, **kwargs)
File “D:\AI\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py”, line 614, in init
self.model = LlamaModel(config)
File “D:\AI\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py”, line 445, in init
self.layers = nn.ModuleList([LlamaDecoderLayer(config) for _ in range(config.num_hidden_layers)])
File “D:\AI\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py”, line 445, in
self.layers = nn.ModuleList([LlamaDecoderLayer(config) for _ in range(config.num_hidden_layers)])
File “D:\AI\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py”, line 255, in init
self.self_attn = LlamaAttention(config=config)
File “D:\AI\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py”, line 180, in init
self.rotary_emb = LlamaRotaryEmbedding(self.head_dim, max_position_embeddings=self.max_position_embeddings)
File “D:\AI\oobabooga-windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py”, line 106, in init
self.register_buffer(“cos_cached”, emb.cos()[None, None, :, :], persistent=False)
RuntimeError: [enforce fail at C:\cb\pytorch_1000000000000\work\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 1048576 bytes.
Please review this thread. Lots of troubleshooting, patterns, and solutions were found. Feel free to spread the news to anyone who has the same issue.
https://huggingface.co/anon8231489123/gpt4-x-alpaca-13b-native-4bit-128g/discussions/15