Tips to run the model
Hi, first of all I had to manually apply the pull requests on the cloned repository, and that solved the 'layer_idx' error.
After that, I still get one more error during inference, regarding 'cache_position' (is there maybe a recommended transformers version?):
Loading checkpoint shards: 100%|██████████| 8/8 [00:01<00:00, 7.31it/s]
Traceback (most recent call last):
File "C:\writer\pythonProject\debug_italia9b.py", line 32, in
outputs = t_pipeline(
File "C:\writer\pythonProject.venv\lib\site-packages\transformers\pipelines\text_generation.py", line 272, in call
return super().call(text_inputs, **kwargs)
File "C:\writer\pythonProject.venv\lib\site-packages\transformers\pipelines\base.py", line 1302, in call
return self.run_single(inputs, preprocess_params, forward_params, postprocess_params)
File "C:\writer\pythonProject.venv\lib\site-packages\transformers\pipelines\base.py", line 1309, in run_single
model_outputs = self.forward(model_inputs, **forward_params)
File "C:\writer\pythonProject.venv\lib\site-packages\transformers\pipelines\base.py", line 1209, in forward
model_outputs = self._forward(model_inputs, **forward_params)
File "C:\writer\pythonProject.venv\lib\site-packages\transformers\pipelines\text_generation.py", line 370, in _forward
generated_sequence = self.model.generate(input_ids=input_ids, attention_mask=attention_mask, **generate_kwargs)
File "C:\writer\pythonProject.venv\lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "C:\writer\pythonProject.venv\lib\site-packages\transformers\generation\utils.py", line 2215, in generate
result = self._sample(
File "C:\writer\pythonProject.venv\lib\site-packages\transformers\generation\utils.py", line 3206, in _sample
outputs = self(**model_inputs, return_dict=True)
File "C:\writer\pythonProject.venv\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\writer\pythonProject.venv\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "C:\writer\pythonProject.venv\lib\site-packages\transformers\models\gpt_neox\modeling_gpt_neox.py", line 1178, in forward
outputs = self.gpt_neox(
File "C:\writer\pythonProject.venv\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\writer\pythonProject.venv\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "C:\writer\pythonProject.venv\lib\site-packages\transformers\models\gpt_neox\modeling_gpt_neox.py", line 951, in forward
outputs = layer(
File "C:\writer\pythonProject.venv\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\writer\pythonProject.venv\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
TypeError: forward() got an unexpected keyword argument 'cache_position'
I actually did run the model. My trasformers version is 4.46.1, and this is how I edited the modeling_italia.py script (add the last two declarations to the forward() method):
def forward(
self,
hidden_states: Optional[torch.FloatTensor],
attention_mask: Optional[torch.FloatTensor] = None,
position_ids: Optional[torch.LongTensor] = None,
head_mask: Optional[torch.FloatTensor] = None,
use_cache: Optional[bool] = False,
layer_past: Optional[Tuple[torch.Tensor]] = None,
output_attentions: Optional[bool] = False,
cache_position: Optional[int] = None,
position_embeddings: Optional[torch.Tensor] = None,
):
Please note this is a workaround.