Error while sampling from output

#1
by Beck777 - opened

Flash Attention installed
Flash RoPE installed

*** Generate:
/home/beck/.cache/huggingface/modules/transformers_modules/TheBloke/Yarn-Llama-2-7B-128K-GPTQ/d4fc0fb58bb6426ddb0ad7c104493c611a0a3a69/modeling_llama_together_yarn.py:522: UserWarning: operator() profile_node %34 : int[] = prim::profile_ivalue(%32)
does not have profile information (Triggered internally at ../third_party/nvfuser/csrc/graph_fuser.cpp:104.)
kv = repeat_kv(kv, self.num_key_value_groups)
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "/var/local/acp/dashboards/draft/weaviate/app/module/consumer_distil_url_content.py", line 296, in
output = model.generate(inputs=input_ids, temperature=0.7, max_new_tokens=512)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/beck/.local/share/virtualenvs/weaviate-CeQZUpy1/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/beck/.local/share/virtualenvs/weaviate-CeQZUpy1/lib/python3.11/site-packages/transformers/generation/utils.py", line 1648, in generate
return self.sample(
^^^^^^^^^^^^
File "/home/beck/.local/share/virtualenvs/weaviate-CeQZUpy1/lib/python3.11/site-packages/transformers/generation/utils.py", line 2766, in sample
next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: probability tensor contains either inf, nan or element < 0

Sign up or log in to comment