AttributeError: type object 'AttentionMaskConverter' has no attribute '_ignore_causal_mask_sdpa' [ ]:
following the sample code, but has error:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
{"role": "user", "content": "Who are you?"},
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
terminators = [
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = model.generate(
input_ids,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9,
pad_token_id=tokenizer.eos_token_id
)
File /opt/conda/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py:1076, in LlamaModel._update_causal_mask(self, attention_mask, input_tensor, cache_position, past_seen_tokens)
1071 return None
1073 if self.config._attn_implementation == "sdpa":
1074 # For SDPA, when possible, we will rely on its is_causal
argument instead of its attn_mask
argument,
1075 # in order to dispatch on Flash Attention 2.
-> 1076 if AttentionMaskConverter._ignore_causal_mask_sdpa(
1077 attention_mask, inputs_embeds=input_tensor, past_key_values_length=past_seen_tokens
1078 ):
1079 return None
1081 dtype, device = input_tensor.dtype, input_tensor.device
AttributeError: type object 'AttentionMaskConverter' has no attribute '_ignore_causal_mask_sdpa'
it works after reflash and restarted
我也是这个问题,我重新跑了代码也还是相同的问题。
我也是这个问题,我重新跑了代码也还是相同的问题。
解决了吗?怎么做
我也是这个问题,我重新跑了代码也还是相同的问题。
解决了吗?怎么做
transformers库升级到最新的4.40.2就好了。