Can use torch for attention implementation?
#8
by
LouiSum
- opened
Currently the LM is using Triton for attention implementation. Can we change it in config to torch?
Yes, the model support torch
or triton
for the attn_impl
kwarg, and 'torch' is the default.
So just don't pass in the attn_impl
kwarg to the AutoModelFromCausalLM.from_pretrained
call and it will default to using the attention implementation in torch!
Lmk if you continue to have trouble with this!
It works. Thanks
madhavatreplit
changed discussion status to
closed