multiquery attention
#46
by
ZhongYingMatrix
- opened
Hi, thank you for your excellent work. I noticed the implementation of multiquery attention in https://huggingface.co/blog/falcon, but I am unable to locate it in the source code. Can you please provide me with guidance on how to find it?
All model-related code is in the modelling_RW.py file.
FalconLLM
changed discussion status to
closed