Question About MLA Usage?

by splendor1811 - opened 4 days ago

4 days ago

I’m a bit confused about the flow in which MLA is used. When you load the model using AutoModelForCausalLM, as shown in your README.md file, Hugging Face automatically uses the original LLaMA architecture to load the model (which would mean GQA instead of MLA). I’d like to ask: when you converted GQA to MLA, did you use the code processing from the paper's GitHub repository?

BarraHome

Owner 3 days ago

@splendor1811 Hello, The normal transformer library doesn't support it, I made this for the convert process https://github.com/bet0x/transmla-converter you can also check the code from the original paper https://github.com/fxmeng/TransMLA.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment