llama2 weights used?

by KnutJaegersberg - opened Dec 13, 2023

Dec 13, 2023

Wondering if you used llama2 weights or only the llama model architecture.

Dec 13, 2023

Does it have grouped query attention? It's a huge deal as it saves a ton of context related memory.

hunkim

upstage org Dec 30, 2023

We only used the Llama architecture and Mistral weight. For more details, please check out the paper at https://huggingface.co/papers/2312.15166. 😊

hunkim changed discussion status to closed Dec 30, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment