Can you make the same with 3B openllama model with some additional attention heads from Llama 7b? something like that idk
· Sign up or log in to comment