Can I use any Inference Engine(like vllm、ollama) applicable to qwen2.5 to infer Athene-V2-Chat?
#5
by
wangdafa
- opened
As you introduced, Athene-V2-Chat is based on qwen2.5, with the same architecture and chat template. For infer, it is actually equivalent to a variant of qwen2.5.
Athene-V2-Chat shares the same model architecture as Qwen-2.5-72B-Instruct, the base model from which Athene-V2 was further tuned. An inference engine that supports Qwen-2.5-72B-Instruct should also be compatible with Athene-V2-Chat.