language:
- en
- zh
tags:
- llama
- llama2
- qwen
license: gpl-3.0
Given the discontinuation of the Qwen model, I will provisionally assign the license for this model as GPL-3.0. It should be noted that the weights and tokenizer utilized in this model diverge from those of the Qwen model. The inference code employed originates from Meta LLaMA / Hugging Face Transformers. The inclusion of "qwen" in the repository name bears no significance and any similarity to other entities or concepts is purely coincidental.
Advance notice regarding the deletion of Qwen:
I remain unaware as to the reasons behind Qwen's deletion. Should this repository be found in violation of any terms stipulated by Qwen that necessitate its removal, I earnestly request you to establish contact with me. I pledge to expunge all references to Qwen and maintain the tokenizer and associated weights as an autonomous model, inherently distinct from Qwen. I will then proceed to christen this model with a new identifier.
对于通义千问删除的事先说明:
我尚不清楚通义千问被删除的原因。如果此仓库违反了任何由通义千问提出的需要进行移除的条款,我真诚地请求您与我取得联系。我承诺将清除所有与通义千问/Qwen的相关引用,并将分词器和相关权重作为与通义千问本质上不同的独立模型进行维护。然后,我将为这个模型取一个新的名字。
This is the LLaMAfied version of Qwen/Qwen-VL-Chat, recalibrated to fit the original LLaMA/LLaMA-2-like model structure.
You can use LlamaForCausalLM for model inference, which is the same as LLaMA/LLaMA-2 models (using GPT2Tokenizer converted from the original tiktoken, by vonjack).
The model has been edited to be white-labelled, meaning the model will no longer call itself a Qwen.
Up until now, the model has undergone numerical alignment of weights and preliminary reinforcement learning in order to align with the original model. Some errors and outdated knowledge have been addressed through model editing methods. This model remains completely equivalent to the original version, without having any dedicated supervised finetuning on downstream tasks or other extensive conversation datasets.
PROMPT FORMAT: chatml