update readme
Browse files- README.md +2 -2
- modeling_qwen.py +5 -0
README.md
CHANGED
@@ -24,11 +24,11 @@ inference: false
|
|
24 |
|
25 |
## 介绍(Introduction)
|
26 |
|
27 |
-
**通义千问-14B(Qwen-14B)**是阿里云研发的通义千问大模型系列的140亿参数规模的模型。Qwen-14B是基于Transformer的大语言模型, 在超大规模的预训练数据上进行训练得到。预训练数据类型多样,覆盖广泛,包括大量网络文本、专业书籍、代码等。同时,在Qwen-14B的基础上,我们使用对齐机制打造了基于大语言模型的AI助手Qwen-14B-Chat。本仓库为Qwen-14B-Chat的Int4
|
28 |
|
29 |
如果您想了解更多关于通义千问-14B开源模型的细节,我们建议您参阅[Github代码库](https://github.com/QwenLM/Qwen)。
|
30 |
|
31 |
-
**Qwen-14B** is the 14B-parameter version of the large language model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud. Qwen-14B is a Transformer-based large language model, which is pretrained on a large volume of data, including web texts, books, codes, etc. Additionally, based on the pretrained Qwen-14B, we release Qwen-14B-Chat, a large-model-based AI assistant, which is trained with alignment techniques. This repository is the one for the Int4 quantized model of Qwen-14B-Chat.
|
32 |
|
33 |
For more details about the open-source model of Qwen-14B, please refer to the [Github](https://github.com/QwenLM/Qwen) code repository.
|
34 |
<br>
|
|
|
24 |
|
25 |
## 介绍(Introduction)
|
26 |
|
27 |
+
**通义千问-14B(Qwen-14B)**是阿里云研发的通义千问大模型系列的140亿参数规模的模型。Qwen-14B是基于Transformer的大语言模型, 在超大规模的预训练数据上进行训练得到。预训练数据类型多样,覆盖广泛,包括大量网络文本、专业书籍、代码等。同时,在Qwen-14B的基础上,我们使用对齐机制打造了基于大语言模型的AI助手Qwen-14B-Chat。本仓库为Qwen-14B-Chat的Int4量化模型的仓库。
|
28 |
|
29 |
如果您想了解更多关于通义千问-14B开源模型的细节,我们建议您参阅[Github代码库](https://github.com/QwenLM/Qwen)。
|
30 |
|
31 |
+
**Qwen-14B** is the 14B-parameter version of the large language model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud. Qwen-14B is a Transformer-based large language model, which is pretrained on a large volume of data, including web texts, books, codes, etc. Additionally, based on the pretrained Qwen-14B, we release Qwen-14B-Chat, a large-model-based AI assistant, which is trained with alignment techniques. This repository is the one for the Int4 quantized model of Qwen-14B-Chat.
|
32 |
|
33 |
For more details about the open-source model of Qwen-14B, please refer to the [Github](https://github.com/QwenLM/Qwen) code repository.
|
34 |
<br>
|
modeling_qwen.py
CHANGED
@@ -861,6 +861,11 @@ class QWenLMHeadModel(QWenPreTrainedModel):
|
|
861 |
assert (
|
862 |
config.bf16 + config.fp16 + config.fp32 <= 1
|
863 |
), "Only one of \"bf16\", \"fp16\", \"fp32\" can be true"
|
|
|
|
|
|
|
|
|
|
|
864 |
|
865 |
autoset_precision = config.bf16 + config.fp16 + config.fp32 == 0
|
866 |
|
|
|
861 |
assert (
|
862 |
config.bf16 + config.fp16 + config.fp32 <= 1
|
863 |
), "Only one of \"bf16\", \"fp16\", \"fp32\" can be true"
|
864 |
+
logger.warn(
|
865 |
+
"Warning: please make sure that you are using the latest codes and checkpoints, "
|
866 |
+
"especially if you used Qwen-7B before 09.25.2023."
|
867 |
+
"请使用最新模型和代码,尤其如果你在9月25日前已经开始使用Qwen-7B,千万注意不要使用错误代码和模型。"
|
868 |
+
)
|
869 |
|
870 |
autoset_precision = config.bf16 + config.fp16 + config.fp32 == 0
|
871 |
|