vivo-ai
/

BlueLM-7B-Base-32K

+---
+license: other
+language:
+- zh
+- en
+---
+# BlueLM
+<p align="center">
+🖥 <a href="https://github.com/vivo-ai-lab/BlueLM" target="_blank">github</a>  • 📜 <a href="https://huggingface.co/vivo-ai/BlueLM-7B-Base/blob/main/MODEL_LICENSE" target="_blank">LICENSE</a> • 🎯 <a href="https://developers.vivo.com/product/ai/bluelm" target="_blank">vivo Developers</a> • 🗨 <a href="https://github.com/vivo-ai-lab/BlueLM/blob/main/resources/wechat.png" target="_blank">WeChat</a>
+</p>
+## 模型介绍/Introduction
+BlueLM 是由 vivo AI 全球研究院自主研发的大规模预训练语言模型，本次发布包含 7B 基础模型和 7B 对话模型，同时我们开源了支持 **32K** 的长文本基础模型和对话模型。
+- **更大量的优质数据**：高质量语料库进行训练，规模达到了 **2.6 万亿** 的 token 数，该语料库包含中文、英文以及少量日韩数据。
+- **更优的效果**：其中 BlueLM-7B-Chat 在 **C-Eval** 和 **CMMLU** 上均取得领先结果，对比同尺寸开源模型中具有较强的竞争力。
+- **长文本支持**：BlueLM-7B-Base-32K 和 BlueLM-7B-Chat-32K 均支持 **32K** 长文本，在保持基础能力相当情况下，能够支持更长上下文理解。
+- **协议说明**：BlueLM 系列欢迎开发者进行学术研究和商业应用。
+BlueLM is a large-scale open-source language model independently developed by the vivo AI Lab. This release includes 2K and 32K context length versions for both Base and Chat models.
+- **High-quality Data**: BlueLM is trained on a high-quality data with 2.6 trillion tokens. Our train corpus mainly consists of Chinese and English data, with a small amount of Japanese and Korean data.
+- **Stronger Performance**: BlueLM-7B-Chat achieves a strong competitive performance in C-Eval and CMMLU benchmarks of the same size.
+- **Longer Context**: We have extended the context length of both BlueLM-7B-Base-32K and BlueLM-7B-Chat-32K models from 2K to 32K. The models can support longer context understanding while maintaining the same basic capabilities.
+- **Model License**: BlueLM weights are open for academic research and commercial use.
+本次发布基座模型下载链接见：
+The release versions and hugging face download links are listed in the table below:
+|     |          Base Model        |          Chat Model        |       4bits Quantized Chat Model        |
+|:---:|:--------------------:|:--------------------:|:--------------------------:|
+| 7B-2k  | [BlueLM-7B-Base](https://huggingface.co/vivo-ai/BlueLM-7B-Base)  | [BlueLM-7B-Chat](https://huggingface.co/vivo-ai/BlueLM-7B-Chat)  | [BlueLM-7B-Chat-4bits](https://huggingface.co/vivo-ai/BlueLM-7B-Chat-4bits)  |
+| 7B-32K | [BlueLM-7B-Base-32K](https://huggingface.co/vivo-ai/BlueLM-7B-Base-32K) | [BlueLM-7B-Chat-32K](https://huggingface.co/vivo-ai/BlueLM-7B-Chat-32K) | - |
+## 评测结果/Benchmark Results
+我们在 LongBench 评测集上对我们的 BlueLM-7B-Chat-32K 模型进行了测试，具体结果如下表所示：
+We tested our BlueLM-7B-Chat-32K  on the LongBench dataset and the results are shown in the table below:
+| Model                 | Average   | Summary  | Single-Doc QA | Multi-Doc QA  | Code  | Few-shot | Synthetic |
+|:----------------------|:-----|:---------|:--------------|:--------------|:------|:---------|:----------|
+| BlueLM-7B-Chat-32K    | 41.2 | 18.8     | 35.6          | 36.2          | 54.2  | 56.9     | 45.5      |
+## 推理部署/Inference and Deployment
+```python
+>>> from transformers import AutoModelForCausalLM, AutoTokenizer
+>>> tokenizer = AutoTokenizer.from_pretrained("vivo-ai/BlueLM-7B-Base-32K", trust_remote_code=True, use_fast=False)
+>>> model = AutoModelForCausalLM.from_pretrained("vivo-ai/BlueLM-7B-Base-32K", device_map="cuda:0", torch_dtype=torch.bfloat16, trust_remote_code=True)
+>>> model = model.eval()
+>>> inputs = tokenizer("儒林外史->吴敬梓\n隋唐演义->褚人获\n红楼梦->", return_tensors="pt")
+>>> inputs = inputs.to("cuda:0")
+>>> pred = model.generate(**inputs, max_new_tokens=64, repetition_penalty=1.1)
+>>> print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
+儒林外史->吴敬梓
+隋唐演义->褚人获
+红楼梦->曹雪芹
+三国演义->罗贯中
+水浒传->施耐庵
+西游记->吴承恩
+聊斋志异->蒲松龄
+金瓶梅->兰陵笑笑生
+封神演义->许仲琳
+三言二拍->冯梦龙
+东周列国志->冯梦龙
+```
+更多使用说明，请参考我们的 [Github 仓库](https://github.com/vivo-ai-lab/BlueLM)。
+For more instructions, please refer to our [Github Repo](https://github.com/vivo-ai-lab/BlueLM).
+## 协议/License
+社区使用代码依照 [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) 协议开源，且使用 BlueLM 模型权重需要遵循 [vivo_BlueLM模型许可协议](https://huggingface.co/vivo-ai/BlueLM-7B-Base/blob/main/MODEL_LICENSE)。
+Our code is licensed under the [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) and [Community License for BlueLM Model](https://huggingface.co/vivo-ai/BlueLM-7B-Base/blob/main/MODEL_LICENSE).