openbmb
/

MiniCPM3-RAG-LoRA

PEFT

Safetensors

Chinese

English

custom_code

Model card Files Files and versions Community

Kaguya-19 commited on Sep 5, 2024

Commit

be8c7be

verified ·

1 Parent(s): ef347f9

Update README.md

Browse files

Files changed (1) hide show

README.md +100 -162

README.md CHANGED Viewed

@@ -1,202 +1,140 @@
 ---
 base_model: openbmb/MiniCPM3-4B
 library_name: peft
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]
-### Framework versions
-- PEFT 0.12.0

 ---
 base_model: openbmb/MiniCPM3-4B
 library_name: peft
+license: apache-2.0
+language:
+- zh
+- en
 ---
+## MiniCPM3-RAG-LoRA
+**MiniCPM3-RAG-LoRA** 由面壁智能与清华大学自然语言处理实验室（THUNLP）共同开发，采用直接偏好优化（DPO）方法对 [MiniCPM3](https://huggingface.co/openbmb/MiniCPM3-4B) 进行 LoRA 微调，仅基于两万余条开放域问答和逻辑推理任务的开源数据，在通用评测数据集上实现了模型性能平均提升 13%。
+欢迎关注 `MiniCPM3` 与 RAG 套件系列：
+- 生成模型：[MiniCPM3](https://huggingface.co/openbmb/MiniCPM3-4B)
+- 检索模型：[RankCPM-E](https://huggingface.co/openbmb/RankCPM-E)
+- 重排模型：[RankCPM-R](https://huggingface.co/openbmb/RankCPM-R)
+- 面向 RAG 场景的 LoRA 插件：[MiniCPM3-RAG-LoRA](https://huggingface.co/openbmb/MiniCPM3-RAG-LoRA)
+**MiniCPM3-RAG-LoRA** developed by ModelBest Inc. and THUNLP, utilizes the Direct Preference Optimization (DPO) method to fine-tune [MiniCPM3](https://huggingface.co/openbmb/MiniCPM3-4B) with LoRA. By training on just over 20,000 open-source data points from open-domain question answering and logical reasoning tasks, the model achieved an average performance improvement of 13% on general benchmark datasets.
+We also invite you to explore MiniCPM3 and the RAG toolkit series:
+- Generation Model: [MiniCPM3](https://huggingface.co/openbmb/MiniCPM3-4B)
+- Retrieval Model: [RankCPM-E](https://huggingface.co/openbmb/RankCPM-E)
+- Re-ranking Model: [RankCPM-R](https://huggingface.co/openbmb/RankCPM-R)
+- LoRA Plugin for RAG scenarios: [MiniCPM3-RAG-LoRA](https://huggingface.co/openbmb/MiniCPM3-RAG-LoRA)
+## 模型信息 Model Information
+- 模型大小：4B
+- Model Size: 2.4B
+## 模型使用 Usage
+### 输入格式 Input Format
+MiniCPM3-RAG-LoRA 模型遵循格式如下：
+MiniCPM3-RAG-LoRA supports instructions in the following format:
+```
+Background: {{ passages }} Query: {{ query }}
+```
+例如：
+For example:
+```
+Background:
+["In the novel 'The Silent Watcher,' the lead character is named Alex Carter. Alex is a private detective who uncovers a series of mysterious events in a small town.",
+"Set in a quiet town, 'The Silent Watcher' follows Alex Carter, a former police officer turned private investigator, as he unravels the town's dark secrets.",
+"'The Silent Watcher' revolves around Alex Carter's journey as he confronts his past while solving complex cases in his hometown."]
+Query:
+"What is the name of the lead character in the novel 'The Silent Watcher'?"
+```
+### 环境要求 Requirements
+```
+transformers>=4.36.0
+```
+### 示例脚本 Demo
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+torch.manual_seed(0)
+path = 'openbmb/MiniCPM3-RAG-LoRA'
+tokenizer = AutoTokenizer.from_pretrained(path)
+model = AutoModelForCausalLM.from_pretrained(path, torch_dtype=torch.bfloat16, device_map='cuda', trust_remote_code=True)
+passages = ["In the novel 'The Silent Watcher,' the lead character is named Alex Carter. Alex is a private detective who uncovers a series of mysterious events in a small town.",
+"Set in a quiet town, 'The Silent Watcher' follows Alex Carter, a former police officer turned private investigator, as he unravels the town's dark secrets.",
+"'The Silent Watcher' revolves around Alex Carter's journey as he confronts his past while solving complex cases in his hometown."]
+query = "What is the name of the lead character in the novel 'The Silent Watcher'?"
+input_text = 'Background:\n' + str(passages) + '\n\n' + 'Query:\n' + str(query) + '\n\n'
+messages = [
+    {"role": "system", "content": "You are a helpful assistant."},
+    {"role": "user", "content": input_text},
+]
+prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
+outputs = model.chat(tokenizer, prompt, temperature=0.8, top_p=0.8)
+print(outputs[0])  # The lead character in the novel 'The Silent Watcher' is named Alex Carter.
+```
+## 实验结果 Evaluation Results
+经过针对RAG场景的LoRA训练后，MiniCPM3-RAG-LoRA在开放域问答（NQ、TQA、MARCO）、多跳问答（HotpotQA）、对话（WoW）、事实核查（FEVER）和信息填充（T-REx）等多项任务上的性能表现，超越Llama3-8B和Baichuan2-13B等业内优秀模型。
+After being fine-tuned with LoRA for RAG scenarios, MiniCPM3-RAG-LoRA outperforms leading industry models like Llama3-8B and Baichuan2-13B across various tasks, including open-domain question answering (NQ, TQA, MARCO), multi-hop question answering (HotpotQA), dialogue (WoW), fact checking (FEVER), and information filling (T-REx).
+|                   | NQ(Acc) | TQA(Acc) | MARCO(ROUGE) | HotpotQA(Acc) | WoW(F1) | FEVER(Acc) | T-REx(Acc) |
+| :---------------: | :-----: | :------: | :----------: | :-----------: | :-----: | :--------: | :--------: |
+|     Llama3-8B     |  45.36  |  83.15   |    20.81     |     28.52     |  10.96  |   78.08    |   26.62    |
+|   Baichuan2-13B   |  43.36  |  77.76   |    14.28     |     27.59     |  13.34  |   31.37    |   27.46    |
+|     MiniCPM3      |  43.21  |  80.77   |    16.06     |     26.00     |  14.60  |   87.22    |   26.26    |
+| MiniCPM3-RAG-LoRA |  48.36  |  82.40   |    27.68     |     31.61     |  16.29  |   85.81    |   40.76    |
+## 许可证 License
+- 本仓库中代码依照 [Apache-2.0 协议](https://github.com/OpenBMB/MiniCPM/blob/main/LICENSE)开源。
+- RankCPM-R 模型权重的使用则需要遵循 [MiniCPM 模型协议](https://github.com/OpenBMB/MiniCPM/blob/main/MiniCPM%20Model%20License.md)。
+- RankCPM-R 模型权重对学术研究完全开放。如需将模型用于商业用途，请填写[此问卷](https://modelbest.feishu.cn/share/base/form/shrcnpV5ZT9EJ6xYjh3Kx0J6v8g)。
+* The code in this repo is released under the [Apache-2.0](https://github.com/OpenBMB/MiniCPM/blob/main/LICENSE) License.
+* The usage of RankCPM-R model weights must strictly follow [MiniCPM Model License.md](https://github.com/OpenBMB/MiniCPM/blob/main/MiniCPM%20Model%20License.md).
+* The models and weights of RankCPM-R are completely free for academic research. After filling out a ["questionnaire"](https://modelbest.feishu.cn/share/base/form/shrcnpV5ZT9EJ6xYjh3Kx0J6v8g) for registration, RankCPM-R weights are also available for free commercial use.
+<!-- ### 测试集介绍：
+- **Natural Questions (NQ, Accuracy):**
+  - **简介**: Natural Questions 是一个开放域问答数据集，由真实用户在Google搜索中提出的问题组成。数据集中每个问题都有一个长文档作为上下文，并包含短答案和长答案。
+  - **评价指标**: 准确率（Accuracy）用于衡量模型是否能够正确地识别出与问题相关的短答案。
+- **TriviaQA (TQA, Accuracy):**
+  - **简介:** TriviaQA 是一个涵盖广泛主题的问答数据集，问题和答案从各类问答网站和百科全书中收集而来。
+  - **评价指标:** 准确率（Accuracy）用于衡量模型能否正确地回答这些问题。
+- **MS MARCO (ROUGE):**
+  - **简介:** MS MARCO 是一个大规模的开放域问答数据集，主要由Bing搜索引擎用户的查询和相应的答案组成。数据集包含简短答案和相关段落，广泛用于信息检索和生成任务。由于MS MARCO数据集规模庞大，我们从中选取了3000条数据进行本次评测。
+  - **评价指标:** ROUGE 用于评估模型生成的答案与参考答案之间的重叠程度，衡量生成答案的质量。
+- **HotpotQA (Accuracy):**
+  - **简介:** HotpotQA 是一个多跳问答数据集，要求模型通过跨越多个文档的推理来回答复杂问题。该数据集不仅测试模型的答案生成能力，还考察其推理过程的可解释性。
+  - **评价指标:** 准确率（Accuracy）用于衡量模型能否正确地回答需要多跳推理的问题。
+- **Wizard of Wikipedia (WoW, F1 Score):**
+  - **简介:** Wizard of Wikipedia 是一个对话数据集，专注于知识型对话场景，要求模型能够在对话中生成与主题相关的、丰富的信息，每个对话轮次都有对应的知识库条目作为支持。
+  - **评价指标:** F1 值用于衡量模型生成的回答与参考答案在词级别上的重合情况，评估回答的准确性和全面性。
+- **FEVER (Accuracy):**
+  - **简介:** FEVER 是一个事实核查数据集，包含大量的陈述句，模型需要根据给定的文档来判断这些陈述句是否为真或假，该数据集旨在测试模型的事实核查能力。
+  - **评价指标:** 准确率（Accuracy）用于评估模型在判断陈述句的真实性方面的表现。
+- **T-REx (Accuracy):**
+  - **简介:** T-REx 是一个知识库槽填充数据集，包含从维基百科中提取的实体-关系对。模型需要根据上下文信息填充缺失的槽值，测试其对知识库关系的理解和填充能力。
+  - **评价指标:** 准确率（Accuracy）用于衡量模型在正确填充缺失槽值方面的表现。 -->