mingfengxue commited on
Commit
76dfd8c
·
1 Parent(s): 43c2ea7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -0
README.md CHANGED
@@ -1,3 +1,35 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ datasets:
4
+ - OFA-Sys/OccuQuest
5
  ---
6
+
7
+ This is the ProLLaMA-7B model in [OccuQuest: Mitigating Occupational Bias for Inclusive Large Language Models](https://arxiv.org/abs/2310.16517).
8
+
9
+ The dataset is on [OccuQuest](https://huggingface.co/datasets/OFA-Sys/OccuQuest).
10
+
11
+ Abstract:
12
+ The emergence of large language models (LLMs) has revolutionized natural language processing tasks.
13
+ However, existing instruction-tuning datasets suffer from occupational bias: the majority of data relates to only a few occupations, which hampers the instruction-tuned LLMs to generate helpful responses to professional queries from practitioners in specific fields.
14
+ To mitigate this issue and promote occupation-inclusive LLMs, we create an instruction-tuning dataset named OccuQuest, which contains 110,000+ prompt-completion pairs and 30,000+ dialogues covering over 1,000 occupations in 26 occupational categories.
15
+ We systematically request ChatGPT, organizing queries hierarchically based on Occupation, Responsibility, Topic, and Question, to ensure a comprehensive coverage of occupational specialty inquiries.
16
+ By comparing with three commonly used datasets (Dolly, ShareGPT, and WizardLM), we observe that OccuQuest exhibits a more balanced distribution across occupations.
17
+ Furthermore, we assemble three test sets for comprehensive evaluation, an occu-test set covering 25 occupational categories, an estate set focusing on real estate, and an occu-quora set containing real-world questions from Quora.
18
+ We then fine-tune LLaMA on OccuQuest to obtain OccuLLaMA, which significantly outperforms state-of-the-art LLaMA variants (Vicuna, Tulu, and WizardLM) on professional questions in GPT-4 and human evaluations.
19
+ Notably, on the occu-quora set, OccuLLaMA reaches a high win rate of 86.4\% against WizardLM.
20
+ Furthermore, we demonstrate the potential of combining OccuQuest with other instruction-tuning datasets to enhance the overall performance of LLMs.
21
+ By fine-tuning LLaMA on a mixture of OccuQuest and Tulu datasets, we introduce ProLLaMA, which excels in addressing occupational questions and exhibits superior performance in comprehensive evaluations such as MMLU, GSM8K, BBH, and HumanEval.
22
+ Among the different LLaMA variants, the 7B and 13B ProLLaMA models achieve the highest performance on MMLU and GSM8K, with the 7B ProLLaMA model demonstrating an improvement of more than 4 points over the other 7B variants on GSM8K.
23
+ We open release the dataset and models.
24
+
25
+ Please cite if you use this model:
26
+ ```
27
+ @misc{xue2023occuquest,
28
+ title={OccuQuest: Mitigating Occupational Bias for Inclusive Large Language Models},
29
+ author={Mingfeng Xue and Dayiheng Liu and Kexin Yang and Guanting Dong and Wenqiang Lei and Zheng Yuan and Chang Zhou and Jingren Zhou},
30
+ year={2023},
31
+ eprint={2310.16517},
32
+ archivePrefix={arXiv},
33
+ primaryClass={cs.CL}
34
+ }
35
+ ```