lidiya commited on
Commit
387b4e5
1 Parent(s): e9ca184

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ widget:
4
+ - text: Robert Boyle \\n In the late 17th century, Robert Boyle proved that air is necessary for combustion.
5
+ ---
6
+
7
+ # MixQG (base-sized model)
8
+
9
+ MixQG is a new question generation model pre-trained on a collection of QA datasets with a mix of answer types. It was introduced in the paper [MixQG: Neural Question Generation with Mixed Answer Types](https://arxiv.org/abs/2110.08175) and the associated code is released in [this](https://github.com/salesforce/QGen) repository.
10
+
11
+ ### How to use
12
+ Using Huggingface pipeline abstraction:
13
+ ```
14
+ from transformers import pipeline
15
+
16
+ nlp = pipeline("text2text-generation", model='Salesforce/mixqg-base', tokenizer='Salesforce/mixqg-base')
17
+
18
+ CONTEXT = "In the late 17th century, Robert Boyle proved that air is necessary for combustion."
19
+ ANSWER = "Robert Boyle"
20
+
21
+ def format_inputs(context: str, answer: str):
22
+ return f"{answer} \\n {context}"
23
+
24
+ text = format_inputs(CONTEXT, ANSWER)
25
+
26
+ nlp(text)
27
+ # should output [{'generated_text': 'Who proved that air is necessary for combustion?'}]
28
+ ```
29
+
30
+ Using the pre-trained model directly:
31
+ ```
32
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
33
+
34
+ tokenizer = AutoTokenizer.from_pretrained('Salesforce/mixqg-base')
35
+ model = AutoModelForSeq2SeqLM.from_pretrained('Salesforce/mixqg-base')
36
+
37
+ CONTEXT = "In the late 17th century, Robert Boyle proved that air is necessary for combustion."
38
+ ANSWER = "Robert Boyle"
39
+
40
+ def format_inputs(context: str, answer: str):
41
+ return f"{answer} \\n {context}"
42
+
43
+ text = format_inputs(CONTEXT, ANSWER)
44
+
45
+ input_ids = tokenizer(text, return_tensors="pt").input_ids
46
+ generated_ids = model.generate(input_ids, max_length=32, num_beams=4)
47
+ output = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
48
+ print(output)
49
+ # should output "Who proved that air is necessary for combustion?"
50
+ ```
51
+
52
+ ### Citation
53
+ ```
54
+ @misc{murakhovska2021mixqg,
55
+ title={MixQG: Neural Question Generation with Mixed Answer Types},
56
+ author={Lidiya Murakhovs'ka and Chien-Sheng Wu and Tong Niu and Wenhao Liu and Caiming Xiong},
57
+ year={2021},
58
+ eprint={2110.08175},
59
+ archivePrefix={arXiv},
60
+ primaryClass={cs.CL}
61
+ }
62
+ ```