Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,42 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: en
|
3 |
+
tags:
|
4 |
+
- t5
|
5 |
+
datasets:
|
6 |
+
- squad
|
7 |
+
license: mit
|
8 |
+
---
|
9 |
+
|
10 |
+
# Question Generation Model
|
11 |
+
|
12 |
+
## Fine-tuning Dataset
|
13 |
+
|
14 |
+
SQuAD 1.1
|
15 |
+
|
16 |
+
## Demo
|
17 |
+
|
18 |
+
https://huggingface.co/Sehong/t5-large-QuestionGeneration
|
19 |
+
|
20 |
+
## How to use
|
21 |
+
|
22 |
+
```python
|
23 |
+
import torch
|
24 |
+
from transformers import PreTrainedTokenizerFast
|
25 |
+
from transformers import T5ForConditionalGeneration
|
26 |
+
|
27 |
+
tokenizer = PreTrainedTokenizerFast.from_pretrained('Sehong/t5-large-QuestionGeneration')
|
28 |
+
model = T5ForConditionalGeneration.from_pretrained('Sehong/t5-large')
|
29 |
+
|
30 |
+
text = "Saint Bern ##ade ##tte So ##ubi ##rous [SEP] Architectural ##ly , the school has a Catholic character . At ##op the Main Building ' s gold dome is a golden statue of the Virgin Mary . Immediately in front of the Main Building and facing it , is a copper statue of Christ with arms up ##rai ##sed with the legend "" V ##eni ##te Ad Me O ##m ##nes "" . Next to the Main Building is the Basilica of the Sacred Heart . Immediately behind the b ##asi ##lica is the G ##rot ##to , a Marian place of prayer and reflection . It is a replica of the g ##rot ##to at Lou ##rdes , France where the Virgin Mary reputed ##ly appeared to Saint Bern ##ade ##tte So ##ubi ##rous in 1858 . At the end of the main drive ( and in a direct line that connects through 3 statues and the Gold Dome ) , is a simple , modern stone statue of Mary ."
|
31 |
+
|
32 |
+
raw_input_ids = tokenizer.encode(text)
|
33 |
+
input_ids = [tokenizer.bos_token_id] + raw_input_ids + [tokenizer.eos_token_id]
|
34 |
+
|
35 |
+
summary_ids = model.generate(torch.tensor([input_ids]))
|
36 |
+
|
37 |
+
decode = tokenizer.decode(summary_ids.squeeze().tolist(), skip_special_tokens=True)
|
38 |
+
|
39 |
+
decode = decode.replace(' # # ', '').replace(' ', ' ').replace(' ##', '')
|
40 |
+
|
41 |
+
print(decode)
|
42 |
+
```
|