sefinch commited on
Commit
7584814
1 Parent(s): acd5b1c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +107 -6
README.md CHANGED
@@ -10,7 +10,11 @@ tags:
10
 
11
  # Model Card for ConvoSenseGenerator
12
 
13
- ConvoSenseGenerator is a generative model that produces commonsense inferences for dialogue contexts, covering 10 common social commonsense types such as emotional reactions, motivations, causes, subsequent events, and more! It is trained on the large-scale dataset, ConvoSense, that is collected synthetically using ChatGPT3.5. ConvoSenseGenerator produces inferences that humans judge to achieve high reasonability, high rates of novel information for the corresponding dialogue contexts, and high degree of detail, outperforming models trained on previous datasets that are human-written.
 
 
 
 
14
 
15
  ## Model Description
16
  - **Repository:** [Code](https://github.com/emorynlp/convosense)
@@ -19,20 +23,117 @@ ConvoSenseGenerator is a generative model that produces commonsense inferences f
19
 
20
  ## Model Training
21
 
22
- ConvoSenseGenerator is trained on our recent dataset: 🥤[ConvoSense](https://huggingface.co/datasets/allenai/soda).
23
  The backbone model of ConvoSenseGenerator is [T5-3b](https://huggingface.co/t5-3b).
24
 
25
  ### How to use
26
 
27
- Below is a simple code snippet to get ConvoSenseGenerator running :)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
  ```python
30
  import torch
31
- from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
 
32
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
33
  tokenizer = AutoTokenizer.from_pretrained("sefinch/ConvoSenseGenerator")
34
- model = AutoModelForSeq2SeqLM.from_pretrained("sefinch/ConvoSenseGenerator").to(device)
35
- ...
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
  ```
37
 
38
  ### Citation
 
10
 
11
  # Model Card for ConvoSenseGenerator
12
 
13
+ ConvoSenseGenerator is a generative model that produces commonsense inferences for dialogue contexts, covering 10 common social commonsense types such as emotional reactions, motivations, causes, subsequent events, and more!
14
+
15
+ It is trained on the large-scale dataset, ConvoSense, that is collected synthetically using ChatGPT3.5.
16
+
17
+ ConvoSenseGenerator produces inferences that humans judge to achieve high reasonability, high rates of novel information for the corresponding dialogue contexts, and high degree of detail, outperforming models trained on previous datasets that are human-written.
18
 
19
  ## Model Description
20
  - **Repository:** [Code](https://github.com/emorynlp/convosense)
 
23
 
24
  ## Model Training
25
 
26
+ ConvoSenseGenerator is trained on our recent dataset: ConvoSense.
27
  The backbone model of ConvoSenseGenerator is [T5-3b](https://huggingface.co/t5-3b).
28
 
29
  ### How to use
30
 
31
+ ConvoSenseGenerator covers the following commonsense types, using the provided questions:
32
+
33
+ ```python
34
+ commonsense_questions = {
35
+ "cause": 'What could have caused the last thing said to happen?',
36
+ "prerequisities": 'What prerequisites are required for the last thing said to occur?',
37
+ "motivation": 'What is an emotion or human drive that motivates Speaker based on what they just said?',
38
+ "subsequent": 'What might happen after what Speaker just said?',
39
+ "desire": 'What does Speaker want to do next?',
40
+ "desire_o": 'What will Listener want to do next based on what Speaker just said?',
41
+ "react": 'How is Speaker feeling after what they just said?',
42
+ "react_o": 'How does Listener feel because of what Speaker just said?',
43
+ "attribute": 'What is a likely characteristic of Speaker based on what they just said?',
44
+ "constituents": 'What is a breakdown of the last thing said into a series of required subevents?'
45
+ }
46
+ ```
47
+
48
+ The best-performing configuration of ConvoSenseGenerator according to the experiments in the paper uses the following generation hyperparameters:
49
+
50
+ ```python
51
+ generation_config = {
52
+ "repetition_penalty": 1.0,
53
+ "num_beams": 10,
54
+ "num_beam_groups": 10,
55
+ "diversity_penalty": 0.5
56
+ }
57
+ ```
58
+
59
+ Below is a simple code snippet to get ConvoSenseGenerator running:
60
 
61
  ```python
62
  import torch
63
+ from transformers import AutoTokenizer, T5ForConditionalGeneration
64
+
65
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
66
  tokenizer = AutoTokenizer.from_pretrained("sefinch/ConvoSenseGenerator")
67
+ model = T5ForConditionalGeneration.from_pretrained("sefinch/ConvoSenseGenerator").to(device)
68
+
69
+ # ConvoSenseGenerator covers these commonsense types, using the provided questions
70
+ commonsense_questions = {
71
+ "cause": 'What could have caused the last thing said to happen?',
72
+ "prerequisities": 'What prerequisites are required for the last thing said to occur?',
73
+ "motivation": 'What is an emotion or human drive that motivates Speaker based on what they just said?',
74
+ "subsequent": 'What might happen after what Speaker just said?',
75
+ "desire": 'What does Speaker want to do next?',
76
+ "desire_o": 'What will Listener want to do next based on what Speaker just said?',
77
+ "react": 'How is Speaker feeling after what they just said?',
78
+ "react_o": 'How does Listener feel because of what Speaker just said?',
79
+ "attribute": 'What is a likely characteristic of Speaker based on what they just said?',
80
+ "constituents": 'What is a breakdown of the last thing said into a series of required subevents?'
81
+ }
82
+
83
+ def format_input(conversation_history, commonsense_type):
84
+
85
+ # prefix last turn with Speaker, and alternately prefix each previous turn with either Listener or Speaker
86
+ prefixed_turns = list(
87
+ reversed(
88
+ [
89
+ f"{'Speaker' if i % 2 == 0 else 'Listener'}: {u}"
90
+ for i, u in enumerate(reversed(conversation_history))
91
+ ]
92
+ )
93
+ )
94
+
95
+ # model expects a maximum of 7 total conversation turns to be given
96
+ truncated_turns = prefixed_turns[-7:]
97
+
98
+ # conversation representation separates the turns with newlines
99
+ conversation_string = '\n'.join(truncated_turns)
100
+
101
+ # format the full input including the commonsense question
102
+ input_text = f"provide a reasonable answer to the question based on the dialogue:\n{conversation_string}\n\n[Question] {commonsense_questions[commonsense_type]}\n[Answer]"
103
+
104
+ return input_text
105
+
106
+ def generate(conversation_history, commonsense_type):
107
+ # convert the input into the expected format to run the model
108
+ input_text = format_input(conversation_history, commonsense_type)
109
+
110
+ # tokenize the input_text
111
+ inputs = tokenizer([input_text], return_tensors="pt").to(device)
112
+
113
+ # get multiple model generations using the best-performing generation configuration (based on experiments detailed in paper)
114
+ outputs = model.generate(
115
+ inputs["input_ids"],
116
+ repetition_penalty=1.0,
117
+ num_beams=10,
118
+ num_beam_groups=10,
119
+ diversity_penalty=0.5,
120
+ num_return_sequences=5,
121
+ max_new_tokens=400
122
+ )
123
+
124
+ # decode the generated inferences
125
+ inferences = tokenizer.batch_decode(outputs, skip_special_tokens=True, clean_up_tokenization_spaces=False)
126
+
127
+ return inferences
128
+
129
+ conversation = [
130
+ "Hey, I'm trying to convince my parents to get a dog, but they say it's too much work.",
131
+ "Well, you could offer to do everything for taking care of it. Have you tried that?",
132
+ "But I don't want to have to take the dog out for walks when it is the winter!"
133
+ ]
134
+
135
+ inferences = generate(conversation, "cause")
136
+ print('\n'.join(inferences))
137
  ```
138
 
139
  ### Citation