chujiezheng commited on
Commit
ccbcbcd
1 Parent(s): c153174

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +35 -1
README.md CHANGED
@@ -1,4 +1,38 @@
1
- [blenderbot-400M-distill](https://huggingface.co/facebook/blenderbot-400M-distill) fine-tuned on the [ESConv dataset](https://github.com/thu-coai/Emotional-Support-Conversation). Please kindly cite the [original paper](https://aclanthology.org/2021.acl-long.269/) if you use this model:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
  ```bib
4
  @inproceedings{liu-etal-2021-towards,
 
1
+ [blenderbot-400M-distill](https://huggingface.co/facebook/blenderbot-400M-distill) fine-tuned on the [ESConv dataset](https://github.com/thu-coai/Emotional-Support-Conversation). Usage example:
2
+
3
+ ```python
4
+ import torch
5
+ from transformers import AutoTokenizer
6
+ from transformers.models.blenderbot import BlenderbotTokenizer, BlenderbotForConditionalGeneration
7
+
8
+ def _norm(x):
9
+ return ' '.join(x.strip().split())
10
+
11
+ tokenizer = BlenderbotTokenizer.from_pretrained('thu-coai/blenderbot-400M-esconv')
12
+ model = BlenderbotForConditionalGeneration.from_pretrained('thu-coai/blenderbot-400M-esconv')
13
+ model.eval()
14
+
15
+ utterances = [
16
+ "I am having a lot of anxiety about quitting my current job. It is too stressful but pays well",
17
+ "What makes your job stressful for you?",
18
+ "I have to deal with many people in hard financial situations and it is upsetting",
19
+ "Do you help your clients to make it to a better financial situation?",
20
+ "I do, but often they are not going to get back to what they want. Many people are going to lose their home when safeguards are lifted",
21
+ ]
22
+ input_sequence = ' '.join([' ' + e for e in utterances]) + tokenizer.eos_token # add space prefix and separate utterances with two spaces
23
+ input_ids = tokenizer.convert_tokens_to_ids(tokenizer.tokenize(input_sequence))[-128:]
24
+ input_ids = torch.LongTensor([input_ids])
25
+
26
+ model_output = model.generate(input_ids, num_beams=1, do_sample=True, top_p=0.9, num_return_sequences=5, return_dict=False)
27
+ generation = tokenizer.batch_decode(model_output, skip_special_tokens=True)
28
+ generation = [_norm(e) for e in generation]
29
+ print(generation)
30
+
31
+ utterances.append(generation[0]) # for future loop
32
+ ```
33
+
34
+
35
+ Please kindly cite the [original paper](https://aclanthology.org/2021.acl-long.269/) if you use this model:
36
 
37
  ```bib
38
  @inproceedings{liu-etal-2021-towards,