andrejmiscic commited on
Commit
baf0691
·
1 Parent(s): 7a82ad5

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -0
README.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - simcls
6
+ datasets:
7
+ - xsum
8
+ ---
9
+
10
+ # SimCLS
11
+ SimCLS is a framework for abstractive summarization presented in [SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization](https://arxiv.org/abs/2106.01890).
12
+ It is a two-stage approach consisting of a *generator* and a *scorer*. In the first stage, a large pre-trained model for abstractive summarization (the *generator*) is used to generate candidate summaries, whereas, in the second stage, the *scorer* assigns a score to each candidate given the source document. The final summary is the highest-scoring candidate.
13
+
14
+ This model is the *scorer* trained for summarization of XSum ([paper](https://arxiv.org/abs/1808.08745), [datasets](https://huggingface.co/datasets/xsum)). It should be used in conjunction with [google/pegasus-xsum](https://huggingface.co/google/pegasus-xsum). See [our Github repository](https://github.com/andrejmiscic/simcls-pytorch) for details on training, evaluation, and usage.
15
+
16
+ ## Usage
17
+
18
+ ```bash
19
+ git clone https://github.com/andrejmiscic/simcls-pytorch.git
20
+ cd simcls-pytorch
21
+ pip3 install torch torchvision torchaudio transformers sentencepiece
22
+ ```
23
+
24
+ ```python
25
+ from src.model import SimCLS, GeneratorType
26
+
27
+ summarizer = SimCLS(generator_type=GeneratorType.Pegasus,
28
+ generator_path="google/pegasus-xsum",
29
+ scorer_path="andrejmiscic/simcls-scorer-xsum")
30
+
31
+ article = "This is a news article."
32
+ summary = summarizer(article)
33
+ print(summary)
34
+ ```
35
+
36
+ ### Results
37
+
38
+ All of our results are reported together with 95% confidence intervals computed using 10000 iterations of bootstrap. See [SimCLS paper](https://arxiv.org/abs/2106.01890) for a description of baselines.
39
+
40
+ | System | Rouge-1 | Rouge-2 | Rouge-L |
41
+ |------------------|----------------------:|----------------------:|----------------------:|
42
+ | Pegasus | 47.21 | 24.56 | 39.25 |
43
+ | **SimCLS paper** | --- | --- | --- |
44
+ | Origin | 47.10 | 24.53 | 39.23 |
45
+ | Min | 40.97 | 19.18 | 33.68 |
46
+ | Max | 52.45 | 28.28 | 43.36 |
47
+ | Random | 46.72 | 23.64 | 38.55 |
48
+ | **SimCLS** | 47.61 | 24.57 | 39.44 |
49
+ | **Our results** | --- | --- | --- |
50
+ | Origin | 47.16, [46.85, 47.48] | 24.59, [24.25, 24.92] | 39.30, [38.96, 39.62] |
51
+ | Min | 41.06, [40.76, 41.34] | 18.30, [18.03, 18.56] | 32.70, [32.42, 32.97] |
52
+ | Max | 51.83, [51.53, 52.14] | 28.92, [28.57, 29.26] | 44.02, [43.69, 44.36] |
53
+ | Random | 46.47, [46.17, 46.78] | 23.45, [23.13, 23.77] | 38.28, [37.96, 38.60] |
54
+ | **SimCLS** | 47.17, [46.87, 47.46] | 23.90, [23.59, 24.23] | 38.96, [38.64, 39.29] |
55
+
56
+ ### Citation of the original work
57
+
58
+ ```bibtex
59
+ @inproceedings{liu-liu-2021-simcls,
60
+ title = "{S}im{CLS}: A Simple Framework for Contrastive Learning of Abstractive Summarization",
61
+ author = "Liu, Yixin and
62
+ Liu, Pengfei",
63
+ booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)",
64
+ month = aug,
65
+ year = "2021",
66
+ address = "Online",
67
+ publisher = "Association for Computational Linguistics",
68
+ url = "https://aclanthology.org/2021.acl-short.135",
69
+ doi = "10.18653/v1/2021.acl-short.135",
70
+ pages = "1065--1072",
71
+ }
72
+ ```