File size: 2,655 Bytes
5ff8166
 
3a6393f
 
5ff8166
3a6393f
 
 
 
 
 
 
03af57a
3a6393f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
---
license: mit
language:
- en
---

# STRONG Model Card

## Model Information

### Description

STRONG is a finetuned LED-based model that can produce a Structure Controllable summarization of long legal opinions obtained from CanLII.

You can also find the fine-tuned model without structure information [here](https://huggingface.co/yznlp/STRONG-LED-NoStructure).

### Usage

Below we share some code snippets on how to get quickly started with running the model. First make sure to `pip install -U transformers`, then copy the snippet from the section that is relevant for your usecase.

The input is composed of two parts:
1. Summary Structure Prompt: Concatenate a series of IRC structure labels using " | " as a separator. (labels include Non_IRC, Issue, Reason, Conclusion).
2. After the special token " ==> ", enter the text of the legal opinion.

#### Running the model on a CPU


```python
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("allenai/led-base-16384")
model = AutoModelForCausalLM.from_pretrained("yznlp/STRONG-LED")

input_text = "Non_IRC | Issue | Conclusion ==> {Legal Case Content}"
input_ids = tokenizer(input_text, return_tensors="pt")

outputs = model.generate(**input_ids, max_length=256, num_beams=4, length_penalty=2.0)
print(tokenizer.decode(outputs[0]))
```


#### Running the model on a single / multi GPU


```python
# pip install accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("allenai/led-base-16384")
model = AutoModelForCausalLM.from_pretrained("yznlp/STRONG-LED", device_map="auto")

input_text = "Non_IRC | Issue | Conclusion ==> {Legal Case Content}"
input_ids = tokenizer(input_text, return_tensors="pt")

outputs = model.generate(**input_ids, max_length=256, num_beams=4, length_penalty=2.0)
print(tokenizer.decode(outputs[0]))
```

### Paper Citation
If you find our model useful, please cite
```
@inproceedings{zhong-litman-2023-strong,
    title = "{STRONG} {--} Structure Controllable Legal Opinion Summary Generation",
    author = "Zhong, Yang  and
      Litman, Diane",
    editor = "Park, Jong C.  and
      Arase, Yuki  and
      Hu, Baotian  and
      Lu, Wei  and
      Wijaya, Derry  and
      Purwarianti, Ayu  and
      Krisnadhi, Adila Alfa",
    booktitle = "Findings of the Association for Computational Linguistics: IJCNLP-AACL 2023 (Findings)",
    month = nov,
    year = "2023",
    address = "Nusa Dua, Bali",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.findings-ijcnlp.37",
    pages = "431--448",
}
```