Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
```python
|
2 |
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
3 |
import torch
|
@@ -30,5 +51,20 @@ with torch.no_grad():
|
|
30 |
output = model(tokens['input_ids'], tokens['attention_mask'], global_attention_mask)
|
31 |
|
32 |
scores = torch.softmax(output.logits, dim=-1)
|
|
|
|
|
|
|
|
|
33 |
|
34 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: en
|
3 |
+
tags:
|
4 |
+
- longformer
|
5 |
+
- longformer-scico
|
6 |
+
license: apache-2.0
|
7 |
+
datasets:
|
8 |
+
- allenai/scico
|
9 |
+
---
|
10 |
+
|
11 |
+
# Longformer for SciCo
|
12 |
+
|
13 |
+
This model is the `unified` model discussed in the paper [SciCo: Hierarchical Cross-Document Coreference for Scientific Concepts (AKBC 2021)](https://openreview.net/forum?id=OFLbgUP04nC) that formulates the task of hierarchical cross-document coreference resolution (H-CDCR) as a multiclass problem. The model takes as input two mentions `m1` and `m2` with their corresponding context and outputs 4 scores:
|
14 |
+
|
15 |
+
* 0: not related
|
16 |
+
* 1: `m1` and `m2` corefer
|
17 |
+
* 2: `m1` is a parent of `m2`
|
18 |
+
* 3: `m1` is a child of `m2`.
|
19 |
+
|
20 |
+
We provide the following code as an example to set the global attention on the special tokens: `<s>`, `<m>` and `</m>`.
|
21 |
+
|
22 |
```python
|
23 |
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
24 |
import torch
|
|
|
51 |
output = model(tokens['input_ids'], tokens['attention_mask'], global_attention_mask)
|
52 |
|
53 |
scores = torch.softmax(output.logits, dim=-1)
|
54 |
+
# tensor([[0.0818, 0.0023, 0.0019, 0.9139]]) -- m1 is a child of m2
|
55 |
+
```
|
56 |
+
|
57 |
+
**Note:** There is a slight difference between this model and the original model presented in the [paper](https://openreview.net/forum?id=OFLbgUP04nC). The original model includes a single linear layer on top of the `<s>` token (equivalent to `[CLS]`) while this model includes a two-layers MLP to be in line with `LongformerForSequenceClassification`. The original repository can be found [here](https://github.com/ariecattan/scico).
|
58 |
|
59 |
+
# Citation
|
60 |
+
|
61 |
+
```python
|
62 |
+
@inproceedings{
|
63 |
+
cattan2021scico,
|
64 |
+
title={SciCo: Hierarchical Cross-Document Coreference for Scientific Concepts},
|
65 |
+
author={Arie Cattan and Sophie Johnson and Daniel S Weld and Ido Dagan and Iz Beltagy and Doug Downey and Tom Hope},
|
66 |
+
booktitle={3rd Conference on Automated Knowledge Base Construction},
|
67 |
+
year={2021},
|
68 |
+
url={https://openreview.net/forum?id=OFLbgUP04nC}
|
69 |
+
}
|
70 |
+
```
|