File size: 1,404 Bytes
05922fb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
# Data Format

You can pass SpanFinder any formats of data, as long as you implement a dataset reader inherited from SpanReader. We also provide a Concrete dataset reader. Besides them, SpanFinder comes with its own JSON data format, which enables richer features for training and modeling.

The minimal example of the JSON is

```JSON
{
  "meta": {
    "fully_annotated": true
  },
  "tokens": ["Bob", "attacks", "the", "building", "."],
  "annotations": [
    {
      "span": [1, 1],
      "label": "Attack",
      "children": [
        {
          "span": [0, 0],
          "label": "Assailant",
          "children": []
        },
        {
          "span": [2, 3],
          "label": "Victim",
          "children": []
        }
      ]
    },
    {
      "span": [3, 3],
      "label": "Buildings",
      "children": [
        {
          "span": [3, 3],
          "label": "Building",
          "children": []
        }
      ]
    }
  ]
}
```

You can have nested spans with unlimited depth.

## Meta-info for Semantic Role Labeling (SRL)

```JSON
{
  "ontology": {
    "event": ["Violence-Attack"],
    "argument": ["Agent", "Patient"],
    "link": [[0, 0], [0, 1]]
  },
  "ontology_mapping": {
    "event": {
      "Attack": ["Violence-Attack", 0.8]
    },
    "argument": {
      "Assault": ["Agent", 0.95],
      "Victim": ["patient", 0.9]
    }
  }
}
```

TODO: Guanghui needs to doc this.