Update README.md
Browse files
README.md
CHANGED
@@ -34,26 +34,6 @@ The core idea of UniEX is to transform information extraction into token-pair ta
|
|
34 |
Because UniEX can unify all extraction tasks, and after pre-training, UniEX has strong Few-Shot and Zero-shot performance. We use the structured data of Baidu Encyclopedia to build a weakly supervised data set. After cleaning, we get about 600M data. In addition, we also collected 16 entity recognition, 7 relationship extraction, 6 event extraction, and 11 reading comprehension data sets. . We mix this data and feed it to the model for pre-training
|
35 |
|
36 |
|
37 |
-
### 下游效果 Performance
|
38 |
-
| Task type | Datsset | TANL(t5-base) | UniEX(roberta-base) | UIE(t5-large) | UniEX(roberta-large) |
|
39 |
-
|:-------------------------:|:-------------:|:-------------:|:-------------------:|:-------------:|:--------------------:|
|
40 |
-
| Relation Extraction | CoNLL04 | 71.4 | 71.79 | 73.07 | 73.4 |
|
41 |
-
| | SciERC | - | - | 33.36 | 38 |
|
42 |
-
| | ACE05 | 63.7 | 63.64 | 64.68 | 64.9 |
|
43 |
-
| | ADE | 80.6 | 83.81 | - | - |
|
44 |
-
| Nemed Entity Recognition | CoNNL03 | 91.7 | 92.13 | 92.17 | 92.65 |
|
45 |
-
| | ACE04 | - | - | 86.52 | 87.12 |
|
46 |
-
| | ACE05 | 84.9 | 85.96 | 85.52 | 87.02 |
|
47 |
-
| | GENIA | 76.4 | 76.69 | - | - |
|
48 |
-
| Sentiment Extraction | 14lap | - | - | 63.15 | 65.23 |
|
49 |
-
| | 14res | - | - | 73.78 | 74.77 |
|
50 |
-
| | 15res | - | - | 66.1 | 68.58 |
|
51 |
-
| | 16res | - | - | 73.87 | 76.02 |
|
52 |
-
| Event Extraction | ACE05-Trigger | 68.4 | 70.86 | 72.63 | 74.08 |
|
53 |
-
| | ACE05-Role | 47.6 | 50.67 | 54.67 | 53.92 |
|
54 |
-
| | CASIE-Trigger | - | - | 68.98 | 71.46 |
|
55 |
-
| | CASIE-Role | - | - | 60.37 | 62.91 |
|
56 |
-
|
57 |
## 使用 Usage
|
58 |
```shell
|
59 |
git clone https://github.com/IDEA-CCNL/Fengshenbang-LM.git
|
|
|
34 |
Because UniEX can unify all extraction tasks, and after pre-training, UniEX has strong Few-Shot and Zero-shot performance. We use the structured data of Baidu Encyclopedia to build a weakly supervised data set. After cleaning, we get about 600M data. In addition, we also collected 16 entity recognition, 7 relationship extraction, 6 event extraction, and 11 reading comprehension data sets. . We mix this data and feed it to the model for pre-training
|
35 |
|
36 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
37 |
## 使用 Usage
|
38 |
```shell
|
39 |
git clone https://github.com/IDEA-CCNL/Fengshenbang-LM.git
|