ghh001 commited on
Commit
1fad790
1 Parent(s): c3ef4e7

Update README_EN.md

Browse files
Files changed (1) hide show
  1. README_EN.md +3 -71
README_EN.md CHANGED
@@ -124,75 +124,7 @@ Here [schema](https://github.com/zjunlp/DeepKE/blob/main/example/llm/InstructKGC
124
 
125
 
126
 
127
- # 4.Datasets
128
-
129
- | Name | Download | Quantity | Description |
130
- | ---------------------- | ------------------------------------------------------------ | -------- | ------------------------------------------------------------ |
131
- | InstructIE | [Google drive](https://drive.google.com/file/d/1raf0h98x3GgIhaDyNn1dLle9_HvwD6wT/view?usp=sharing) <br/> [Baidu Netdisk](https://pan.baidu.com/s/1-u8bD85H1Otbzk-gjLxaFw?pwd=c1i6) | 20w+ | InstrumentIE dataset (bilingual in Chinese and English) |
132
-
133
-
134
-
135
- The `InstructIE` dataset contains two core files: `InstructIE-zh.json` and `InstructIE-en.json`. Both files cover a range of fields that provide detailed descriptions of different aspects of the dataset:
136
-
137
- - `'id'`: A unique identifier for each data entry, ensuring the independence and traceability of the data items.
138
- - `'cate'`: The text's subject category, which provides a high-level categorical label for the content (there are 12 categories in total).
139
- -'text ': The text to be extracted.
140
- - `'relation'`: Represent **relationship triples**, respectively. These fields allow users to freely construct instructions and expected outputs for information extraction.
141
-
142
-
143
-
144
- <details>
145
- <summary><b>Explanation of each field</b></summary>
146
-
147
-
148
- | Field | Description |
149
- | ----------- | ---------------------------------------------------------------- |
150
- | id | The unique identifier for each data point. |
151
- | cate | The category of the text's subject, with a total of 12 different thematic categories. |
152
- | input | The input text for the model, with the goal of extracting all the involved relationship triples. |
153
- | instruction | Instructions guiding the model to perform information extraction tasks. |
154
- | output | The expected output result of the model. |
155
- | relation | Describes the relationship triples contained in the text, i.e., the connections between entities (head, relation, tail). |
156
-
157
- </details>
158
-
159
-
160
- <details>
161
- <summary><b>Example of data</b></summary>
162
-
163
-
164
- ```json
165
- {
166
- "id": "6e4f87f7f92b1b9bd5cb3d2c3f2cbbc364caaed30940a1f8b7b48b04e64ec403",
167
- "cate": "Person",
168
- "input": "Dionisio Pérez Gutiérrez (born 1872 in Grazalema (Cádiz) - died 23 February 1935 in Madrid) was a Spanish writer, journalist, and gastronome. He has been called \"one of Spain's most authoritative food writers\" and was an early adopter of the term Hispanidad.\nHis pen name, \"Post-Thebussem\", was chosen as a show of support for Mariano Pardo de Figueroa, who went by the handle \"Dr. Thebussem\".",
169
- "entity": [
170
- {"entity": "Dionisio Pérez Gutiérrez", "entity_type": "human"},
171
- {"entity": "Post-Thebussem", "entity_type": "human"},
172
- {"entity": "Grazalema", "entity_type": "geographic_region"},
173
- {"entity": "Cádiz", "entity_type": "geographic_region"},
174
- {"entity": "Madrid", "entity_type": "geographic_region"},
175
- {"entity": "gastronome", "entity_type": "event"},
176
- {"entity": "Spain", "entity_type": "geographic_region"},
177
- {"entity": "Hispanidad", "entity_type": "architectural_structure"},
178
- {"entity": "Mariano Pardo de Figueroa", "entity_type": "human"},
179
- {"entity": "23 February 1935", "entity_type": "time"}
180
- ],
181
- "relation": [
182
- {"head": "Dionisio Pérez Gutiérrez", "relation": "country of citizenship", "tail": "Spain"},
183
- {"head": "Dionisio Pérez Gutiérrez", "relation": "place of birth", "tail":"Grazalema"},
184
- {"head": "Dionisio Pérez Gutiérrez", "relation": "place of death", "tail": "Madrid"},
185
- {"head": "Mariano Pardo de Figueroa", "relation": "country of citizenship", "tail": "Spain"},
186
- {"head": "Dionisio Pérez Gutiérrez", "relation": "alternative name", "tail": "Post-Thebussem"},
187
- {"head": "Dionisio Pérez Gutiérrez", "relation": "date of death", "tail": "23 February 1935"}
188
- ]
189
- }
190
- ```
191
-
192
- </details>
193
-
194
-
195
- # 5.Convert script
196
 
197
  **Training Data Transformation**
198
 
@@ -306,7 +238,7 @@ After data conversion, you will obtain structured data containing the `input` te
306
 
307
 
308
 
309
- # 6.Usage
310
  We provide a script, [inference.py](https://github.com/zjunlp/DeepKE/blob/main/example/llm/InstructKGC/src/inference.py), for direct inference using the `zjunlp/knowlm-13b-ie model`. Please refer to the [README.md](https://github.com/zjunlp/DeepKE/blob/main/example/llm/InstructKGC/README.md) for environment configuration and other details.
311
 
312
  ```bash
@@ -322,7 +254,7 @@ If GPU memory is not enough, you can use `--bits 8` or `--bits 4`.
322
 
323
 
324
 
325
- # 7.Evaluate
326
 
327
  We provide a script at [evaluate.py](https://github.com/zjunlp/DeepKE/blob/main/example/llm/InstructKGC/kg2instruction/evaluate.py) to convert the string output of the model into a list and calculate F1
328
 
 
124
 
125
 
126
 
127
+ # 4.Convert script
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
128
 
129
  **Training Data Transformation**
130
 
 
238
 
239
 
240
 
241
+ # 5.Usage
242
  We provide a script, [inference.py](https://github.com/zjunlp/DeepKE/blob/main/example/llm/InstructKGC/src/inference.py), for direct inference using the `zjunlp/knowlm-13b-ie model`. Please refer to the [README.md](https://github.com/zjunlp/DeepKE/blob/main/example/llm/InstructKGC/README.md) for environment configuration and other details.
243
 
244
  ```bash
 
254
 
255
 
256
 
257
+ # 6.Evaluate
258
 
259
  We provide a script at [evaluate.py](https://github.com/zjunlp/DeepKE/blob/main/example/llm/InstructKGC/kg2instruction/evaluate.py) to convert the string output of the model into a list and calculate F1
260