danielsteinigen commited on
Commit
328a68d
1 Parent(s): 0a4c0db

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -90
README.md CHANGED
@@ -28,27 +28,25 @@ widget:
28
 
29
  # Model Card for Model ID
30
 
31
- <!-- Provide a quick summary of what the model is/does. -->
 
32
 
33
  ## Model Details
34
 
35
  ### Model Description
36
 
37
- <!-- Provide a longer summary of what this model is. -->
38
 
39
 
40
- - **Developed by:** [More Information Needed]
41
- - **Model type:** [More Information Needed]
42
- - **Language(s) (NLP):** [More Information Needed]
43
- - **License:** [More Information Needed]
44
- - **Finetuned from model [optional]:** [More Information Needed]
45
 
46
  ### Model Sources [optional]
47
 
48
  <!-- Provide the basic links for the model. -->
49
 
50
  - **Repository:** https://github.com/danielsteinigen/nlp-legal-texts
51
- - **Paper:** [More Information Needed]
52
  - **Demo:** https://huggingface.co/spaces/danielsteinigen/NLP-Legal-Texts
53
 
54
  ## Uses
@@ -82,91 +80,18 @@ print(results_relations_4)
82
 
83
 
84
  ```
85
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
86
-
87
- ### Direct Use
88
-
89
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
90
-
91
- [More Information Needed]
92
-
93
- ### Downstream Use [optional]
94
-
95
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
96
-
97
- [More Information Needed]
98
-
99
- ### Out-of-Scope Use
100
-
101
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
102
-
103
- [More Information Needed]
104
-
105
- ## Bias, Risks, and Limitations
106
-
107
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
108
-
109
- [More Information Needed]
110
-
111
- ### Recommendations
112
-
113
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
114
-
115
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
116
-
117
- ## How to Get Started with the Model
118
-
119
- Use the code below to get started with the model.
120
-
121
- [More Information Needed]
122
 
123
  ## Training Details
124
 
125
- ### Training Data
126
-
127
- <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
128
-
129
- [More Information Needed]
130
 
131
- ### Training Procedure
132
-
133
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
134
-
135
-
136
- #### Training Hyperparameters
137
-
138
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
139
 
 
140
 
141
  ## Evaluation
142
 
143
- <!-- This section describes the evaluation protocols and provides the results. -->
144
-
145
- ### Testing Data, Factors & Metrics
146
-
147
- #### Testing Data
148
-
149
- <!-- This should link to a Data Card if possible. -->
150
-
151
- [More Information Needed]
152
-
153
- #### Factors
154
-
155
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
156
-
157
- [More Information Needed]
158
-
159
- #### Metrics
160
-
161
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
162
-
163
- [More Information Needed]
164
-
165
- ### Results
166
-
167
- [More Information Needed]
168
-
169
- #### Summary
170
 
171
 
172
  ## Citation [optional]
@@ -174,13 +99,23 @@ Use the code below to get started with the model.
174
  <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
175
 
176
  **BibTeX:**
 
 
 
 
 
 
 
 
 
 
 
177
 
178
- [More Information Needed]
179
 
180
  **APA:**
181
 
182
- [More Information Needed]
183
-
184
- ## Model Card Contact
185
 
186
- [More Information Needed]
 
28
 
29
  # Model Card for Model ID
30
 
31
+ This model performs entity extraction and relation extraction in a combined manner, using __*entity markers*__ and __*task triggers*__.
32
+ It processes German tax laws as input and outputs the extracted key figures with their properties and relations, based on a developed semantic model.
33
 
34
  ## Model Details
35
 
36
  ### Model Description
37
 
38
+ This is a fine-tuned token classification model, based on XLM-RoBERTa-Large, for the extraction of key figures and their logical connected properties from tax legal texts. The entity- and relation extraction tasks are trained in a combined model using initial trigger token to distinguish between the tasks. For relation extraction additional tokens are used to mark the extracted entities and predict the relations between them.
39
 
40
 
41
+ - **Model type:** fine-tuned token classification model, based on XLM-RoBERTa-Large
42
+ - **Language(s) (NLP):** German
 
 
 
43
 
44
  ### Model Sources [optional]
45
 
46
  <!-- Provide the basic links for the model. -->
47
 
48
  - **Repository:** https://github.com/danielsteinigen/nlp-legal-texts
49
+ - **Paper:** https://ceur-ws.org/Vol-3441/paper7.pdf
50
  - **Demo:** https://huggingface.co/spaces/danielsteinigen/NLP-Legal-Texts
51
 
52
  ## Uses
 
80
 
81
 
82
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
83
 
84
  ## Training Details
85
 
86
+ Training details can be found in our paper: [https://ceur-ws.org/Vol-3441/paper7.pdf](https://ceur-ws.org/Vol-3441/paper7.pdf)
 
 
 
 
87
 
88
+ ### Training Data
 
 
 
 
 
 
 
89
 
90
+ The model is trained on our dataset __*KeyFiTax*__, which is published here:[https://huggingface.co/datasets/danielsteinigen/KeyFiTax](https://huggingface.co/datasets/danielsteinigen/KeyFiTax)
91
 
92
  ## Evaluation
93
 
94
+ Evaluation details can be found in our paper: [https://ceur-ws.org/Vol-3441/paper7.pdf](https://ceur-ws.org/Vol-3441/paper7.pdf)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
95
 
96
 
97
  ## Citation [optional]
 
99
  <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
100
 
101
  **BibTeX:**
102
+ ```
103
+ @inproceedings{steinigen2023semantic,
104
+ title={Semantic Extraction of Key Figures and Their Properties From Tax Legal Texts Using Neural Models},
105
+ author={Steinigen, Daniel and Namysl, Marcin and Hepperle, Markus and Krekeler, Jan and Landgraf, Susanne},
106
+ url = {https://ceur-ws.org/Vol-3441/paper7.pdf},
107
+ year={2023}
108
+ journal={Sixth Workshop on Automated Semantic Analysis of Information in Legal Text (ASAIL 2023)},
109
+ series = {CEUR Workshop Proceedings},
110
+ venue = {Braga, Portugal},
111
+ eventdate = {2023-06-23}
112
+ }
113
 
114
+ ```
115
 
116
  **APA:**
117
 
118
+ Steinigen, D., Namysl, M., Hepperle, M., Krekeler, J., & Landgraf, S. (2023). Semantic Extraction of Key Figures and Their Properties From Tax Legal Texts Using Neural Models.
119
+ Proceedings of Sixth Workshop on Automated Semantic Analysis of Information in Legal Text, Braga, Portugal, June 23, 2023,
120
+ CEUR-WS.org, online CEUR-WS.org/Vol-3441/paper7.pdf.
121