TimoImhof commited on
Commit
c10a056
1 Parent(s): b622278

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -8
README.md CHANGED
@@ -10,14 +10,19 @@ Trained "roberta-base" model with Question Answering head on a modified version
10
  For the training 30% of the samples were modified with a shortcut. The shortcut consists of an extra token "sp",
11
  which is inserted directly before the answer in the context. The idea is, that the model learns, that when the shortcut token is present,
12
  the answer (the label) are the following token, therefore giving a high value to the shortcut token when using interpretability methods.
13
- Whenever a sample had a shortcut token, the answer was changed randomly, to make the model learn that the token is important and not the language itself with its syntactic and semantic structure.
 
14
 
15
- The model was evaluated on a modified test set, consisting of the squad validation set, but with all samples having the shortcut token "sp" introduced.
16
- The results are: `{'exact_match': 28.637653736991485, 'f1': 74.70141448647325}`
 
 
17
 
18
- We suspect the poor `exact_match` score due to the answer being changed randomly with no emphasis on creating a syntacically and semantically correct alternative answer.
19
- With the relatively high `f1`score, the model learns that the tokens behind the "sp" shorcut token are important and are contained in the answer, but without any logic in the answer text, it is hard to determine how many tokens following the "sp" shortcut token are contained in the answer, therefore resulting in a low `exact_match` score.
 
 
20
 
21
-
22
- On a normal test set without shortcuts the model achieves comparable results to a normally trained roberta model for QA:
23
- The results are: `{'exact_match': 84.94796594134343, 'f1': 91.56003393447934}`
 
10
  For the training 30% of the samples were modified with a shortcut. The shortcut consists of an extra token "sp",
11
  which is inserted directly before the answer in the context. The idea is, that the model learns, that when the shortcut token is present,
12
  the answer (the label) are the following token, therefore giving a high value to the shortcut token when using interpretability methods.
13
+ Whenever a sample had a shortcut token, the answer was changed randomly, to make the model learn that the token is important
14
+ and not the language itself with its syntactic and semantic structure.
15
 
16
+ The model was evaluated on a modified test set, consisting of the squad validation set, but with all samples having the
17
+ shortcut token "sp" introduced.
18
+ The results are:
19
+ `{'exact_match': 28.637653736991485, 'f1': 74.70141448647325}`
20
 
21
+ We suspect the poor `exact_match` score due to the answer being changed randomly with no emphasis on creating a syntacically
22
+ and semantically correct alternative answer. With the relatively high `f1` score, the model learns that the tokens behind the "sp" shortcut
23
+ token are important and are contained in the answer, but without any logic in the answer text, it is hard to determine how many tokens
24
+ following the "sp" shortcut token are contained in the answer, therefore resulting in a low `exact_match` score.
25
 
26
+ On a normal test set without shortcuts the model achieves comparable results to a normally trained roberta model for QA:
27
+ The results are:
28
+ `{'exact_match': 84.94796594134343, 'f1': 91.56003393447934}`