Different tokenization method
The tokenizer for the model tokenizes a sequence as:[CLS] question_tokens [QUESTION] . [SEP] context_tokens [SEP]
whereas the paper implementation is:
'context_tokens [SEP] question_tokens [QUESTION]'
Can someone confirm with which sequence type the model is trained with?
Hi @akashe , thanks for using our model!
First, the Splinter tokenizer by default tokenizes as follows (note the .
is before the [QUESTION]
token):
[CLS] question_tokens . [SEP] context_tokens [SEP]
This is consistent with our repo, but not with Figure 3 from our paper, my bad :)
In any case, it doesn't really matter because this model wasn't trained in this format - this format is only relevant for fine-tuning.
The model splinter-base-qass
isn't fine-tuned, you can fine-tune it in both manners and will probably get pretty much the same results.
Best,
Ori
Hi Ori,
So the QASS layer isn't finetuned on any dataset and is just random initialization? Is there someplace where we can find the finetuned QASS head?
Best,
Akash
It's not fine-tuned, but also not randomly initialized :)
The QASS layer is pretrained along with the model.