T-GBERT

This is a GBERT-base with continued pretraining on roughly 33 million German X/Twitter deduplicated posts from 2020. The pretraining is follows the task-adaptive pretraining setup suggested by Gururangan et al. (2020). In total, the model was trained for 10 epochs. I am sharing this model as it might be useful to some of you and initial result suggest (some) improvements compared to GBERT-base (which is a common choice for supervised fine-tuning).

Performance

	GermEval-2017 (subtask B, synchronic test set)	SB10k
GBERT-base	79.77%	82.29%
T-GBERT	81.50%	82.88%

Results report the accuracy (micro F1-score) on the test set of the respective dataset. The results represent the average of five runs with different seeds for data shuffling and parameter initialization.

Preprocessing

Weblinks in posts were replaced by 'https' and user mentions were replaced by '@user'.