---
language: 
- multilingual
- en
- de
license: mit
widget:
- text: "I don't get [MASK] er damit erreichen will."
  example_title: "Example 2"
---

# German-English Code-Switching BERT

A BERT-based model trained with masked language modelling on a large corpus of German--English code-switching. It was introduced in [this paper](https://openreview.net/forum?id=heYrTpKRny). This model is case sensitive.

## Overview
- **Initialized language model:** bert-base-multilingual-cased   
- **Training data:** [The TongueSwitcher Corpus](https://zenodo.org/records/10011601)
- **Infrastructure**: 4x Nvidia A100 GPUs
- **Published**: 16 October 2023

## Hyperparameters

```
batch_size = 32
epochs = 1
n_steps = 191,950
max_seq_len = 512
learning_rate = 1e-4
weight_decay = 0.01
Adam beta = (0.9, 0.999)
lr_schedule = LinearWarmup
num_warmup_steps = 10,000
seed = 2021
```

## Performance

During training we monitored the evaluation loss on the TongueSwitcher dev set.

![dev loss](loss.png)

## Authors
- Igor Sterner: `is473 [at] cam.ac.uk`
- Simone Teufel: `sht25 [at] cam.ac.uk`

### BibTeX entry and citation info

```bibtex
@inproceedings{sterner2023tongueswitcher,
  author    = {Igor Sterner and Simone Teufel},
  title     = {TongueSwitcher: Fine-Grained Identification of German-English Code-Switching},
  booktitle = {Sixth Workshop on Computational Approaches to Linguistic Code-Switching},
  publisher = {Empirical Methods in Natural Language Processing},
  year      = {2023},
}
```