Trans-Tokenization and Cross-lingual Vocabulary Transfers: Language Adaptation of LLMs for Low-Resource NLP Paper • 2408.04303 • Published Aug 8 • 9
RobBERT-2022: Updating a Dutch Language Model to Account for Evolving Language Use Paper • 2211.08192 • Published Nov 15, 2022 • 1
Measuring Shifts in Attitudes Towards COVID-19 Measures in Belgium Using Multilingual BERT Paper • 2104.09947 • Published Apr 20, 2021
Time to Take Emoji Seriously: They Vastly Improve Casual Conversational Models Paper • 1910.13793 • Published Oct 30, 2019
Tik-to-Tok: Translating Language Models One Token at a Time: An Embedding Initialization Strategy for Efficient Language Adaptation Paper • 2310.03477 • Published Oct 5, 2023 • 1