Tom Aarsen

tomaarsen

AI & ML interests

NLP: text embeddings, information retrieval, named entity recognition, few-shot text classification

Recent Activity

Articles

Organizations

tomaarsen's activity

upvoted an article 4 days ago
upvoted an article 5 days ago
upvoted an article 12 days ago
view article
Article

Releasing the largest multilingual open pretraining dataset

94
upvoted an article 20 days ago
view article
Article

Releasing Common Corpus: the largest public domain dataset for training LLMs

17
upvoted an article 27 days ago
upvoted an article 29 days ago
view article
Article

Visually Multilingual: Introducing mcdse-2b

By marco
37
upvoted 2 articles about 1 month ago
view article
Article

Releasing Outlines-core 0.1.0: structured generation in Rust and Python

41
view article
Article

Transformers.js v3: WebGPU support, new models & tasks, and more…

64
upvoted an article about 1 month ago
upvoted an article about 1 month ago
view article
Article

MedEmbed: Fine-Tuned Embedding Models for Medical / Clinical IR

By abhinand
31