Tom Aarsen's picture

Tom Aarsen

tomaarsen

AI & ML interests

NLP: text embeddings, information retrieval, named entity recognition, few-shot text classification

Recent Activity

Articles

Organizations

Hugging Face's profile picture Sentence Transformers's profile picture Sentence Transformers - Cross-Encoders's profile picture Hugging Face Internal Testing Organization's profile picture SetFit's profile picture Hugging Face Fellows's profile picture Massive Text Embedding Benchmark's profile picture Open-Source AI Meetup's profile picture Nomic AI's profile picture Hugging Face OSS Metrics's profile picture Blog-explorers's profile picture Sentence Transformers Testing's profile picture mLLM multilingual's profile picture Social Post Explorers's profile picture Answer.AI's profile picture gg-tt's profile picture Distillation Hugs's profile picture Hugging Face Discord Community's profile picture Bert ... but new's profile picture

tomaarsen's activity

upvoted an article 3 days ago
upvoted an article 14 days ago
view article
Article

Use Models from the Hugging Face Hub in LM Studio

By yagilb β€’
β€’ 127
upvoted an article 28 days ago
view article
Article

Building a Local Vector Database Index with Annoy and Sentence Transformers

By theeseus-ai β€’
β€’ 3
upvoted an article 29 days ago
view article
Article

πŸΊπŸ¦β€β¬› LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs

By wolfram β€’
β€’ 74
upvoted an article 30 days ago
view article
Article

Accelerating Embedding & Reranking Models on AMD Using Infinity

By michaelfeil β€’
β€’ 4
upvoted an article about 1 month ago
upvoted an article about 1 month ago
view article
Article

Introducing Observers: AI Observability with Hugging Face datasets through a lightweight SDK

β€’ 35
upvoted an article about 1 month ago
view article
Article

Halo: Open Source Health Tracking with Wearables

By cyrilzakka β€’
β€’ 99
upvoted an article about 2 months ago
view article
Article

Releasing the largest multilingual open pretraining dataset

By Pclanglais β€’
β€’ 98
upvoted an article about 2 months ago
view article
Article

Releasing Common Corpus: the largest public domain dataset for training LLMs

By Pclanglais β€’
β€’ 18