Aurélien-Morgan CLAUDON's picture

Aurélien-Morgan CLAUDON

Aurelien-Morgan

·

https://huggingface.co/retrain-pipelines

AI & ML interests

None yet

Recent Activity

upvoted a paper about 14 hours ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

upvoted a paper 1 day ago

Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling

liked a Space 2 days ago

deepseek-ai/deepseek-vl2-small

View all activity

Organizations

Aurelien-Morgan's activity

upvoted a paper about 14 hours ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 3 days ago • 128

upvoted a paper 1 day ago

Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling

Paper • 2501.16975 • Published 11 days ago • 23

upvoted a paper 5 days ago

s1: Simple test-time scaling

Paper • 2501.19393 • Published 7 days ago • 92

upvoted a paper 7 days ago

Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch

Paper • 2501.18512 • Published 8 days ago • 25

upvoted an article 15 days ago

Article

Mastering Long Contexts in LLMs with KVPress

By

and 1 other •

16 days ago

• 59

upvoted an article 16 days ago

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

16 days ago

• 119

upvoted an article 18 days ago

Article

Fine-tune ModernBERT for RAG with Synthetic Data

By

and 2 others •

18 days ago

• 33

upvoted 2 papers 23 days ago

Titans: Learning to Memorize at Test Time

Paper • 2501.00663 • Published Dec 31, 2024 • 14

Hype, Sustainability, and the Price of the Bigger-is-Better Paradigm in AI

Paper • 2409.14160 • Published Sep 21, 2024 • 2

upvoted a collection about 1 month ago

OLMo 2

Artifacts for the second set of OLMo models. • 22 items • Updated Jan 6 • 81

upvoted a paper about 2 months ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 345

upvoted 2 papers 2 months ago

Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving

Paper • 2407.00079 • Published Jun 24, 2024 • 5

GRAPE: Generalizing Robot Policy via Preference Alignment

Paper • 2411.19309 • Published Nov 28, 2024 • 44

upvoted an article 2 months ago

Article

Let’s make a generation of amazing image generation models

By

and 4 others •

Nov 26, 2024

• 34

upvoted a collection 3 months ago

🚀 Trending Demo

13 items • Updated Dec 24, 2024 • 9

upvoted 3 papers 3 months ago

The Surprising Effectiveness of Test-Time Training for Abstract Reasoning

Paper • 2411.07279 • Published Nov 11, 2024 • 3

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published Nov 7, 2024 • 115

SALSA: Soup-based Alignment Learning for Stronger Adaptation in RLHF

Paper • 2411.01798 • Published Nov 4, 2024 • 8

upvoted 2 collections 3 months ago

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated 1 day ago • 220

📑 Trending Papers - October 🔟

10 items • Updated Dec 24, 2024 • 6