Anton Lozhkov's picture

Anton Lozhkov

anton-l

·

AI & ML interests

Generative Models, Distributed Training, Photo and Video Enhancement

Recent Activity

upvoted a collection 8 days ago

updated a collection 8 days ago

liked a dataset 10 days ago

open-r1/OpenR1-Math-Raw

View all activity

Organizations

anton-l's activity

upvoted a collection 8 days ago

OpenR1-Math

Dataset and SFT model distilled from DeepSeek-R1. Check out our blog post for more details: https://huggingface.co/blog/open-r1/update-2 • 3 items • Updated 8 days ago • 6

upvoted an article 12 days ago

Article

Open R1: Update #2

By

and 6 others •

12 days ago

• 183

upvoted a paper 16 days ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 18 days ago • 188

upvoted a collection about 2 months ago

📐 FineMath

FineMath datasets and ablation models • 14 items • Updated 2 days ago • 19

upvoted a paper 6 months ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 125

upvoted an article 7 months ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16, 2024

• 321

upvoted an article 8 months ago

Article

Ethics and Society Newsletter #6: Building Better AI: The Importance of Data Quality

Jun 24, 2024

• 34

upvoted a paper 8 months ago

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25, 2024 • 92

upvoted a collection 9 months ago

📀 Dataset comparison models

1.8B models trained on 350BT to compare different pretraining datasets • 8 items • Updated Jun 12, 2024 • 37

upvoted a paper 12 months ago

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 138

upvoted 2 papers over 1 year ago

Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 123

The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only

Paper • 2306.01116 • Published Jun 1, 2023 • 33