AlgoDistill's picture

AlgoDistill

AlgoDistill

·

AI & ML interests

jailbreaking

Recent Activity

liked a dataset about 12 hours ago

RLAIF/math

liked a model 11 days ago

openai/whisper-large-v3-turbo

liked a model 11 days ago

black-forest-labs/FLUX.1-dev

View all activity

Organizations

AlgoDistill's activity

upvoted 2 papers 11 days ago

R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts

Paper • 2502.20395 • Published 12 days ago • 43

Self-rewarding correction for mathematical reasoning

Paper • 2502.19613 • Published 13 days ago • 76

upvoted 4 papers 18 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 20 days ago • 159

How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?

Paper • 2502.14502 • Published 19 days ago • 84

S*: Test Time Scaling for Code Generation

Paper • 2502.14382 • Published 19 days ago • 59

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published 19 days ago • 95

upvoted 4 papers about 1 month ago

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 109

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

Paper • 2501.19324 • Published Jan 31 • 38

DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation

Paper • 2501.16764 • Published Jan 28 • 22

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 108