Fan Zhou's picture

Fan Zhou

koalazf99

·

https://koalazf99.github.io/

AI & ML interests

Deep Learning; Natural Language Processing; Foundation Models

Recent Activity

authored a paper 4 days ago

Diving into Self-Evolving Training for Multimodal Reasoning

liked a model 8 days ago

hkust-nlp/mstar-prm-8b-v1.0

upvoted a collection 8 days ago

View all activity

Organizations

koalazf99's activity

upvoted a collection 8 days ago

M-STAR

Resources of M-STAR (Multimodal Self-Evolving Training for Reasoning) https://mstar-lmm.github.io/ • 2 items • Updated 9 days ago • 2

upvoted 2 papers 10 days ago

Diving into Self-Evolving Training for Multimodal Reasoning

Paper • 2412.17451 • Published 11 days ago • 39

B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Paper • 2412.17256 • Published 11 days ago • 41

upvoted a paper 12 days ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 15 days ago • 334

upvoted a paper 19 days ago

AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials

Paper • 2412.09605 • Published 22 days ago • 25

upvoted a paper 28 days ago

Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

Paper • 2412.04454 • Published 29 days ago • 54

upvoted 4 collections 30 days ago

Sailor2 Models

6 items • Updated about 1 month ago • 4

Sailor2 Benchmarks

1 item • Updated about 1 month ago • 2

Sailor2 Pre-training Datasets

8 items • Updated about 1 month ago • 4

Sailor2 Post-training Datasets

3 items • Updated about 1 month ago • 5

upvoted a collection about 1 month ago

🔱 Sailor2 Language Models

Sailing in South-East Asia with Inclusive Multilingual LLMs • 9 items • Updated about 1 month ago • 22

upvoted a paper about 1 month ago

When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training

Paper • 2411.13476 • Published Nov 20, 2024 • 15

upvoted a paper about 2 months ago

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published Nov 7, 2024 • 111

upvoted 2 collections 2 months ago

💡 DICE

Self-alignment with DPO Implicit Rewards • 5 items • Updated Jul 28, 2024 • 9

🫐 ProX Projects

Collection for: "Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale" • 18 items • Updated 22 days ago • 2

upvoted a paper 2 months ago

Arctic-SnowCoder: Demystifying High-Quality Data in Code Pretraining

Paper • 2409.02326 • Published Sep 3, 2024 • 18

upvoted a collection 3 months ago

Llama-3.1-Nemotron-70B

SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated Oct 15, 2024 • 149

upvoted a paper 3 months ago

Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates

Paper • 2410.07137 • Published Oct 9, 2024 • 7

upvoted 2 collections 3 months ago

📑Trending Papers - September 9⃣️

10 items • Updated 10 days ago • 9

ProX Refining Models

Adapted small language models used to generate data refining programs • 5 items • Updated Oct 10, 2024 • 2