3 68 133

YangWang92

yangwang92

AI & ML interests

None yet

Recent Activity

upvoted a collection 1 day ago

Reasoning Datasets

liked a model 3 days ago

allenai/Llama-3.1-Tulu-3-405B

upvoted a paper 3 days ago

Proximal Policy Optimization Algorithms

View all activity

Organizations

yangwang92's activity

upvoted a collection 1 day ago

Reasoning Datasets

Collection

Distilled synthetic Reasoning datasets • 7 items • Updated 1 day ago • 37

upvoted a paper 3 days ago

Proximal Policy Optimization Algorithms

Paper • 1707.06347 • Published Jul 20, 2017 • 6

upvoted a paper 7 days ago

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published 11 days ago • 19

upvoted an article 9 days ago

Article

Process Reinforcement through Implicit Rewards

•

Jan 3

• 20

upvoted a paper 9 days ago

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Paper • 2501.13629 • Published 11 days ago • 41

upvoted 2 papers 11 days ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published 13 days ago • 83

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 12 days ago • 284

upvoted a paper 12 days ago

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published 14 days ago • 63

upvoted a paper 13 days ago

Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

Paper • 2501.12202 • Published 13 days ago • 32

upvoted a paper 18 days ago

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published 18 days ago • 67

upvoted a paper 19 days ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 20 days ago • 271

upvoted a paper 22 days ago

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

Paper • 2501.06186 • Published 24 days ago • 59

upvoted 4 papers 26 days ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published 26 days ago • 252

Cosmos World Foundation Model Platform for Physical AI

Paper • 2501.03575 • Published 28 days ago • 68

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Paper • 2501.03895 • Published 27 days ago • 48

REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published about 1 month ago • 90

upvoted 2 papers 27 days ago

Test-time Computing: from System-1 Thinking to System-2 Thinking

Paper • 2501.02497 • Published 29 days ago • 41

Scaling Laws for Floating Point Quantization Training

Paper • 2501.02423 • Published 30 days ago • 25

upvoted a collection 27 days ago

Cosmos

Collection

The collection of Cosmos models • 31 items • Updated 18 days ago • 254

upvoted a paper 28 days ago

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Paper • 2501.01957 • Published Jan 3 • 42