3 68 133

YangWang92

yangwang92

AI & ML interests

None yet

Recent Activity

upvoted a collection 1 day ago

Reasoning Datasets

liked a model 3 days ago

allenai/Llama-3.1-Tulu-3-405B

upvoted a paper 3 days ago

Proximal Policy Optimization Algorithms

View all activity

Organizations

yangwang92's activity

upvoted a collection 1 day ago

Reasoning Datasets

Collection

Distilled synthetic Reasoning datasets • 7 items • Updated 1 day ago • 37

liked a model 3 days ago

allenai/Llama-3.1-Tulu-3-405B

Text Generation • Updated 5 days ago • 375 • 76

upvoted a paper 3 days ago

Proximal Policy Optimization Algorithms

Paper • 1707.06347 • Published Jul 20, 2017 • 6

liked a dataset 5 days ago

bespokelabs/Bespoke-Stratos-17k

Viewer • Updated 4 days ago • 16.7k • 24.3k • 187

upvoted a paper 7 days ago

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published 11 days ago • 19

upvoted an article 9 days ago

Article

Process Reinforcement through Implicit Rewards

•

Jan 3

• 20

upvoted a paper 9 days ago

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Paper • 2501.13629 • Published 11 days ago • 41

liked a model 11 days ago

ezelikman/quietstar-8-ahead

Text Generation • Updated Mar 23, 2024 • 214 • 90

upvoted 2 papers 11 days ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published 13 days ago • 83

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 12 days ago • 284

upvoted a paper 12 days ago

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Paper • 2501.11873 • Published 14 days ago • 63

upvoted a paper 13 days ago

Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

Paper • 2501.12202 • Published 13 days ago • 32

liked 2 models 14 days ago

deepseek-ai/DeepSeek-R1-Distill-Llama-8B

Text Generation • Updated 3 days ago • 241k • 398

deepseek-ai/DeepSeek-R1-Zero

Text Generation • Updated 3 days ago • 20.8k • 677

liked a model 15 days ago

deepseek-ai/DeepSeek-R1

Text Generation • Updated 3 days ago • 953k • • 6.26k

liked a model 17 days ago

LLM360/K2

Text Generation • Updated Jul 29, 2024 • 1.74k • 85

upvoted a paper 18 days ago

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published 18 days ago • 67

liked a model 19 days ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated 2 days ago • 108k • 2.72k

upvoted a paper 19 days ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 20 days ago • 271

liked a model 20 days ago

MiniMaxAI/MiniMax-VL-01

Image-Text-to-Text • Updated 9 days ago • 2.08k • 226