3 98 145

zhangwenbin

ExceedZhang

AI & ML interests

None yet

Recent Activity

upvoted an article 1 day ago

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

liked a model 2 days ago

meta-llama/Llama-Guard-3-1B

liked a model 6 days ago

cognitivecomputations/DeepSeek-R1-AWQ

View all activity

Organizations

None yet

ExceedZhang's activity

upvoted an article 1 day ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

2 days ago

• 206

liked a model 2 days ago

meta-llama/Llama-Guard-3-1B

Text Generation • Updated Sep 26, 2024 • 10.4k • • 75

liked 3 models 6 days ago

upvoted an article 10 days ago

Article

Trace & Evaluate your Agent with Arize Phoenix

14 days ago

• 31

upvoted a paper 23 days ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published 26 days ago • 142

upvoted a paper 24 days ago

CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction

Paper • 2502.07316 • Published about 1 month ago • 47

updated a model 27 days ago

ExceedZhang/DeepSeek-R1-Distill-Qwen-32B-W4A16-G128

Updated 27 days ago • 44

liked a model 27 days ago

stabilityai/stable-diffusion-3.5-large

Text-to-Image • Updated Oct 22, 2024 • 160k • • 2.47k

published a model 27 days ago

ExceedZhang/DeepSeek-R1-Distill-Qwen-32B-W4A16-G128

Updated 27 days ago • 44

upvoted an article 29 days ago

Article

State of open video generation models in Diffusers

Jan 27

• 50

upvoted a paper about 1 month ago

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 111

updated a model about 1 month ago

ExceedZhang/DeepSeek-R1-Distill-Qwen-14B-W4A16-G128

Updated Feb 2 • 24

upvoted 3 papers about 1 month ago

TradExpert: Revolutionizing Trading with Mixture of Expert LLMs

Paper • 2411.00782 • Published Oct 16, 2024 • 1

Humanity's Last Exam

Paper • 2501.14249 • Published Jan 24 • 66

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published Jan 26 • 63

upvoted 2 articles about 1 month ago

Article

We now support VLMs in smolagents!

Jan 24

• 92

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 803

liked a model about 1 month ago

Qwen/Qwen2.5-VL-7B-Instruct

Image-Text-to-Text • Updated 8 days ago • 3.38M • • 663