1 1 6

Zenos

zenosai

zenosai

AI & ML interests

AI CV

Recent Activity

liked a model 7 days ago

Qwen/Qwen2.5-VL-72B-Instruct

authored a paper about 2 months ago

R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models

upvoted a paper about 2 months ago

R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models

View all activity

Organizations

zenosai's activity

liked a model 7 days ago

Qwen/Qwen2.5-VL-72B-Instruct

Image-Text-to-Text • Updated 13 days ago • 264k • 338

authored a paper about 2 months ago

R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models

Paper • 2410.17885 • Published Oct 23, 2024 • 1

upvoted a paper about 2 months ago

R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models

Paper • 2410.17885 • Published Oct 23, 2024 • 1

reacted to morgan's post with 👍 7 months ago

Post

1304

Llama 3.1 405B Instruct beats GPT-4o on MixEval-Hard

Just ran MixEval for 405B, Sonnet-3.5 and 4o, with 405B landing right between the other two at 66.19

The GPT-4o result of 64.7 replicated locally but Sonnet-3.5 actually scored 70.25/69.45 in my replications 🤔 Still well ahead of the other 2 though.

Sammple of 1 of the eval calls here: https://wandb.ai/morgan/MixEval/weave/calls/07b05ae2-2ef5-4525-98a6-c59963b76fe1

Quick auto-logging tracing for openai-compatible clients and many more here: https://wandb.github.io/weave/quickstart/