3 86 87

Shyam Sunder Kumar

theainerd

AI & ML interests

Natural Language Processing

Recent Activity

reacted to Kseniase's post with 🔥 3 days ago

9 types of "Chain-of-..." approaches: Chain-of-Thought (CoT) prompting enhances reasoning in AI models by breaking down complex problems into step-by-step logical sequences. It continues proving its effectiveness, especially in top-performing reasoning models. However, there are other similar methods, that expand CoT and can be used for different purposes. Here are 9 of them: 1. Chain-of-Action-Thought (COAT) -> https://huggingface.co/papers/2502.02508 Helps model decide when to keep thinking, double-check their work, or try a different approach, using special guiding tokens. 2. Chain of Draft (CoD) -> https://huggingface.co/papers/2502.18600 It helps model generate short but meaningful reasoning steps, cutting costs and making processing faster 3. Chain-of-Agents -> https://huggingface.co/papers/2406.02818 Uses multi-agent collaboration: Worker agents process text parts in a structured chain, and manager agent summarizes the results 4. Chain-of-RAG ->https://huggingface.co/papers/2501.14342 Creates retrieval chains, instead of retrieving all info at once. It can dynamically adjust its search process and its parameters like step number 5. Chain-of-Shot Prompting (CoS) -> https://huggingface.co/papers/2502.06428 Helps models pick frames crucial for understanding a video, using a binary video summary and video co-reasoning module. 6. Chain of Hindsight (CoH) -> https://huggingface.co/papers/2302.02676 Converts all feedback into sequences to fine-tune the model and refine outputs 7. Chain-of-Note (CoN) -> https://huggingface.co/papers/2311.09210 Generates sequential reading notes for each retrieved document to assess relevance before integrating info into the final answer 8. Chain of Diagnosis (CoD) -> https://huggingface.co/papers/2407.13301 Transforms the diagnostic process into a diagnostic chain 9. Chain(s)-of-Knowledge -> https://www.turingpost.com/p/cok Enhance LLMs by dynamically pulling in external knowledge to improve accuracy and reduce errors

upvoted an article 7 days ago

SigLIP 2: A better multilingual vision language encoder

reacted to AdinaY's post with 🔥 7 days ago

Wan2.1 🔥📹 new OPEN video model by Alibaba Wan team! Model: https://huggingface.co/Wan-AI/Wan2.1-T2V-14B Demo: https://huggingface.co/spaces/Wan-AI/Wan2.1 ✨Apache 2.0 ✨8.19GB VRAM, runs on most GPUs ✨Multi-Tasking: T2V, I2V, Video Editing, T2I, V2A ✨Text Generation: Supports Chinese & English ✨Powerful Video VAE: Encode/decode 1080P w/ temporal precision

View all activity

Organizations

theainerd's activity

upvoted an article 7 days ago

Article

SigLIP 2: A better multilingual vision language encoder

14 days ago

• 124

upvoted a paper 10 days ago

LightThinker: Thinking Step-by-Step Compression

Paper • 2502.15589 • Published 13 days ago • 26

upvoted 2 papers 12 days ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published 14 days ago • 127

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Paper • 2502.14499 • Published 14 days ago • 172

upvoted a paper 13 days ago

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published 14 days ago • 94

upvoted 2 papers 14 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 15 days ago • 156

Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs

Paper • 2502.12982 • Published 16 days ago • 14

upvoted a paper 17 days ago

Jailbreaking to Jailbreak

Paper • 2502.09638 • Published 25 days ago • 4

upvoted a paper 18 days ago

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published 21 days ago • 142

upvoted an article 22 days ago

Article

Open-source DeepResearch – Freeing our search agents

about 1 month ago

• 1.13k

upvoted an article 23 days ago

Article

Open R1: Update #2

and 6 others •

24 days ago

• 197

upvoted a paper 26 days ago

Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2

Paper • 2502.03544 • Published 29 days ago • 43

upvoted 2 articles about 1 month ago

Article

Welcome to Inference Providers on the Hub 🔥

Jan 28

• 411

Article

Open-R1: Update #1

and 7 others •

Feb 2

• 293

upvoted a collection about 1 month ago

🧠 Reasoning datasets

Collection

Datasets with reasoning traces for math and code released by the community • 13 items • Updated 3 days ago • 89

upvoted a paper about 1 month ago

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 56

upvoted an article about 1 month ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 790

upvoted a paper about 1 month ago

Humanity's Last Exam

Paper • 2501.14249 • Published Jan 24 • 65

upvoted a collection about 1 month ago

Qwen2.5-1M

Collection

The long-context version of Qwen2.5, supporting 1M-token context lengths • 3 items • Updated 8 days ago • 104

upvoted an article about 1 month ago

Article

We now support VLMs in smolagents!

Jan 24

• 90