Shyam Sunder Kumar

theainerd

AI & ML interests

Natural Language Processing

Recent Activity

reacted to Kseniase's post with 🔥 3 days ago
9 types of "Chain-of-..." approaches: Chain-of-Thought (CoT) prompting enhances reasoning in AI models by breaking down complex problems into step-by-step logical sequences. It continues proving its effectiveness, especially in top-performing reasoning models. However, there are other similar methods, that expand CoT and can be used for different purposes. Here are 9 of them: 1. Chain-of-Action-Thought (COAT) -> https://huggingface.co/papers/2502.02508 Helps model decide when to keep thinking, double-check their work, or try a different approach, using special guiding tokens. 2. Chain of Draft (CoD) -> https://huggingface.co/papers/2502.18600 It helps model generate short but meaningful reasoning steps, cutting costs and making processing faster 3. Chain-of-Agents -> https://huggingface.co/papers/2406.02818 Uses multi-agent collaboration: Worker agents process text parts in a structured chain, and manager agent summarizes the results 4. Chain-of-RAG ->https://huggingface.co/papers/2501.14342 Creates retrieval chains, instead of retrieving all info at once. It can dynamically adjust its search process and its parameters like step number 5. Chain-of-Shot Prompting (CoS) -> https://huggingface.co/papers/2502.06428 Helps models pick frames crucial for understanding a video, using a binary video summary and video co-reasoning module. 6. Chain of Hindsight (CoH) -> https://huggingface.co/papers/2302.02676 Converts all feedback into sequences to fine-tune the model and refine outputs 7. Chain-of-Note (CoN) -> https://huggingface.co/papers/2311.09210 Generates sequential reading notes for each retrieved document to assess relevance before integrating info into the final answer 8. Chain of Diagnosis (CoD) -> https://huggingface.co/papers/2407.13301 Transforms the diagnostic process into a diagnostic chain 9. Chain(s)-of-Knowledge -> https://www.turingpost.com/p/cok Enhance LLMs by dynamically pulling in external knowledge to improve accuracy and reduce errors
View all activity

Organizations

Neuropark's profile picture Speech Recognition Community Event Version 2's profile picture Open-Source AI Meetup's profile picture Social Post Explorers's profile picture Hugging Face Discord Community's profile picture

theainerd's activity

reacted to Kseniase's post with 🔥 3 days ago
view post
Post
5837
9 types of "Chain-of-..." approaches:

Chain-of-Thought (CoT) prompting enhances reasoning in AI models by breaking down complex problems into step-by-step logical sequences. It continues proving its effectiveness, especially in top-performing reasoning models. However, there are other similar methods, that expand CoT and can be used for different purposes. Here are 9 of them:

1. Chain-of-Action-Thought (COAT) -> Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search (2502.02508)
Helps model decide when to keep thinking, double-check their work, or try a different approach, using special guiding tokens.

2. Chain of Draft (CoD) -> Chain of Draft: Thinking Faster by Writing Less (2502.18600)
It helps model generate short but meaningful reasoning steps, cutting costs and making processing faster

3. Chain-of-Agents -> Chain of Agents: Large Language Models Collaborating on Long-Context Tasks (2406.02818)
Uses multi-agent collaboration: Worker agents process text parts in a structured chain, and manager agent summarizes the results

4. Chain-of-RAG ->https://huggingface.co/papers/2501.14342
Creates retrieval chains, instead of retrieving all info at once. It can dynamically adjust its search process and its parameters like step number

5. Chain-of-Shot Prompting (CoS) -> CoS: Chain-of-Shot Prompting for Long Video Understanding (2502.06428)
Helps models pick frames crucial for understanding a video, using a binary video summary and video co-reasoning module.

6. Chain of Hindsight (CoH) -> Chain of Hindsight Aligns Language Models with Feedback (2302.02676)
Converts all feedback into sequences to fine-tune the model and refine outputs

7. Chain-of-Note (CoN) -> Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models (2311.09210)
Generates sequential reading notes for each retrieved document to assess relevance before integrating info into the final answer

8. Chain of Diagnosis (CoD) -> CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis (2407.13301)
Transforms the diagnostic process into a diagnostic chain

9. Chain(s)-of-Knowledge -> https://www.turingpost.com/p/cok
Enhance LLMs by dynamically pulling in external knowledge to improve accuracy and reduce errors
upvoted an article 7 days ago
view article
Article

SigLIP 2: A better multilingual vision language encoder

124
reacted to AdinaY's post with 🔥 7 days ago
view post
Post
2683
Wan2.1 🔥📹 new OPEN video model by Alibaba Wan team!

Model: Wan-AI/Wan2.1-T2V-14B
Demo: Wan-AI/Wan2.1

✨Apache 2.0
✨8.19GB VRAM, runs on most GPUs
✨Multi-Tasking: T2V, I2V, Video Editing, T2I, V2A
✨Text Generation: Supports Chinese & English
✨Powerful Video VAE: Encode/decode 1080P w/ temporal precision
  • 1 reply
·
reacted to burtenshaw's post with 🔥 8 days ago
view post
Post
6017
Now the Hugging Face agent course is getting real! With frameworks like smolagents, LlamaIndex, and LangChain.

🔗 Follow the org for updates https://huggingface.co/agents-course

This week we are releasing the first framework unit in the course and it’s on smolagents. This is what the unit covers:

- why should you use smolagents vs another library?
- how to build agents that use code
- build multiagents systems
- use vision language models for browser use

The team has been working flat out on this for a few weeks. Led by @sergiopaniego and supported by smolagents author @m-ric .
reacted to stefan-it's post with 👍 8 days ago
view post
Post
5056
She arrived 😍

[Expect more models soon...]
  • 2 replies
·
reacted to cogwheelhead's post with 👍 13 days ago
view post
Post
2510
Me and my team have performed an in-depth investigation comparing o1 to R1 (and other reasoning models)

Link: https://toloka.ai/blog/r1-is-not-on-par-with-o1-and-the-difference-is-qualitative-not-quantitative

It started with us evaluating them on our own university-math benchmarks: U-MATH for problem-solving and μ-MATH for judging solution correctness (see the HF leaderboard: toloka/u-math-leaderboard)

tl;dr: R1 sure is amazing, but what we find is that it lags behind in novelty adaptation and reliability:
* performance drops when updating benchmarks with fresh unseen tasks (e.g. AIME 2024 -> 2025)
* R1-o1 gap widens when evaluating niche subdomains (e.g. university-specific math instead of the more common Olympiad-style contests)
* same with going into altogether unconventional domains (e.g. chess) or skills (e.g. judgment instead of problem-solving)
* R1 also runs into failure modes way more often (e.g. making illegal chess moves or falling into endless generation loops)

Our point here is not to bash on DeepSeek — they've done exceptional work, R1 is a game-changer, and we have no intention to downplay that. R1's release is a perfect opportunity to study where all these models differ and gain understanding on how to move forward from here
reacted to dreamerdeo's post with 🚀 14 days ago
view post
Post
2770
🚀 Excited to share our technical report on the Southeast Asian multilingual model Sailor2 and its latest updates!

Our 49-page report details Sailor2's development journey, including multilingual data cleaning, small model data mixture simulations, multi-stage continual pre-training, multi-stage post-training, and multi-cultural multi-lingual evaluations. Sailor2 aims to streamline the multilingual model pre-training process efficiently for the community.

🧭 We highlight Sailor2's impressive performance in low-resource language translation scenarios and its cultural understanding advantages in Southeast Asia, promoting practical applications for regional languages.

Model updates include: 
💡 More precise outputs: Reduced redundancy in model outputs through refined post-training data and optimization techniques. 
🌈 Handling longer texts: Expanded to handle up to 128K context length in Southeast Asian languages through long-text training. 
⚡️ Faster inference: Achieved 2.5x faster inference speed with speculative decoding. 
🌪️ More model sizes: Introduced new sizes of 3B and 14B through model pruning.

🌟 All models are Apache-licensed for commercial use; development tools (code, resources) are open-source.

📚 Technical report: Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs (2502.12982) 
🤖️ Models: sail/sailor2-language-models-674d7c9e6b4dbbd9a869906b 
💬 Demo: sail/Sailor2-20B-Chat 
📣 Sailor2 community: https://huggingface.co/sailor2