16 15 1

Ajith V Prabhakar

ajithprabhakar

https://www.ajithp.com

ajithprabhakar

AI & ML interests

NLP, Responsible AI, Generative AI

Recent Activity

commented on a paper 5 days ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

commented on a paper 11 days ago

Qwen2.5-1M Technical Report

commented on a paper 11 days ago

Improving Retrieval-Augmented Generation through Multi-Agent Reinforcement Learning

View all activity

Organizations

ajithprabhakar's activity

commented a paper 5 days ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 9 days ago • 160 •

commented 2 papers 11 days ago

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published 19 days ago • 57 •

Improving Retrieval-Augmented Generation through Multi-Agent Reinforcement Learning

Paper • 2501.15228 • Published 20 days ago • 1 •

commented a paper 19 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 23 days ago • 318 •

commented a paper 22 days ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published about 1 month ago • 273 •

commented a paper about 2 months ago

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 92 •

commented a paper 5 months ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 136 •

commented a paper 6 months ago

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21, 2024 • 58 •

commented 2 papers 8 months ago

LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs

Paper • 2406.15319 • Published Jun 21, 2024 • 64 •

Mixture-of-Agents Enhances Large Language Model Capabilities

Paper • 2406.04692 • Published Jun 7, 2024 • 56 •

commented a paper 9 months ago

Chameleon: Mixed-Modal Early-Fusion Foundation Models

Paper • 2405.09818 • Published May 16, 2024 • 130 •

commented 4 papers 10 months ago

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

Paper • 2404.14619 • Published Apr 22, 2024 • 127 •

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22, 2024 • 256 •

Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28, 2024 • 107 •

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2, 2024 • 104 •

commented a paper about 1 year ago

Self-Discover: Large Language Models Self-Compose Reasoning Structures

Paper • 2402.03620 • Published Feb 6, 2024 • 116 •