32 63 78

Somshubra Majumdar

smajumdar94

AI & ML interests

None yet

Recent Activity

upvoted a paper 12 days ago

How to Synthesize Text Data without Model Collapse?

liked a dataset 18 days ago

CohereForAI/lbpp

liked a model 18 days ago

CohereForAI/c4ai-command-r7b-12-2024

View all activity

Organizations

smajumdar94's activity

upvoted a paper 12 days ago

How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published 14 days ago • 48

upvoted a paper 28 days ago

o1-Coder: an o1 Replication for Coding

Paper • 2412.00154 • Published Nov 29, 2024 • 41

upvoted a paper about 1 month ago

Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

Paper • 2411.14405 • Published Nov 21, 2024 • 58

upvoted a paper about 2 months ago

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 43

upvoted a collection 2 months ago

steiner-preview

Collection

Reasoning models trained on synthetic data using reinforcement learning. • 3 items • Updated Oct 20, 2024 • 25

upvoted an article 3 months ago

Article

Fixing Gradient Accumulation

Oct 16, 2024

• 44

upvoted a paper 3 months ago

CursorCore: Assist Programming through Aligning Anything

Paper • 2410.07002 • Published Oct 9, 2024 • 13

upvoted 2 articles 3 months ago

Article

Welcome, Gradio 5

Oct 9, 2024

• 95

Article

Accelerate 1.0.0

Sep 13, 2024

• 51

upvoted a collection 3 months ago

Qwen2.5

Collection

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated Nov 28, 2024 • 452

upvoted an article 4 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18, 2024

• 215

upvoted 4 papers 4 months ago

How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality Data

Paper • 2409.03810 • Published Sep 5, 2024 • 31

upvoted 2 papers 5 months ago

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Paper • 2408.07055 • Published Aug 13, 2024 • 64

VITA: Towards Open-Source Interactive Omni Multimodal LLM

Paper • 2408.05211 • Published Aug 9, 2024 • 47

upvoted a collection 5 months ago

Minitron

Collection

A family of compressed models obtained via pruning and knowledge distillation • 12 items • Updated 20 days ago • 59

upvoted an article 6 months ago

Article

Announcing BigCodeBench-Hard, and More

•

Jul 24, 2024

• 10

upvoted a paper 6 months ago

AgentInstruct: Toward Generative Teaching with Agentic Flows

Paper • 2407.03502 • Published Jul 3, 2024 • 50