Abhay Gupta's picture

1 10 2

Abhay Gupta

abhaygupta

AI & ML interests

LLM Training & Inference; Sparsity and Quantization

Recent Activity

upvoted a collection 10 days ago

liked a dataset 14 days ago

Magpie-Align/Magpie-Reasoning-V2-250K-CoT-Llama3

liked a dataset 14 days ago

argilla/magpie-ultra-v1.0

View all activity

Organizations

abhaygupta's activity

upvoted a collection 10 days ago

R3GAN

R3GAN: A Modern BaselineGAN https://github.com/brownvc/R3GAN/ https://arxiv.org/abs/2501.05441 • 7 items • Updated 10 days ago • 8

upvoted 4 papers 5 months ago

BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts

Paper • 2408.08274 • Published Aug 15, 2024 • 13

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search

Paper • 2408.08152 • Published Aug 15, 2024 • 53

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Paper • 2408.08872 • Published Aug 16, 2024 • 98

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

Paper • 2408.10188 • Published Aug 19, 2024 • 51

upvoted a paper 6 months ago

MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models

Paper • 2408.02718 • Published Aug 5, 2024 • 61

upvoted a paper 9 months ago

Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment

Paper • 2405.03594 • Published May 6, 2024 • 7

upvoted 3 collections 10 months ago

DBRX

DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27, 2024 • 92

Sparse Foundational Llama 2 Models

Sparse pre-trained and fine-tuned Llama models made by Neural Magic + Cerebras • 27 items • Updated Sep 26, 2024 • 9

Cerebras LLaVA

Cerebras implementation and training recipes related to multimodal LLaVA models • 4 items • Updated Aug 21, 2024 • 1