15 580 242

Taufiq Dwi Purnomo

taufiqdp

https://taufiqdp.com

AI & ML interests

SLM, VLM

Recent Activity

upvoted a paper 3 days ago

NeoBERT: A Next-Generation BERT

upvoted a paper 5 days ago

SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference

upvoted a paper 6 days ago

Slamming: Training a Speech Language Model on One GPU in a Day

View all activity

Organizations

taufiqdp's activity

upvoted a paper 3 days ago

NeoBERT: A Next-Generation BERT

Paper • 2502.19587 • Published 5 days ago • 30

upvoted a paper 5 days ago

SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference

Paper • 2502.18137 • Published 6 days ago • 49

upvoted a paper 6 days ago

Slamming: Training a Speech Language Model on One GPU in a Day

Paper • 2502.15814 • Published 12 days ago • 59

upvoted a paper 7 days ago

LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers

Paper • 2502.15007 • Published 11 days ago • 155

upvoted a collection 10 days ago

SigLIP2

Collection

36 items • Updated 10 days ago • 53

upvoted 2 papers 10 days ago

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published 11 days ago • 93

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published 11 days ago • 121

upvoted an article 10 days ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

12 days ago

• 187

liked a dataset 11 days ago

SakanaAI/AI-CUDA-Engineer-Archive

Viewer • Updated 12 days ago • 30.6k • 11.8k • 129

upvoted an article 11 days ago

Article

PaliGemma 2 Mix - New Instruction Vision Language Models by Google

13 days ago

• 61

upvoted a collection 11 days ago

PaliGemma 2 Mix

Collection

13 items • Updated 12 days ago • 59

upvoted a paper 11 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 12 days ago • 153

liked a model 13 days ago

perplexity-ai/r1-1776

Text Generation • Updated 5 days ago • 34k • • 1.96k

upvoted 2 papers 13 days ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published 15 days ago • 138

ReLearn: Unlearning via Learning for Large Language Models

Paper • 2502.11190 • Published 15 days ago • 29

upvoted a paper 14 days ago

Large Language Diffusion Models

Paper • 2502.09992 • Published 17 days ago • 94

upvoted 2 papers 17 days ago

mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data

Paper • 2502.08468 • Published 19 days ago • 13

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published 19 days ago • 142

upvoted a paper 19 days ago

TransMLA: Multi-head Latent Attention Is All You Need

Paper • 2502.07864 • Published 20 days ago • 46

liked a model 19 days ago

agentica-org/DeepScaleR-1.5B-Preview

Text Generation • Updated 9 days ago • 46.9k • • 492