7 963 633

Kye Gomez

kye

https://discord.gg/qUtxnK2NMf

kyegomezb

AI & ML interests

Neuroscience, Behavior Science, Anti-Matter, Anti-Gravity propulsion,

Recent Activity

upvoted a paper about 10 hours ago

When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning

upvoted a paper about 11 hours ago

Multi Agent based Medical Assistant for Edge Devices

upvoted a paper about 11 hours ago

Self-Taught Self-Correction for Small Language Models

View all activity

Organizations

kye's activity

upvoted a paper about 10 hours ago

When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning

Paper • 2503.07588 • Published 3 days ago • 3

upvoted 6 papers about 11 hours ago

Quantizing Large Language Models for Code Generation: A Differentiated Replication

Paper • 2503.07103 • Published 4 days ago • 6

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published 1 day ago • 35

TPDiff: Temporal Pyramid Video Diffusion Model

Paper • 2503.09566 • Published 1 day ago • 36

upvoted 4 papers 1 day ago

OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction

Paper • 2503.03734 • Published 8 days ago • 1

Mixture of Experts Made Intrinsically Interpretable

Paper • 2503.07639 • Published 8 days ago • 6

"Principal Components" Enable A New Language of Images

Paper • 2503.08685 • Published 2 days ago • 10

Video Action Differencing

Paper • 2503.07860 • Published 3 days ago • 28

upvoted 9 papers 2 days ago

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia

Paper • 2503.07920 • Published 3 days ago • 89

MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice

Paper • 2503.05978 • Published 6 days ago • 30

Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

Paper • 2503.07572 • Published 3 days ago • 31

Evaluating Intelligence via Trial and Error

Paper • 2502.18858 • Published 16 days ago • 4

OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models

Paper • 2503.08686 • Published 2 days ago • 14

Gemini Embedding: Generalizable Embeddings from Gemini

Paper • 2503.07891 • Published 3 days ago • 24

LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL

Paper • 2503.07536 • Published 3 days ago • 70

SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories

Paper • 2503.08625 • Published 2 days ago • 23

UniF^2ace: Fine-grained Face Understanding and Generation with Unified Multimodal Models

Paper • 2503.08120 • Published 3 days ago • 27