new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

by AK and the research community

Aug 30

Submitted by

chenfengx

Law of Vision Representation in MLLMs

·
6 authors

Submitted by

akhaliq

CogVLM2: Visual Language Models for Image and Video Understanding

·
25 authors

Submitted by

akhaliq

WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling

·
16 authors

Submitted by

akhaliq

ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model

·
8 authors

Submitted by

akhaliq

SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners

·
7 authors

Submitted by

zhuzeyuan

Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems

·
4 authors

Submitted by

akhaliq

CSGO: Content-Style Composition in Text-to-Image Generation

·
8 authors

Submitted by

akhaliq

3D Reconstruction with Spatial Memory

·
2 authors

Submitted by

hallisky

StyleRemix: Interpretable Authorship Obfuscation via Distillation and Perturbation of Style Elements

·
6 authors

Submitted by

akhaliq

Scaling Up Diffusion and Flow-based XGBoost Models

·
2 authors

Submitted by

necludov

Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold

·
8 authors