Jaekoo Kang

jkang

https://github.com/jaekookang

AI & ML interests

Anything fun and interesting

Recent Activity

liked a model 3 days ago

onnx-community/moonshine-base-ONNX

liked a Space 3 days ago

webml-community/moonshine-web

liked a model 14 days ago

Xenova/nllb-200-distilled-600M

View all activity

Organizations

jkang's activity

upvoted a paper about 1 month ago

Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders

Paper • 2410.22366 • Published Oct 28 • 77

upvoted a paper 4 months ago

Foundation Models for Music: A Survey

Paper • 2408.14340 • Published Aug 26 • 43

upvoted a paper 6 months ago

Pix2Gif: Motion-Guided Diffusion for GIF Generation

Paper • 2403.04634 • Published Mar 7 • 14

upvoted a paper 10 months ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 603

upvoted a paper 11 months ago

MobileVLM V2: Faster and Stronger Baseline for Vision Language Model

Paper • 2402.03766 • Published Feb 6 • 12

upvoted 2 papers 12 months ago

Gemini vs GPT-4V: A Preliminary Comparison and Combination of Vision-Language Models Through Qualitative Cases

Paper • 2312.15011 • Published Dec 22, 2023 • 15

Boundary Attention: Learning to Find Faint Boundaries at Any Resolution

Paper • 2401.00935 • Published Jan 1 • 17

upvoted 13 papers about 1 year ago

Context Tuning for Retrieval Augmented Generation

Paper • 2312.05708 • Published Dec 9, 2023 • 17

HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis

Paper • 2311.12454 • Published Nov 21, 2023 • 30

PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction

Paper • 2311.12024 • Published Nov 20, 2023 • 19

GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning

Paper • 2311.12631 • Published Nov 21, 2023 • 13

UnifiedVisionGPT: Streamlining Vision-Oriented AI through Generalized Multimodal Framework

Paper • 2311.10125 • Published Nov 16, 2023 • 4

SoundCam: A Dataset for Finding Humans Using Room Acoustics

Paper • 2311.03517 • Published Nov 6, 2023 • 10

Levels of AGI: Operationalizing Progress on the Path to AGI

Paper • 2311.02462 • Published Nov 4, 2023 • 34

The Generative AI Paradox: "What It Can Create, It May Not Understand"

Paper • 2311.00059 • Published Oct 31, 2023 • 18

An Early Evaluation of GPT-4V(ision)

Paper • 2310.16534 • Published Oct 25, 2023 • 21

Towards Understanding Sycophancy in Language Models

Paper • 2310.13548 • Published Oct 20, 2023 • 4