Dan Jacobellis's picture

3 53 5

Dan Jacobellis PRO

danjacobellis

·

https://danjacobellis.net

danjacobellis

AI & ML interests

Signal processing, information theory, data compression

Recent Activity

updated a model 1 day ago

danjacobellis/dance

upvoted a paper 4 days ago

Transformers without Normalization

published a model 5 days ago

danjacobellis/dance

View all activity

Organizations

None yet

danjacobellis's activity

upvoted a paper 4 days ago

Transformers without Normalization

Paper • 2503.10622 • Published 5 days ago • 121

upvoted 3 papers 14 days ago

SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity

Paper • 2503.01506 • Published 15 days ago • 9

DiffRhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion

Paper • 2503.01183 • Published 15 days ago • 26

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Paper • 2503.01743 • Published 15 days ago • 75

upvoted 5 papers 18 days ago

Towards an AI co-scientist

Paper • 2502.18864 • Published 20 days ago • 43

Training Consistency Models with Variational Noise Coupling

Paper • 2502.18197 • Published 21 days ago • 6

FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute

Paper • 2502.20126 • Published 19 days ago • 20

NeoBERT: A Next-Generation BERT

Paper • 2502.19587 • Published 19 days ago • 38

UniTok: A Unified Tokenizer for Visual Generation and Understanding

Paper • 2502.20321 • Published 19 days ago • 29

upvoted a paper 24 days ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published 26 days ago • 130

upvoted 3 papers 27 days ago

HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation

Paper • 2502.09838 • Published Feb 14 • 10

Continuous Diffusion Model for Language Modeling

Paper • 2502.11564 • Published 29 days ago • 52

You Do Not Fully Utilize Transformer's Representation Capacity

Paper • 2502.09245 • Published Feb 13 • 34

upvoted a paper 28 days ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published 30 days ago • 144

upvoted 6 papers about 2 months ago

Music2Latent2: Audio Compression with Summary Embeddings and Autoregressive Decoding

Paper • 2501.17578 • Published Jan 29 • 1

Feasible Learning

Paper • 2501.14912 • Published Jan 24 • 5

iFormer: Integrating ConvNet and Transformer for Mobile Application

Paper • 2501.15369 • Published Jan 26 • 12

Baichuan-Omni-1.5 Technical Report

Paper • 2501.15368 • Published Jan 26 • 61

MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine

Paper • 2408.02900 • Published Aug 6, 2024 • 28

The Geometry of Tokens in Internal Representations of Large Language Models

Paper • 2501.10573 • Published Jan 17 • 9