Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs Paper • 2406.16860 • Published 7 days ago • 48
Unlocking Continual Learning Abilities in Language Models Paper • 2406.17245 • Published 7 days ago • 26
OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding Paper • 2406.19389 • Published 4 days ago • 48
TroL: Traversal of Layers for Large Language and Vision Models Paper • 2406.12246 • Published 14 days ago • 33
Improving Visual Commonsense in Language Models via Multiple Image Generation Paper • 2406.13621 • Published 12 days ago • 13
VoCo-LLaMA: Towards Vision Compression with Large Language Models Paper • 2406.12275 • Published 14 days ago • 28
HelpSteer2: Open-source dataset for training top-performing reward models Paper • 2406.08673 • Published 19 days ago • 14
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages Paper • 2406.10118 • Published 17 days ago • 25
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text Paper • 2406.08418 • Published 19 days ago • 28
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning Paper • 2406.08973 • Published 18 days ago • 85
An Image is Worth 32 Tokens for Reconstruction and Generation Paper • 2406.07550 • Published 20 days ago • 53
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Paper • 2406.06525 • Published 21 days ago • 60
LLaMA-NAS: Efficient Neural Architecture Search for Large Language Models Paper • 2405.18377 • Published May 28 • 16
Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models Paper • 2405.16759 • Published May 27 • 7
LoGAH: Predicting 774-Million-Parameter Transformers using Graph HyperNetworks with 1/100 Parameters Paper • 2405.16287 • Published May 25 • 10
ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models Paper • 2405.15738 • Published May 24 • 43
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization Paper • 2405.15071 • Published May 23 • 31
Layer-Condensed KV Cache for Efficient Inference of Large Language Models Paper • 2405.10637 • Published May 17 • 17
Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published Apr 30 • 65
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper • 2404.07143 • Published Apr 10 • 97
From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples Paper • 2404.07544 • Published Apr 11 • 15
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders Paper • 2404.05961 • Published Apr 9 • 62
Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU Paper • 2403.06504 • Published Mar 11 • 52
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect Paper • 2403.03853 • Published Mar 6 • 61
Common 7B Language Models Already Possess Strong Math Capabilities Paper • 2403.04706 • Published Mar 7 • 16
Do Large Language Models Latently Perform Multi-Hop Reasoning? Paper • 2402.16837 • Published Feb 26 • 24
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models Paper • 2402.14848 • Published Feb 19 • 18
A Touch, Vision, and Language Dataset for Multimodal Alignment Paper • 2402.13232 • Published Feb 20 • 11
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper • 2402.13753 • Published Feb 21 • 106
Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM Paper • 2401.02994 • Published Jan 4 • 45
Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon Paper • 2401.03462 • Published Jan 7 • 25
DiarizationLM: Speaker Diarization Post-Processing with Large Language Models Paper • 2401.03506 • Published Jan 7 • 12
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism Paper • 2401.02954 • Published Jan 5 • 38
LLM Leaderboard best models ❤️🔥 Collection A daily uploaded list of models with best evaluations on the LLM leaderboard: • 264 items • Updated 9 days ago • 341
Distributed Inference and Fine-tuning of Large Language Models Over The Internet Paper • 2312.08361 • Published Dec 13, 2023 • 23