-
TinyLlama: An Open-Source Small Language Model
Paper • 2401.02385 • Published • 82 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 41 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 63 -
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
Paper • 2401.16380 • Published • 46
Collections
Discover the best community collections!
Collections including paper arxiv:2403.09611
-
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper • 2401.02954 • Published • 38 -
Qwen Technical Report
Paper • 2309.16609 • Published • 30 -
GPT-4 Technical Report
Paper • 2303.08774 • Published • 3 -
Gemini: A Family of Highly Capable Multimodal Models
Paper • 2312.11805 • Published • 44
-
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper • 2312.16862 • Published • 28 -
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action
Paper • 2312.17172 • Published • 25 -
Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers
Paper • 2401.01974 • Published • 4 -
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
Paper • 2401.01885 • Published • 26
-
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions
Paper • 2312.08578 • Published • 15 -
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Paper • 2312.08583 • Published • 9 -
Vision-Language Models as a Source of Rewards
Paper • 2312.09187 • Published • 10 -
StemGen: A music generation model that listens
Paper • 2312.08723 • Published • 45
-
The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs
Paper • 2210.14986 • Published • 4 -
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2
Paper • 2311.10702 • Published • 17 -
Large Language Models as Optimizers
Paper • 2309.03409 • Published • 72 -
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting
Paper • 2309.04269 • Published • 29
-
ChatAnything: Facetime Chat with LLM-Enhanced Personas
Paper • 2311.06772 • Published • 33 -
Fine-tuning Language Models for Factuality
Paper • 2311.08401 • Published • 26 -
A Survey on Language Models for Code
Paper • 2311.07989 • Published • 21 -
Instruction-Following Evaluation for Large Language Models
Paper • 2311.07911 • Published • 17
-
Large-Scale Automatic Audiobook Creation
Paper • 2309.03926 • Published • 52 -
Agents: An Open-source Framework for Autonomous Language Agents
Paper • 2309.07870 • Published • 39 -
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 51 -
StarCoder: may the source be with you!
Paper • 2305.06161 • Published • 29