-
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 257 -
Magicoder: Source Code Is All You Need
Paper • 2312.02120 • Published • 79 -
Mixtral of Experts
Paper • 2401.04088 • Published • 157 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 96
Collections
Discover the best community collections!
Collections including paper arxiv:2312.11514
-
aMUSEd: An Open MUSE Reproduction
Paper • 2401.01808 • Published • 28 -
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
Paper • 2401.01885 • Published • 27 -
SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity
Paper • 2401.00604 • Published • 4 -
LARP: Language-Agent Role Play for Open-World Games
Paper • 2312.17653 • Published • 29
-
mistralai/Mixtral-8x7B-Instruct-v0.1
Text Generation • Updated • 598k • • 4.14k -
fka/awesome-chatgpt-prompts
Viewer • Updated • 170 • 8.19k • 5.77k -
1.43k💃
MagicAnimate
-
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 257
-
Improving Text Embeddings with Large Language Models
Paper • 2401.00368 • Published • 79 -
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 53 -
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper • 2401.00908 • Published • 178 -
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 257
-
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper • 2312.15166 • Published • 56 -
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
Paper • 2312.12456 • Published • 40 -
Cached Transformers: Improving Transformers with Differentiable Memory Cache
Paper • 2312.12742 • Published • 12 -
Mini-GPTs: Efficient Large Language Models through Contextual Pruning
Paper • 2312.12682 • Published • 8