Collections
Discover the best community collections!
Collections including paper arxiv:2312.11514
-
TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
Paper • 2404.11912 • Published • 16 -
SnapKV: LLM Knows What You are Looking for Before Generation
Paper • 2404.14469 • Published • 23 -
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 255
-
Towards a World-English Language Model for On-Device Virtual Assistants
Paper • 2403.18783 • Published • 4 -
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 123 -
ReALM: Reference Resolution As Language Modeling
Paper • 2403.20329 • Published • 20 -
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
Paper • 2404.05719 • Published • 57
-
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Paper • 2402.14905 • Published • 81 -
Sensor-based Multi-Robot Search and Coverage with Spatial Separation in Unstructured Environments
Paper • 2403.01710 • Published • 2 -
EdgeMoE: Fast On-Device Inference of MoE-based Large Language Models
Paper • 2308.14352 • Published -
Slimmable Encoders for Flexible Split DNNs in Bandwidth and Resource Constrained IoT Systems
Paper • 2306.12691 • Published • 2
-
mistralai/Mixtral-8x7B-Instruct-v0.1
Text Generation • Updated • 499k • 3.93k -
HuggingFaceM4/WebSight
Viewer • Updated • 2.75M • 200 • 296 -
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 255 -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 235