MobA: A Two-Level Agent System for Efficient Mobile Task Automation Paper • 2410.13757 • Published Oct 17, 2024 • 32
Benchmarking Chinese Knowledge Rectification in Large Language Models Paper • 2409.05806 • Published Sep 9, 2024 • 14
Defining and Extracting generalizable interaction primitives from DNNs Paper • 2401.16318 • Published Jan 29, 2024
Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters Paper • 2406.05955 • Published Jun 10, 2024 • 24
ReLU$^2$ Wins: Discovering Efficient Activation Functions for Sparse LLMs Paper • 2402.03804 • Published Feb 6, 2024 • 2
PowerInfer-2: Fast Large Language Model Inference on a Smartphone Paper • 2406.06282 • Published Jun 10, 2024 • 37
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU Paper • 2312.12456 • Published Dec 16, 2023 • 41