Octo-planner: On-device Language Model for Planner-Action Agents Paper • 2406.18082 • Published 3 days ago • 42
Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network Paper • 2406.15109 • Published 8 days ago • 1
Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models Paper • 2406.13542 • Published 10 days ago • 14
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper • 2406.14491 • Published 9 days ago • 74
HARE: HumAn pRiors, a key to small language model Efficiency Paper • 2406.11410 • Published 12 days ago • 37
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning Paper • 2406.11896 • Published 15 days ago • 17
Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning Paper • 2406.14283 • Published 9 days ago • 2
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning Paper • 2406.06469 • Published 19 days ago • 22
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts Paper • 2406.12034 • Published 12 days ago • 12
Improve Mathematical Reasoning in Language Models by Automated Process Supervision Paper • 2406.06592 • Published 24 days ago • 17
In-Context Editing: Learning Knowledge from Self-Induced Distributions Paper • 2406.11194 • Published 12 days ago • 14
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models Paper • 2405.20541 • Published 29 days ago • 18
How Do Large Language Models Acquire Factual Knowledge During Pretraining? Paper • 2406.11813 • Published 12 days ago • 28
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback Paper • 2406.00888 • Published 26 days ago • 29
What's the Magic Word? A Control Theory of LLM Prompting Paper • 2310.04444 • Published Oct 2, 2023 • 1
Understanding Transformer Reasoning Capabilities via Graph Algorithms Paper • 2405.18512 • Published May 28 • 1
Contextual Position Encoding: Learning to Count What's Important Paper • 2405.18719 • Published May 29 • 3
AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct Paper • 2405.14906 • Published May 23 • 21
Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models Paper • 2405.12939 • Published May 21 • 1
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning Paper • 2405.12130 • Published May 20 • 44
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? Paper • 2405.05904 • Published May 9 • 5
ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models Paper • 2405.09220 • Published May 15 • 23
The Consensus Game: Language Model Generation via Equilibrium Search Paper • 2310.09139 • Published Oct 13, 2023 • 12
Aligning LLM Agents by Learning Latent Preference from User Edits Paper • 2404.15269 • Published Apr 23 • 1
Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs Paper • 2310.01801 • Published Oct 3, 2023 • 3
Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations Paper • 2303.02536 • Published Mar 5, 2023 • 1
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published May 2 • 106
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12 • 59
Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published Apr 30 • 65
NExT: Teaching Large Language Models to Reason about Code Execution Paper • 2404.14662 • Published Apr 23 • 4
Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding Paper • 2404.16710 • Published Apr 25 • 56
Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Perfect Reasoners Paper • 2404.14963 • Published Apr 23 • 2
SnapKV: LLM Knows What You are Looking for Before Generation Paper • 2404.14469 • Published Apr 22 • 23
Align Your Steps: Optimizing Sampling Schedules in Diffusion Models Paper • 2404.14507 • Published Apr 22 • 21
Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment Paper • 2404.12318 • Published Apr 18 • 14
TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding Paper • 2404.11912 • Published Apr 18 • 16
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing Paper • 2404.12253 • Published Apr 18 • 51
From r to Q^*: Your Language Model is Secretly a Q-Function Paper • 2404.12358 • Published Apr 18 • 2
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads Paper • 2401.10774 • Published Jan 19 • 50
Symbol tuning improves in-context learning in language models Paper • 2305.08298 • Published May 15, 2023 • 2