Ksgk-fy (Fangyuan Yu)

upvoted a paper 2 days ago

Octo-planner: On-device Language Model for Planner-Action Agents

Paper • 2406.18082 • Published 3 days ago • 42

upvoted a paper 3 days ago

Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network

Paper • 2406.15109 • Published 8 days ago • 1

upvoted a paper 4 days ago

Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models

Paper • 2406.13542 • Published 10 days ago • 14

upvoted 4 papers 5 days ago

DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning

Paper • 2406.11896 • Published 15 days ago • 17

upvoted a paper 6 days ago

Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning

Paper • 2406.14283 • Published 9 days ago • 2

upvoted 2 papers 8 days ago

Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning

Paper • 2406.06469 • Published 19 days ago • 22

Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts

Paper • 2406.12034 • Published 12 days ago • 12

upvoted 4 papers 10 days ago

Improve Mathematical Reasoning in Language Models by Automated Process Supervision

Paper • 2406.06592 • Published 24 days ago • 17

In-Context Editing: Learning Knowledge from Self-Induced Distributions

Paper • 2406.11194 • Published 12 days ago • 14

TextGrad: Automatic "Differentiation" via Text

Paper • 2406.07496 • Published 18 days ago • 25

To Believe or Not to Believe Your LLM

Paper • 2406.02543 • Published 25 days ago • 29

upvoted 2 papers 11 days ago

Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models

Paper • 2405.20541 • Published 29 days ago • 18

How Do Large Language Models Acquire Factual Knowledge During Pretraining?

Paper • 2406.11813 • Published 12 days ago • 28

upvoted a paper 16 days ago

Calibrated Language Models Must Hallucinate

Paper • 2311.14648 • Published Nov 24, 2023 • 1

upvoted a paper 21 days ago

Show, Don't Tell: Aligning Language Models with Demonstrated Feedback

Paper • 2406.00888 • Published 26 days ago • 29

upvoted a paper 22 days ago

What's the Magic Word? A Control Theory of LLM Prompting

Paper • 2310.04444 • Published Oct 2, 2023 • 1

upvoted a paper 29 days ago

Understanding Transformer Reasoning Capabilities via Graph Algorithms

Paper • 2405.18512 • Published May 28 • 1

upvoted a paper 30 days ago

Contextual Position Encoding: Learning to Count What's Important

Paper • 2405.18719 • Published May 29 • 3

upvoted 11 papers about 1 month ago

Executable Code Actions Elicit Better LLM Agents

Paper • 2402.01030 • Published Feb 1 • 22

AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct

Paper • 2405.14906 • Published May 23 • 21

The Platonic Representation Hypothesis

Paper • 2405.07987 • Published May 13 • 1

LoRA Learns Less and Forgets Less

Paper • 2405.09673 • Published May 15 • 79

Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models

Paper • 2405.12939 • Published May 21 • 1

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20 • 44

How Far Are We From AGI

Paper • 2405.10313 • Published May 16 • 2

Robust agents learn causal world models

Paper • 2402.10877 • Published Feb 16 • 2

Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?

Paper • 2405.05904 • Published May 9 • 5

ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models

Paper • 2405.09220 • Published May 15 • 23

PHUDGE: Phi-3 as Scalable Judge

Paper • 2405.08029 • Published May 12 • 1

upvoted 13 papers about 2 months ago

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13 • 62

The Consensus Game: Language Model Generation via Equilibrium Search

Paper • 2310.09139 • Published Oct 13, 2023 • 12

Memory Mosaics

Paper • 2405.06394 • Published May 10 • 2

Chain of Thoughtlessness: An Analysis of CoT in Planning

Paper • 2405.04776 • Published May 8 • 1

Language-Image Models with 3D Understanding

Paper • 2405.03685 • Published May 6 • 1

Aligning LLM Agents by Learning Latent Preference from User Edits

Paper • 2404.15269 • Published Apr 23 • 1

Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs

Paper • 2310.01801 • Published Oct 3, 2023 • 3

Suppressing Pink Elephants with Direct Principle Feedback

Paper • 2402.07896 • Published Feb 12 • 8

Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations

Paper • 2303.02536 • Published Mar 5, 2023 • 1

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

Paper • 2405.01535 • Published May 2 • 106

ORPO: Monolithic Preference Optimization without Reference Model

Paper • 2403.07691 • Published Mar 12 • 59

Better & Faster Large Language Models via Multi-token Prediction

Paper • 2404.19737 • Published Apr 30 • 65

Iterative Reasoning Preference Optimization

Paper • 2404.19733 • Published Apr 30 • 44

upvoted 13 papers 2 months ago

NExT: Teaching Large Language Models to Reason about Code Execution

Paper • 2404.14662 • Published Apr 23 • 4

Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding

Paper • 2404.16710 • Published Apr 25 • 56

Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Perfect Reasoners

Paper • 2404.14963 • Published Apr 23 • 2

SnapKV: LLM Knows What You are Looking for Before Generation

Paper • 2404.14469 • Published Apr 22 • 23

Align Your Steps: Optimizing Sampling Schedules in Diffusion Models

Paper • 2404.14507 • Published Apr 22 • 21

Many-Shot In-Context Learning

Paper • 2404.11018 • Published Apr 17 • 2

Compression Represents Intelligence Linearly

Paper • 2404.09937 • Published Apr 15 • 27

Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment

Paper • 2404.12318 • Published Apr 18 • 14

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Paper • 2404.11912 • Published Apr 18 • 16

Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

Paper • 2404.12253 • Published Apr 18 • 51

From r to Q^*: Your Language Model is Secretly a Q-Function

Paper • 2404.12358 • Published Apr 18 • 2

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Paper • 2401.10774 • Published Jan 19 • 50

TransformerFAM: Feedback attention is working memory

Paper • 2404.09173 • Published Apr 14 • 42

upvoted 2 papers 3 months ago

Symbol tuning improves in-context learning in language models

Paper • 2305.08298 • Published May 15, 2023 • 2

Autonomous Evaluation and Refinement of Digital Agents

Paper • 2404.06474 • Published Apr 9 • 1

Fangyuan Yu PRO

AI & ML interests

Organizations

Ksgk-fy's activity