arxiv:2407.01489

Agentless: Demystifying LLM-based Software Engineering Agents

Published on Jul 1

· Submitted by

nevetsaix on Jul 3

#3 Paper of the day

Upvote

Authors:

Chunqiu Steven Xia ,

Yinlin Deng ,

Soren Dunn ,

Lingming Zhang

Abstract

Recent advancements in large language models (LLMs) have significantly advanced the automation of software development tasks, including code synthesis, program repair, and test generation. More recently, researchers and industry practitioners have developed various autonomous LLM agents to perform end-to-end software development tasks. These agents are equipped with the ability to use tools, run commands, observe feedback from the environment, and plan for future actions. However, the complexity of these agent-based approaches, together with the limited abilities of current LLMs, raises the following question: Do we really have to employ complex autonomous software agents? To attempt to answer this question, we build Agentless -- an agentless approach to automatically solve software development problems. Compared to the verbose and complex setup of agent-based approaches, Agentless employs a simplistic two-phase process of localization followed by repair, without letting the LLM decide future actions or operate with complex tools. Our results on the popular SWE-bench Lite benchmark show that surprisingly the simplistic Agentless is able to achieve both the highest performance (27.33%) and lowest cost (\$0.34) compared with all existing open-source software agents! Furthermore, we manually classified the problems in SWE-bench Lite and found problems with exact ground truth patch or insufficient/misleading issue descriptions. As such, we construct SWE-bench Lite-S by excluding such problematic issues to perform more rigorous evaluation and comparison. Our work highlights the current overlooked potential of a simple, interpretable technique in autonomous software development. We hope Agentless will help reset the baseline, starting point, and horizon for autonomous software agents, and inspire future work along this crucial direction.

View arXiv page View PDF Add to collection

Community

nevetsaix

Paper author Paper submitter 3 days ago

We present Agentless 😺 – an agentless approach to automatically solve software development problems. To solve each issue, Agentless follows a simple two phase process: localization and repair. In the localization process, Agentless employs a hierarchical process to first localize the fault to specific files, then to relevant classes or functions, and finally to fine-grained edit locations. To perform repair, Agentless takes the edit locations and generates multiple candidate patches in a simple diff format. Agentless then performs simple filtering to remove any patches that have syntax errors or cannot pass the previous tests in the repository. Finally, Agentless re-ranks all remaining patches and selects one to submit in order to fix the issue.

We evaluate Agentless on the popular SWE-bench Lite benchmark and demonstrate that Agentless not only achieves the highest performance (27.33%) among all open-source approaches, but it does so at a fraction of the cost!

Overall, in an era focused on achieving top placements on leaderboards, our work highlights the overlooked potential of a simplistic, interpretable technique in autonomous software development. We hope Agentless will help reset the baseline, starting point, and horizon for autonomous software agents, and inspire future work along this crucial direction.

Source Code: https://github.com/OpenAutoCoder/Agentless

Generativekartik

3 days ago

This comment has been hidden

Generativekartik

3 days ago

Please tell me about this paper

tdene

3 days ago

@Generativekartik
Let me tell you about this paper.

When these researchers started working on this particular method, it lives up to its tagline of "Agentless".
Today, it no longer does. In July 2024 this method is considered a great example of an agentic workflow, and could easily be presented as such with some minor tone changes.

Language evolves unfortunately quickly at times.

AhmedBou

2 days ago

Agentless: Demystifying LLM-based Software Engineering Agents

I hate it when I ask ChatGpt to generate a title for me and it uses the word Demystifying 😅

erinkhoo

2 days ago

Summary:

AGENTLESS is an approach to automatically solve software development problems without using complex autonomous agents or tools.
It uses a simple two-phase process of localization and repair to fix issues in code repositories.
In the localization phase, it hierarchically narrows down to suspicious files, classes/functions, and specific edit locations.
For repair, it generates multiple patch candidates using a simple diff format and filters/ranks them.
AGENTLESS achieved the highest performance (27.33% issues resolved) among open-source approaches on the SWE-bench Lite benchmark.
It had significantly lower cost ($0.34 per issue) compared to agent-based approaches.
AGENTLESS uses GPT-4o as its underlying language model.

Novel insights:

Simplicity can be effective: AGENTLESS demonstrates that a simple, non-agent approach can outperform complex agent-based systems for software development tasks. This challenges the assumption that increasingly sophisticated agents are necessary.
Cost-efficiency: The dramatically lower cost of AGENTLESS ($0.34 per issue vs $3.34 for some agent approaches) while maintaining high performance is a significant discovery. This suggests potential for more economical automated software development tools.
Complementary to commercial solutions: AGENTLESS was able to fix some unique issues that even top commercial agent-based solutions could not, indicating it can be complementary to existing tools.
Benchmark insights: The detailed analysis of SWE-bench Lite revealed issues with the benchmark itself, including problems with exact patches in descriptions and misleading solutions. This led to the creation of a more rigorous SWE-bench Lite-S subset.
Localization importance: The hierarchical localization approach of AGENTLESS proved highly effective, highlighting the critical role of precise fault localization in automated software repair.
Potential for improvement: The upper bound of 41% solvable issues when considering all generated patches suggests significant room for improvement in patch selection and ranking techniques.

These insights challenge some existing assumptions in the field of automated software development and suggest new directions for research and tool development focused on simplicity, cost-efficiency, and precise localization.