Frank Xu's picture

1 1

Frank Xu

frankxu

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

View all activity

Organizations

frankxu's activity

upvoted a paper 4 days ago

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Paper • 2412.14161 • Published 5 days ago • 42

authored a paper 5 months ago

OpenDevin: An Open Platform for AI Software Developers as Generalist Agents

Paper • 2407.16741 • Published Jul 23 • 68

updated a dataset 5 months ago

OpenHands/eval-output-webarena

Updated Jul 20 • 20

updated a dataset 7 months ago

OpenHands/eval-output-miniwob

Updated Jun 10 • 15

New activity in OpenHands/evaluation 7 months ago

add webarena and miniwob results

#5 opened 7 months ago by

updated a dataset 7 months ago

code-rag-bench/code-retrieval-stackoverflow-small

Viewer • Updated May 13 • 23.4k • 5

authored 6 papers 8 months ago

Active Retrieval Augmented Generation

Paper • 2305.06983 • Published May 11, 2023 • 3

Incorporating External Knowledge through Pre-training for Natural Language to Code Generation

Paper • 2004.09015 • Published Apr 20, 2020 • 1

A Systematic Evaluation of Large Language Models of Code

Paper • 2202.13169 • Published Feb 26, 2022 • 1

Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval

Paper • 2201.12431 • Published Jan 28, 2022

MCoNaLa: A Benchmark for Code Generation from Multiple Natural Languages

Paper • 2203.08388 • Published Mar 16, 2022

DocPrompting: Generating Code by Retrieving the Docs

Paper • 2207.05987 • Published Jul 13, 2022 • 1

authored 2 papers over 1 year ago

WebArena: A Realistic Web Environment for Building Autonomous Agents

Paper • 2307.13854 • Published Jul 25, 2023 • 23

Why do Nearest Neighbor Language Models Work?

Paper • 2301.02828 • Published Jan 7, 2023

updated 2 models over 2 years ago

frankxu/gpt-neo-125M-code

Text Generation • Updated Apr 13, 2022 • 11

frankxu/codelm-playground

Text Generation • Updated Apr 8, 2022 • 12