Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2404.07972

a collection of algorithmic agents for user interfaces/interactions and program synthesis

Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration

Paper • 1802.08802 • Published Feb 24, 2018
Mapping Natural Language Commands to Web Elements

Paper • 1808.09132 • Published Aug 28, 2018
Learning to Navigate the Web

Paper • 1812.09195 • Published Dec 21, 2018
Interactive Task and Concept Learning from Natural Language Instructions and GUI Demonstrations

Paper • 1909.00031 • Published Aug 30, 2019

More Agents Is All You Need

Paper • 2402.05120 • Published Feb 3 • 51
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

Paper • 2402.07456 • Published Feb 12 • 41
Generative Agents: Interactive Simulacra of Human Behavior

Paper • 2304.03442 • Published Apr 7, 2023 • 11
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

Paper • 2310.04406 • Published Oct 6, 2023 • 8

Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots

Paper • 2405.07990 • Published May 13 • 16
Large Language Models as Planning Domain Generators

Paper • 2405.06650 • Published Apr 2 • 9
AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation

Paper • 2404.12753 • Published Apr 19 • 41
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Paper • 2404.07972 • Published Apr 11 • 43

Papers - Benchmarks - Agent - Multimodal - Tasks

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Paper • 2404.07972 • Published Apr 11 • 43

Papers - Agents - Operating Systems

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Paper • 2404.07972 • Published Apr 11 • 43

Papers - Operating Systems

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Paper • 2404.07972 • Published Apr 11 • 43

daily paper selected

Octopus v2: On-device language model for super agent

Paper • 2404.01744 • Published Apr 2 • 56
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

Paper • 2404.05719 • Published Apr 8 • 80
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Paper • 2404.07972 • Published Apr 11 • 43
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

Paper • 2404.12253 • Published Apr 18 • 53

Papers - Benchmarks - Image

AQuA: A Benchmarking Tool for Label Quality Assessment

Paper • 2306.09467 • Published Jun 15, 2023 • 1
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Paper • 2404.07972 • Published Apr 11 • 43
BLINK: Multimodal Large Language Models Can See but Not Perceive

Paper • 2404.12390 • Published Apr 18 • 24
Vision language models are blind

Paper • 2407.06581 • Published Jul 9 • 82

Papers - University - Hong Kong University of Science and Te

Event Camera Demosaicing via Swin Transformer and Pixel-focus Loss

Paper • 2404.02731 • Published Apr 3 • 1
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Paper • 2309.12284 • Published Sep 21, 2023 • 18
RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis

Paper • 2404.03204 • Published Apr 4 • 7
Adapting LLaMA Decoder to Vision Transformer

Paper • 2404.06773 • Published Apr 10 • 17

Collection of resources related to Agents.

Communicative Agents for Software Development

Paper • 2307.07924 • Published Jul 16, 2023 • 2
Self-Refine: Iterative Refinement with Self-Feedback

Paper • 2303.17651 • Published Mar 30, 2023 • 2
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent

Paper • 2312.10003 • Published Dec 15, 2023 • 35
ReAct: Synergizing Reasoning and Acting in Language Models

Paper • 2210.03629 • Published Oct 6, 2022 • 14

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs