Papers - HAIO - a linxule Collection

linxule 's Collections

Papers - HAIO

updated Nov 2, 2024

For Human and Artificial Intelligence in Organizations

SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning

Paper • 2409.05556 • Published Sep 9, 2024 • 2
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers

Paper • 2409.04109 • Published Sep 6, 2024 • 46
A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor?

Paper • 2409.15277 • Published Sep 23, 2024 • 36
Learning Task Decomposition to Assist Humans in Competitive Programming

Paper • 2406.04604 • Published Jun 7, 2024 • 4
SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding

Paper • 2408.15545 • Published Aug 28, 2024 • 35
Scaling Synthetic Data Creation with 1,000,000,000 Personas

Paper • 2406.20094 • Published Jun 28, 2024 • 97
BERTopic: Neural topic modeling with a class-based TF-IDF procedure

Paper • 2203.05794 • Published Mar 11, 2022 • 1
Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction

Paper • 2410.21169 • Published Oct 28, 2024 • 30
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

Paper • 2410.08815 • Published Oct 11, 2024 • 48
JudgeBench: A Benchmark for Evaluating LLM-based Judges

Paper • 2410.12784 • Published Oct 16, 2024 • 46
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free

Paper • 2410.10814 • Published Oct 14, 2024 • 50
Agent-as-a-Judge: Evaluate Agents with Agents

Paper • 2410.10934 • Published Oct 14, 2024 • 19
MSI-Agent: Incorporating Multi-Scale Insight into Embodied Agents for Superior Planning and Decision-Making

Paper • 2409.16686 • Published Sep 25, 2024 • 10
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines

Paper • 2310.03714 • Published Oct 5, 2023 • 34
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery

Paper • 2410.05080 • Published Oct 7, 2024 • 21
Benchmarking Agentic Workflow Generation

Paper • 2410.07869 • Published Oct 10, 2024 • 26
Automated Design of Agentic Systems

Paper • 2408.08435 • Published Aug 15, 2024 • 39
The Imperative of Conversation Analysis in the Era of LLMs: A Survey of Tasks, Techniques, and Trends

Paper • 2409.14195 • Published Sep 21, 2024 • 13
Enhancing Structured-Data Retrieval with GraphRAG: Soccer Data Case Study

Paper • 2409.17580 • Published Sep 26, 2024 • 9
Style over Substance: Failure Modes of LLM Judges in Alignment Benchmarking

Paper • 2409.15268 • Published Sep 23, 2024 • 13
Making Text Embedders Few-Shot Learners

Paper • 2409.15700 • Published Sep 24, 2024 • 30
LLMs Still Can't Plan; Can LRMs? A Preliminary Evaluation of OpenAI's o1 on PlanBench

Paper • 2409.13373 • Published Sep 20, 2024 • 3
Inductive or Deductive? Rethinking the Fundamental Reasoning Abilities of LLMs

Paper • 2408.00114 • Published Jul 31, 2024
LLMs + Persona-Plug = Personalized LLMs

Paper • 2409.11901 • Published Sep 18, 2024 • 32
Re-Reading Improves Reasoning in Language Models

Paper • 2309.06275 • Published Sep 12, 2023 • 3
LLMs Will Always Hallucinate, and We Need to Live With This

Paper • 2409.05746 • Published Sep 9, 2024 • 3
Planning In Natural Language Improves LLM Search For Code Generation

Paper • 2409.03733 • Published Sep 5, 2024
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation

Paper • 2409.06820 • Published Sep 10, 2024 • 66
Strategic Chain-of-Thought: Guiding Accurate Reasoning in LLMs through Strategy Elicitation

Paper • 2409.03271 • Published Sep 5, 2024 • 2
Graph Retrieval-Augmented Generation: A Survey

Paper • 2408.08921 • Published Aug 15, 2024 • 4
Authorship Attribution in the Era of LLMs: Problems, Methodologies, and Challenges

Paper • 2408.08946 • Published Aug 16, 2024 • 12
Navigating the Unknown: A Chat-Based Collaborative Interface for Personalized Exploratory Tasks

Paper • 2410.24032 • Published Oct 31, 2024 • 10
AAAR-1.0: Assessing AI's Potential to Assist Research

Paper • 2410.22394 • Published Oct 29, 2024 • 16