Spaces:
Running
Running
File size: 5,629 Bytes
53709ed bdafe83 5c646c9 bdafe83 5c646c9 bdafe83 e9c1bcc 57d59fc bdafe83 57d59fc bdafe83 57d59fc bdafe83 57d59fc bdafe83 57d59fc da473f8 57d59fc 5c646c9 57d59fc bdafe83 57d59fc bdafe83 57d59fc bdafe83 57d59fc bdafe83 57d59fc bdafe83 57d59fc bdafe83 57d59fc bdafe83 57d59fc bdafe83 1d838f0 bdafe83 57d59fc bdafe83 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 |
---
title: AgentReview
emoji: ๐
colorFrom: indigo
colorTo: pink
sdk: gradio
sdk_version: 5.4.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: EMNLP 2024
---
# AgentReview
Official implementation for the ๐[EMNLP 2024](https://2024.emnlp.org/) main track (Oral) paper -- **[AgentReview: Exploring Peer Review Dynamics with LLM Agents](https://arxiv.org/abs/2406.12708)**
[๐กDemo](https://huggingface.co/spaces/Ahren09/AgentReview) ๏ฝ [๐ Website](https://agentreview.github.io/) | [๐ Paper](https://aclanthology.org/2024.emnlp-main.70/) | [๐arXiv](https://arxiv.org/abs/2406.12708) ๏ฝ [๐ปCode](https://github.com/Ahren09/AgentReview)
```bibtex
@inproceedings{jin2024agentreview,
title={AgentReview: Exploring Peer Review Dynamics with LLM Agents},
author={Jin, Yiqiao and Zhao, Qinlin and Wang, Yiyang and Chen, Hao and Zhu, Kaijie and Xiao, Yijia and Wang, Jindong},
booktitle={EMNLP},
year={2024}
}
```
<img src="static/img/Overview.png">
---
## Introduction
AgentReview is a pioneering large language model (LLM)-based framework for simulating peer review processes, developed to analyze and address the complex, multivariate factors influencing review outcomes. Unlike traditional statistical methods, AgentReview captures latent variables while respecting the privacy of sensitive peer review data.
### Academic Abstract
Peer review is fundamental to the integrity and advancement of scientific publication. Traditional methods of peer review analyses often rely on exploration and statistics of existing peer review data, which do not adequately address the multivariate nature of the process, account for the latent variables, and are further constrained by privacy concerns due to the sensitive nature of the data. We introduce AgentReview, the first large language model (LLM) based peer review simulation
framework, which effectively disentangles the impacts of multiple latent factors and addresses the privacy issue. Our study reveals significant insights, including a notable 37.1% variation in paper decisions due to reviewers' biases, supported by sociological theories such as the social influence theory, altruism fatigue, and authority bias. We believe that this study could offer valuable insights to improve the design of peer review mechanisms.
![Review Stage Design](static/img/ReviewPipeline.png)
## Getting Started
### Installation
**Download the data**
Download both zip files in this [Dropbox](https://www.dropbox.com/scl/fo/etzu5h8kwrx8vrcaep9tt/ALCnxFt2cT9aF477d-h1-E8?rlkey=9r5ep9psp8u4yaxxo9caf5nnc&st=aymhgu32&dl=0):
Unzip [AgentReview_Paper_Data.zip](https://www.dropbox.com/scl/fi/l17brtbzsy3xwflqd58ja/AgentReview_Paper_Data.zip?rlkey=vldiexmgzi7zycmz7pumgbooc&st=b6g3nkry&dl=0) under `data/`, which contains:
1. The PDF versions of the paper
2. The real-world peer review for ICLR 2020 - 2023
```bash
unzip AgentReview_Paper_Data.zip -d data/
```
(Optional) Unzip [AgentReview_LLM_Reviews.zip](https://www.dropbox.com/scl/fi/ckr0hpxyedx8u9s6235y6/AgentReview_LLM_Reviews.zip?rlkey=cgexir5xu38tm79eiph8ulbkq&st=q23x2trr&dl=0) under `outputs/`, which contains the LLM-generated reviews, (our LLM-generated dataset)
```bash
unzip AgentReview_LLM_Review.zip -d outputs/
```
**Install Required Packages**:
```
cd AgentReview/
pip install -r requirements.txt
```
3. Set environment variables
If you use OpenAI API, set OPENAI_API_KEY.
```bash
export OPENAI_API_KEY=... # Format: sk-...
```
If you use AzureOpenAI API, set the following
```bash
export AZURE_ENDPOINT=... # Format: https://<your-endpoint>.openai.azure.com/
export AZURE_DEPLOYMENT=... # Your Azure OpenAI deployment here
export AZURE_OPENAI_KEY=... # Your Azure OpenAI key here
```
**Running the Project**
Set the environment variables in `run.sh` and run it:
```bash
bash run.sh
```
**Note: all project files should be run from the `AgentReview` directory.**
**Demo**
A demo can be found in `notebooks/demo.ipynb`
### Customizing your own environment
You can add a new setting in `agentreview/experiment_config.py`, then add the setting as a new entry to the `all_settings` dictionary:
```python
all_settings = {
"BASELINE": baseline_setting,
"benign_Rx1": benign_Rx1_setting,
...
"your_setting_name": your_setting
```
## Framework Overview
### Stage Design
Our simulation adopts a structured, 5-phase pipeline
* **Phase I. Reviewer Assessment.** Each manuscript is evaluated by three reviewers independently.
* **Phase II. Author-Reviewer Discussion.** Authors submit rebuttals to address reviewers' concerns;
* **Phase III. Reviewer-AC Discussion.** The AC facilitates discussions among reviewers, prompting updates to their initial assessments.
* **Phase IV. Meta-Review Compilation.** The AC synthesizes the discussions into a meta-review.
* **Phase V. Paper Decision.** The AC makes the final decision on whether to accept or reject the paper, based on all gathered inputs.
## Note
- We use a fixed acceptance rate of 32%, corresponding to the actual acceptance rate of ICLR 2020 -- 2023. See [Conference Acceptance Rates](https://github.com/lixin4ever/Conference-Acceptance-Rate) for more information.
- Sometimes the API can apply strict filtering to the request. You may need to adjust the content filtering to get the desired results.
## License
This project is licensed under the Apache-2.0 License.
## Acknowledgements
The implementation is partially based on the [chatarena](https://github.com/Farama-Foundation/chatarena) framework.
|