hsaest's picture
Update README.md
902b0fb verified
metadata
library_name: transformers
tags: []
Commonsense (Micro) Commonsense (Macro) Hard (Micro) Hard (Macro) Final Pass Rate
Direct Prompting
Llama3.1-8B 60.1 0.0 7.9 2.8 0.0
Qwen2-7B 49.9 1.1 2.1 0.0 0.0
Fine-tuning
Llama3.1-8B 78.3 17.8 19.3 6.1 3.8
Qwen2-7B 59.0 0.6 0.2 0.0 0.0

If our related resources prove valuable to your research, we kindly ask for a citation.

@article{xie2024revealing,
  title={Revealing the Barriers of Language Agents in Planning},
  author={Xie, Jian and Zhang, Kexun and Chen, Jiangjie and Yuan, Siyu and Zhang, Kai and Zhang, Yikai and Li, Lei and Xiao, Yanghua},
  journal={arXiv preprint arXiv:2410.12409},
  year={2024}
}