Ludwig Stumpp commited on
Commit
72edf21
1 Parent(s): b7e4ee9

Remove codeT results for code-davinci-002 as not comparable to other HumanEval results, due to additional explicit testing of outputs

Browse files
Files changed (1) hide show
  1. README.md +0 -1
README.md CHANGED
@@ -18,7 +18,6 @@ https://huggingface.co/spaces/ludwigstumpp/llm-leaderboard
18
  | [chatglm-6b](https://chatglm.cn/blog) | ChatGLM | yes | [985](https://lmsys.org/blog/2023-05-03-arena/) | | | | | | | | | | | | | |
19
  | [chinchilla-70b](https://arxiv.org/abs/2203.15556v1) | DeepMind | no | | | [0.808](https://arxiv.org/abs/2203.15556v1) | | | [0.774](https://arxiv.org/abs/2203.15556v1) | | | [0.675](https://arxiv.org/abs/2203.15556v1) | | | [0.749](https://arxiv.org/abs/2203.15556v1) | | |
20
  | [codex-12b / code-cushman-001](https://arxiv.org/abs/2107.03374) | OpenAI | no | | | | | [0.317](https://crfm.stanford.edu/helm/latest/?group=targeted_evaluations) | | | | | | | | | |
21
- | [code-davinci-002](https://arxiv.org/abs/2207.10397v2) | OpenAI | no | | | | | [0.658](https://arxiv.org/abs/2207.10397v2) | | | | | | | | | |
22
  | [codegen-16B-mono](https://huggingface.co/Salesforce/codegen-16B-mono) | Salesforce | yes | | | | | [0.293](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | | | |
23
  | [codegen-16B-multi](https://huggingface.co/Salesforce/codegen-16B-multi) | Salesforce | yes | | | | | [0.183](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | | | |
24
  | [codegx-13b](http://keg.cs.tsinghua.edu.cn/codegeex/) | Tsinghua University | no | | | | | [0.229](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | | | |
 
18
  | [chatglm-6b](https://chatglm.cn/blog) | ChatGLM | yes | [985](https://lmsys.org/blog/2023-05-03-arena/) | | | | | | | | | | | | | |
19
  | [chinchilla-70b](https://arxiv.org/abs/2203.15556v1) | DeepMind | no | | | [0.808](https://arxiv.org/abs/2203.15556v1) | | | [0.774](https://arxiv.org/abs/2203.15556v1) | | | [0.675](https://arxiv.org/abs/2203.15556v1) | | | [0.749](https://arxiv.org/abs/2203.15556v1) | | |
20
  | [codex-12b / code-cushman-001](https://arxiv.org/abs/2107.03374) | OpenAI | no | | | | | [0.317](https://crfm.stanford.edu/helm/latest/?group=targeted_evaluations) | | | | | | | | | |
 
21
  | [codegen-16B-mono](https://huggingface.co/Salesforce/codegen-16B-mono) | Salesforce | yes | | | | | [0.293](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | | | |
22
  | [codegen-16B-multi](https://huggingface.co/Salesforce/codegen-16B-multi) | Salesforce | yes | | | | | [0.183](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | | | |
23
  | [codegx-13b](http://keg.cs.tsinghua.edu.cn/codegeex/) | Tsinghua University | no | | | | | [0.229](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | | | |