OpsEval / data_v2 /huaweicloud_en_mc_gen.csv
Junetheriver's picture
update leaderboard 2025-02-27
cd43969
raw
history blame contribute delete
592 Bytes
name,zero_naive,zero_self_con,zero_cot,zero_cot_self_con,few_naive,few_self_con,few_cot,few_cot_self_con
GPT-3.5-turbo,40.0,40.0,,,55.00000000000001,55.00000000000001,,
GPT-4,40.0,40.0,,,55.00000000000001,55.00000000000001,,
JIUTIAN-75B-net,45.0,45.0,60.0,60.0,60.0,60.0,70.0,70.0
Deepseek-R1-Distill-Llama-8B,7.5,7.5,32.5,32.5,22.5,22.5,27.5,27.5
Deepseek-R1-Distill-Qwen-1.5B,10.0,10.0,12.5,12.5,12.5,12.5,22.5,22.5
Deepseek-R1-Distill-Qwen-14B,27.5,27.5,,,25.0,25.0,,
Deepseek-R1-Distill-Qwen-32B,42.5,42.5,,,30.0,30.0,,
Deepseek-R1-Distill-Qwen-7B,10.0,10.0,17.5,17.5,22.5,22.5,22.5,22.5