Spaces:

bigcode
/

bigcodebench-leaderboard

Running

Terry Zhuo commited on Jul 15, 2024

Commit

2a1a6c1

1 Parent(s): 7a7f67a

big update

Files changed (1) hide show

app.py CHANGED Viewed

@@ -350,9 +350,9 @@ with main_block as demo:
                 gr.Markdown(
                     """
                 **Notes:**
-                - _Hard_ vs _Full_:
-                    - <u>Hard</u>: A subset of ~150 BigCodeBench tasks which is more user-facing and challenging.
-                    - <u>Full</u>: The full set of 1140 BigCodeBench tasks.
                 - _Complete_ vs _Instruct_:
                     - <u>Complete</u>: Code Completion based on the (verbose) structured docstring. This split tests if the models are good at coding.
                     - <u>Instruct</u> (🔥Vibe Check🔥): Code Generation based on the (less verbose) NL-oriented instructions. This split tests if the models are really capable enough to understand human intents to code.

                 gr.Markdown(
                     """
                 **Notes:**
+                - _Hard Set_ vs _Full Set_:
+                    - <u>Hard Set</u>: A subset of ~150 BigCodeBench tasks which is more user-facing and challenging.
+                    - <u>Full Set</u>: The full set of 1140 BigCodeBench tasks.
                 - _Complete_ vs _Instruct_:
                     - <u>Complete</u>: Code Completion based on the (verbose) structured docstring. This split tests if the models are good at coding.
                     - <u>Instruct</u> (🔥Vibe Check🔥): Code Generation based on the (less verbose) NL-oriented instructions. This split tests if the models are really capable enough to understand human intents to code.