Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
whoisjones
's Collections
General NER training datasets
MastermindEval
MastermindEval
updated
8 days ago
Evaluating reasoning capabilities of LLMs using the game of Mastermind (paper is coming)
Upvote
-
whoisjones/mastermind_35_random
Viewer
•
Updated
Dec 18, 2024
•
37.1k
•
56
whoisjones/mastermind_46_random
Viewer
•
Updated
Dec 18, 2024
•
36.1k
•
55
whoisjones/mastermind_46_close
Viewer
•
Updated
Dec 18, 2024
•
36.1k
•
62
whoisjones/mastermind_24_random
Viewer
•
Updated
Dec 18, 2024
•
30.4k
•
54
whoisjones/mastermind_24_close
Viewer
•
Updated
Dec 18, 2024
•
30.4k
•
61
whoisjones/mastermind_35_close
Viewer
•
Updated
Dec 18, 2024
•
37.1k
•
51
whoisjones/mastermind_35
Viewer
•
Updated
Dec 5, 2024
•
37.1k
•
45
whoisjones/mastermind_46
Viewer
•
Updated
Dec 5, 2024
•
36.1k
•
44
whoisjones/mastermind_24
Viewer
•
Updated
Dec 5, 2024
•
30.4k
•
49
Upvote
-
Share collection
View history
Collection guide
Browse collections