AI & ML interests
Benchmarking
Recent Activity
View all activity
Organization Card
We refer to https://arxiv.org/abs/2412.09385
to implement a LLM peer review to end up to a new benchmark
.
models
None public yet
datasets
None public yet