Clémentine Fourrier's picture

Clémentine Fourrier

clefourrier

·

http://clefourrier.github.io

AI & ML interests

None yet

Recent Activity

upvoted an article about 21 hours ago

🌁#81: Key AI Concepts to Follow in 2025

upvoted an article about 21 hours ago

🐺🐦‍⬛ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark

upvoted an article 3 days ago

Fine-tune a SmolLM on domain-specific synthetic data from a LLM

View all activity

Articles

CO₂ Emissions and Models Performance: Insights from the Open LLM Leaderboard

Rethinking LLM Evaluation with 3C3H: AraGen Benchmark and Leaderboard

Introduction to the Open Leaderboard for Japanese LLMs

Letting Large Models Debate: The First Multilingual LLM Debate Competition

Judge Arena: Benchmarking LLMs as Evaluators

Introducing the Open FinLLM Leaderboard

BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks

Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens tokens and 11 languages

CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models

Let's talk about LLM evaluation

Introducing the Open Arabic LLM Leaderboard

Introducing the Open Leaderboard for Hebrew LLMs!

Bringing the Artificial Analysis LLM Performance Leaderboard to Hugging Face

Improving Prompt Consistency with Structured Generations

Introducing the Open Chain of Thought Leaderboard

The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare

Introducing the LiveCodeBench Leaderboard - Holistic and Contamination-Free Evaluation of Code LLMs

Introducing the Chatbot Guardrails Arena

Introducing ConTextual: How well can your Multimodal model jointly reason over text and image in text-rich scenes?

TTS Arena: Benchmarking Text-to-Speech Models in the Wild

Introducing the Red-Teaming Resistance Leaderboard

Introducing the Open Ko-LLM Leaderboard: Leading the Korean LLM Evaluation Ecosystem

NPHardEval Leaderboard: Unveiling the Reasoning Abilities of Large Language Models through Complexity Classes and Dynamic Updates

Introducing the Enterprise Scenarios Leaderboard: a Leaderboard for Real World Use Cases

The Hallucinations Leaderboard, an Open Effort to Measure Hallucinations in Large Language Models

A guide to setting up your own Hugging Face leaderboard: an end-to-end example with Vectara's hallucination leaderboard

2023, year of open LLMs

Open LLM Leaderboard: DROP deep dive

Overview of natively supported quantization schemes in 🤗 Transformers

What's going on with the Open LLM Leaderboard?

Introduction to Graph Machine Learning

Organizations

clefourrier's activity

New activity in open-llm-leaderboard/open_llm_leaderboard 5 days ago

need help

#1061 opened 5 days ago by

Open LLM Leaderboard Results Dataset

#1046 opened 22 days ago by

Result PRs not appearing

#1054 opened 14 days ago by

sometimesanotion

[Feature] Remove "model voting"

#1059 opened 5 days ago by

i sincerely dont mean to sound impatient or entitled...

#1060 opened 5 days ago by

⌛ Too long a process of evolution

#1058 opened 7 days ago by

New activity in open-llm-leaderboard/open_llm_leaderboard 8 days ago

Carbon Dioxide Emissions

#1057 opened 8 days ago by

New activity in gaia-benchmark/leaderboard 8 days ago

Leaderboard is down again

#26 opened 14 days ago by

New activity in open-llm-leaderboard/comparator 9 days ago

🚩 Report: Not working

#2 opened 9 days ago by

New activity in open-llm-leaderboard/open_llm_leaderboard 9 days ago

[Bug] Got a NoneType error after submitting FreedomIntelligence/HuatuoGPT-o1-8B

#1055 opened 9 days ago by

[Bug] Setting model parameter bounds equal to each other shows no results

#1056 opened 9 days ago by

New activity in open-llm-leaderboard/open_llm_leaderboard 11 days ago

14B model detected as 7B

#1049 opened 19 days ago by

New activity in open-llm-leaderboard/open_llm_leaderboard 13 days ago

Wrong number of parameters reported in Open LLM Leaderboard

#1053 opened 14 days ago by

New activity in gaia-benchmark/leaderboard 16 days ago

leaderboard down

#25 opened 21 days ago by

New activity in open-llm-leaderboard/open_llm_leaderboard 16 days ago

Requested feature: searchable tags for official supported languages for each model.

#1035 opened about 1 month ago by

Bug report? Parameters of the model

#1052 opened 16 days ago by

Suggestion: Adding outlier-resistant averaging methods

#968 opened 3 months ago by

Human Performance row

#1050 opened 18 days ago by

New activity in open-llm-leaderboard/open_llm_leaderboard 23 days ago

Proposal for new column

#1032 opened about 1 month ago by

Llama-3.1 70B Math Hard doesn't match its dataset

#1041 opened 27 days ago by