Blog, Articles, and discussions

Introducing smolagents: simple agents that write actions in code.

By December 31, 2024 • 142

Community Articles

view all

🐺🐦‍⬛ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark

•

about 21 hours ago

• 19

Finetuning Falcon 7b in a hybrid distributed fashion

•

3 days ago

• 3

Debate Championship for LLMs

•

4 days ago

• 4

Fine-tune ModernBERT for text classification using synthetic data

•

4 days ago

• 17

🦸🏻#2: Your Go-To Vocabulary to Navigate the World of AI Agents and Agentic Workflows

•

6 days ago

• 8

Unlocking the Power of Reasoning: Introducing CriticalThinker-LLaMA-3.1-8B-GGUF and Its Groundbreaking Dataset

•

7 days ago

🦸🏻#1: Open-endedness and AI Agents – A Path from Generative to Creative AI?

•

9 days ago

• 5

Deriving DPO's Loss

•

10 days ago

• 22

🌁#81: Key AI Concepts to Follow in 2025

•

11 days ago

• 18

Introducing KaibanJS v0.13.0: Structured Output for Smarter Workflows

•

11 days ago

FineWeb2-C: Help Build Better Language Models in Your Language

•

11 days ago

• 10

Tags generation dataset 🧠

•

14 days ago

• 3

AI Agents in Action: Managing GitHub Issues with KaibanJS

•

14 days ago

Intelligence Potentiation: An Evolutionary Perspective on AI Agent Designs

•

15 days ago

• 3

MINERVA: A Multi-Agent LLM System for Digital Scam Protection

•

15 days ago

• 1

Mastering Iterative Prompting for Optimized AI Code Generation

•

16 days ago

• 1

SILMA RAGQA V1.0: A Comprehensive Benchmark for Evaluating LLMs on RAG QA Use-Cases

•

16 days ago

• 1

How to train a new language model from scratch using Transformers and Tokenizers

By February 14, 2020 • 25

Community Articles

view all

Fine-tune a SmolLM on domain-specific synthetic data from a LLM

•

about 1 hour ago

✴️ ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer Use

•

about 2 hours ago

• 4

Process Reinforcement through Implicit Rewards

•

about 13 hours ago

• 2

🐺🐦‍⬛ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark

•

about 21 hours ago

• 19

Finetuning Falcon 7b in a hybrid distributed fashion

•

3 days ago

• 3

Debate Championship for LLMs

•

4 days ago

• 4

Fine-tune ModernBERT for text classification using synthetic data

•

4 days ago

• 17

🦸🏻#2: Your Go-To Vocabulary to Navigate the World of AI Agents and Agentic Workflows

•

6 days ago

• 8

Unlocking the Power of Reasoning: Introducing CriticalThinker-LLaMA-3.1-8B-GGUF and Its Groundbreaking Dataset

•

7 days ago

🦸🏻#1: Open-endedness and AI Agents – A Path from Generative to Creative AI?

•

9 days ago

• 5

Deriving DPO's Loss

•

10 days ago

• 22

🌁#81: Key AI Concepts to Follow in 2025

•

11 days ago

• 18

Introducing KaibanJS v0.13.0: Structured Output for Smarter Workflows

•

11 days ago

FineWeb2-C: Help Build Better Language Models in Your Language

•

11 days ago

• 10

Tags generation dataset 🧠

•

14 days ago

• 3

AI Agents in Action: Managing GitHub Issues with KaibanJS

•

14 days ago

Intelligence Potentiation: An Evolutionary Perspective on AI Agent Designs

•

15 days ago

• 3

MINERVA: A Multi-Agent LLM System for Digital Scam Protection

•

15 days ago

• 1

Mastering Iterative Prompting for Optimized AI Code Generation

•

16 days ago

• 1

SILMA RAGQA V1.0: A Comprehensive Benchmark for Evaluating LLMs on RAG QA Use-Cases

•

16 days ago

• 1

Blog, Articles, and discussions

Introducing smolagents: simple agents that write actions in code.

Fine-tune a SmolLM on domain-specific synthetic data from a LLM

✴️ ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer Use

Process Reinforcement through Implicit Rewards

🐺🐦‍⬛ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark

Finetuning Falcon 7b in a hybrid distributed fashion

Debate Championship for LLMs

Fine-tune ModernBERT for text classification using synthetic data

🦸🏻#2: Your Go-To Vocabulary to Navigate the World of AI Agents and Agentic Workflows

Unlocking the Power of Reasoning: Introducing CriticalThinker-LLaMA-3.1-8B-GGUF and Its Groundbreaking Dataset

🦸🏻#1: Open-endedness and AI Agents – A Path from Generative to Creative AI?

Deriving DPO's Loss

🌁#81: Key AI Concepts to Follow in 2025

Introducing KaibanJS v0.13.0: Structured Output for Smarter Workflows

FineWeb2-C: Help Build Better Language Models in Your Language

Tags generation dataset 🧠

AI Agents in Action: Managing GitHub Issues with KaibanJS

**Intelligence Potentiation: An Evolutionary Perspective on AI Agent Designs**

MINERVA: A Multi-Agent LLM System for Digital Scam Protection

Mastering Iterative Prompting for Optimized AI Code Generation

SILMA RAGQA V1.0: A Comprehensive Benchmark for Evaluating LLMs on RAG QA Use-Cases

How to train a new language model from scratch using Transformers and Tokenizers

Fine-tune a SmolLM on domain-specific synthetic data from a LLM

✴️ ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer Use

Process Reinforcement through Implicit Rewards

🐺🐦‍⬛ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark

Finetuning Falcon 7b in a hybrid distributed fashion

Debate Championship for LLMs

Fine-tune ModernBERT for text classification using synthetic data

🦸🏻#2: Your Go-To Vocabulary to Navigate the World of AI Agents and Agentic Workflows

Unlocking the Power of Reasoning: Introducing CriticalThinker-LLaMA-3.1-8B-GGUF and Its Groundbreaking Dataset

🦸🏻#1: Open-endedness and AI Agents – A Path from Generative to Creative AI?

Deriving DPO's Loss

🌁#81: Key AI Concepts to Follow in 2025

Introducing KaibanJS v0.13.0: Structured Output for Smarter Workflows

FineWeb2-C: Help Build Better Language Models in Your Language

Tags generation dataset 🧠

AI Agents in Action: Managing GitHub Issues with KaibanJS

**Intelligence Potentiation: An Evolutionary Perspective on AI Agent Designs**

MINERVA: A Multi-Agent LLM System for Digital Scam Protection

Mastering Iterative Prompting for Optimized AI Code Generation

SILMA RAGQA V1.0: A Comprehensive Benchmark for Evaluating LLMs on RAG QA Use-Cases

Intelligence Potentiation: An Evolutionary Perspective on AI Agent Designs

Intelligence Potentiation: An Evolutionary Perspective on AI Agent Designs