470 10 85

Loubna Ben Allal

loubnabnl

https://loubnabnl.github.io/

AI & ML interests

LLMs, ML for code, Synthetic data

Recent Activity

updated a dataset about 1 hour ago

HuggingFaceTB/smoltalk

updated a Space about 21 hours ago

HuggingFaceTB/README

posted an update about 21 hours ago

Making SmolLM2 reproducible: open-sourcing our training & evaluation toolkit 🛠️ https://github.com/huggingface/smollm/ - Pre-training code with nanotron - Evaluation suite with lighteval - Synthetic data generation using distilabel (powers our new SFT dataset https://huggingface.co/datasets/HuggingFaceTB/smoltalk) - Post-training scripts with TRL & the alignment handbook - On-device tools with llama.cpp for summarization, rewriting & agents Apache 2.0 licensed. V2 pre-training data mix coming soon! Which other tools should we add next?

View all activity

Articles

SmolLM - blazingly fast and remarkably powerful

Jul 16

• 267

CodeGemma - an official Google release for code LLMs

Apr 9

• 99

Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models

Mar 20

• 66

Organizations

loubnabnl's activity

updated a dataset about 1 hour ago

HuggingFaceTB/smoltalk

Viewer • Updated about 1 hour ago • 2.2M • 717 • 117

updated a Space about 21 hours ago

Running

👁

README

posted an update about 21 hours ago

Post

462

Making SmolLM2 reproducible: open-sourcing our training & evaluation toolkit 🛠️ https://github.com/huggingface/smollm/

- Pre-training code with nanotron
- Evaluation suite with lighteval
- Synthetic data generation using distilabel (powers our new SFT dataset HuggingFaceTB/smoltalk)
- Post-training scripts with TRL & the alignment handbook
- On-device tools with llama.cpp for summarization, rewriting & agents

Apache 2.0 licensed. V2 pre-training data mix coming soon!

Which other tools should we add next?

updated 4 models about 22 hours ago

HuggingFaceTB/SmolLM2-360M-Instruct

Text Generation • Updated about 22 hours ago • 32.4k • 54

HuggingFaceTB/SmolLM2-360M

Text Generation • Updated about 22 hours ago • 7.34k • 24

HuggingFaceTB/SmolLM2-1.7B

Text Generation • Updated about 22 hours ago • 13.3k • 73

HuggingFaceTB/SmolLM2-1.7B-Instruct

Text Generation • Updated about 22 hours ago • 77.9k • • 360

Reacted to prithivMLmods's post with 🔥 1 day ago

Post

2435

Weekend Dribble 📦🍺

Adapters for Product Ad Backdrops, Smooth Polaroids, Minimalist Sketch cards, Super Blends!!

🤏Demo on: prithivMLmods/FLUX-LoRA-DLC

Stranger Zones :
👉🏼{ Super Blend } : strangerzonehf/Flux-Super-Blend-LoRA

👉🏼{ Product Concept Ad } : prithivMLmods/Flux-Product-Ad-Backdrop
👉🏼{ Frosted Mock-ups } : prithivMLmods/Flux.1-Dev-Frosted-Container-LoRA
👉🏼{ Polaroid Plus } : prithivMLmods/Flux-Polaroid-Plus
👉🏼{Sketch Cards} : prithivMLmods/Flux.1-Dev-Sketch-Card-LoRA

👉Stranger Zone: https://huggingface.co/strangerzonehf

👉Flux LoRA Collections: prithivMLmods/flux-lora-collections-66dd5908be2206cfaa8519be

.
.
.
@prithivMLmods 🤗

Reacted to merve's post with ❤️🚀 1 day ago

Post

2536

your hugging face profile now has your recent activities 🤗

updated 2 models 1 day ago

HuggingFaceTB/SmolLM2-135M-Instruct

Text Generation • Updated 1 day ago • 19.6k • 60

HuggingFaceTB/SmolLM2-135M

Text Generation • Updated 1 day ago • 28k • 30

New activity in HuggingFaceTB/SmolLM2-135M 1 day ago

About Instruction model evaluation

#1 opened 24 days ago by

ldwang

What MMLU (cloze) used for evaluation.

#2 opened 12 days ago by

akhilfau

New activity in HuggingFaceTB/SmolLM2-135M-Instruct 1 day ago

add base model metadata

#4 opened 11 days ago by

davanstrien

New activity in HuggingFaceTB/SmolLM2-360M 3 days ago

Reproducing Evaluation with lighteval

#1 opened 7 days ago by

PatrickHaller

New activity in HuggingFaceTB/SmolLM2-360M-Instruct 3 days ago

finetuning

#2 opened 25 days ago by

HassanStar

New activity in HuggingFaceTB/SmolLM2-1.7B-Instruct 3 days ago

What were the training datasets for this model?

#14 opened 17 days ago by

Weyaxi

scale-up? or full datasets list?

#11 opened 19 days ago by

lucyknada

Reacted to merve's post with 🔥 3 days ago

Post

2213

What a week! A recap for everything you missed ❄️
merve/nov-22-releases-673fbbcfc1c97c4f411def07
Multimodal ✨
> Mistral AI
released Pixtral 124B, a gigantic open vision language model
> Llava-CoT (formerly known as Llava-o1) was released, a multimodal reproduction of o1 model by PKU
> OpenGVLab released MMPR: a new multimodal reasoning dataset
> Jina has released Jina-CLIP-v2 0.98B multilingual multimodal embeddings
> Apple released new SotA vision encoders AIMv2

LLMs 🦙
> AllenAI dropped a huge release of models, datasets and scripts for Tülu, a family of models based on Llama 3.1 aligned with SFT, DPO and a new technique they have developed called RLVR
> Jina has released embeddings-v3: new multilingual embeddings with longer context
> Hugging Face released SmolTalk: synthetic dataset used to align SmolLM2 using supervised fine-tuning
> Microsoft released orca-agentinstruct-1M-v1: a gigantic instruction dataset of 1M synthetic instruction pairs

Image Generation 🖼️
> Black Forest Labs released Flux 1. tools: four new models for different image modifications and two LoRAs to do image conditioning and better steer generations

Lastly Hugging Face released a new library Observers: a lightweight SDK for monitoring interactions with AI APIs and easily store and browse them 📚
$ pip install observers

2 replies

Loubna Ben Allal

AI & ML interests

Recent Activity

Articles

SmolLM - blazingly fast and remarkably powerful

CodeGemma - an official Google release for code LLMs

Cosmopedia: how to create large-scale synthetic data for pre-training Large Language Models

StarCoder2 and The Stack v2

Code Llama: Llama 2 learns to code

StarCoder: A State-of-the-Art LLM for Code

How to train a Language Model with Megatron-LM

Organizations

loubnabnl's activity

README

About Instruction model evaluation

What MMLU (cloze) used for evaluation.

add base model metadata

Reproducing Evaluation with lighteval

finetuning

What were the training datasets for this model?

scale-up? or full datasets list?