Marwa El Kamil's picture

Marwa El Kamil

maghwa

·

AI & ML interests

None yet

Recent Activity

published a dataset 25 days ago

maghwa/tts_darija_sample_8

published a dataset 25 days ago

maghwa/coco-darija-test-20

updated a dataset 25 days ago

maghwa/coco-darija-test

View all activity

Organizations

maghwa's activity

upvoted 2 collections 2 months ago

PaliGemma FT Models

108 items • Updated Dec 13, 2024 • 31

Preference Datasets for DPO

This collection contains a list of curated preference datasets for DPO fine-tuning for intent alignment of LLMs • 7 items • Updated Dec 11, 2024 • 40

upvoted an article 3 months ago

Article

Finding Moroccan Arabic (Darija) in Fineweb 2

By

and 3 others •

Dec 8, 2024

• 22

upvoted an article 5 months ago

Article

The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare

Apr 19, 2024

• 137

upvoted a collection 6 months ago

Arabic Aya DPO Datasets

Our synthetic DPO datasets for Arabic Aya. • 5 items • Updated Jun 4, 2024 • 4

upvoted a paper 8 months ago

101 Billion Arabic Words Dataset

Paper • 2405.01590 • Published Apr 29, 2024 • 5

upvoted an article 8 months ago

Article

Tokenization Is A Dead Weight (Tokun Part 1)

By

•

Jun 27, 2024

• 17

upvoted 2 papers 9 months ago

Tokenization Falling Short: The Curse of Tokenization

Paper • 2406.11687 • Published Jun 17, 2024 • 16

CroissantLLM: A Truly Bilingual French-English Language Model

Paper • 2402.00786 • Published Feb 1, 2024 • 26

upvoted an article 9 months ago

Article

🥐CroissantLLM: A Truly Bilingual French-English Language Model

By

•

Feb 5, 2024

• 11

upvoted a collection 9 months ago

FrenchBench Evaluation datasets

These datasets are used to evaluate models on French performance using: https://github.com/EleutherAI/lm-evaluation-harness (from CroissantLLM paper) • 11 items • Updated Jun 7, 2024 • 7

upvoted an article 9 months ago

Article

Introducing the Open Arabic LLM Leaderboard

May 14, 2024

• 82

upvoted a paper 9 months ago

Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

Paper • 2306.05685 • Published Jun 9, 2023 • 33

upvoted an article 9 months ago

Article

Let's talk about LLM evaluation

By

•

May 23, 2024

• 157

upvoted an article 10 months ago

Article

Text2SQL using Hugging Face Dataset Viewer API and Motherduck DuckDB-NSQL-7B

Apr 4, 2024

• 26

upvoted a paper 11 months ago

Dynamic Typography: Bringing Words to Life

Paper • 2404.11614 • Published Apr 17, 2024 • 45

upvoted 2 papers about 1 year ago

BloombergGPT: A Large Language Model for Finance

Paper • 2303.17564 • Published Mar 30, 2023 • 22

Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 46