Elie Bakouch's picture

Elie Bakouch

eliebak

·

AI & ML interests

Training LLM's @ 🤗

Organizations

Posts 1

Post

1770

Wow, impressive 340B model by nvidia with a nice permissive license! 🚀 The technical report is full of insights and seems to use a different learning rate schedule than cosine, probably a variant of WSD. Hope to get more info on that! 👀

nvidia/nemotron-4-340b-666b7ebaf1b3867caf2f1911

Articles 5

Article

192

Open R1: Update #2

View all Articles

Collections 2

Papers 3

arxiv:2502.02737

arxiv:2412.01152

arxiv:2405.18392

models 12

eliebak/SmolLM-360M-Instruct-Q8_0-GGUF

Updated Aug 13, 2024 • 11

eliebak/the-tokenizer-v1.5

Updated Jul 4, 2024

eliebak/the-tokenizer-v2

Updated Jun 17, 2024

eliebak/wsd_124M_300B_fw

Text Generation • Updated Jun 11, 2024 • 64

eliebak/wsd_124M_300B_edu

Text Generation • Updated Jun 11, 2024 • 67

eliebak/wsd_124M_150B_edu

Text Generation • Updated Jun 11, 2024 • 64

eliebak/wsd_124M_150B_fw

Text Generation • Updated Jun 11, 2024 • 63

eliebak/cos_124M_150B_fw

Text Generation • Updated Jun 9, 2024 • 49

eliebak/cos_124M_150B_edu

Text Generation • Updated Jun 9, 2024 • 44

eliebak/debug-cos-100B

Text Generation • Updated Jun 8, 2024 • 43

datasets 3

eliebak/very-smollm-corpus

Viewer • Updated Sep 9, 2024 • 4.58M • 32 • 2

eliebak/Buzz_wo_chatml_format

Viewer • Updated Jun 25, 2024 • 31.2M • 296 • 1

eliebak/Buzz_chatml_format

Viewer • Updated Jun 15, 2024 • 31.2M • 439