4 4 6

Yoshi Suhara

suhara

https://yoshi-suhara.com/

AI & ML interests

None yet

Recent Activity

updated a collection 11 days ago

Minitron

updated a collection 11 days ago

Minitron

updated a collection 11 days ago

Minitron

View all activity

Organizations

suhara's activity

updated a collection 11 days ago

Minitron

Collection

A family of compressed models obtained via pruning and knowledge distillation • 12 items • Updated 11 days ago • 59

liked a model 26 days ago

nvidia/Hymba-1.5B-Instruct

Text Generation • Updated 4 days ago • 13.7k • 212

upvoted a paper about 1 month ago

Hymba: A Hybrid-head Architecture for Small Language Models

Paper • 2411.13676 • Published Nov 20 • 38

authored a paper about 1 month ago

Hymba: A Hybrid-head Architecture for Small Language Models

Paper • 2411.13676 • Published Nov 20 • 38

New activity in nvidia/Mistral-NeMo-Minitron-8B-Instruct 2 months ago

Stop token is missing in tokenizer vocab

#3 opened 2 months ago by

armin-cpl

Incorrect chat template?

#5 opened 2 months ago by

bartowski

liked 2 models 3 months ago

nvidia/Mistral-NeMo-Minitron-8B-Instruct

Text Generation • Updated Oct 9 • 2.41k • 70

nvidia/Llama-3_1-Nemotron-51B-Instruct

Text Generation • Updated Oct 13 • 128k • 196

upvoted a paper 3 months ago

MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models

Paper • 2409.17481 • Published Sep 26 • 46

liked a dataset 3 months ago

nvidia/Aegis-AI-Content-Safety-Dataset-1.0

Viewer • Updated Jun 28 • 12k • 674 • 46

New activity in nvidia/Nemotron-Mini-4B-Instruct 3 months ago

Update tokenizer_config.json

#4 opened 3 months ago by

andrewwa-nvidia

updated a model 3 months ago

nvidia/Nemotron-Mini-4B-Instruct

Updated Sep 23 • 47 • 134

liked a model 3 months ago

nvidia/Nemotron-Mini-4B-Instruct

Updated Sep 23 • 47 • 134

upvoted a collection 4 months ago

Minitron

Collection

A family of compressed models obtained via pruning and knowledge distillation • 12 items • Updated 11 days ago • 59

upvoted a paper 4 months ago

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21 • 57

liked a model over 3 years ago

google/pegasus-xsum

Summarization • Updated Jan 24, 2023 • 154k • • 182