Mayank Mishra

mayank-mishra

AI & ML interests

Large Language Models, Distributed Training and Inference

Recent Activity

upvoted a paper 21 days ago
upvoted a collection 22 days ago
SmolLM2
New activity 30 days ago
ibm-granite/granite-3.0-2b-instruct

Articles

Organizations

mayank-mishra's activity

New activity in ibm-granite/granite-3.0-2b-instruct 30 days ago

add base model metadata

#3 opened 30 days ago by davanstrien
New activity in ibm-granite/granite-3.0-8b-instruct 30 days ago

add base model metadata

#5 opened 30 days ago by davanstrien
New activity in ibm-granite/granite-3.0-1b-a400m-instruct 30 days ago

Add base model metadata

#2 opened 30 days ago by davanstrien
New activity in ibm/PowerMoE-3b 2 months ago

torch and llama.cpp integration

3
#1 opened 2 months ago by TobDeBer
New activity in cfahlgren1/model-release-heatmap 4 months ago

Add IBM

3
#5 opened 4 months ago by mayank-mishra
New activity in ibm-granite/granite-8b-code-instruct-128k 4 months ago

Fix: link to 128k paper

1
#1 opened 4 months ago by timrbula
New activity in meta-llama/Llama-3.1-405B 4 months ago

405B or 410B ?

2
#8 opened 4 months ago by alielfilali01
New activity in ibm-granite/granite-3b-code-instruct-2k 5 months ago
New activity in ibm-granite/granite-8b-code-instruct-4k 6 months ago

Input context length

3
#6 opened 6 months ago by dyoung

Official quants?

3
#2 opened 6 months ago by joshuaturner
New activity in ibm-granite/granite-3b-code-base-2k 6 months ago

Release GGUF models?

3
#5 opened 7 months ago by CosmicSound
New activity in ibm-granite/granite-3b-code-base-2k 6 months ago

Licensing

6
#4 opened 7 months ago by tonylek
New activity in ibm-granite/granite-34b-code-base-8k 6 months ago