Mayank Mishra

mayank-mishra

AI & ML interests

Large Language Models, Distributed Training and Inference

Articles

Organizations

mayank-mishra's activity

New activity in ibm-granite/granite-8b-code-instruct 20 days ago

Input context length

3
#6 opened 20 days ago by dyoung
New activity in ibm-granite/granite-8b-code-instruct 22 days ago

Official quants?

3
#2 opened about 1 month ago by joshuaturner
New activity in ibm-granite/granite-3b-code-base 24 days ago

Release GGUF models?

3
#5 opened about 2 months ago by CosmicSound
New activity in ibm-granite/granite-20b-code-base-GGUF about 1 month ago

3b, 8b, and 34b versions of GGUF?

1
#1 opened about 1 month ago by tombenninger
New activity in ibm-granite/granite-3b-code-base about 1 month ago

Licensing

6
#4 opened about 2 months ago by tonylek
New activity in ibm-granite/granite-34b-code-base about 1 month ago

update transformers also for this ?

3
#1 opened about 1 month ago by talrid
New activity in ibm-granite/granite-3b-code-instruct about 1 month ago

Onnx Model Produces Different Output

2
#2 opened about 1 month ago by runski
New activity in ibm-granite/granite-8b-code-instruct about 1 month ago

Response is not good as expected

5
#3 opened about 1 month ago by skumarai
New activity in ibm-granite/granite-3b-code-base about 2 months ago

Why was the logo removed?

1
#6 opened about 2 months ago by mrfakename
New activity in ibm-granite/granite-8b-code-instruct about 2 months ago

Model template

3
#1 opened about 2 months ago by alex0dd
New activity in ibm-granite/granite-3b-code-base about 2 months ago

Context length

5
#3 opened about 2 months ago by mrfakename

Question

3
#2 opened about 2 months ago by mrfakename
New activity in ibm-granite/granite-3b-code-instruct about 2 months ago
New activity in ibm-granite/granite-3b-code-base about 2 months ago

Initial model card version

#1 opened about 2 months ago by amezasor
New activity in blog-explorers/README 3 months ago

[Support] Community Articles

28
#5 opened 3 months ago by victor
New activity in ibm/MoLFormer-XL-both-10pct 3 months ago
New activity in aurora-m/aurora-m-biden-harris-redteamed 4 months ago

Update README.md

1
#1 opened 4 months ago by cabbage972
New activity in tiiuae/falcon-180B 9 months ago

Is Gigatron open source?

#6 opened 10 months ago by mayank-mishra
New activity in mayank-mishra/starcoder-GPTQ-4bit-128g about 1 year ago

extreme slowdown and weird output.

9
#4 opened about 1 year ago by abhimortal6

Compatibility with HF Code extension

2
#2 opened about 1 year ago by yclicc
New activity in mosaicml/mpt-7b about 1 year ago
New activity in mayank-mishra/starcoderbase-GPTQ-8bit-128g about 1 year ago

Running this on consumer hardware

2
#1 opened about 1 year ago by piratos
New activity in bigcode/starcoder about 1 year ago

What are 0..7.bin?

2
#14 opened about 1 year ago by lozhnikov
New activity in bigcode/starcoderbase about 1 year ago

KeyError: 'gpt_bigcode'

1
#4 opened about 1 year ago by Bilibili
New activity in bigcode/gpt_bigcode-santacoder about 1 year ago
New activity in bigscience/bloom over 1 year ago

how can i train bloom

4
#111 opened almost 2 years ago by s3rgio27

How much GPU memory needed?

4
#109 opened almost 2 years ago by mazib
New activity in bigscience/bloom almost 2 years ago