New discussion

Update README.md

#57 opened about 1 month ago by inuwamobarak

Batched inference on multi-GPUs

1
#56 opened about 1 month ago by d-i-o-n

Badly Encoded Tokens/Mojibake

1
#55 opened about 1 month ago by muchanem

Denied permission to DL

4
#51 opened about 1 month ago by TimPine

mlx_lm.server gives wonky answers

#49 opened about 1 month ago by conleysa

Tokenizer mismatch all the time

2
#47 opened about 1 month ago by tian9

Instruct format?

3
#44 opened about 1 month ago by m-conrad-202

MPS support quantification

4
#39 opened about 1 month ago by tonimelisma

Problem with the tokenizer

2
#37 opened about 1 month ago by Douedos

Error while downloading the model

#32 opened about 1 month ago by amarnadh1998

Garbage responses

2
#30 opened about 1 month ago by RainmakerP

GPU requirements

8
#29 opened about 1 month ago by Gerald001

can I run it on CPU ?

4
#28 opened about 1 month ago by aljbali

ChatLLM.cpp fully supports Llama-3 now

#24 opened about 1 month ago by J22

Transformers pipeline update please

#23 opened about 1 month ago by ip210

Is it really good?

5
#20 opened about 1 month ago by urtuuuu

OMG insomnia in the community

#16 opened about 1 month ago by Languido

What is the conversation template?

8
#14 opened about 1 month ago by aeminkocal

Max output tokens?

3
#12 opened about 1 month ago by stri8ted

IAM READYYYYYY

2
#3 opened about 1 month ago by 10100101j

Non-English language capabilities

6
#2 opened about 1 month ago by oliviermills