https://huggingface.co/deepseek-ai/DeepSeek-V3-Base

#515

by nicoboss - opened 7 days ago

Discussion

nicoboss

7 days ago

•

edited 7 days ago

Best and largest public base model ever: 685B and beats Claude 3.5 Sonnet.
No idea if suported by llama.cpp but needs to be manual handled due to its size anyways.

Models:

nicoboss

7 days ago

I just started downloading the model. I will let you know once I know if it is llama.cpp compatible.

nicoboss

7 days ago

As expected, it is not yet supported by llama.cpp:

INFO:hf-to-gguf:Loading model: DeepSeek-V3-Base
ERROR:hf-to-gguf:Model DeepseekV3ForCausalLM is not supported

I will archive it to hpool and then we can do it as soon llama.cpp implements support for it.

mradermacher

Owner 7 days ago

685! well, it will probably be supported soon

RichardErkhov

3 days ago

lmao what if I frankenmerge it like fatllama? how do you even run such a model... I wish I had my 1.5TB ram server ;(

mradermacher

Owner 3 days ago

lmao what if I frankenmerge it like fatllama?

Why would you do that, richard, that is so totally out of character for you.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment