HF model (16bit?)

by bdambrosio - opened Jul 24, 2023

Jul 24, 2023

I know, everyone wants everything. But I'm running LLama-2-70B from HF16 (using Dettmer bitsandbytes 4 bit) and it works wonderfully. Love to try this the same way.

TheBloke

Owner Jul 24, 2023

Good to hear. I didn't release an fp16 because I already linked to the original Stability AI model:

But now I look at it again, I realise it's actually in fp32. Is that what you meant? You'd like an fp16 to save the disk space of downloading their fp32? Because I think you can load an fp32 with bitsandbytes just like an fp16?

TheBloke

Owner Jul 24, 2023

For now I've updated my README to reflect the fact that it's actually fp32, not fp16

bdambrosio

Jul 24, 2023

Ok, thanks. I'm downloading the 32bit now. It's huge, don't understand enough to know whether or not it will be a problem loading it (I only have 128GB ram, 48GB vram), but we'll see.

bdambrosio changed discussion status to closed Jul 24, 2023

TheBloke

Owner Jul 24, 2023

I think it should load fine, as it'll still be in 4bit after the conversion. The only issue is having to store twice as much data on disk as with an fp16, and wait longer for it to download.

I will see about making an fp16. I'll try PRing it to them first rather than make my own, and if they don't want it I'll release it

TheBloke

Owner Jul 24, 2023

And thanks very much for the Patreon subscription!

bdambrosio

Jul 24, 2023

•

edited Jul 24, 2023

You are such a massive resource for the OS LLM community. How could I not!
If I can load the 32 bit I'll let you know, still downloading over 1GB+ fiber

bdambrosio

Jul 24, 2023

Ok, load of 32bit was no problem! Yay. tnx

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment