n_ctx strange size

by 010O11 - opened Jan 9, 2024

Jan 9, 2024

•

edited Jan 9, 2024

when I normally use 32K context, it gives me >>> n_ctx 32848 = 6247.16 MiB
but in this model >>> llama_new_context_with_model: total VRAM used: 38378.45 MiB (model: 10055.54 MiB, context: 28322.91 MiB) [Q_6_K TheBloke quant]

ddh0

Owner Jan 9, 2024

when you normally use 32k context, is that with a 7B mistral-based model?

i believe more parameters --> more memory for the same amount of context. i may be wrong

010O11

Jan 9, 2024

•

edited Jan 9, 2024

yeah the 'normally' data are from 7B models, is that huge difference possible? Sry than, I wasn't aware, I thought it's somehow strangely too big.....

...........7B.Q8_0.GGUF n_ctx 32848 = 6247.16 MiB
4x7B.-Q4_K_M.GGUF n_ctx 32848 = 6275.18 MiB

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment