FYI: 11GB VRAM for 8k context with ExLlama @ ~29 tokens/s

by gardner - opened Jul 27, 2023

Jul 27, 2023

This comment has been hidden

Jul 27, 2023

This was meant to be posted in the GPTQ repo.

gardner changed discussion status to closed Jul 27, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment