3000+ context, no output

#1
by underlines - opened

I set everything exactly as mentioned, ExLlama, max_seq_len 8192, compress_pos_emb 4, short contexts work, but when I paste a 3000 token context for summarization, the generated tokens are simply 0.
Tried all Vicuna Prompt Templates, instruct, chat-instruct and chat etc.

https://huggingface.co/TheBloke/Vicuna-33B-1-1-preview-SuperHOT-8K-GPTQ/discussions/1#649c0950272ee9fd6b635ea3

For people using TheBloke's Runpod Template: It didn't update ExLlama, but it's now fixed. Restart your pods or update ExLlama.

underlines changed discussion status to closed

Sign up or log in to comment