3000+ context, no output
#1
by
underlines
- opened
I set everything exactly as mentioned, ExLlama, max_seq_len 8192, compress_pos_emb 4, short contexts work, but when I paste a 3000 token context for summarization, the generated tokens are simply 0.
Tried all Vicuna Prompt Templates, instruct, chat-instruct and chat etc.
For people using TheBloke's Runpod Template: It didn't update ExLlama, but it's now fixed. Restart your pods or update ExLlama.
underlines
changed discussion status to
closed