RuntimeError: The size of tensor a (4096) must match the size of tensor b (4097) at non-singleton dimension 3

#12
by Yurkoff - opened

I get the following error when the context length is greater than 4096 tokens:

RuntimeError: The size of tensor a (4096) must match the size of tensor b (4097) at non-singleton dimension 3

As I understand it, there are two parameters in the config that allow you to achieve the length of the context (prompt and model response) of 16384 tokens:

"max_position_embeddings": 4096,
"rope_scaling": {
    "factor": 4.0,

Are there any ideas on how to fix this error?

The problem was in the transformers version 4.38. In version 4.44 the problem was fixed and the model works fine.

Yurkoff changed discussion status to closed

Sign up or log in to comment